{"id":539,"date":"2018-06-10T18:55:01","date_gmt":"2018-06-10T22:55:01","guid":{"rendered":"https:\/\/micah.waldste.in\/?p=539"},"modified":"2018-06-24T18:56:54","modified_gmt":"2018-06-24T22:56:54","slug":"introduction-to-spherical-densities-in-r","status":"publish","type":"post","link":"https:\/\/micah.waldste.in\/blog\/blog\/2018\/06\/introduction-to-spherical-densities-in-r\/","title":{"rendered":"Introduction to Spherical Densities in R"},"content":{"rendered":"<p><em>DISCLAIMER: While I know a thing or two, there&#39;s a reasonable chance I got\r\nsome things wrong or at very least there are certainly more efficient ways to\r\ngo about things. Feedback always appreciated!<\/em><\/p>\r\n\r\n<p>It always happens&hellip; I get interested in what I think will be a small data\r\nproject to scratch some itch and end up down a deep rabbit hole. In this case,\r\na passing interest in the geographic distribution of some samples (more on that\r\nin a future post) led to a deep dive into spherical distributions and\r\ndensities.<\/p>\r\n\r\n<h1>Motivation<\/h1>\r\n\r\n<p>While I got interested in figuring out densities for the purpose of figuring\r\nout the density of points on a map, there are plenty of other cases where you\r\nmight be interested in the distribution of points on a sphere. The trouble is\r\nthat most functions commonly available, e.g. <code>geom_density_2d<\/code> from ggplot2,\r\nonly handles regular grid coordinates.<\/p>\r\n\r\n<p>The forms:<\/p>\r\n\r\n<ul>\r\n<li>Global densities simply fail at the &#39;edge&#39; of coordinates &#8211; e.g. near the\r\npoles or near +\/- 180 degrees longitude.<\/li>\r\n<li>Projection issues. On small scales and near the equator, it is generally\r\nsafe to make the simplification that longitude\/latitude forms a square grid.\r\nAs you look to larger scales and close to the poles, that assumption breaks\r\ndown.<\/li>\r\n<\/ul>\r\n\r\n<p>I think it is important to point out that there are many tutorials on plotting\r\nevent densities on maps (e.g. crime occurrences), but that these are all at the\r\ncity level, where the problems of using existing methods is a reasonable\r\napproximation.<\/p>\r\n\r\n<h1>Set-Up<\/h1>\r\n\r\n<p>First, we&#39;ll make use of a number of libraries and setup our plotting\r\nenvironment:<\/p>\r\n\r\n<pre><code class=\"r\">library(ggplot2)     # For most of our plotting\r\nlibrary(cowplot)     # grid arrangement of plots\r\nlibrary(Directional) # For spherical density functions\r\nlibrary(maps)        # vector maps of the world\r\nlibrary(hrbrthemes)  # hrbrmstr themes\r\n\r\n# And set some theme defaults\r\ntheme_set(theme_ipsum())\r\n# Axis settings we&#39;ll reuse a lot\r\nno.axis &lt;- theme(axis.ticks.y = element_blank(), axis.text.y = element_blank(),\r\n                 axis.ticks.x = element_blank(), axis.text.x = element_blank(),\r\n                 axis.title.x = element_blank(), axis.title.y = element_blank())\r\n<\/code><\/pre>\r\n\r\n<p>Next, for this example, we&#39;ll be using a random blob placed on a sphere. I&#39;ll\r\nuse the <code>rvmf<\/code> function from the Directional package. Directional is a general\r\npurpose library using Latitude defined from 0 to 180 degrees and Longitude from\r\n0 to 360 instead of -90 to 90 and -180 to 180 respectively. The <code>random_points<\/code>\r\nfunction here gives us points in a coordinate system we&#39;re used to.<\/p>\r\n\r\n<pre><code class=\"r\">random_points &lt;- function(n_points, lat, lon, concentration) {\r\n  # Directional defines lat + long as 0-180 and 0-360 respectively so we\r\n  # have to shift back and forth\r\n  mu &lt;- euclid(c(lat + 90, lon + 180))[1,]\r\n  pts &lt;- euclid.inv(rvmf(n_points, mu, concentration))\r\n  pts[,1] &lt;- pts[,1] - 90\r\n  pts[,2] &lt;- pts[,2] - 180\r\n  data.frame(pts)\r\n}\r\n<\/code><\/pre>\r\n\r\n<h1>Problem<\/h1>\r\n\r\n<p>To  visualize the problem, we&#39;ll create 2 sets of points, one centered on the\r\nmap, the other near the pole and near 180 degrees. We&#39;ll then plot the contours\r\nof the densities to show the issue.<\/p>\r\n\r\n<pre><code class=\"r\">offset.pos &lt;- list(Lat = 75, Long = 175)\r\npositions.center &lt;- random_points(1000, 0, 0, 10)\r\npositions.offset &lt;- random_points(1000, offset.pos$Lat, offset.pos$Long, 10)\r\nplot.colors &lt;- hcl(h = c(0:3)*90, c = 50 , l = 70)\r\ng.base &lt;- ggplot(positions.center, aes(x = Long, y = Lat)) +\r\n          scale_y_continuous(breaks = (-2:2) * 30, limits = c(-90, 90)) +\r\n          scale_x_continuous(breaks = (-4:4) * 45, limits = c(-180, 180)) +\r\n          coord_map()\r\n\r\ng.broken &lt;- g.base +\r\n     # The centered random points\r\n     geom_density_2d(color = plot.colors[1]) +\r\n     geom_point(size = 0.5, stroke = 0, color = plot.colors[1]) +\r\n     # The offset random points\r\n     geom_density_2d(data = positions.offset, color = plot.colors[2]) +\r\n     geom_point(data = positions.offset, size = 0.5, stroke = 0,\r\n                color = plot.colors[2])\r\n\r\northo.projections &lt;- plot_grid(\r\n  g.broken + coord_map(&quot;ortho&quot;, orientation = c(0, 0, 0)) + no.axis,\r\n  g.broken + coord_map(&quot;ortho&quot;, orientation = c(offset.pos$Lat, offset.pos$Long, 0))\r\n           + no.axis,\r\n  labels = NULL,\r\n  align = &#39;h&#39;)\r\ng.broken\r\northo.projections\r\n<\/code><\/pre>\r\n\r\n<p><img decoding=\"async\" src=\"https:\/\/i.imgur.com\/oz4xjBt.png\" alt=\"plot of chunk problem\"\/><img decoding=\"async\" src=\"https:\/\/i.imgur.com\/XHLNc5A.png\" alt=\"plot of chunk problem\"\/><\/p>\r\n\r\n<p>We can quickly see the problem looking at the blue offset density plot &#8211; there\r\nare multiple &ldquo;centers&rdquo; and the contours don&#39;t connect cleanly.<\/p>\r\n\r\n<h1>Spherical Densities<\/h1>\r\n\r\n<p>The solution is to use spherical densities an fortunately, the Directional\r\npackage provides functions for spherical (and in fact, circular and spheres of\r\narbitrary dimensions) distributions using the <a href=\"https:\/\/en.wikipedia.org\/wiki\/Von_Mises%E2%80%93Fisher_distribution\">von Mises-Fisher\r\ndistribution<\/a>.<\/p>\r\n\r\n<p>Our basic approach will be the following steps:<\/p>\r\n\r\n<ul>\r\n<li>Calculate a &ldquo;grid&rdquo; of densities manually, covering the entire globe<\/li>\r\n<li>Use geom_contour to turn those density maps into contour curves<\/li>\r\n<li>Plot away!<\/li>\r\n<\/ul>\r\n\r\n<p>Before we fix the problem using spherical densities, we first need to do some\r\nsetup. We&#39;ll be using <code>vmf.kerncontour<\/code> from the Directional library, but in\r\ncurrent CRAN version (3.2), that function plots contours itself. We want to\r\nget the data to perform the plots ourselves, so we need a version that returns\r\nthe data. The next version of the package will have that option, but in the\r\nmeantime we put the code for the revised function in the Appendix as\r\n<code>vmf.kerncontour.new<\/code>.<\/p>\r\n\r\n<p>Similar to what we did for <code>random_points<\/code>, we also need to perform some\r\ntranslation of <code>vmf.kerncontour<\/code>&#39;s input and output to out more familiar\r\nformats.<\/p>\r\n\r\n<pre><code class=\"r\">vmf_density_grid &lt;- function(u, ngrid = 100) {\r\n  # Translate to (0,180) and (0,360)\r\n  u[,1] &lt;- u[,1] + 90\r\n  u[,2] &lt;- u[,2] + 180\r\n  res &lt;- vmf.kerncontour.new(u, thumb = &quot;none&quot;, ret.all = T, full = T,\r\n                             ngrid = ngrid)\r\n\r\n  # Translate back to (-90, 90) and (-180, 180) and create a grid of\r\n  # coordinates\r\n  ret &lt;- expand.grid(Lat = res$Lat - 90, Long = res$Long - 180)\r\n  ret$Density = c(res$d)\r\n  ret\r\n}\r\n<\/code><\/pre>\r\n\r\n<p>Now we can go ahead and calculate the densities and plot the contours. We&#39;ll\r\nkeep the &ldquo;bad&rdquo; contours for comparison.<\/p>\r\n\r\n<pre><code class=\"r\">densities.center &lt;- vmf_density_grid(positions.center)\r\ndensities.offset &lt;- vmf_density_grid(positions.offset)\r\n\r\ng.broken &lt;- g.base +\r\n     geom_density_2d(color = plot.colors[1], alpha = .5) +\r\n     geom_point(size = 0.5, stroke = 0, color = plot.colors[1], alpha = .5) +\r\n     geom_density_2d(data = positions.offset, color = plot.colors[2], alpha = .5) +\r\n     geom_point(data = positions.offset, size = 0.5, stroke = 0, color =\r\n                plot.colors[2], alpha = .5)\r\n\r\ng.densities &lt;- g.broken +\r\n  geom_contour(data = densities.center,\r\n               aes(x=Long, y=Lat, z=Density),\r\n               color = plot.colors[3]) +\r\n  geom_contour(data = densities.offset,\r\n               aes(x=Long, y=Lat, z=Density),\r\n               color = plot.colors[4])\r\n\r\northo.projections &lt;- plot_grid(\r\n  g.densities + coord_map(&quot;ortho&quot;, orientation = c(0, 0, 0)) + no.axis,\r\n  g.densities + coord_map(&quot;ortho&quot;,\r\n                          orientation = c(offset.pos$Lat, offset.pos$Long, 0))\r\n              + no.axis,\r\n  labels = NULL,\r\n  align = &#39;h&#39;)\r\ng.densities\r\northo.projections\r\n<\/code><\/pre>\r\n\r\n<p><img decoding=\"async\" src=\"https:\/\/i.imgur.com\/uTvEMzO.png\" alt=\"plot of chunk fixed_densitites\"\/><img decoding=\"async\" src=\"https:\/\/i.imgur.com\/oecjPmI.png\" alt=\"plot of chunk fixed_densitites\"\/><\/p>\r\n\r\n<p>Particularly looking at the orthographic plots, it is easy to see that the\r\nspherical density process gives the same rings in both locations, with\r\ncontinuous curves.<\/p>\r\n\r\n<h2>Practical Example: Global Earthquakes<\/h2>\r\n\r\n<p>Earthquake density is used in one of the few existing attempts to perform\r\ndensity calculations with spherical coordiates on\r\n<a href=\"https:\/\/www.r-bloggers.com\/circular-or-spherical-data-and-density-estimation\/\">R-Bloggers<\/a>.\r\nThe <a href=\"http:\/\/www.ncedc.org\/anss\/catalog-search.html\">Northern California Earthquake Data\r\nCenter<\/a> provides an archive of\r\nearthquakes for download, so we start with a set of quakes since Jan 1, 1950 of\r\nmagnitude 5.9 or higher.\r\nGiven that data, we then follow the same process as we did with our random data\r\nto plot both the 2d density contours and the density contours using spherical\r\nfunctions.<\/p>\r\n\r\n<pre><code class=\"r\">earthquakes &lt;- read.csv(file.path(&quot;..&quot;, &quot;data&quot;, &quot;earthquakes.csv&quot;))\r\nearthquake.densities &lt;- vmf_density_grid(earthquakes[,c(&quot;Latitude&quot;,\r\n                                                        &quot;Longitude&quot;)],\r\n                                         ngrid = 300)\r\n<\/code><\/pre>\r\n\r\n<pre><code class=\"r\">world &lt;- map_data(&quot;world&quot;)\r\ng.earthquakes &lt;- ggplot() +\r\n  geom_map(data = world, map = world,\r\n           mapping = aes(map_id = region),\r\n           color = &quot;grey90&quot;, fill = &quot;grey80&quot;) +\r\n  geom_point(data = earthquakes,\r\n             mapping = aes(x = Longitude, y = Latitude),\r\n             color = &quot;red&quot;, alpha = .2, size = .5, stroke = 0) +\r\n  geom_density_2d(data = earthquakes,\r\n                  aes(x=Longitude, y=Latitude),\r\n                  color = plot.colors[2], alpha = 1) +\r\n  geom_contour(data = earthquake.densities, aes(x=Long, y=Lat, z=Density),\r\n               color = plot.colors[4]) +\r\n  scale_y_continuous(breaks = (-2:2) * 30, limits = c(-90, 90)) +\r\n  scale_x_continuous(breaks = (-4:4) * 45, limits = c(-180, 180)) +\r\n  coord_map(&quot;mercator&quot;)\r\n\r\ng.earthquakes\r\n<\/code><\/pre>\r\n\r\n<p><img decoding=\"async\" src=\"https:\/\/i.imgur.com\/lCvudBY.png\" alt=\"plot of chunk earthquake_plot\"\/><\/p>\r\n\r\n<pre><code class=\"r\"># We use the built-in knitr options to animate the globe\r\nn.frames &lt;- 30\r\nfor (i in 1:n.frames) {\r\n  long &lt;- 170 + (i - 1) * 360 \/ n.frames\r\n  # We Explicitly use the &#39;plot&#39; command to show the ggplot\r\n  plot(g.earthquakes + coord_map(&quot;ortho&quot;, orientation = c(0, long, 0)) + no.axis)\r\n}\r\n<\/code><\/pre>\r\n\r\n<p><img decoding=\"async\" src=\"https:\/\/i.imgur.com\/UCO91cx.gif\" alt=\"plot of chunk earthquake_ani\"\/><\/p>\r\n\r\n<p>The yellow shows default 2d density, and you can again see the continuity\r\nproblems. The blue shows the expected <a href=\"https:\/\/en.wikipedia.org\/wiki\/Ring_of_fire\">Ring of\r\nFire<\/a> thanks to the spherical\r\ndensity. It isn&#39;t perfect &#8211; if we were really interested in the most accurate\r\nresults, we&#39;d probably want to turn up the grid size to better follow the\r\nchains of quakes or tweak the contour breakpoints to see the fine features.<\/p>\r\n\r\n<p>This should be a good first step to looking at densities in geo events.<\/p>\r\n\r\n<h1>Next<\/h1>\r\n\r\n<p>While this should have given a good introduction to densities on a sphere and\r\nthe issues with using the default density functions, there is still more we can\r\ndo. We&#39;ve got a few more posts coming:<\/p>\r\n\r\n<ul>\r\n<li><strong>Heatmaps<\/strong> &#8211; Working with heatmaps means generating raster data and\r\nprojections with raster data adds more complexity<\/li>\r\n<li><strong>More Real Examples<\/strong> &#8211; I mentioned I had an actual project I was curious\r\nabout, right?<\/li>\r\n<\/ul>\r\n\r\n<h1>Appendix<\/h1>\r\n\r\n<h2>Spherical Density Function<\/h2>\r\n\r\n<p>This calculates a grid of densities which can then be used with <code>geom_contour<\/code>.\r\nThe code basically comes directly from <a href=\"https:\/\/rdrr.io\/cran\/Directional\/man\/vmf.kerncontour.html\">Directional&#39;s\r\nvmf.kerncontour<\/a>,\r\nonly returning a data.frame instead of actually plotting the output.<\/p>\r\n\r\n<pre><code class=\"r\">vmf.kerncontour.new &lt;- function(u, thumb = &quot;none&quot;, ret.all = FALSE, full = FALSE,\r\n                            ngrid = 100) {\r\n  ## u contains the data in latitude and longitude\r\n  ## the first column is the latitude and the\r\n  ## second column is the longitude\r\n  ## thumb is either &#39;none&#39; (default), or &#39;rot&#39; (Garcia-Portugues, 2013)\r\n  ## ret.all if set to TRUE returns a matrix with latitude, longitude and density\r\n  ## full if set to TRUE calculates densities for the full sphere, otherwise\r\n  ##   using extents of the data\r\n  ## ngrid specifies the number of points taken at each axis\r\n  n &lt;- dim(u)[1]  ## sample size\r\n  x &lt;- euclid(u)\r\n\r\n  if (thumb == &quot;none&quot;) {\r\n    h &lt;- as.numeric( vmfkde.tune(x, low = 0.1, up = 1)[1] )\r\n  } else if (thumb == &quot;rot&quot;) {\r\n    k &lt;- vmf(x)$kappa\r\n    h &lt;- ( (8 * sinh(k)^2) \/ (k * n * ( (1 + 4 * k^2) * sinh(2 * k) -\r\n    2 * k * cosh(2 * k)) ) ) ^ ( 1\/6 )\r\n  }\r\n\r\n  if (full) {\r\n    x1 &lt;- seq( 0, 180, length = ngrid )  ## latitude\r\n    x2 &lt;- seq( 0, 360, length = ngrid )  ## longitude\r\n  } else {\r\n    x1 &lt;- seq( min(u[, 1]) - 5, max(u[, 1]) + 5, length = ngrid )  ## latitude\r\n    x2 &lt;- seq( min(u[, 2]) - 5, max(u[, 2]) + 5, length = ngrid )  ## longitude\r\n  }\r\n  cpk &lt;- 1 \/ (  ( h^2)^0.5 *(2 * pi)^1.5 * besselI(1\/h^2, 0.5) )\r\n  mat &lt;- matrix(nrow = ngrid, ncol = ngrid)\r\n\r\n  for (i in 1:ngrid) {\r\n    for (j in 1:ngrid) {\r\n      y &lt;- euclid( c(x1[i], x2[j]) )\r\n      a &lt;- as.vector( tcrossprod(x, y \/ h^2) )\r\n      can &lt;- sum( exp(a + log(cpk)) ) \/ ngrid\r\n      if (abs(can) &lt; Inf)   mat[i, j] &lt;- can\r\n    }\r\n  }\r\n\r\n  if (ret.all) {\r\n    return(list(Lat = x1, Long = x2, h = h, d = mat))\r\n  } else {\r\n    contour(mat$Lat, mat$Long, mat, nlevels = 10, col = 2, xlab = &quot;Latitude&quot;,\r\n            ylab = &quot;Longitude&quot;)\r\n    points(u[, 1], u[, 2])\r\n  }\r\n}\r\n<\/code><\/pre>\r\n\r\n<h2>References<\/h2>\r\n\r\n<ul>\r\n<li>Earthquake data was accessed through the <a href=\"http:\/\/www.ncedc.org\/anss\/catalog-search.html\">Northern California Earthquake Data Center (NCEDC)<\/a>, doi:10.7932\/NCEDC.<\/li>\r\n<\/ul>","protected":false},"excerpt":{"rendered":"DISCLAIMER: While I know a thing or two, there&#39;s a reasonable chance I got some things wrong or at very least there are certainly more efficient ways to go about things. Feedback always appreciated! It always happens&hellip; I get interested in what I think will be a small data project to scratch some itch and &hellip; <p class=\"link-more\"><a href=\"https:\/\/micah.waldste.in\/blog\/blog\/2018\/06\/introduction-to-spherical-densities-in-r\/\" class=\"more-link\">Continue reading<span class=\"screen-reader-text\"> &#8220;Introduction to Spherical Densities in R&#8221;<\/span><\/a><\/p>","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[16],"tags":[],"class_list":["post-539","post","type-post","status-publish","format-standard","hentry","category-rstats","no-wpautop"],"_links":{"self":[{"href":"https:\/\/micah.waldste.in\/blog\/wp-json\/wp\/v2\/posts\/539","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/micah.waldste.in\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/micah.waldste.in\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/micah.waldste.in\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/micah.waldste.in\/blog\/wp-json\/wp\/v2\/comments?post=539"}],"version-history":[{"count":1,"href":"https:\/\/micah.waldste.in\/blog\/wp-json\/wp\/v2\/posts\/539\/revisions"}],"predecessor-version":[{"id":540,"href":"https:\/\/micah.waldste.in\/blog\/wp-json\/wp\/v2\/posts\/539\/revisions\/540"}],"wp:attachment":[{"href":"https:\/\/micah.waldste.in\/blog\/wp-json\/wp\/v2\/media?parent=539"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/micah.waldste.in\/blog\/wp-json\/wp\/v2\/categories?post=539"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/micah.waldste.in\/blog\/wp-json\/wp\/v2\/tags?post=539"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}