In the Introduction to colorist and Tuning visualizations in colorist vignettes, we provided an overview of colorist
functionality and explored how users might adjust visualizations to address specific narrative goals. Here, we focus on several issues that users may confront when making finished visualizations:
colorist
maps using supplementary spatial datacolorist
legendscolorist
maps in other R spatial packages and external GIS softwareWe start by loading the necessary libraries.
library(colorist)
library(ggplot2)
library(raster)
library(rnaturalearth)
library(sf)
library(tigris)
It’s easy to imagine users wanting to add context to visualizations generated by colorist
, to help explain the causes of particular spatiotemporal distributions. Here, we focus on showing how Natural Earth and US Census TIGER/Line data can be accessed from within R and incorporated into colorist
maps.
rnaturalearth
packageNatural Earth is a public domain data set that contains vector and raster data developed for display at several different map scales. Files describing physical and cultural geographies can be accessed through R using functions in the rnaturalearth
package. Here, we download Natural Earth vector data and add it to a map describing the spatiotemporal distribution of Field Sparrow, a small bird that occurs in the eastern United States. We begin by calculating distribution metrics for Field Sparrow with metrics_distill()
, choosing a palette with palette_timecycle()
, and creating a map with map_single()
. Note that we choose to save the ggplot2
map as an object.
# calculate metrics, choose a palette, make a map
<- metrics_distill(fiespa_occ)
m_fiespa <- palette_timecycle(fiespa_occ)
p_fiespa <- map_single(m_fiespa, p_fiespa) map_fiespa
Then, we use the ne_download()
function from rnaturalearth
to download country, river, and lake spatial data as sf
(simple features) objects, which are easy to incorporate into our existing map. In the process, we project and crop each data set to the same projection and extent as our Field Sparrow data. To avoid annoying topology and projection issues, we create our own ocean polygon by calculating the difference between a polygon describing the extent of the Field Sparrow map and the polygon describing countries.
# download, transform, and crop spatial data
<- ne_download(category = "cultural", type = "countries",
countries returnclass = "sf", scale = 50) %>%
st_transform(crs = st_crs(fiespa_occ)) %>%
st_crop(st_bbox(fiespa_occ))
<- ne_download(category = "physical", type = "rivers_lake_centerlines",
rivers returnclass = "sf", scale = 50) %>%
st_transform(crs = st_crs(fiespa_occ)) %>%
st_crop(st_bbox(fiespa_occ))
<- ne_download(category = "physical", type = "lakes", returnclass = "sf",
lakes scale = 50) %>%
st_transform(crs = st_crs(fiespa_occ)) %>%
st_crop(st_bbox(fiespa_occ))
# create polygon describing ocean
<- st_as_sfc(st_bbox(fiespa_occ)) %>%
ocean st_difference(st_union(countries))
We add the supplementary spatial data to the map by appending geom_sf()
elements to the underlying map and symbolizing each element according to our specifications. We also add a bounding rectangle to our map.
# add supplementary spatial data to map
<- map_fiespa +
map_fiespa_ne geom_sf(data = ocean, fill = "gray95", color = "black", size = 0.25) +
geom_sf(data = rivers, size = 0.25, color = "gray35") +
geom_sf(data = lakes, fill = "gray95", size = 0.25, color = "gray35") +
geom_sf(data = st_as_sfc(st_bbox(fiespa_occ)), fill = NA, color = "black", size = 0.25)
# show the map
print(map_fiespa_ne)
Now, we have a clearer perspective on the spatiotemporal distribution of Field Sparrow in relation to a variety of easily recognizable geographic features in North America.
tigris
packageNext, we integrate spatial data from the tigris
package into a map describing Fisher use of a territory in upstate New York over the course of nine nights. The tigris
package enables download of TIGER/Line data generated by the United States Census Bureau. Files detail a wide array of geographic features, area delineations, and place names throughout the United States.
In contrast to the previous example, we start by using the metrics_pull()
, palette_timeline()
, and map_multiples()
functions to visualize individual raster layers from the Fisher data set. We adjust lambda_i
within map_multiples()
to emphasize areas that were used only briefly by the Fisher.
# calculate metrics, choose a palette, make a series of maps
<- metrics_pull(fisher_ud)
m_fisher <- palette_timeline(9, start_hue = -40)
p_fisher <- map_multiples(m_fisher, p_fisher, lambda_i = -5,
map_fisher labels = paste("April", 7:15))
To download spatial data using tigris
we need to know that the Fisher study area is located in Rensselaer County, New York. Then, we can access linear water features, areal water features, and roads for the county using tigris
functions, projecting and cropping our supplementary spatial data along the way. tigris
functions provide data formatted as sf
objects.
# download, transform, and crop spatial data
<- linear_water("NY", "Rensselaer") %>%
streams st_transform(crs = st_crs(fisher_ud)) %>%
st_crop(st_bbox(fisher_ud))
<- area_water("NY", "Rensselaer") %>%
ponds st_transform(crs = st_crs(fisher_ud)) %>%
st_crop(st_bbox(fisher_ud))
<- roads("NY", "Rensselaer") %>%
roads st_transform(crs = st_crs(fisher_ud)) %>%
st_crop(st_bbox(fisher_ud))
As in the previous example, we add our spatial data to the underlying colorist
map using geom_sf()
elements and we add an unobtrusive bounding box to more clearly define the facets of our plot.
# add supplementary spatial data to the series of maps
+
map_fisher geom_sf(data = streams, linetype = 6, color = "lightblue4", size = 0.25) +
geom_sf(data = ponds, linetype = 6, color = "lightblue4", fill = "lightblue",
size = 0.25) +
geom_sf(data = roads, size = 0.25, color = alpha("black", 0.5)) +
geom_sf(data = st_as_sfc(st_bbox(fisher_ud)), fill = NA, color = "gray",
size = 0.25)
Now we can clearly see how the Fisher moved throughout its territory across nights and in relation to the locations of roads, streams, and ponds.
We recommend trying to keep a “light touch” with edits to colorist
maps to ensure that information about spatiotemporal distributions remains the focus. In addition, it’s important to keep in mind that colorist
visualizations rely on differences in cell opacity to communicate information about differences in intensity; changing the plot background color from white will make the map difficult to interpret.
The legend functions in colorist
generate ggplot2
plot objects, some of which are fairly complicated. Here, we focus on modifying legends to aid interpretation and combining legends and maps within a ggplot2
framework.
legend_timecyle()
When we make a legend for the Field Sparrow annual cycle map described above, the default settings do not communicate the meanings of intensity, specificity, or layer very clearly.
# create a legend using default settings
legend_timecycle(p_fiespa)
Remember, we have distilled distribution metrics from a stack of monthly occurrence probabilities, so intensity values represent maximum occurrence throughout the year, specificity values indicate the degree of seasonality in occurrences, and layer values reveal the month of maximum occurrence. We can edit the legend to clarify these interpretations by making several changes in the legend_timecycle()
function.
# change labels on legend
<- legend_timecycle(p_fiespa,
l_fiespa origin_label = "Jan 1",
# specificity labels
label_s = c("Low\nseasonality",
"Moderate\nseasonality",
"High\nseasonality"),
# intensity label
label_i = "Peak occurrence",
# layers label
label_l = "Month of peak occurrence")
# show legend
print(l_fiespa)
Better.
Now, we need to decide where to position the legend in relation to the map describing the spatiotemporal distribution of Field Sparrow. After playing with several ideas, we place the legend beneath the main map. This requires expanding the spatial extent of the plot using the coord_sf()
function and then inserting the legend into the “empty landscape” using annotation_custom()
. Note that the values in coord_sf()
and annotation_custom()
are in projected map coordinates.
# position legend below map
+
map_fiespa_ne coord_sf(ylim = c(-3153281, 1405830)) +
annotation_custom(ggplotGrob(l_fiespa),
xmin = -1482551, xmax = 1850606,
ymin = -3153281, ymax = -1453281)
legend_timeline()
The strategy for editing details of legend_timeline()
plots is similar to that for legend_timecycle()
plots, despite differences in their layouts. Before changing the legend, however, we should use metrics_distill()
and map_single()
to describe Fisher use of its territory over nine nights in a single map. Because we’re making one larger map, we’re also going to add another layer of spatial data describing building footprints in the area. These features come from the OpenStreeMap Microsoft Building Footprint dataset, a cropped and projected copy of which is available on the GitHub repository for this package.
# calculate distribution metrics
<- metrics_distill(fisher_ud)
m_fisher_distill
# download building footprints
<- file.path(tempdir(), "buildings.rds")
f_buildings download.file(paste0("https://github.com/mstrimas/colorist/raw/master/",
"data-raw/buildings.rds"),
f_buildings)<- readRDS(f_buildings)
buildings unlink(f_buildings)
# make a map
<- map_single(m_fisher_distill, p_fisher,
map_fisher_distill lambda_i = -5, lambda_s = 10) +
geom_sf(data = streams, linetype = 6, color = "lightblue4", size = 0.25) +
geom_sf(data = ponds, linetype = 6, color = "lightblue4", fill = "lightblue",
size = 0.25) +
geom_sf(data = roads, size = 0.25, color = alpha("black", 0.5)) +
geom_sf(data = buildings, size = 0.25, color = alpha("black", 0.5)) +
geom_sf(data = st_as_sfc(st_bbox(fisher_ud)), fill = NA, color = "black",
size = 0.25)
# show the map
print(map_fisher_distill)
Setting lambda_i = -5
makes rarely used parts of the landscape more conspicuous and setting lambda_s = 10
emphasizes differences in how consistently areas were used across nights. Spatial data for Rensselaer County are added to the map using geom_sf()
elements, just as in the previous section.
Now, let us generate a default legend for our Fisher data using legend_timeline()
.
# create a legend using default settings
legend_timeline(p_fisher)
To make the legend more helpful as a guide to interpreting the spatiotemporal distribution of the Fisher, we need to edit intensity, specificity, and layer labels. So, we modify label_s
, label_i
, and label_l
in legend_timeline()
to reflect their meanings with respect to the Fisher data. We also prescribe how the ends of the timeline are labeled using thetime_labels
argument.
# change labels of legend
<- legend_timeline(p_fisher,
l_fisher time_labels = c("Apr 7", "Apr 15"),
# intensity label
label_i = "Peak use",
# layer label
label_l = "Night of peak use",
# specificity labels
label_s = c("Consistent use", "Occasional use", "Ephemeral use"))
# show legend
print(l_fisher)
To place our legend next to the map, we expand the extent of our plot by modifying coord_sf()
and we specify its position using the annotation_custom()
function.
# position legend to the left of the map
+
map_fisher_distill coord_sf(xlim = c(-4300, 2150)) +
annotation_custom(ggplotGrob(l_fisher),
xmin = -4400, xmax = -2500,
ymin = 5100266, ymax = 5104666)
legend_set()
Here, we return to data describing the utilization distributions of two African Elephants in Etosha National Park in 2011. The two layers of the RasterStack form an unordered set of distributions, one for each individual. To visualize how the elephants partitioned space, we begin by calculating distribution metrics using metrics_distill()
, creating a palette with palette_set()
, and generating a map with map_single()
. We add visual emphasis to rarely used areas via the lambda_i
argument.
# calculate metrics, choose a palette, make a series of maps
<- metrics_distill(elephant_ud)
m_elephants <- palette_set(elephant_ud)
p_elephants <- map_single(m_elephants, p_elephants, lambda_i = -5) map_elephants
This should all seem familiar.
Next, we build a legend using our palette and the default settings in legend_set()
.
# create a legend using default settings
legend_set(p_elephants)
The legend is meant to tell us how the two individuals partitioned space in Etosha National Park over the course of a year, allowing viewers to understand locations that were used predominantly by one individual, or the other, and locations that were used equally by both individuals.
To help communicate this information more clearly, we should make some adjustments to the legend. We can label the two facets with individual identifiers using the group_labels
argument in legend_set()
and we can modifylabel_i
,label_s
, and axis_s
values to orient viewers.
# change labels of legend
<- legend_set(p_elephants,
l_elephants group_labels = names(elephant_ud),
# intensity label
label_i = "Intensity of use",
# specificity label
label_s = "Share of use",
# specificty axis
axis_s = c("equal", "sole"))
# show legend
print(l_elephants)
Now, we have a legend that further reveals the narrative of space use by LA11 and LA14 in 2011 and we can position it within the map using annotation_custom()
.
# position legend within map
+
map_elephants annotation_custom(ggplotGrob(l_elephants),
xmin = -3000, xmax = 37000,
ymin = -2347500, ymax = -2305000)
colorist
maps in other softwarecolorist
uses ggplot2
to construct visualizations because it is an incredibly flexible visualization tool that enables maps and legends to be amended and adjusted in many different ways. However, some users may want to employ other tools for visualizing colorist
outputs, because of their greater familiarity with those tools or because of specific features that are not available in colorist
or ggplot2
.
Many of the R packages that enable plotting of spatial data (and most external GIS software) rely on RGBa descriptions of color and opacity instead of the hexadecimal-alpha descriptions used in colorist
functions. To accommodate display of colorist
maps in these other tools, the map_single()
function can be used to return a RasterStack
object containing layers of red, green, blue, and alpha information. Users can then visualize those data in a variety of other tools.
raster
packageThe raster
package is a workhorse for handling and analyzing gridded spatial data in R. Here, we return to our Field Sparrow data, but instead of accepting the default colorist
map output, we specify return_type = "stack"
in the map_single()
function. This change directs map_single()
to return an RGBa stack instead of a ggplot2
plot object.
# ask for map as a raster stack
<- map_single(m_fiespa, p_fiespa, return_type = "stack")
s_fiesp
# show stack information
s_fiesp
With a RasterStack
describing our map in RGBa values, we can now opt to visualize the data using plot functions available in the raster
package. The standard raster::plot()
function illustrates each layer of the RasterStack in a standard color palette. In addition to the R, G, B, and alpha layers, note the last layer (i.e., n_layers
) which describes, in every cell, the number of layers in the original fiespa_occ
data set containing non-NA occurrence values.
# plot layers using raster plot function
plot(s_fiesp)
To reconstruct our colorist
map with the appropriate color scheme, we visualize the stack using raster::plotRGB()
, which requires specification of R, G, and B layers. It also requires specification of alpha values as a vector. If all has gone well, the resulting plot should look the same as the ggplot2
plot generated within colorist
.
# plot r, g, b, and alpha layers to reconstruct colorist map
plotRGB(s_fiesp, 1, 2, 3, alpha = as.vector(s_fiesp[["alpha"]]))
You can then add supplementary spatial data to your raster using other plot functions within raster
and related packages.
There are situations in which you may want to use external GIS software instead of R to create a refined cartographic visualization of colorist
output. Here, we show how to recreate colorist
maps in QGIS, a powerful and free GIS.
Here, we reuse metrics and palette information that we have already generated for two African Elephants to create a map. Instead of accepting the default settings in the map_single()
function, however, we specify return_type = "stack"
and save the output as a RasterStack
.
# ask for map as a RasterStack object
<- map_single(m_elephants, p_elephants, lambda_i = -5,
s_elephants return_type = "stack")
# save RasterStack to R working directory
writeRaster(s_elephants, "elephants_rgba.grd")
Then, we drag the resulting file into an empty QGIS workspace to begin the visualization process.
The default symbology assigned to the multiband raster by QGIS makes it appear as if our map were created using a box of pastels. To correct the appearance of the map colors, double-click on the elephant_rgba layer to open Layer Properties and ensure that data are being rendered in multiband color, RGB bands are in the correct order, and min and max values for all three bands are 0 and 255, respectively. In addition, be sure that band 4 (i.e., alpha) is specified as the transparency band in the Transparency tab of Layer Properties.
After clicking Apply, the thumbnail at the base of the layer properties interface should correctly reproduce the map we were able to generate in R. Click OK. Avoid making further changes to the symbology of elephant_rgba, otherwise map colors will not match those generated in the colorist
palette and the map will become difficult to interpret. Now, users can add and symbolize spatial data using sophisticated cartographic tools available within QGIS.