A journey into mapping

mapping allows to easily show data using maps, automatically link spatial coordinates to data, and transform data with spatial coordinates. The mapping functions use the already available and well implemented function in tmap, cartography, and leaflet packages.

Given the different boundary structure of each country, the package provide specific mapping functions and building object functions to link e manipulate data with spatial coordinates.

Country	Object creation function	Object Class	Map Function
World	`WR()`	WR	`mappingWR()`
European Union	`EU()`	EU	`mappingEU()`
Italy	`IT()`	IT	`mappingIT()`
United states of america	`US()`	US	`mappingUS()`
Germany	`DE()`	DE	`mappingDE()`
United kingdom	`UK()`	UK	`mappingUK()`
France	`FR()`	FR	`mappingFR()`

Datasets

Description	Dataset name
World population	`WR()`
European population nuts 2	`EU()`
Italy	`IT()`
USA 208-2016 election results	`US()`
Tax wedge OCDE countries 2018

data("popWR")
str(popWR)
## 'data.frame':    269 obs. of  5 variables:
##  $ country     : Factor w/ 265 levels "","Afghanistan",..: 2 3 4 5 6 7 8 10 11 12 ...
##  $ country_code: Factor w/ 265 levels "","ABW","AFG",..: 3 5 60 11 6 4 12 9 10 2 ...
##  $ total       : num  37172386 2866376 42228429 55465 77006 ...
##  $ male        : num  19093281 1460043 21332000 NA NA ...
##  $ female      : num  18079105 1406333 20896429 NA NA ...

data("popEU")
str(popEU)
## 'data.frame':    2252 obs. of  5 variables:
##  $ TIME  : num  2019 2019 2019 2019 2019 ...
##  $ GEO   : chr  "BE" "BE1" "BE10" "BE100" ...
##  $ total : num  11455519 1215290 1215290 1215290 6596233 ...
##  $ male  : num  5644826 597008 597008 597008 3265134 ...
##  $ female: num  5810693 618282 618282 618282 3331099 ...

data("popIT")
str(popIT)
## 'data.frame':    107 obs. of  4 variables:
##  $ ID     : chr  "Roma" "Milano" "Napoli" "Torino" ...
##  $ maschi : num  2081239 1576316 1497289 1092504 624201 ...
##  $ femmine: num  2260973 1673999 1587601 1167019 641753 ...
##  $ totale : num  4342212 3250315 3084890 2259523 1265954 ...

data("popUS")
str(popUS)
## 'data.frame':    52 obs. of  2 variables:
##  $ id        : chr  "Maine" "North Carolina" "Georgia" "Alaska" ...
##  $ population: int  1338404 10383620 10519475 737438 4887871 626299 3034392 1805832 3943079 5813568 ...

Load coordinates and check names

Coordinates can be separately downloaded using this specific functions

Coordinates	Function
World	`loadCoordWR()`
European Union	`loadCoordEU()`
Italy	`loadCoordIT()`
United States of America	`loadCoordUS()`

Coordinates are download from the GitHub repository , which provides .geojson and .RData files with coordinates, which return an object of class sf.

coord_eu <- loadCoordEU(unit = "nuts0")

The unit argument in the load functions, indicates the type of statistical unit, geographical subdivision or level of aggregation.

library(tmap)
tm_shape(coord_eu) + tm_borders()

Returning an object of class sf, we can also use the mapping function available in the other R packages.

Note that, the data are downloaded from an online repository, and then an internet connection should be preferred. Nevertheless, if the use_internet argument set to FALSE, we will get the coordinates locally available in the package.

checkNames functions will return the nomatching names:

checkNamesIT(popIT$ID, unit = "provincia")

GetNames functions returns the names used in the packages for each unit.

head(getNamesEU(unit = "nuts0"))
##       country iso2 iso3 country_code nuts0_id       nuts0
## 1     Germany   DE  DEU          276       DE     Germany
## 2     Czechia   CZ  CZE          203       CZ     Czechia
## 3    Bulgaria   BG  BGR          100       BG    Bulgaria
## 4 Switzerland   CH  CHE          756       CH Switzerland
## 5     Albania   AL  ALB            8       AL     Albania
## 6     Austria   AT  AUT           40       AT     Austria
head(getNamesEU(unit = "nuts0", all_levels = FALSE))
## [1] "Germany"     "Czechia"     "Bulgaria"    "Switzerland" "Albania"    
## [6] "Austria"

Building a mapping object

The popIT, as showed in the previous section, does not contain any information about the geographical geometries:

str(popIT)
## 'data.frame':    107 obs. of  4 variables:
##  $ ID     : chr  "Roma" "Milano" "Napoli" "Torino" ...
##  $ maschi : num  2081239 1576316 1497289 1092504 624201 ...
##  $ femmine: num  2260973 1673999 1587601 1167019 641753 ...
##  $ totale : num  4342212 3250315 3084890 2259523 1265954 ...

then, the coordinates are added as follows:

it <- IT(data = popIT, unit = "provincia", year = "2018",colID = "ID")

We have to specify the type of statistical unit, the column containing the ids and, if necessary, the year of the subdivision, and the functions will automatically download the coordinates and link them to the data.

str(it,1)
## Classes 'sf', 'IT', 'IT' and 'data.frame':   107 obs. of  11 variables:
##  $ ripartizione     : chr  "Nord-ovest" "Nord-ovest" "Nord-ovest" "Nord-ovest" ...
##  $ regione          : chr  "Piemonte" "Piemonte" "Piemonte" "Piemonte" ...
##  $ code_ripartizione: int  1 1 1 1 1 1 1 1 1 1 ...
##  $ code_regione     : int  1 1 1 1 1 1 2 7 7 7 ...
##  $ code_provincia   : int  1 2 3 4 5 6 7 8 9 10 ...
##  $ ID               : chr  "torino" "vercelli" "novara" "cuneo" ...
##  $ code             : chr  "TO" "VC" "NO" "CN" ...
##  $ maschi           : num  1092504 82848 179588 289459 105011 ...
##  $ femmine          : num  1167019 88063 189430 297639 109627 ...
##  $ totale           : num  2259523 170911 369018 587098 214638 ...
##  $ geometry         :sfc_MULTIPOLYGON of length 107; first list element: List of 1
##   ..- attr(*, "class")= chr [1:3] "XY" "MULTIPOLYGON" "sfg"
##  - attr(*, "sf_column")= chr "geometry"
##  - attr(*, "agr")= Factor w/ 3 levels "constant","aggregate",..: NA NA NA NA NA NA NA NA NA NA
##   ..- attr(*, "names")= chr [1:10] "ripartizione" "regione" "code_ripartizione" "code_regione" ...
##  - attr(*, "unit")= chr "provincia"
##  - attr(*, "year")= chr "2018"
##  - attr(*, "colID")= chr "ID"

In linking the data and the coordinates, the functions available in this packages will return also the information about larger units

library(tmap)
tm_shape(it) + tm_borders() + tm_fill("totale")

Data can be subsets before mapping

it <- IT(data = popIT, unit = "provincia", 
         year = "2018",colID = "ID",
         subset = ~ I(regione == "Lazio"))

tm_shape(it) + tm_borders()

and add variables

it <- IT(data = popIT, unit = "provincia", 
         year = "2018",colID = "ID",
         add = ~I(maschi/totale) + I(femmine/totale), 
         new_var_names = c("Male percentage", "Female percentage"),
         print = FALSE)

str(it,1)
## Classes 'sf', 'IT', 'IT' and 'data.frame':   107 obs. of  13 variables:
##  $ ripartizione     : chr  "Nord-ovest" "Nord-ovest" "Nord-ovest" "Nord-ovest" ...
##  $ regione          : chr  "Piemonte" "Piemonte" "Piemonte" "Piemonte" ...
##  $ code_ripartizione: int  1 1 1 1 1 1 1 1 1 1 ...
##  $ code_regione     : int  1 1 1 1 1 1 2 7 7 7 ...
##  $ code_provincia   : int  1 2 3 4 5 6 7 8 9 10 ...
##  $ ID               : chr  "torino" "vercelli" "novara" "cuneo" ...
##  $ code             : chr  "TO" "VC" "NO" "CN" ...
##  $ maschi           : num  1092504 82848 179588 289459 105011 ...
##  $ femmine          : num  1167019 88063 189430 297639 109627 ...
##  $ totale           : num  2259523 170911 369018 587098 214638 ...
##  $ Male.percentage  : 'AsIs' num  0.483510.... 0.484743.... 0.486664.... 0.493033.... 0.489247.... ...
##  $ Female.percentage: 'AsIs' num  0.516489.... 0.515256.... 0.513335.... 0.506966.... 0.510752.... ...
##  $ geometry         :sfc_MULTIPOLYGON of length 107; first list element: List of 1
##   ..- attr(*, "class")= chr [1:3] "XY" "MULTIPOLYGON" "sfg"
##  - attr(*, "sf_column")= chr "geometry"
##  - attr(*, "agr")= Factor w/ 3 levels "constant","aggregate",..: NA NA NA NA NA NA NA NA NA NA ...
##   ..- attr(*, "names")= chr [1:12] "ripartizione" "regione" "code_ripartizione" "code_regione" ...
##  - attr(*, "unit")= chr "provincia"
##  - attr(*, "year")= chr "2018"
##  - attr(*, "colID")= chr "ID"

Note that, the matchWith argument indicates the type of names we have to link, for example code number, iso code, and so on

str(popEU)
## 'data.frame':    2252 obs. of  5 variables:
##  $ TIME  : num  2019 2019 2019 2019 2019 ...
##  $ GEO   : chr  "BE" "BE1" "BE10" "BE100" ...
##  $ total : num  11455519 1215290 1215290 1215290 6596233 ...
##  $ male  : num  5644826 597008 597008 597008 3265134 ...
##  $ female: num  5810693 618282 618282 618282 3331099 ...
eu <- EU(data = popEU, unit = "nuts0", colID = "GEO", 
         matchWith = "id", check.unit.names = FALSE)

Static maps

coord_eu <- loadCoordEU(unit = "nuts0")
mappingEU(data = coord_eu)

Now we look at the distribution of population among European countries

eu <- EU(data = popEU, unit = "nuts0", colID = "GEO", 
         matchWith = "iso2", check.unit.names = FALSE)
mappingEU(eu, var = "total")

matching with iso2 code.

It is equivalent to use mapping function without building as EU object

mappingEU(data = popEU,unit = "nuts0", colID = "GEO", matchWith = "iso2", var = "total")

The loadCoord functions, as explained in the previous section, automatically return all the bigger statistical unit aggregation starting from the input unit.

eu_aggr <- EU(data = popEU, unit = "nuts1", colID = "GEO", 
         matchWith = "id", check.unit.names = FALSE,
         aggregation_var = "total", aggregation_unit = "nuts0", aggregation_fun = sum)

mappingEU(eu_aggr, var = "total")

or the same as follows

eu <- EU(data = popEU, unit = "nuts1", colID = "GEO", 
         matchWith = "id", check.unit.names = FALSE)


mappingEU(eu, var = "total", aggregation_unit = "nuts0", aggregation_fun = sum)

We can also provide multiple variables to generate multiple maps

mappingEU(eu, var = c("male","female"))

or a subset

mappingEU(eu, var = "total", 
          subset = ~I(country == "Spain" | country == "Italy"))

Let’s look at USA example.

data("popUS")
us <- US(data = popUS, unit = "state", matchWith = "name")
mappingUS(us)

mappingUS(us, var = "population")


mappingUS(us, var = "population", add_text = "state_id",
          options = mapping.options(nclass = 10, legend.portrait = FALSE))

mappingUS(us, var = "population", options = mapping.options(nclass = 10, legend.portrait = FALSE))

The facets argument returns the small multiples.

mappingUS(us, var = "population", aggregation_unit = "division", facets = "division")

or facets for some geographical unit

mappingUS(us, var = "population", subset = ~I(region == "Northeast"), facets = "id")

In this case it maps all the divisions.

Interactive maps

The interactive map functions work as the static functions and they share the same argument.

eu <- EU(data = popEU, unit = "nuts0", colID = "GEO", 
         matchWith = "id", check.unit.names = FALSE)
mappingEU(eu, var = "total", type = "interactive")

mappingEU(eu, var = "total", type = "interactive",
                      subset = ~I(country == "Spain" | country == "Italy"))

or aggregating for countries (“nuts0”)

mappingEU(eu, var = "total", type = "interactive",
                      aggregation_unit = "nuts0")

Multiple variable will provide a single interactive map with different layers:

mappingEU(eu, var = c("male","female"), type = "interactive")

and also the facets is implemented for interactive maps.

A generic `mapping()` function

The package also provide a generic function to map data, mapping. This accept object of class sf, WR, EU, IT, and US.

library(dplyr)

data("popIT")
popIT <- popIT
coords <- loadCoordIT(unit = "provincia", year = '2019')
cr <- left_join(coords, popIT, by = c( "provincia" = "ID"))
mapping(cr)

mapping(cr, var = "maschi")

library(sf)
nc = st_read(system.file("shape/nc.shp", package="sf"))
## Reading layer `nc' from data source 
##   `C:\Users\srfal\Documents\R\win-library\4.1\sf\shape\nc.shp' 
##   using driver `ESRI Shapefile'
## Simple feature collection with 100 features and 14 fields
## Geometry type: MULTIPOLYGON
## Dimension:     XY
## Bounding box:  xmin: -84.32385 ymin: 33.88199 xmax: -75.45698 ymax: 36.58965
## Geodetic CRS:  NAD27
class(nc)
## [1] "sf"         "data.frame"
mapping(nc)

Layout options

Aesthetic options are controlled by mapping.options() function. General options can be retrieved

mapping.options()

single or multiple options may be retrieved

mapping.options("palette.cont")
## [1] "YlGnBu"
mapping.options("legend.position")
## [1] "right" "top"

and we can globally change until a new R session, as follows

mapping.options(legend.position = c("left","bottom"))
mapping.options("legend.position")
## [1] "right" "top"

Options can be changed locally in mapping functions:

map <- mappingEU(eu, var = "total")
map_options <- mappingEU(eu, var = "total", 
                         options = mapping.options(list(legend.position = c("left","bottom"),
                                                        title = "EU total population",
                                                        map.frame = FALSE,
                                                        col.style = "pretty")))

library(tmap)
tmap_arrange(map, map_options)

mapping.options.reset()

or globally outside the functions. Original options can be reseted using mapping.options.reset().

Some other plot example

mappingIT(data = it, var = "totale")

mappingIT(data = it, var = c("maschi", "femmine"))

mappingIT(data = it, var = c("maschi", "femmine"), options = mapping.options(palette.cont = c("Greens", "Blues"), legend.position = c("left","bottom")))


mappingIT(data = it, var = "ripartizione")

mappingIT(data = it, var = "totale", add_text = "code")

mappingIT(data = it, var = "totale", type = "interactive")
mappingIT(data = it, var = "totale", type = "interactive",
          options = mapping.options(legend.position = c("left","bottom")))

mappingIT(data = it, var = c("maschi", "femmine"), type = "interactive", 
          options = mapping.options(interactive.control.collapse = FALSE))
mappingIT(data = it, var = c("maschi", "femmine"), type = "interactive",
          options = mapping.options(palette.cont = c("BuGn", "BuPu")))


mappingIT(data = it, var = "totale", type = "interactive", 
          options = mapping.options(col.style = "pretty"))
mappingIT(data = it, var = "totale", type = "interactive", 
          options = mapping.options(col.style = "equal"))
mappingIT(data = it, var = "totale", type = "interactive", 
          options = mapping.options(col.style = "equal", nclass = 7))

mappingIT(data = it, var = "ripartizione", type = "interactive")
mappingIT(data = it, var = "totale", type = "interactive", facets = "ripartizione")

Some other aggregation and facets examples:

data(popIT)
it <- IT(data = popIT, unit = "provincia", year = "2018",colID = "ID")


mappingIT(data = it, var = "totale", aggregation_unit = "regione")

mappingIT(data = it, var = "totale", 
          aggregation_unit = "ripartizione", 
          aggregation_fun = function(x) sum(x, na.rm = TRUE))



mappingIT(data = it, var = "totale", 
          aggregation_unit = "regione", facets = "ripartizione")

mappingIT(data = it, var = "totale", 
          aggregation_unit = "ripartizione", 
          aggregation_fun = function(x) sum(x, na.rm = TRUE),
          type = "interactive")


mappingIT(data = it, var = "totale", 
          aggregation_unit = "regione", facets = "regione")

data("popEU")
popEU <- popEU
euNuts2 <- EU(data = popEU, colID = "GEO",unit = "nuts2",matchWith = "id")
str(euNuts2,1)
## Classes 'sf', 'EU', 'EU' and 'data.frame':   328 obs. of  16 variables:
##  $ GEO         : chr  "cz05" "cz06" "cz07" "cz08" ...
##  $ country     : chr  "Czechia" "Czechia" "Czechia" "Czechia" ...
##  $ iso2        : chr  "CZ" "CZ" "CZ" "CZ" ...
##  $ iso3        : chr  "CZE" "CZE" "CZE" "CZE" ...
##  $ country_code: int  203 203 203 203 276 276 276 276 276 276 ...
##  $ nuts0_id    : chr  "CZ" "CZ" "CZ" "CZ" ...
##  $ nuts1_id    : chr  "CZ0" "CZ0" "CZ0" "CZ0" ...
##  $ nuts2_id    : chr  "CZ05" "CZ06" "CZ07" "CZ08" ...
##  $ nuts0       : chr  "Czechia" "Czechia" "Czechia" "Czechia" ...
##  $ nuts1       : chr  "Cesko" "Cesko" "Cesko" "Cesko" ...
##  $ nuts2       : chr  "Severovýchod" "Jihovýchod" "Strední Morava" "Moravskoslezsko" ...
##  $ TIME        : num  2019 2019 2019 2019 2019 ...
##  $ total       : num  1513693 1696941 1215413 1203299 4143418 ...
##  $ male        : num  747330 835577 595503 590516 2067094 ...
##  $ female      : num  766363 861364 619910 612783 2076324 ...
##  $ geometry    :sfc_MULTIPOLYGON of length 328; first list element: List of 1
##   ..- attr(*, "class")= chr [1:3] "XY" "MULTIPOLYGON" "sfg"
##  - attr(*, "sf_column")= chr "geometry"
##  - attr(*, "agr")= Factor w/ 3 levels "constant","aggregate",..: NA NA NA NA NA NA NA NA NA NA ...
##   ..- attr(*, "names")= chr [1:15] "GEO" "country" "iso2" "iso3" ...
##  - attr(*, "unit")= chr "nuts2"
##  - attr(*, "year")= chr "2021"
##  - attr(*, "colID")= chr "GEO"

mappingEU(data = euNuts2, var = "total", 
          aggregation_unit = "nuts0", aggregation_fun = mean)

mappingEU(data = euNuts2, var = "total", 
          aggregation_unit = "nuts0", aggregation_fun = sum)

mappingEU(data = euNuts2, var = "total", subset = ~I(iso3 == "FRA"),
          aggregation_unit = "nuts1", aggregation_fun = sum, facets = "nuts1")

data("popWR")
popWR <- popWR
wr <- WR(data = popWR, colID = "country_code",
         matchWith = "iso3_eh", check.unit.names = FALSE,
         res = "low")

mappingWR(data = wr, var = "total", 
          aggregation_unit = "continent", 
          aggregation_fun = function(x) sum(x, na.rm = TRUE))


mappingWR(data = wr, var = "total", 
          aggregation_unit = "continent", 
          aggregation_fun = function(x) sum(x, na.rm = TRUE),
          type = "interactive")