The Mangal project aims at archiving published ecological networks and at easing their retrieval. To do so, Mangal:
uses a data specification for ecological networks (described in Poisot et al. 2016);
archives ecological networks in a PostgreSQL database;
provides:
Currently, 172 datasets are including in the database representing over 1300 ecological networks. In 2016, the first paper describing the project was published and introduced the first release of rmangal (Poisot et al. 2016). Since then, the structure of the database has been improved (new fields have been added), several ecological networks added and the API entirely rewritten. Consequently, the first release of the rmangal is obsolete (and archived) and we introduce rmangal v2.0 in this vignette.
The diagram on the left side represents the structure of the Mangal database. All references included in Mangal correspond to a specific publication that includes one or several dataset(s). This dataset is basically a collection of ecological networks whose nodes and interactions (edges) are stored in separate tables. Below, we briefly describe the content of each table.
References – Information pertaining to a reference (scientific article, book, online website, etc.) characterizing an original collection of ecological networks. URLs of data and publication sources are included as well as persistent identifiers (when available) such as digital object identifiers (DOIs). This allows the user to retrieve more details about original publications using appropriate R packages such as crossref.
Datasets – Metadata of the datasets attached to a reference. It includes a general description of the networks.
Networks – Metadata of the networks attached to a dataset. It provides the sampling location, date and specific description of the network.
Nodes – Information on the population, taxa or individu in the network. Each node has the original taxon name documented and taxonomic backbone provided by all services embedded in taxize (Chamberlain et al. 2019).
Interactions – Information on the interaction type (e.g. mutualism, predation, etc.), the strength, and the direction of the interaction between two nodes.
So far, the rmangal
package provides methods to get access to the data store. Data requests (performed via httr::GET()
) do not require any authentication.
A bearer authentication strategy using ORCID credentials (as a third-party services) has been implemented on all POST
, DELETE
, PUT
API operations to allow the user to add and delete new ecological to the data base. These features are not currently included in the rmangal package, but are under consideration for future major releases.
In order to efficiently retrieve networks from the database, rmangal includes 7 search functions querying the 5 tables described above as well as a table dedicated to the taxonomy backbone.
search_references()
: search in the reference table, for instance the user can look for a specific doi
;search_datasets()
: search among datasets using a keyword;search_networks()
and search_networks_sf()
: search networks based on a keyword or a geographical area;search_interactions()
: list all networks containing a specific interaction type;search_nodes()
: identify nodes based on nodes information;search_taxonomy()
: identify nodes based on taxonomic names and unique identifiers.All of these functions return specific class objects with the information needed to retrieve the corresponding set of ecological networks with get_collection()
. Hence, the user can easily retrieve data in two steps:
search_*() %>% get_collection() networks <-
Note that if there is only one network to be retrieved, get_collection()
returns a mgNetwork
object, otherwise it returns an object of class mgNetworksCollection
which is a collection (a list) of mgNetwork
objects. Below, we exemplify how to use the search functions, how to get a collection of networks and how to use other packages to carry out specific analyses.
In rmangal, every functions queries a specific table and allow only one query at a time (see section Batch analysis to learn how to perform more than one query). All the functions offer two ways to query the corresponding table:
Let’s load rmangal as well as two helper packages:
library(rmangal)
library(magrittr) # for the pipe %>%
library(tibble) # to use tibble (enhanced data frames)
Let’s assume we are looking for ecological networks including species living in lagoons. If we have no idea about any existing data set, the best starting point is then to query the dataset
table with lagoon
as a keyword:
search_datasets(query = "lagoon")
lagoon <-class(lagoon)
#> [1] "tbl_df" "tbl" "data.frame" "mgSearchDatasets"
lagoon#> # A tibble: 2 × 10
#> id name description public created_at updated_at ref_id user_id references networks
#> <int> <chr> <chr> <lgl> <chr> <chr> <int> <int> <list> <list>
#> 1 22 zetina_2003 Dietary matrix of the Huizac… TRUE 2019-02-23T17:04… 2019-02-23T17:04… 22 3 <df [1 × 1… <df [1 × …
#> 2 52 yanez_1978 Food web of the Guerrero lag… TRUE 2019-02-24T23:42… 2019-02-24T23:42… 53 3 <df [1 × 1… <df [2 × …
If the Mangal reference id containing the lagoon networks was known, we could build a custom query as follow:
search_datasets(list(ref_id = 22))
lagoon_zetina <-
lagoon_zetina#> # A tibble: 1 × 10
#> id name description public created_at updated_at ref_id user_id references networks
#> <int> <chr> <chr> <lgl> <chr> <chr> <int> <int> <list> <list>
#> 1 22 zetina_2003 Dietary matrix of the Huizac… TRUE 2019-02-23T17:04… 2019-02-23T17:04… 22 3 <df [1 × 1… <df [1 × …
Note that if an empty character is passed, i.e. ""
, all entries are returned. We can use this behavior to list all datasets available:
search_datasets("", verbose = FALSE)
all_datasets <-glimpse(all_datasets)
#> Rows: 175
#> Columns: 10
#> $ id <int> 2, 7, 9, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38,…
#> $ name <chr> "howking_1968", "lundgren_olesen_2005", "elberling_olesen_1999", "johnston_1956", "havens_1992", "kemp_1977"…
#> $ description <chr> "Insect activity recorded on flower at Lake Hazen, Ellesmere Island, N.W.T., Canada", "Pollnator activity re…
#> $ public <lgl> TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, …
#> $ created_at <chr> "2019-02-22T15:39:00.427Z", "2019-02-22T20:04:25.322Z", "2019-02-22T20:09:17.994Z", "2019-02-22T21:10:45.269…
#> $ updated_at <chr> "2019-02-22T15:39:00.427Z", "2019-02-22T20:04:25.322Z", "2019-02-22T20:09:17.994Z", "2019-02-22T21:10:45.269…
#> $ ref_id <int> 2, 7, 9, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 38, 39, 40,…
#> $ user_id <int> 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, …
#> $ references <list> [<data.frame[1 x 11]>], [<data.frame[1 x 11]>], [<data.frame[1 x 11]>], [<data.frame[1 x 11]>], [<data.fram…
#> $ networks <list> [<data.frame[1 x 13]>], [<data.frame[1 x 13]>], [<data.frame[1 x 13]>], [<data.frame[1 x 13]>], [<data.fram…
As shown in the diagram above, a dataset comes from a specific reference and search_references()
queries the reference table directly. A handy argument of this function is doi
as it allows to pass a Digital Object Identifier and so to retrieve all datasets attached to a specific publication.
2003 <- search_references(doi = "10.1016/s0272-7714(02)00410-9") zetina_
We can also search by keyword across all networks.
search_networks(query="insect%")
insect_coll <-glimpse(insect_coll)
#> Rows: 14
#> Error: Input must be a vector, not a <data.frame/mgSearchNetworks> object.
It is also possible to retrieve all networks based on interaction types involved:
# List all interaction types available
avail_type()
#> [1] "competition" "amensalism" "neutralism" "commensalism" "mutualism" "parasitism" "predation" "herbivory"
#> [9] "symbiosis" "scavenger" "detritivore" "unspecified"
search_interactions(type="competition")
comp_interac <-# Number of competition interactions in mangal
nrow(comp_interac)
#> [1] 12
search_networks_sf()
handles spatial queries: argument query_sf
takes a sf
object as input and returns all networks included in the spatial extent of this object. For instance, one can retrieve all Californian networks included in Mangal like so:
library(sf)
library(mapview)
library(USAboundaries)
us_states(state = "california")
area <- search_networks_sf(area, verbose = FALSE) in_CA <-
mapView(st_geometry(area), color = "red", legend = FALSE, col.regions = "#FF000033") + mapView(in_CA, legend = FALSE)
#> Error in path.expand(path): invalid 'path' argument
The user can easily identify networks including a specific taxonomic entity with search_taxonomy()
:
search_taxonomy("Ficus") sr_ficus <-
This function allows to search for a specific taxonomic entity using it’s validated name or unique identifiers, i.e. EOL, TSN, GBIF, COL, BOLD and NCBI IDs. Taxon names of the taxonomy
table were validated with TNRS (see https://tnrs.biendata.org/ and/or GNR (see https://resolver.globalnames.org/). The taxon names in this table might not be the taxon name documented in the original publication. In order to identify relevant networks with the original name, use [search_nodes()].
The validation of taxon names was performed by an automated procedure using taxize (Chamberlain et al. 2019) and if there is any doubt, the original names recorded by authors should be regarded as the most reliable information. Please report any issue related to taxonomy at https://github.com/mangal-interactions/contribute/issues/new/choose.
glimpse(search_taxonomy(tsn = 28749))
#> Rows: 1
#> Error: Input must be a vector, not a <data.frame/mgSearchTaxonomy> object.
glimpse(search_taxonomy(eol = 583069))
#> Rows: 1
#> Error: Input must be a vector, not a <data.frame/mgSearchTaxonomy> object.
Note that in some case, one may need to find a dataset based on the original name included in the publication, in such case, search_nodes()
must be used:
search_nodes("Ficus") sr_ficus2 <-
search_*
objectOnce the search performed, ecological networks are accessible from the object returned with get_collection()
:
lagoon %>% get_collection
nets_lagoons <- in_CA %>% get_collection
nets_in_CA <- comp_interac %>% get_collection nets_competition <-
nets_lagoons#> A collection of 3 networks
#>
#> * Network #86 included in dataset #22
#> * Description: Dietary matrix of the Huizache–Caimanero lagoon
#> * Includes 189 edges and 26 nodes
#> * Current taxonomic IDs coverage for nodes of this network:
#> --> ITIS: 81%, BOLD: 81%, EOL: 85%, COL: 81%, GBIF: 0%, NCBI: 85%
#> * Published in ref # DOI:10.1016/s0272-7714(02)00410-9
#>
#> * Network #927 included in dataset #52
#> * Description: Food web of the Brackish lagoon
#> * Includes 27 edges and 11 nodes
#> * Current taxonomic IDs coverage for nodes of this network:
#> --> ITIS: 45%, BOLD: 45%, EOL: 45%, COL: 45%, GBIF: 18%, NCBI: 45%
#> * Published in ref # DOI:NA
#>
#> * Network #926 included in dataset #52
#> * Description: Food web of the Costal lagoon
#> * Includes 34 edges and 13 nodes
#> * Current taxonomic IDs coverage for nodes of this network:
#> --> ITIS: 54%, BOLD: 54%, EOL: 54%, COL: 54%, GBIF: 15%, NCBI: 54%
#> * Published in ref # DOI:NA
class(nets_lagoons)
#> [1] "mgNetworksCollection"
Note that mgNetworksCollection
objects are lists of mgNetwork
object which are a list of five datasets reflecting the 5 tables presented in the diagram in the first section:
names(nets_lagoons[[1]])
#> [1] "network" "nodes" "interactions" "dataset" "reference"
glimpse(nets_lagoons[[1]]$network)
#> Rows: 1
#> Columns: 13
#> $ network_id <int> 86
#> $ name <chr> "zetina_2003_20030101_86"
#> $ date <chr> "2003-01-01T00:00:00.000Z"
#> $ description <chr> "Dietary matrix of the Huizache–Caimanero lagoon"
#> $ public <lgl> TRUE
#> $ all_interactions <lgl> FALSE
#> $ created_at <chr> "2019-02-23T17:04:34.046Z"
#> $ updated_at <chr> "2019-02-23T17:04:34.046Z"
#> $ dataset_id <int> 22
#> $ user_id <int> 3
#> $ geom_type <chr> "Point"
#> $ geom_lon <list> -106.1099
#> $ geom_lat <list> 22.98531
glimpse(nets_lagoons[[1]]$nodes)
#> Rows: 26
#> Columns: 19
#> $ node_id <int> 4904, 4905, 4906, 4907, 4908, 4909, 4910, 4911, 4912, 4913, 4914, 4915, 4916, 4917, 4918, 4919, 4920…
#> $ original_name <chr> "Scianids", "Elopids", "Lutjanids", "Carangids", "Centropomids", "Ariids", "Haemulids", "Pleuronecto…
#> $ node_level <chr> "taxon", "taxon", "taxon", "taxon", "taxon", "taxon", "taxon", "taxon", "taxon", "taxon", "taxon", "…
#> $ network_id <int> 86, 86, 86, 86, 86, 86, 86, 86, 86, 86, 86, 86, 86, 86, 86, 86, 86, 86, 86, 86, 86, 86, 86, 86, 86, …
#> $ taxonomy_id <int> 4363, 4364, 4365, 4366, 4367, 4368, 4369, 4355, 3823, 4370, 4371, 4372, 4373, 4374, 4279, 4375, 4376…
#> $ created_at <chr> "2019-02-23T17:04:42.505Z", "2019-02-23T17:04:42.571Z", "2019-02-23T17:04:42.622Z", "2019-02-23T17:0…
#> $ updated_at <chr> "2019-02-23T17:04:42.505Z", "2019-02-23T17:04:42.571Z", "2019-02-23T17:04:42.622Z", "2019-02-23T17:0…
#> $ taxonomy.id <int> 4363, 4364, 4365, 4366, 4367, 4368, 4369, 4355, 3823, 4370, 4371, 4372, 4373, 4374, 4279, 4375, 4376…
#> $ taxonomy.name <chr> "Sciaenidae", "Elops", "Lutjanidae", "Carangidae", "Centropomidae", "Ariidae", "Haemulidae", "Pleuro…
#> $ taxonomy.ncbi <int> 30870, 7927, 30850, 8157, 8184, 31017, 30840, 8256, 6762, 94935, 55118, 274463, 8079, 8219, 8189, 66…
#> $ taxonomy.tsn <int> 169237, 28630, 168845, 168584, 167642, 43998, 169055, 172859, 13951, 165546, NA, 169013, 165876, 171…
#> $ taxonomy.eol <int> 5211, 46561210, 5294, 5361, 5355, 5115, 5317, 5173, 46508442, 8246, 6893, 5321, 5517, 46575119, 5287…
#> $ taxonomy.bold <int> 1856, 4061, 1858, 1851, 586, 1313, 1855, 1126, 4985, 1326, 1252, 422, 1259, NA, 1863, 1504, 28100, 7…
#> $ taxonomy.gbif <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
#> $ taxonomy.col <chr> "81a86c329909d507edb5c296906ef3f4", "94532a14786adeb25bcec244a53aadc1", "7150078b7dd31a5f7575240f1b7…
#> $ taxonomy.rank <chr> "family", "genus", "family", "family", "family", "family", "family", "family", "genus", "family", "f…
#> $ taxonomy.created_at <chr> "2019-02-23T17:04:35.620Z", "2019-02-23T17:04:35.744Z", "2019-02-23T17:04:35.870Z", "2019-02-23T17:0…
#> $ taxonomy.updated_at <chr> "2019-06-14T15:25:46.438Z", "2019-06-14T15:25:46.492Z", "2019-06-14T15:25:46.546Z", "2019-06-14T15:2…
#> $ taxonomy <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
glimpse(nets_lagoons[[1]]$interactions)
#> Rows: 189
#> Columns: 20
#> $ interaction_id <int> 48376, 48377, 48378, 48379, 48380, 48381, 48382, 48383, 48384, 48385, 48388, 48389, 48390, 48391, …
#> $ node_from <int> 4912, 4912, 4912, 4912, 4912, 4912, 4912, 4912, 4912, 4912, 4913, 4913, 4913, 4913, 4913, 4913, 49…
#> $ node_to <int> 4912, 4914, 4915, 4918, 4919, 4920, 4921, 4922, 4925, 4926, 4914, 4916, 4917, 4919, 4920, 4922, 49…
#> $ date <chr> "2003-01-01T00:00:00.000Z", "2003-01-01T00:00:00.000Z", "2003-01-01T00:00:00.000Z", "2003-01-01T00…
#> $ direction <chr> "directed", "directed", "directed", "directed", "directed", "directed", "directed", "directed", "d…
#> $ type <chr> "predation", "predation", "predation", "predation", "predation", "predation", "predation", "predat…
#> $ method <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
#> $ attr_id <int> 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12…
#> $ value <dbl> 0.026, 0.025, 0.003, 0.009, 0.009, 0.016, 0.284, 0.231, 0.079, 0.090, 0.100, 0.002, 0.004, 0.006, …
#> $ geom <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
#> $ public <lgl> TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TR…
#> $ network_id <int> 86, 86, 86, 86, 86, 86, 86, 86, 86, 86, 86, 86, 86, 86, 86, 86, 86, 86, 86, 86, 86, 86, 86, 86, 86…
#> $ created_at <chr> "2019-02-23T17:05:45.061Z", "2019-02-23T17:05:45.131Z", "2019-02-23T17:05:45.193Z", "2019-02-23T17…
#> $ updated_at <chr> "2019-02-23T17:05:45.061Z", "2019-02-23T17:05:45.131Z", "2019-02-23T17:05:45.193Z", "2019-02-23T17…
#> $ attribute.id <int> 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12…
#> $ attribute.name <chr> "dietary matrix", "dietary matrix", "dietary matrix", "dietary matrix", "dietary matrix", "dietary…
#> $ attribute.description <chr> "Proportions of the consumer diets made up by the prey.", "Proportions of the consumer diets made …
#> $ attribute.unit <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
#> $ attribute.created_at <chr> "2019-02-23T17:04:25.350Z", "2019-02-23T17:04:25.350Z", "2019-02-23T17:04:25.350Z", "2019-02-23T17…
#> $ attribute.updated_at <chr> "2019-02-23T17:04:25.350Z", "2019-02-23T17:04:25.350Z", "2019-02-23T17:04:25.350Z", "2019-02-23T17…
glimpse(nets_lagoons[[1]]$dataset)
#> Rows: 1
#> Columns: 8
#> $ dataset_id <int> 22
#> $ name <chr> "zetina_2003"
#> $ description <chr> "Dietary matrix of the Huizache–Caimanero lagoon"
#> $ public <lgl> TRUE
#> $ created_at <chr> "2019-02-23T17:04:32.017Z"
#> $ updated_at <chr> "2019-02-23T17:04:32.017Z"
#> $ ref_id <int> 22
#> $ user_id <int> 3
glimpse(nets_lagoons[[1]]$reference)
#> Rows: 1
#> Columns: 11
#> $ ref_id <int> 22
#> $ doi <chr> "10.1016/s0272-7714(02)00410-9"
#> $ first_author <chr> "manuel j. zetina-rejon"
#> $ year <chr> "2003"
#> $ jstor <lgl> NA
#> $ pmid <lgl> NA
#> $ bibtex <chr> "@article{Zetina_Rej_n_2003, doi = {10.1016/s0272-7714(02)00410-9}, url = {https://doi.org/10.1016%2Fs0272-…
#> $ paper_url <chr> "https://doi.org/10.1016%2Fs0272-7714%2802%2900410-9"
#> $ data_url <chr> "https://globalwebdb.com/"
#> $ created_at <chr> "2019-02-23T17:04:28.307Z"
#> $ updated_at <chr> "2019-02-23T17:04:28.307Z"
So far, the search functions of rmangal allow the user to perform only a single search at a time. The simplest way to do more than one search is to loop over a vector or a list of queries. Below we exemplify how to do so using lapply()
:
c(837855, 169237)
tsn <- lapply(tsn, function(x) search_taxonomy(tsn = x)) %>%
mgn <- lapply(get_collection) %>%
combine_mgNetworks
mgn#> A collection of 3 networks
#>
#> * Network #948 included in dataset #66
#> * Description: Flower and anthophilous insect interactions in the primary cool-temperate subalpine forests and meadows at Mt. Kushigata, Yamanashi Prefecture, Japan
#> * Includes 871 edges and 456 nodes
#> * Current taxonomic IDs coverage for nodes of this network:
#> --> ITIS: 20%, BOLD: 33%, EOL: 46%, COL: 43%, GBIF: 35%, NCBI: 38%
#> * Published in ref # DOI:NA
#>
#> * Network #86 included in dataset #22
#> * Description: Dietary matrix of the Huizache–Caimanero lagoon
#> * Includes 189 edges and 26 nodes
#> * Current taxonomic IDs coverage for nodes of this network:
#> --> ITIS: 81%, BOLD: 81%, EOL: 85%, COL: 81%, GBIF: 0%, NCBI: 85%
#> * Published in ref # DOI:10.1016/s0272-7714(02)00410-9
#>
#> * Network #1101 included in dataset #77
#> * Description: Food web of the Angolan fishery landings
#> * Includes 127 edges and 28 nodes
#> * Current taxonomic IDs coverage for nodes of this network:
#> --> ITIS: 61%, BOLD: 50%, EOL: 61%, COL: 54%, GBIF: 4%, NCBI: 57%
#> * Published in ref # DOI:10.3989/scimar.2011.75n2309
sf
The function get_collection()
has an argument as_sf
than converts network metadata of mgNetwork objects to sf
objects, which requires sf
to be installed. This allows the user to easily geolocalize the networks retrieved from Mangal.
# assuming sf and mapview are is loaded (as we did above)
search_datasets(query = 'lagoon') %>% get_collection(as_sf = TRUE)
mg_lag_sf <-class(mg_lag_sf[[1]]$network)
#> [1] "sf" "data.frame"
# let's combine all these sf object into a single one
mapView(mg_lag_sf[[1]]$network) + mapView(mg_lag_sf[[2]]$network)
## Taxonomic analysis with taxize
As Mangal includes taxonomic identifiers, rmangal can readily be combined with taxize
(see taxize for more details about this package):
library(taxize)
search_taxonomy("Acer")$taxonomy.tsn
tsn_acer <-classification(tsn_acer, db = "itis")
#> $`28749`
#> name rank id
#> 1 Plantae kingdom 202422
#> 2 Viridiplantae subkingdom 954898
#> 3 Streptophyta infrakingdom 846494
#> 4 Embryophyta superdivision 954900
#> 5 Tracheophyta division 846496
#> 6 Spermatophytina subdivision 846504
#> 7 Magnoliopsida class 18063
#> 8 Rosanae superorder 846548
#> 9 Sapindales order 28643
#> 10 Sapindaceae family 28657
#> 11 Acer genus 28727
#> 12 Acer negundo species 28749
#>
#> $`28757`
#> name rank id
#> 1 Plantae kingdom 202422
#> 2 Viridiplantae subkingdom 954898
#> 3 Streptophyta infrakingdom 846494
#> 4 Embryophyta superdivision 954900
#> 5 Tracheophyta division 846496
#> 6 Spermatophytina subdivision 846504
#> 7 Magnoliopsida class 18063
#> 8 Rosanae superorder 846548
#> 9 Sapindales order 28643
#> 10 Sapindaceae family 28657
#> 11 Acer genus 28727
#> 12 Acer saccharinum species 28757
#>
#> $<NA>
#> [1] NA
#>
#> $<NA>
#> [1] NA
#>
#> $`837855`
#> name rank id
#> 1 Plantae kingdom 202422
#> 2 Viridiplantae subkingdom 954898
#> 3 Streptophyta infrakingdom 846494
#> 4 Embryophyta superdivision 954900
#> 5 Tracheophyta division 846496
#> 6 Spermatophytina subdivision 846504
#> 7 Magnoliopsida class 18063
#> 8 Rosanae superorder 846548
#> 9 Sapindales order 28643
#> 10 Sapindaceae family 28657
#> 11 Acer genus 28727
#> 12 Acer japonicum species 837855
#>
#> attr(,"class")
#> [1] "classification"
#> attr(,"db")
#> [1] "itis"
igraph
Once the data are retrieved and a mgNetwork
or a mgNetworkCollection
objects obtained, it is straightforward to convert it as a igraph
(see the dedicated website) object and then to carry out network analysis:
library(igraph)
search_datasets(query = 'lagoon') %>% get_collection
mg_lagoons <-# NB the line below returns a list of igraph objects
as.igraph(mg_lagoons)
ig_lagoons <-## Modularity analysis for the first network
modularity(ig_lagoons[[1]], membership(cluster_walktrap(ig_lagoons[[1]])))
#> [1] 0.04824893
## Degree values for all networks
lapply(ig_lagoons, degree)
#> [[1]]
#> 4904 4905 4906 4907 4908 4909 4910 4911 4912 4913 4914 4915 4916 4917 4918 4919 4920 4921 4922 4924 4925 4926 4927 4923 4929 4928
#> 17 11 14 13 18 20 14 10 18 14 12 15 7 15 14 12 14 11 26 7 22 15 21 16 5 17
#>
#> [[2]]
#> 6459 6460 6461 6463 6464 6465 6458 6462 6466 6456 6457
#> 4 7 9 3 3 7 4 6 3 4 4
#>
#> [[3]]
#> 6445 6447 6448 6449 6450 6452 6453 6454 6446 6451 6455 6443 6444
#> 6 4 5 5 11 2 5 8 3 5 4 5 5
tidygraph
and ggraph
The package tidygraph
treats networks as two tidy tables (one for the edges and one for the nodes) that can be modified using the grammar of data manipulation developed in the tidyverse. Moreover, tidygraph
wraps over most of the igraph
functions so that the user can call a vast variety of algorithms to properly analysis networks. Fortunately, objects of class mgNetwork
can readily be converted into tbl_graph
objects which allows the user to benefit from all the tools included in tidygraph
:
library(tidygraph)
# NB the line below would not work with a mgNetworksCollection (use lapply)
as_tbl_graph(mg_lagoons[[1]]) %>%
tg_lagoons <- mutate(centrality_dg = centrality_degree(mode = 'in'))
%E>% as_tibble
tg_lagoons #> # A tibble: 189 × 19
#> from to interaction_id date direction type method attr_id value public network_id created_at updated_at attribute.id
#> <int> <int> <int> <chr> <chr> <chr> <lgl> <int> <dbl> <lgl> <int> <chr> <chr> <int>
#> 1 9 9 48376 2003-0… directed preda… NA 12 0.026 TRUE 86 2019-02-23… 2019-02-23… 12
#> 2 9 11 48377 2003-0… directed preda… NA 12 0.025 TRUE 86 2019-02-23… 2019-02-23… 12
#> 3 9 12 48378 2003-0… directed preda… NA 12 0.003 TRUE 86 2019-02-23… 2019-02-23… 12
#> 4 9 15 48379 2003-0… directed preda… NA 12 0.009 TRUE 86 2019-02-23… 2019-02-23… 12
#> 5 9 16 48380 2003-0… directed preda… NA 12 0.009 TRUE 86 2019-02-23… 2019-02-23… 12
#> 6 9 17 48381 2003-0… directed preda… NA 12 0.016 TRUE 86 2019-02-23… 2019-02-23… 12
#> 7 9 18 48382 2003-0… directed preda… NA 12 0.284 TRUE 86 2019-02-23… 2019-02-23… 12
#> 8 9 19 48383 2003-0… directed preda… NA 12 0.231 TRUE 86 2019-02-23… 2019-02-23… 12
#> 9 9 21 48384 2003-0… directed preda… NA 12 0.079 TRUE 86 2019-02-23… 2019-02-23… 12
#> 10 9 22 48385 2003-0… directed preda… NA 12 0.09 TRUE 86 2019-02-23… 2019-02-23… 12
#> # … with 179 more rows, and 5 more variables: attribute.name <chr>, attribute.description <chr>, attribute.unit <lgl>,
#> # attribute.created_at <chr>, attribute.updated_at <chr>
%N>% as_tibble %>%
tg_lagoons select(original_name, taxonomy.tsn, centrality_dg)
#> # A tibble: 26 × 3
#> original_name taxonomy.tsn centrality_dg
#> <chr> <int> <dbl>
#> 1 Scianids 169237 1
#> 2 Elopids 28630 0
#> 3 Lutjanids 168845 1
#> 4 Carangids 168584 2
#> 5 Centropomids 167642 2
#> 6 Ariids 43998 1
#> 7 Haemulids 169055 4
#> 8 Pleuronectoids 172859 3
#> 9 Callinectes 13951 6
#> 10 Belonoids 165546 4
#> # … with 16 more rows
Another strong advantage of tbl_graph
objects is that there are the objects used by the package ggraph
that that offers various functions (theme, geoms, etc.) to efficiently visualize networks:
library(ggraph)
ggraph(tg_lagoons, layout = "stress") +
geom_edge_parallel(end_cap = circle(.5), start_cap = circle(.5),
arrow = arrow(length = unit(1, 'mm'), type = 'closed')) +
geom_node_point(aes(colour = taxonomy.rank), size = 8) +
theme_graph(background = "grey40", foreground = NA, text_colour = 'white')
We can easily print the BibTeX of all publications involved in the networks collection.
search_datasets(query = 'lagoon') %>%
get_collection %>% get_citation %>% cat(sep = "\n\n")
#> @article{Zetina_Rej_n_2003, doi = {10.1016/s0272-7714(02)00410-9}, url = {https://doi.org/10.1016%2Fs0272-7714%2802%2900410-9}, year = 2003, month = {aug}, publisher = {Elsevier {BV}}, volume = {57}, number = {5-6}, pages = {803--815}, author = {Manuel J. Zetina-Rejón and Francisco Arreguí-Sánchez and Ernesto A. Chávez}, title = {Trophic structure and flows of energy in the Huizache{ extendash}Caimanero lagoon complex on the Pacific coast of Mexico},journal = {Estuarine, Coastal and Shelf Science}}
#>
#> @book{yanez_1978, Author = {Yáñez-Arancibia, Alejandro}, Editor = {Universidad Nacional Autónoma de México, Centro de Ciencias del Mar y Limnología. Ciudad Universitaria, México, D.F. -- 1a ed.},Title = {Taxonomía, ecología y estructura de las comunidades de peces en lagunas costeras con bocas efímeras del Pacífico de México}, Year = {1978}}
Chamberlain, Scott, Eduard Szoecs, Zachary Foster, Zebulun Arendsee, Carl Boettiger, Karthik Ram, Ignasi Bartomeus, et al. 2019. Taxize: Taxonomic Information from Around the Web. https://github.com/ropensci/taxize.
Poisot, Timothée, Benjamin Baiser, Jennifer A. Dunne, Sonia Kéfi, François Massol, Nicolas Mouquet, Tamara N. Romanuk, Daniel B. Stouffer, Spencer A. Wood, and Dominique Gravel. 2016. “Mangal - Making Ecological Network Analysis Simple.” Ecography 39 (4): 384–90. https://doi.org/10.1111/ecog.00976.