The atlas_
functions are used to return data from the atlas chosen using galah_config()
. They are:
atlas_counts
atlas_occurrences
atlas_species
atlas_media
atlas_taxonomy
The final atlas_
function - atlas_citation
- is unusual in that it does not return any new data. Instead it provides a citation for an existing dataset ( downloaded using atlas_occurrences
) that has an associated DOI. The other functions are described below.
atlas_counts()
provides summary counts on records in the specified atlas, without needing to download all the records.
# Total number of records in the ALA
atlas_counts()
## # A tibble: 1 × 1
## count
## <int>
## 1 102070026
In addition to the filter arguments, it has an optional group_by
argument, which provides counts binned by the requested field.
galah_call() |>
galah_group_by(kingdom) |>
atlas_counts()
## # A tibble: 10 × 2
## kingdom count
## <chr> <int>
## 1 Animalia 76305582
## 2 Plantae 21998260
## 3 Fungi 1893778
## 4 Chromista 915235
## 5 Protista 67365
## 6 Bacteria 58156
## 7 Protozoa 23635
## 8 Archaea 1103
## 9 Eukaryota 735
## 10 Virus 472
A common use case of atlas data is to identify which species occur in a specified region, time period, or taxonomic group. atlas_species()
is similar to search_taxa
, in that it returns taxonomic information and unique identifiers in a tibble
. It differs in not being able to return information on taxonomic levels other than the species; but also in being more flexible by supporting filtering:
<- galah_call() |>
species galah_identify("Rodentia") |>
galah_filter(stateProvince == "Northern Territory") |>
atlas_species()
|> head() species
## # A tibble: 6 × 10
## kingdom phylum class order family genus species author species_guid vernacular_name
## <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
## 1 Animalia Chordata Mammalia Rodentia Muridae Mesembriomys Mesembriomys gouldii (J.E. G… urn:lsid:biodiver… Black-footed T…
## 2 Animalia Chordata Mammalia Rodentia Muridae Zyzomys Zyzomys argurus (Thomas… urn:lsid:biodiver… Common Rock-rat
## 3 Animalia Chordata Mammalia Rodentia Muridae Pseudomys Pseudomys hermannsburgensis (Waite,… urn:lsid:biodiver… Sandy Inland M…
## 4 Animalia Chordata Mammalia Rodentia Muridae Notomys Notomys alexis Thomas,… urn:lsid:biodiver… Spinifex Hoppi…
## 5 Animalia Chordata Mammalia Rodentia Muridae Melomys Melomys burtoni (Ramsay… urn:lsid:biodiver… Grassland Melo…
## 6 Animalia Chordata Mammalia Rodentia Muridae Mus Mus musculus Linnaeu… urn:lsid:biodiver… House Mouse
To download occurrence data you will need to specify your email in galah_config()
. This email must be associated with an active ALA account. See more information in the config section
galah_config(email = "your_email@email.com", atlas = "Australia")
Download occurrence records for Eolophus roseicapilla
<- galah_call() |>
occ galah_identify("Eolophus roseicapilla") |>
galah_filter(
== "Australian Capital Territory",
stateProvince >= 2010,
year profile = "ALA"
|>
) galah_select(institutionID, group = "basic") |>
atlas_occurrences()
|> head() occ
## # A tibble: 6 × 8
## decimalLatitude decimalLongitude eventDate scientificName taxonConceptID recordID dataResourceName institutionID
## <dbl> <dbl> <chr> <chr> <chr> <chr> <chr> <chr>
## 1 -35.9 149. "" Eolophus rosei… urn:lsid:biodiv… 17f46d4… eBird Australia ""
## 2 -35.9 149. "" Eolophus rosei… urn:lsid:biodiv… ef2b906… eBird Australia ""
## 3 -35.9 149. "2012-01-18T13:00:00Z" Eolophus rosei… urn:lsid:biodiv… 4f7cd71… BirdLife Austra… ""
## 4 -35.9 149. "" Eolophus rosei… urn:lsid:biodiv… 3236c47… eBird Australia ""
## 5 -35.8 149. "" Eolophus rosei… urn:lsid:biodiv… eec91d8… eBird Australia ""
## 6 -35.8 149. "" Eolophus rosei… urn:lsid:biodiv… e340c42… eBird Australia ""
In addition to text data describing individual occurrences and their attributes, ALA stores images, sounds and videos associated with a given record. These can be downloaded to R
using atlas_media()
and the same set of filters as the other data download functions.
<- galah_call() |>
media_data galah_identify("Eolophus roseicapilla") |>
galah_filter(year == 2020) |>
atlas_media(download_dir = "media")
atlas_taxonomy
provides a way to build taxonomic trees from one clade down to another using ALA’s internal taxonomy. Specify which taxonomic level your tree will go down to with galah_down_to
.
<- galah_call() |>
classes galah_identify("chordata") |>
galah_down_to(class) |>
atlas_taxonomy()
This function is unique within galah
as it is the only function that returns a data.tree
, rather than a tibble
.
## levelName
## 1 Chordata
## 2 ¦--Cephalochordata
## 3 ¦ °--Amphioxi
## 4 ¦--Craniata
## 5 ¦ °--Agnatha
## 6 ¦ ¦--Cephalasipidomorphi
## 7 ¦ °--Myxini
## 8 ¦--Tunicata
## 9 ¦ ¦--Appendicularia
## 10 ¦ ¦--Ascidiacea
## 11 ¦ °--Thaliacea
## 12 °--Vertebrata
## 13 °--Gnathostomata
## 14 ¦--Amphibia
## 15 ¦--Aves
## 16 ¦--Mammalia
## 17 ¦--Pisces
## 18 ¦ ¦--Actinopterygii
## 19 ¦ ¦--Chondrichthyes
## 20 ¦ ¦--Cephalaspidomorphi
## 21 ¦ °--Sarcopterygii
## 22 °--Reptilia
Although the tree format is useful, converting to a data.frame
is straightforward.
::ToDataFrameTypeCol(classes, type = "rank") |> head() data.tree
## rank_phylum rank_subphylum rank_superclass rank_informal rank_class
## 1 Chordata Cephalochordata <NA> <NA> Amphioxi
## 2 Chordata Craniata Agnatha <NA> Cephalasipidomorphi
## 3 Chordata Craniata Agnatha <NA> Myxini
## 4 Chordata Tunicata <NA> <NA> Appendicularia
## 5 Chordata Tunicata <NA> <NA> Ascidiacea
## 6 Chordata Tunicata <NA> <NA> Thaliacea
galah
Various aspects of the galah package can be customized. To preserve configuration for future sessions, set profile_path
to a location of a .Rprofile
file.
To download occurrence records, you will need to provide an email address registered with the ALA. You can create an account here. Once an email is registered with the ALA, it should be stored in the config:
galah_config(email = "myemail@gmail.com")
galah
can cache most results to local files. This means that if the same code is run multiple times, the second and subsequent iterations will be faster.
By default, this caching is session-based, meaning that the local files are stored in a temporary directory that is automatically deleted when the R session is ended. This behaviour can be altered so that caching is permanent, by setting the caching directory to a non-temporary location.
galah_config(cache_directory = "example/dir")
By default, caching is turned off. To turn caching on, run
galah_config(caching = FALSE)
ALA requires that you provide a reason when downloading occurrence data (via the galah atlas_occurrences()
function). The reason is set as “scientific research” by default, but you can change this using galah_config()
. See show_all_reasons()
for valid download reasons.
galah_config(download_reason_id = your_reason_id)
If things aren’t working as expected, more detail (particularly about web requests and caching behaviour) can be obtained by setting the verbose
configuration option:
galah_config(verbose = TRUE)