‘DemografixeR’ allows to estimate gender, age & nationality from a name. The package is an API wrapper of all 3 ‘Demografix’ API’s - all three APIs supported in one package:
You can find all the necessary documentation about the package here:
You can install the CRAN release version of DemografixeR following this R
command:
You can also install the development version of DemografixeR following these R
commands:
if (!require("devtools")) install.packages("devtools")
devtools::install_github("matbmeijer/DemografixeR")
These are basic examples, which shows you how to estimate nationality, gender and age by a given name with & without specifying a country. The package takes care of multiple background tasks:
dplyr
or data.table
)library(DemografixeR)
#Simple example without country_id
names<-c("Ben", "Allister", "Lucie", "Paula")
genderize(name = names)
#> [1] "male" "male" "female" "female"
nationalize(name = names)
#> [1] "AU" "ZA" "CZ" "PT"
agify(name = names)
#> [1] 48 44 24 50
#Simple example with
genderize(name = names, country_id = "US")
#> [1] "male" "male" "female" "female"
agify(name = names, country_id = "US")
#> [1] 67 46 65 70
#Workflow example with dplyr with missing values and multiple different countries
df<-data.frame(names=c("Ana", NA, "Pedro",
"Francisco", "Maria", "Elena"),
country=c(NA, NA, "ES",
"DE", "ES", "NL"), stringsAsFactors = FALSE)
df %>% dplyr::mutate(guessed_nationality=nationalize(name = names),
guessed_gender=genderize(name = names, country_id = country),
guessed_age=agify(name = names, country_id = country)) %>%
knitr::kable()
names | country | guessed_nationality | guessed_gender | guessed_age |
---|---|---|---|---|
Ana | NA | PT | female | 58 |
NA | NA | NA | NA | NA |
Pedro | ES | PT | male | 69 |
Francisco | DE | CL | male | 58 |
Maria | ES | CY | NA | 59 |
Elena | NL | CC | female | 69 |
#Detailed data.frame example:
genderize(name = names, simplify = FALSE, meta = TRUE) %>% knitr::kable()
name | type | gender | probability | count | api_rate_limit | api_rate_remaining | api_rate_reset | api_request_timestamp | |
---|---|---|---|---|---|---|---|---|---|
2 | Ben | gender | male | 0.95 | 77991 | 1000 | 831 | 5214 | 2020-05-04 22:33:05 |
1 | Allister | gender | male | 0.98 | 129 | 1000 | 831 | 5214 | 2020-05-04 22:33:05 |
3 | Lucie | gender | female | 0.99 | 85580 | 1000 | 831 | 5214 | 2020-05-04 22:33:05 |
4 | Paula | gender | female | 0.98 | 74130 | 1000 | 831 | 5214 | 2020-05-04 22:33:05 |
Please note that the ‘DemografixeR’ project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.