This vignette is a brief introduction to the package including its installation and making some basic queries.
lehdr is an R package that allows users to draw Longitudinal and Employer Household Dynamics Origin-Destination Employment Statistics (LODES) datasets returned as dataframes. The LODES dataset forms the backbone of the US Census’s OnTheMap web app that allows users to track changing spatial employment patterns at a fine geographic scale. While OnTheMap is useful, it is a limited tool that does not easily allow comparisons over time or across geographies. This package exists to make querying the tables that form the OnTheMap easier for urban researchers and practitioners, such as transportation and economic development planners and disaster preparedness professionals.
lehdr has not yet been submitted to CRAN so installing using devtools is required. Additionally, we’ll be using dplyr.
::install_github("jamgreen/lehdr")
devtoolslibrary(lehdr)
library(dplyr)
library(stringr)
This first example pulls the Oregon (state = "or"
) 2014 (year = 2014
), origin-destination (lodes_type = "od"
), all jobs including private primary, secondary, and Federal (job_type = "JT01"
), all jobs across ages, earnings, and industry (segment = "S000"
), aggregated at the Census Tract level rather than the default Census Block (agg_geo = "tract"
).
<- grab_lodes(state = "or", year = 2014, lodes_type = "od", job_type = "JT01",
or_od segment = "S000", state_part = "main", agg_geo = "tract")
head(or_od)
The package can be used to retrieve multiple states and years at the same time by creating a vector or list. This second example pulls the Oregon AND Rhode Island (state = c("or", "ri")
) for 2013 and 2014 (year = c(2013, 2014)
or year = 2013:2014
).
<- grab_lodes(state = c("or", "ri"), year = c(2013, 2014), lodes_type = "od", job_type = "JT01",
or_ri_od segment = "S000", state_part = "main", agg_geo = "tract")
head(or_ri_od)
Not all years are available for each state. To see all options for lodes_type
, job_type
, and segment
and the availability for each state/year, please see the most recent LEHD Technical Document at https://lehd.ces.census.gov/data/lodes/LODES7/.
Other common uses might include retrieving Residential or Work Area Characteristics (lodes_type = "rac"
or lodes_type = "wac"
respectively), low income jobs (segment = "SE01"
) or good producing jobs (segment = "SI01"
). Other common geographies might include retrieving data at the Census Block level (agg_geo = "block"
, not necessary as it is default) – but see below for other aggregation levels.
The following examples loads work area characteristics (wac), then uses the work area geoid w_geocode
to create a variable that is just the county w_county_fips
. Similar transformations can be made on residence area characteristics (rac) by using the h_geocode
variable. Both variables are available in origin-destination (od) datasets and with od, one would need to set a h_county_fips
and on w_county_fips
.
<- grab_lodes(state = "md", year = 2015, lodes_type = "wac", job_type = "JT01", segment = "S000")
md_rac
head(md_rac)
<- md_rac %>% mutate(w_county_fips = str_sub(w_geocode, 1, 5))
md_rac_county
head(md_rac_county)
To aggregate at the county level, continuing the above example, we must also drop the original lock geoid w_geocode
, group by our new variable w_county_fips
and our existing variables year
and createdate
, then aggregate the remaining numeric variables.
<- md_rac %>% mutate(w_county_fips = str_sub(w_geocode, 1, 5)) %>%
md_rac_county select(-"w_geocode") %>%
group_by(w_county_fips, state, year, createdate) %>%
summarise_if(is.numeric, sum)
head(md_rac_county)
Alternatively, this functionality is also built-in to the package and advisable for origin-destination grabs. Here include an argument to aggregate at the County level (agg_geo = "county"
):
<- grab_lodes(state = "md", year = 2015, lodes_type = "rac", job_type = "JT01",
md_rac_county segment = "S000", agg_geo = "county")
head(md_rac_county)
As mentioned above, aggregating origin-destination is built-in. This takes care of aggregation on both the h_geocode
and w_geocode
variables:
<- grab_lodes(state = "md", year = 2015, lodes_type = "od", job_type = "JT01",
md_od_county segment = "S000", agg_geo = "county", state_part = "main")
head(md_od_county)
Similarly, built-in functions exist to group at Block Group, Tract, County, and State levels. County was demonstrated above. All require setting the agg_geo
argument. This aggregation works for all three LODES types.
<- grab_lodes(state = "md", year = 2015, lodes_type = "rac", job_type = "JT01",
md_rac_bg segment = "S000", agg_geo = "bg")
head(md_rac_bg)
<- grab_lodes(state = "md", year = 2015, lodes_type = "rac", job_type = "JT01",
md_rac_tract segment = "S000", agg_geo = "tract")
head(md_rac_tract)
<- grab_lodes(state = "md", year = 2015, lodes_type = "rac", job_type = "JT01",
md_rac_state segment = "S000", agg_geo = "state")
head(md_rac_state)