incidence()
objects are easy to work with, and we providing helper functions for both manipulating and accessing the underlying data and attributes. As incidence()
objects are subclasses of tibbles they also have good integration with tidyverse verbs.
regroup()
Sometimes you may find you’ve created a grouped incidence but now want to change the internal grouping. Assuming you are after a subset of the grouping already generated, then you can use to regroup()
function to get the desired aggregation:
library(outbreaks)
library(dplyr)
library(incidence2)
# load data
<- ebola_sim_clean$linelist
dat
# generate the incidence object with 3 groups
<- incidence(dat, date_of_onset, groups = c(gender, hospital, outcome), interval = "week")
inci
inci#> An incidence object: 1,448 x 5
#> date range: [2014-W15] to [2015-W18]
#> cases: 5829
#> interval: 1 (Monday) week
#> cumulative: FALSE
#>
#> date_index gender hospital outcome count
#> <yrwk> <fct> <fct> <fct> <int>
#> 1 2014-W15 f Military Hospital <NA> 1
#> 2 2014-W16 m Connaught Hospital <NA> 1
#> 3 2014-W17 f <NA> <NA> 1
#> 4 2014-W17 f <NA> Death 1
#> 5 2014-W17 f other Recover 2
#> 6 2014-W17 m other Recover 1
#> 7 2014-W18 f <NA> Recover 1
#> 8 2014-W18 f Connaught Hospital Recover 1
#> 9 2014-W18 f Princess Christian Maternity Hospital (PCMH) Death 1
#> 10 2014-W18 f Rokupa Hospital Recover 1
#> # … with 1,438 more rows
# regroup to just two groups
%>% regroup(c(gender, outcome))
inci #> An incidence object: 320 x 4
#> date range: [2014-W15] to [2015-W18]
#> cases: 5829
#> interval: 1 (Monday) week
#> cumulative: FALSE
#>
#> date_index gender outcome count
#> <yrwk> <fct> <fct> <int>
#> 1 2014-W15 f <NA> 1
#> 2 2014-W16 m <NA> 1
#> 3 2014-W17 f <NA> 1
#> 4 2014-W17 f Death 1
#> 5 2014-W17 f Recover 2
#> 6 2014-W17 m Recover 1
#> 7 2014-W18 f Death 1
#> 8 2014-W18 f Recover 3
#> 9 2014-W19 f <NA> 4
#> 10 2014-W19 f Death 2
#> # … with 310 more rows
# drop all groups
%>% regroup()
inci #> An incidence object: 56 x 2
#> date range: [2014-W15] to [2015-W18]
#> cases: 5829
#> interval: 1 (Monday) week
#> cumulative: FALSE
#>
#> date_index count
#> <yrwk> <int>
#> 1 2014-W15 1
#> 2 2014-W16 1
#> 3 2014-W17 5
#> 4 2014-W18 4
#> 5 2014-W19 12
#> 6 2014-W20 17
#> 7 2014-W21 15
#> 8 2014-W22 19
#> 9 2014-W23 23
#> 10 2014-W24 21
#> # … with 46 more rows
We also provide a helper function, cumulate
() to easily generate cumulative incidences:
%>%
inci regroup(hospital) %>%
cumulate() %>%
facet_plot(n_breaks = 4, nrow = 3)
keep_first()
and keep_last()
Once your data is grouped by date, you may want to select the first or last few entries based on a particular date grouping using keep_first()
and keep_last()
:
%>% keep_first(3)
inci #> An incidence object: 6 x 5
#> date range: [2014-W15] to [2014-W17]
#> cases: 7
#> interval: 1 (Monday) week
#> cumulative: FALSE
#>
#> date_index gender hospital outcome count
#> <yrwk> <fct> <fct> <fct> <int>
#> 1 2014-W15 f Military Hospital <NA> 1
#> 2 2014-W16 m Connaught Hospital <NA> 1
#> 3 2014-W17 f <NA> <NA> 1
#> 4 2014-W17 f <NA> Death 1
#> 5 2014-W17 f other Recover 2
#> 6 2014-W17 m other Recover 1
%>% keep_last(3)
inci #> An incidence object: 63 x 5
#> date range: [2015-W16] to [2015-W18]
#> cases: 103
#> interval: 1 (Monday) week
#> cumulative: FALSE
#>
#> date_index gender hospital outcome count
#> <yrwk> <fct> <fct> <fct> <int>
#> 1 2015-W16 f <NA> <NA> 1
#> 2 2015-W16 f <NA> Death 7
#> 3 2015-W16 f <NA> Recover 1
#> 4 2015-W16 f Connaught Hospital <NA> 1
#> 5 2015-W16 f Connaught Hospital Death 5
#> 6 2015-W16 f Connaught Hospital Recover 3
#> 7 2015-W16 f Military Hospital Recover 1
#> 8 2015-W16 f other <NA> 1
#> 9 2015-W16 f other Death 2
#> 10 2015-W16 f other Recover 1
#> # … with 53 more rows
incidence2 has been written with tidyverse compatibility (in particular dplyr) at the forefront of the design choices we have made. By this we mean that if an operation from dplyr is applied to an incidence object then as long as the invariants of the object are preserved (i.e. groups, interval and uniqueness of rows) then the object returned will be an incidence object. If the invariants are not preserved then a tibble will be returned instead.
library(dplyr)
# create incidence object
<- incidence(dat, date_of_onset, interval = "week", groups = c(hospital, gender))
inci
# filtering preserves class
%>% filter(gender == "f", hospital == "Rokupa Hospital")
inci #> An incidence object: 48 x 4
#> date range: [2014-W18] to [2015-W18]
#> cases: 210
#> interval: 1 (Monday) week
#> cumulative: FALSE
#>
#> date_index hospital gender count
#> <yrwk> <fct> <fct> <int>
#> 1 2014-W18 Rokupa Hospital f 1
#> 2 2014-W20 Rokupa Hospital f 1
#> 3 2014-W22 Rokupa Hospital f 1
#> 4 2014-W23 Rokupa Hospital f 1
#> 5 2014-W25 Rokupa Hospital f 1
#> 6 2014-W27 Rokupa Hospital f 1
#> 7 2014-W28 Rokupa Hospital f 4
#> 8 2014-W29 Rokupa Hospital f 2
#> 9 2014-W30 Rokupa Hospital f 1
#> 10 2014-W31 Rokupa Hospital f 1
#> # … with 38 more rows
# slice operations preserve class
%>% slice_sample(n = 10)
inci #> An incidence object: 10 x 4
#> date range: [2014-W23] to [2015-W17]
#> cases: 93
#> interval: 1 (Monday) week
#> cumulative: FALSE
#>
#> date_index hospital gender count
#> <yrwk> <fct> <fct> <int>
#> 1 2015-W17 Military Hospital f 4
#> 2 2014-W25 <NA> m 3
#> 3 2014-W23 other f 2
#> 4 2015-W06 Rokupa Hospital m 2
#> 5 2014-W45 Military Hospital m 20
#> 6 2015-W15 <NA> m 7
#> 7 2014-W47 <NA> m 11
#> 8 2014-W42 other m 20
#> 9 2014-W38 Rokupa Hospital m 7
#> 10 2015-W03 <NA> f 17
%>% slice(1, 5, 10)
inci #> An incidence object: 3 x 4
#> date range: [2014-W15] to [2014-W19]
#> cases: 3
#> interval: 1 (Monday) week
#> cumulative: FALSE
#>
#> date_index hospital gender count
#> <yrwk> <fct> <fct> <int>
#> 1 2014-W15 Military Hospital f 1
#> 2 2014-W17 other m 1
#> 3 2014-W19 <NA> f 1
# mutate preserve class
%>% mutate(future = date_index + 999)
inci #> An incidence object: 601 x 5
#> date range: [2014-W15] to [2015-W18]
#> cases: 5829
#> interval: 1 (Monday) week
#> cumulative: FALSE
#>
#> date_index hospital gender count future
#> <yrwk> <fct> <fct> <int> <yrwk>
#> 1 2014-W15 Military Hospital f 1 2033-W22
#> 2 2014-W16 Connaught Hospital m 1 2033-W23
#> 3 2014-W17 <NA> f 2 2033-W24
#> 4 2014-W17 other f 2 2033-W24
#> 5 2014-W17 other m 1 2033-W24
#> 6 2014-W18 <NA> f 1 2033-W25
#> 7 2014-W18 Connaught Hospital f 1 2033-W25
#> 8 2014-W18 Princess Christian Maternity Hospital (PCMH) f 1 2033-W25
#> 9 2014-W18 Rokupa Hospital f 1 2033-W25
#> 10 2014-W19 <NA> f 1 2033-W26
#> # … with 591 more rows
# rename preserve class
%>% rename(left_bin = date_index)
inci #> An incidence object: 601 x 4
#> date range: [2014-W15] to [2015-W18]
#> cases: 5829
#> interval: 1 (Monday) week
#> cumulative: FALSE
#>
#> left_bin hospital gender count
#> <yrwk> <fct> <fct> <int>
#> 1 2014-W15 Military Hospital f 1
#> 2 2014-W16 Connaught Hospital m 1
#> 3 2014-W17 <NA> f 2
#> 4 2014-W17 other f 2
#> 5 2014-W17 other m 1
#> 6 2014-W18 <NA> f 1
#> 7 2014-W18 Connaught Hospital f 1
#> 8 2014-W18 Princess Christian Maternity Hospital (PCMH) f 1
#> 9 2014-W18 Rokupa Hospital f 1
#> 10 2014-W19 <NA> f 1
#> # … with 591 more rows
# select returns a tibble unless all date, count and group variables are preserved
%>% select(-1)
inci #> # A tibble: 601 × 3
#> hospital gender count
#> <fct> <fct> <int>
#> 1 Military Hospital f 1
#> 2 Connaught Hospital m 1
#> 3 <NA> f 2
#> 4 other f 2
#> 5 other m 1
#> 6 <NA> f 1
#> 7 Connaught Hospital f 1
#> 8 Princess Christian Maternity Hospital (PCMH) f 1
#> 9 Rokupa Hospital f 1
#> 10 <NA> f 1
#> # … with 591 more rows
%>% select(everything())
inci #> An incidence object: 601 x 4
#> date range: [2014-W15] to [2015-W18]
#> cases: 5829
#> interval: 1 (Monday) week
#> cumulative: FALSE
#>
#> date_index hospital gender count
#> <yrwk> <fct> <fct> <int>
#> 1 2014-W15 Military Hospital f 1
#> 2 2014-W16 Connaught Hospital m 1
#> 3 2014-W17 <NA> f 2
#> 4 2014-W17 other f 2
#> 5 2014-W17 other m 1
#> 6 2014-W18 <NA> f 1
#> 7 2014-W18 Connaught Hospital f 1
#> 8 2014-W18 Princess Christian Maternity Hospital (PCMH) f 1
#> 9 2014-W18 Rokupa Hospital f 1
#> 10 2014-W19 <NA> f 1
#> # … with 591 more rows
We provide multiple accessors to easily access information about an incidence()
objects structure:
get_count_names()
, get_dates_name()
, and get_group_names()
all return character vectors of the column names corresponding to the requested variables.get_n()
returns the number of observations.get_interval()
returns the interval of the object.get_timespan()
returns the number of days the object covers.