This article describes creating an ADSL
ADaM. Examples are currently presented and tested using DM
, EX
, AE
, LB
and DS
SDTM domains. However, other domains could be used.
Note: All examples assume CDISC SDTM and/or ADaM format as input unless otherwise specified.
TRT0xP
, TRT0xA
)TRTSDT
, TRTEDT
, TRTDURD
)LSTALVDT
)To start, all data frames needed for the creation of ADSL
should be read into the environment. This will be a company specific process. Some of the data frames needed may be DM
, EX
, DS
, AE
, and LB
.
For example purpose, the CDISC Pilot SDTM datasets—which are included in {admiral.test}
—are used.
library(admiral)
library(dplyr)
library(admiral.test)
library(lubridate)
library(stringr)
data("admiral_dm")
data("admiral_ds")
data("admiral_ex")
data("admiral_ae")
data("admiral_lb")
<- admiral_dm
dm <- admiral_ds
ds <- admiral_ex
ex <- admiral_ae
ae <- admiral_lb lb
The DM
domain is used as the basis for ADSL
:
<- dm %>%
adsl select(-DOMAIN)
USUBJID | RFSTDTC | COUNTRY | AGE | SEX | RACE | ETHNIC | ARM | ACTARM |
---|---|---|---|---|---|---|---|---|
01-701-1015 | 2014-01-02 | USA | 63 | F | WHITE | HISPANIC OR LATINO | Placebo | Placebo |
01-701-1023 | 2012-08-05 | USA | 64 | M | WHITE | HISPANIC OR LATINO | Placebo | Placebo |
01-701-1028 | 2013-07-19 | USA | 71 | M | WHITE | NOT HISPANIC OR LATINO | Xanomeline High Dose | Xanomeline High Dose |
01-701-1033 | 2014-03-18 | USA | 74 | M | WHITE | NOT HISPANIC OR LATINO | Xanomeline Low Dose | Xanomeline Low Dose |
01-701-1034 | 2014-07-01 | USA | 77 | F | WHITE | NOT HISPANIC OR LATINO | Xanomeline High Dose | Xanomeline High Dose |
01-701-1047 | 2013-02-12 | USA | 85 | F | WHITE | NOT HISPANIC OR LATINO | Placebo | Placebo |
01-701-1057 | USA | 59 | F | WHITE | HISPANIC OR LATINO | Screen Failure | Screen Failure | |
01-701-1097 | 2014-01-01 | USA | 68 | M | WHITE | NOT HISPANIC OR LATINO | Xanomeline Low Dose | Xanomeline Low Dose |
01-701-1111 | 2012-09-07 | USA | 81 | F | WHITE | NOT HISPANIC OR LATINO | Xanomeline Low Dose | Xanomeline Low Dose |
01-701-1115 | 2012-11-30 | USA | 84 | M | WHITE | NOT HISPANIC OR LATINO | Xanomeline Low Dose | Xanomeline Low Dose |
TRT0xP
, TRT0xA
)The mapping of the treatment variables is left to the ADaM programmer. An example mapping may be:
<- dm %>%
adsl mutate(TRT01P = ARM, TRT01A = ACTARM)
TRTSDTM
, TRTEDTM
, TRTDURD
)The function derive_vars_merged()
can be used to derive the treatment start and end date/times using the ex
domain. A pre-processing step for ex
is required to convert the variable EXSTDTC
and EXSTDTC
to datetime variables and impute missing date or time components. Conversion and imputation is done by derive_vars_dtm()
.
Example calls:
# impute start and end time of exposure to first and last respectively, do not impute date
<- ex %>%
ex_ext derive_vars_dtm(
dtc = EXSTDTC,
new_vars_prefix = "EXST"
%>%
) derive_vars_dtm(
dtc = EXENDTC,
new_vars_prefix = "EXEN",
time_imputation = "last"
)
<- adsl %>%
adsl derive_vars_merged(
dataset_add = ex_ext,
filter_add = (EXDOSE > 0 |
== 0 &
(EXDOSE str_detect(EXTRT, "PLACEBO"))) & !is.na(EXSTDTM),
new_vars = vars(TRTSDTM = EXSTDTM, TRTSTMF = EXSTTMF),
order = vars(EXSTDTM, EXSEQ),
mode = "first",
by_vars = vars(STUDYID, USUBJID)
%>%
) derive_vars_merged(
dataset_add = ex_ext,
filter_add = (EXDOSE > 0 |
== 0 &
(EXDOSE str_detect(EXTRT, "PLACEBO"))) & !is.na(EXENDTM),
new_vars = vars(TRTEDTM = EXENDTM, TRTETMF = EXENTMF),
order = vars(EXENDTM, EXSEQ),
mode = "last",
by_vars = vars(STUDYID, USUBJID)
)
This call returns the original data frame with the column TRTSDTM
, TRTSTMF
, TRTEDTM
, and TRTETMF
added. Exposure observations with incomplete date and zero doses of non placebo treatments are ignored. Missing time parts are imputed as first or last for start and end date respectively.
The datetime variables returned can be converted to dates using the derive_vars_dtm_to_dt()
function.
<- adsl %>%
adsl derive_vars_dtm_to_dt(source_vars = vars(TRTSDTM, TRTEDTM))
Now, that TRTSDT
and TRTEDT
are derived, the function derive_var_trtdurd()
can be used to calculate the Treatment duration (TRTDURD
).
<- adsl %>%
adsl derive_var_trtdurd()
USUBJID | RFSTDTC | TRTSDTM | TRTSDT | TRTEDTM | TRTEDT | TRTDURD |
---|---|---|---|---|---|---|
01-701-1015 | 2014-01-02 | 2014-01-02 | 2014-01-02 | 2014-07-02 23:59:59 | 2014-07-02 | 182 |
01-701-1023 | 2012-08-05 | 2012-08-05 | 2012-08-05 | 2012-09-01 23:59:59 | 2012-09-01 | 28 |
01-701-1028 | 2013-07-19 | 2013-07-19 | 2013-07-19 | 2014-01-14 23:59:59 | 2014-01-14 | 180 |
01-701-1033 | 2014-03-18 | 2014-03-18 | 2014-03-18 | 2014-03-31 23:59:59 | 2014-03-31 | 14 |
01-701-1034 | 2014-07-01 | 2014-07-01 | 2014-07-01 | 2014-12-30 23:59:59 | 2014-12-30 | 183 |
01-701-1047 | 2013-02-12 | 2013-02-12 | 2013-02-12 | 2013-03-09 23:59:59 | 2013-03-09 | 26 |
01-701-1057 | NA | NA | NA | NA | NA | |
01-701-1097 | 2014-01-01 | 2014-01-01 | 2014-01-01 | 2014-07-09 23:59:59 | 2014-07-09 | 190 |
01-701-1111 | 2012-09-07 | 2012-09-07 | 2012-09-07 | 2012-09-16 23:59:59 | 2012-09-16 | 10 |
01-701-1115 | 2012-11-30 | 2012-11-30 | 2012-11-30 | 2013-01-23 23:59:59 | 2013-01-23 | 55 |
EOSDT
)The functions derive_vars_dt()
and derive_vars_merged()
can be used to derive a disposition date. First the character disposition date (DS.DSSTDTC
) is converted to a numeric date (DSSTDT
) calling derive_vars_dt()
. Then the relevant disposition date is selected by adjusting the filter_add
parameter.
To derive the End of Study date (EOSDT
), a call could be:
# convert character date to numeric date without imputation
<- derive_vars_dt(
ds_ext
ds,dtc = DSSTDTC,
new_vars_prefix = "DSST"
)
<- adsl %>%
adsl derive_vars_merged(
dataset_add = ds_ext,
by_vars = vars(STUDYID, USUBJID),
new_vars = vars(EOSDT = DSSTDT),
filter_add = DSCAT == "DISPOSITION EVENT" & DSDECOD != "SCREEN FAILURE"
)
USUBJID | DSCAT | DSDECOD | DSTERM | DSSTDTC |
---|---|---|---|---|
01-701-1015 | PROTOCOL MILESTONE | RANDOMIZED | RANDOMIZED | 2014-01-02 |
01-701-1015 | DISPOSITION EVENT | COMPLETED | PROTOCOL COMPLETED | 2014-07-02 |
01-701-1015 | OTHER EVENT | FINAL LAB VISIT | FINAL LAB VISIT | 2014-07-02 |
01-701-1023 | PROTOCOL MILESTONE | RANDOMIZED | RANDOMIZED | 2012-08-05 |
01-701-1023 | DISPOSITION EVENT | ADVERSE EVENT | ADVERSE EVENT | 2012-09-02 |
01-701-1023 | OTHER EVENT | FINAL LAB VISIT | FINAL LAB VISIT | 2012-09-02 |
01-701-1023 | OTHER EVENT | FINAL RETRIEVAL VISIT | FINAL RETRIEVAL VISIT | 2013-02-18 |
01-701-1028 | PROTOCOL MILESTONE | RANDOMIZED | RANDOMIZED | 2013-07-19 |
01-701-1028 | DISPOSITION EVENT | COMPLETED | PROTOCOL COMPLETED | 2014-01-14 |
01-701-1028 | OTHER EVENT | FINAL LAB VISIT | FINAL LAB VISIT | 2014-01-14 |
We would get :
USUBJID | EOSDT |
---|---|
01-701-1015 | 2014-07-02 |
01-701-1023 | 2012-09-02 |
01-701-1028 | 2014-01-14 |
01-701-1033 | 2014-04-14 |
01-701-1034 | 2014-12-30 |
01-701-1047 | 2013-03-29 |
01-701-1057 | NA |
01-701-1097 | 2014-07-09 |
01-701-1111 | 2012-09-17 |
01-701-1115 | 2013-01-23 |
This call would return the input dataset with the column EOSDT
added. This function allows the user to impute partial dates as well. If imputation is needed and the date is to be imputed to the first of the month, then set date_imputation = "FIRST"
.
EOSSTT
)The function derive_var_disposition_status()
can be used to derive a disposition status at a specific timepoint. The relevant disposition variable (DS.DSDECOD
) is selected by adjusting the filter parameter and used to derive EOSSTT
.
To derive the End of Study status (EOSSTT
), a call could be:
<- adsl %>%
adsl derive_var_disposition_status(
dataset_ds = ds,
new_var = EOSSTT,
status_var = DSDECOD,
filter_ds = DSCAT == "DISPOSITION EVENT"
)
USUBJID | EOSDT | EOSSTT |
---|---|---|
01-701-1015 | 2014-07-02 | COMPLETED |
01-701-1023 | 2012-09-02 | DISCONTINUED |
01-701-1028 | 2014-01-14 | COMPLETED |
01-701-1033 | 2014-04-14 | DISCONTINUED |
01-701-1034 | 2014-12-30 | COMPLETED |
01-701-1047 | 2013-03-29 | DISCONTINUED |
01-701-1057 | NA | NOT STARTED |
01-701-1097 | 2014-07-09 | COMPLETED |
01-701-1111 | 2012-09-17 | DISCONTINUED |
01-701-1115 | 2013-01-23 | DISCONTINUED |
Link to DS
.
This call would return the input dataset with the column EOSSTT
added.
By default, the function will derive EOSSTT
as
"NOT STARTED"
if DSDECOD
is "SCREEN FAILURE"
or "SCREENING NOT COMPLETED"
"COMPLETED"
if DSDECOD == "COMPLETED"
"DISCONTINUED"
if DSDECOD
is not "COMPLETED"
or NA
"ONGOING"
otherwiseIf the default derivation must be changed, the user can create his/her own function and pass it to the format_new_var
argument of the function (format_new_var = new_mapping
) to map DSDECOD
to a suitable EOSSTT
value.
Example function format_eosstt()
:
<- function(DSDECOD) {
format_eosstt case_when(
%in% c("COMPLETED") ~ "COMPLETED",
DSDECOD %in% c("SCREEN FAILURE") ~ NA_character_,
DSDECOD !is.na(DSDECOD) ~ "DISCONTINUED",
TRUE ~ "ONGOING"
) }
The customized mapping function format_eosstt()
can now be passed to the main function:
<- adsl %>%
adsl derive_var_disposition_status(
dataset_ds = ds,
new_var = EOSSTT,
status_var = DSDECOD,
format_new_var = format_eosstt,
filter_ds = DSCAT == "DISPOSITION EVENT"
)
This call would return the input dataset with the column EOSSTT
added.
DCSREAS
, DCSREASP
)The main reason for discontinuation is usually stored in DSDECOD
while DSTERM
provides additional details regarding subject’s discontinuation (e.g., description of "OTHER"
).
The function derive_vars_disposition_reason()
can be used to derive a disposition reason (along with the details, if required) at a specific timepoint. The relevant disposition variable(s) (DS.DSDECOD
, DS.DSTERM
) are selected by adjusting the filter parameter and used to derive the main reason (and details).
To derive the End of Study reason(s) (DCSREAS
and DCSREASP
), the call would be:
<- adsl %>%
adsl derive_vars_disposition_reason(
dataset_ds = ds,
new_var = DCSREAS,
reason_var = DSDECOD,
new_var_spe = DCSREASP,
reason_var_spe = DSTERM,
filter_ds = DSCAT == "DISPOSITION EVENT" & DSDECOD != "SCREEN FAILURE"
)
USUBJID | EOSDT | EOSSTT | DCSREAS | DCSREASP |
---|---|---|---|---|
01-701-1015 | 2014-07-02 | COMPLETED | NA | NA |
01-701-1023 | 2012-09-02 | DISCONTINUED | ADVERSE EVENT | NA |
01-701-1028 | 2014-01-14 | COMPLETED | NA | NA |
01-701-1033 | 2014-04-14 | DISCONTINUED | STUDY TERMINATED BY SPONSOR | NA |
01-701-1034 | 2014-12-30 | COMPLETED | NA | NA |
01-701-1047 | 2013-03-29 | DISCONTINUED | ADVERSE EVENT | NA |
01-701-1057 | NA | NOT STARTED | NA | NA |
01-701-1097 | 2014-07-09 | COMPLETED | NA | NA |
01-701-1111 | 2012-09-17 | DISCONTINUED | ADVERSE EVENT | NA |
01-701-1115 | 2013-01-23 | DISCONTINUED | ADVERSE EVENT | NA |
Link to DS
.
This call would return the input dataset with the column DCSREAS
and DCSREASP
added.
By default, the function will map
DCSREAS
as DSDECOD
if DSDECOD
is not "COMPLETED"
or NA
, NA
otherwiseDCSREASP
as DSTERM
if DSDECOD
is equal to OTHER
, NA
otherwiseIf the default derivation must be changed, the user can create his/her own function and pass it to the format_new_var
argument of the function (format_new_var = new_mapping
) to map DSDECOD
and DSTERM
to a suitable DCSREAS
/DCSREASP
value.
Example function format_dcsreas()
:
<- function(dsdecod, dsterm = NULL) {
format_dcsreas if (is.null(dsterm)) {
if_else(dsdecod %notin% c("COMPLETED", "SCREEN FAILURE") & !is.na(dsdecod), dsdecod, NA_character_)
else {
} if_else(dsdecod == "OTHER", dsterm, NA_character_)
} }
The customized mapping function format_dcsreas()
can now be passed to the main function:
<- adsl %>%
adsl derive_vars_disposition_reason(
dataset_ds = ds,
new_var = DCSREAS,
reason_var = DSDECOD,
new_var_spe = DCSREASP,
reason_var_spe = DSTERM,
format_new_vars = format_dcsreas,
filter_ds = DSCAT == "DISPOSITION EVENT"
)
RANDDT
)The function derive_vars_merged()
can be used to derive randomization date variable. To map Randomization Date (RANDDT
), the call would be:
<- adsl %>%
adsl derive_vars_merged(
dataset_add = ds_ext,
filter_add = DSDECOD == "RANDOMIZED",
by_vars = vars(STUDYID, USUBJID),
new_vars = vars(RANDDT = DSSTDT)
)
This call would return the input dataset with the column RANDDT
is added.
USUBJID | RANDDT |
---|---|
01-701-1015 | 2014-01-02 |
01-701-1023 | 2012-08-05 |
01-701-1028 | 2013-07-19 |
01-701-1033 | 2014-03-18 |
01-701-1034 | 2014-07-01 |
01-701-1047 | 2013-02-12 |
01-701-1057 | NA |
01-701-1097 | 2014-01-01 |
01-701-1111 | 2012-09-07 |
01-701-1115 | 2012-11-30 |
Link to DS
.
DTHDT
)The function derive_vars_dt()
can be used to derive DTHDT
. This function allows the user to impute the date as well.
Example calls:
<- adsl %>%
adsl derive_vars_dt(
new_vars_prefix = "DTH",
dtc = DTHDTC
)
USUBJID | TRTEDT | DTHDTC | DTHDT | DTHFL |
---|---|---|---|---|
01-701-1015 | 2014-07-02 | NA | ||
01-701-1023 | 2012-09-01 | NA | ||
01-701-1028 | 2014-01-14 | NA | ||
01-701-1033 | 2014-03-31 | NA | ||
01-701-1034 | 2014-12-30 | NA | ||
01-701-1047 | 2013-03-09 | NA | ||
01-701-1057 | NA | NA | ||
01-701-1097 | 2014-07-09 | NA | ||
01-701-1111 | 2012-09-16 | NA | ||
01-701-1115 | 2013-01-23 | NA |
This call would return the input dataset with the columns DTHDT
added and, by default, the associated date imputation flag (DTHDTF
) populated with the controlled terminology outlined in the ADaM IG for date imputations. If the imputation flag is not required, the user must set the argument flag_imputation
to “none”.
If imputation is needed and the date is to be imputed to the first day of the month/year the call would be:
<- adsl %>%
adsl derive_vars_dt(
new_vars_prefix = "DTH",
dtc = DTHDTC,
date_imputation = "first"
)
See also Date and Time Imputation.
DTHCAUS
)The cause of death DTHCAUS
can be derived using the function derive_var_dthcaus()
.
Since the cause of death could be collected/mapped in different domains (e.g. DS
, AE
, DD
), it is important the user specifies the right source(s) to derive the cause of death from.
For example, if the date of death is collected in the AE form when the AE is Fatal, the cause of death would be set to the preferred term (AEDECOD
) of that Fatal AE, while if the date of death is collected in the DS
form, the cause of death would be set to the disposition term (DSTERM
). To achieve this, the dthcaus_source()
objects must be specified and defined such as it fits the study requirement.
dthcaus_source()
specifications:
dataset_name
: the name of the dataset where to search for death information,filter
: the condition to define death,date
: the date of death,mode
: first
or last
to select the first/last date of death if multiple dates are collected,dthcaus
: variable or text used to populate DTHCAUS
.traceability_vars
: whether the traceability variables need to be added (e.g source domain, sequence, variable)An example call to define the sources would be:
<- dthcaus_source(
src_ae dataset_name = "ae",
filter = AEOUT == "FATAL",
date = AESTDTM,
mode = "first",
dthcaus = AEDECOD
)
USUBJID | AESTDTC | AEENDTC | AEDECOD | AEOUT |
---|---|---|---|---|
01-701-1211 | 2013-01-14 | 2013-01-14 | SUDDEN DEATH | FATAL |
01-704-1445 | 2014-10-31 | 2014-10-31 | COMPLETED SUICIDE | FATAL |
01-710-1083 | 2013-08-02 | 2013-08-02 | MYOCARDIAL INFARCTION | FATAL |
<- dthcaus_source(
src_ds dataset_name = "ds",
filter = DSDECOD == "DEATH" & grepl("DEATH DUE TO", DSTERM),
date = DSSTDT,
mode = "first",
dthcaus = "Death in DS"
)
USUBJID | DSDECOD | DSTERM | DSSTDTC |
---|---|---|---|
01-701-1211 | DEATH | DEATH | 2013-01-14 |
01-704-1445 | DEATH | DEATH | 2014-11-01 |
01-710-1083 | DEATH | DEATH | 2013-08-02 |
Once the sources are defined, the function derive_var_dthcaus()
can be used to derive DTHCAUS
:
<- derive_vars_dtm(
ae_ext
ae,dtc = AESTDTC,
new_vars_prefix = "AEST",
highest_imputation = "M",
flag_imputation = "none"
)
<- adsl %>%
adsl derive_var_dthcaus(src_ae, src_ds, source_datasets = list(ae = ae_ext, ds = ds_ext))
USUBJID | EOSDT | DTHDTC | DTHDT | DTHCAUS |
---|---|---|---|---|
01-701-1211 | 2013-01-14 | 2013-01-14 | 2013-01-14 | SUDDEN DEATH |
01-704-1445 | 2014-11-01 | 2014-11-01 | 2014-11-01 | COMPLETED SUICIDE |
01-710-1083 | 2013-08-02 | 2013-08-02 | 2013-08-02 | MYOCARDIAL INFARCTION |
The function also offers the option to add some traceability variables (e.g. DTHDOM
would store the domain where the date of death is collected, and DTHSEQ
would store the xxSEQ
value of that domain). To add them, the traceability_vars
argument must be added to the dthcaus_source()
arguments:
<- dthcaus_source(
src_ae dataset_name = "ae",
filter = AEOUT == "FATAL",
date = AESTDTM,
mode = "first",
dthcaus = AEDECOD,
traceability_vars = vars(DTHDOM = "AE", DTHSEQ = AESEQ)
)
<- dthcaus_source(
src_ds dataset_name = "ds",
filter = DSDECOD == "DEATH" & grepl("DEATH DUE TO", DSTERM),
date = DSSTDT,
mode = "first",
dthcaus = DSTERM,
traceability_vars = vars(DTHDOM = "DS", DTHSEQ = DSSEQ)
)<- adsl %>%
adsl select(-DTHCAUS) %>% # remove it before deriving it again
derive_var_dthcaus(src_ae, src_ds, source_datasets = list(ae = ae_ext, ds = ds_ext))
USUBJID | TRTEDT | DTHDTC | DTHDT | DTHCAUS | DTHDOM | DTHSEQ |
---|---|---|---|---|---|---|
01-701-1211 | 2013-01-12 | 2013-01-14 | 2013-01-14 | SUDDEN DEATH | AE | 9 |
01-704-1445 | 2014-11-01 | 2014-11-01 | 2014-11-01 | COMPLETED SUICIDE | AE | 1 |
01-710-1083 | 2013-08-01 | 2013-08-02 | 2013-08-02 | MYOCARDIAL INFARCTION | AE | 1 |
The function derive_vars_duration()
can be used to derive duration relative to death like the Relative Day of Death (DTHADY
) or the numbers of days from last dose to death (LDDTHELD
).
Example calls:
<- adsl %>%
adsl derive_vars_duration(
new_var = DTHADY,
start_date = TRTSDT,
end_date = DTHDT
)
<- adsl %>%
adsl derive_vars_duration(
new_var = LDDTHELD,
start_date = TRTEDT,
end_date = DTHDT,
add_one = FALSE
)
USUBJID | TRTEDT | DTHDTC | DTHDT | DTHCAUS | DTHADY | LDDTHELD |
---|---|---|---|---|---|---|
01-701-1211 | 2013-01-12 | 2013-01-14 | 2013-01-14 | SUDDEN DEATH | 61 | 2 |
01-704-1445 | 2014-11-01 | 2014-11-01 | 2014-11-01 | COMPLETED SUICIDE | 175 | 0 |
01-710-1083 | 2013-08-01 | 2013-08-02 | 2013-08-02 | MYOCARDIAL INFARCTION | 12 | 1 |
LSTALVDT
)Similarly as for the cause of death (DTHCAUS
), the last known alive date (LSTALVDT
) can be derived from multiples sources and the user must ensure the sources (date_source()
) are correctly defined.
date_source()
specifications:
dataset_name
: the name of the dataset where to search for date information,filter
: the filter to apply on the datasets,date
: the date of interest,date_imputation
: whether and how to impute partial dates,traceability_vars
: whether the traceability variables need to be added (e.g source domain, sequence, variable)An example could be :
<- date_source(
ae_start_date dataset_name = "ae",
date = AESTDT
)<- date_source(
ae_end_date dataset_name = "ae",
date = AEENDT
)<- date_source(
lb_date dataset_name = "lb",
date = LBDT,
filter = !is.na(LBDT)
)<- date_source(
trt_end_date dataset_name = "adsl",
date = TRTEDT
)
Once the sources are defined, the function derive_var_extreme_dt()
can be used to derive LSTALVDT
:
# impute AE start and end date to first
<- ae %>%
ae_ext derive_vars_dt(
dtc = AESTDTC,
new_vars_prefix = "AEST",
highest_imputation = "M"
%>%
) derive_vars_dt(
dtc = AEENDTC,
new_vars_prefix = "AEEN",
highest_imputation = "M"
)
# impute LB date to first
<- derive_vars_dt(
lb_ext
lb,dtc = LBDTC,
new_vars_prefix = "LB",
highest_imputation = "M"
)
<- adsl %>%
adsl derive_var_extreme_dt(
new_var = LSTALVDT,
ae_start_date, ae_end_date, lb_date, trt_end_date,source_datasets = list(ae = ae_ext, adsl = adsl, lb = lb_ext),
mode = "last"
)
USUBJID | TRTEDT | DTHDTC | LSTALVDT |
---|---|---|---|
01-701-1015 | 2014-07-02 | 2014-07-02 | |
01-701-1023 | 2012-09-01 | 2012-09-02 | |
01-701-1028 | 2014-01-14 | 2014-01-14 | |
01-701-1033 | 2014-03-31 | 2014-04-14 | |
01-701-1034 | 2014-12-30 | 2014-12-30 | |
01-701-1047 | 2013-03-09 | 2013-04-07 | |
01-701-1097 | 2014-07-09 | 2014-07-09 | |
01-701-1111 | 2012-09-16 | 2012-09-17 | |
01-701-1115 | 2013-01-23 | 2013-01-23 | |
01-701-1118 | 2014-09-09 | 2014-09-09 |
Similarly to dthcaus_source()
, the traceability variables can be added by specifying the traceability_vars
argument in date_source()
.
<- date_source(
ae_start_date dataset_name = "ae",
date = AESTDT,
traceability_vars = vars(LALVDOM = "AE", LALVSEQ = AESEQ, LALVVAR = "AESTDTC")
)<- date_source(
ae_end_date dataset_name = "ae",
date = AEENDT,
traceability_vars = vars(LALVDOM = "AE", LALVSEQ = AESEQ, LALVVAR = "AEENDTC")
)<- date_source(
lb_date dataset_name = "lb",
date = LBDT,
filter = !is.na(LBDT),
traceability_vars = vars(LALVDOM = "LB", LALVSEQ = LBSEQ, LALVVAR = "LBDTC")
)<- date_source(
trt_end_date dataset_name = "adsl",
date = TRTEDTM,
traceability_vars = vars(LALVDOM = "ADSL", LALVSEQ = NA_integer_, LALVVAR = "TRTEDTM")
)
<- adsl %>%
adsl select(-LSTALVDT) %>% # created in the previous call
derive_var_extreme_dt(
new_var = LSTALVDT,
ae_start_date, ae_end_date, lb_date, trt_end_date,source_datasets = list(ae = ae_ext, adsl = adsl, lb = lb_ext),
mode = "last"
)
USUBJID | TRTEDT | DTHDTC | LSTALVDT | LALVDOM | LALVSEQ | LALVVAR |
---|---|---|---|---|---|---|
01-701-1015 | 2014-07-02 | 2014-07-02 | ADSL | NA | TRTEDTM | |
01-701-1023 | 2012-09-01 | 2012-09-02 | LB | 107 | LBDTC | |
01-701-1028 | 2014-01-14 | 2014-01-14 | ADSL | NA | TRTEDTM | |
01-701-1033 | 2014-03-31 | 2014-04-14 | LB | 107 | LBDTC | |
01-701-1034 | 2014-12-30 | 2014-12-30 | ADSL | NA | TRTEDTM | |
01-701-1047 | 2013-03-09 | 2013-04-07 | LB | 134 | LBDTC | |
01-701-1097 | 2014-07-09 | 2014-07-09 | ADSL | NA | TRTEDTM | |
01-701-1111 | 2012-09-16 | 2012-09-17 | LB | 73 | LBDTC | |
01-701-1115 | 2013-01-23 | 2013-01-23 | ADSL | NA | TRTEDTM | |
01-701-1118 | 2014-09-09 | 2014-09-09 | ADSL | NA | TRTEDTM |
AGEGR1
or REGION1
)Numeric and categorical variables (AGE
, RACE
, COUNTRY
, etc.) may need to be grouped to perform the required analysis. {admiral}
does not currently have functionality to assist with all required groupings. Some functions exist for age grouping according to FDA or EMA conventions. For others, the user can create his/her own function to meet his/her study requirement.
To derive AGEGR1
as categorized AGE
in < 18
, 18-65
, >= 65
(FDA convention):
<- adsl %>%
adsl derive_var_agegr_fda(
age_var = AGE,
new_var = AGEGR1
)#> Warning: `derive_var_agegr_ema()` was deprecated in admiral 0.8.0.
#> This warning is displayed once every 8 hours.
#> Call `lifecycle::last_lifecycle_warnings()` to see where this warning was generated.
However for example if
AGEGR2
would categorize AGE
in < 65
, >= 65
,REGION1
would categorize COUNTRY
in North America
, Rest of the World
,the user defined function(s) would be like:
<- function(var_input) {
format_agegr2 case_when(
< 65 ~ "< 65",
var_input >= 65 ~ ">= 65",
var_input TRUE ~ NA_character_
)
}
<- function(var_input) {
format_region1 case_when(
%in% c("CAN", "USA") ~ "North America",
var_input !is.na(var_input) ~ "Rest of the World",
TRUE ~ "Missing"
) }
These functions are then used in a mutate()
statement to derive the required grouping variables:
<- adsl %>%
adsl mutate(
AGEGR2 = format_agegr2(AGE),
REGION1 = format_region1(COUNTRY)
)
USUBJID | AGE | SEX | COUNTRY | AGEGR1 | AGEGR2 | REGION1 |
---|---|---|---|---|---|---|
01-701-1015 | 63 | F | USA | 18-64 | < 65 | North America |
01-701-1023 | 64 | M | USA | 18-64 | < 65 | North America |
01-701-1028 | 71 | M | USA | >=65 | >= 65 | North America |
01-701-1033 | 74 | M | USA | >=65 | >= 65 | North America |
01-701-1034 | 77 | F | USA | >=65 | >= 65 | North America |
01-701-1047 | 85 | F | USA | >=65 | >= 65 | North America |
01-701-1057 | 59 | F | USA | 18-64 | < 65 | North America |
01-701-1097 | 68 | M | USA | >=65 | >= 65 | North America |
01-701-1111 | 81 | F | USA | >=65 | >= 65 | North America |
01-701-1115 | 84 | M | USA | >=65 | >= 65 | North America |
SAFFL
)Since the populations flags are mainly company/study specific no dedicated functions are provided, but in most cases they can easily be derived using derive_var_merged_exist_flag
.
An example of an implementation could be:
<- adsl %>%
adsl derive_var_merged_exist_flag(
dataset_add = ex,
by_vars = vars(STUDYID, USUBJID),
new_var = SAFFL,
condition = (EXDOSE > 0 | (EXDOSE == 0 & str_detect(EXTRT, "PLACEBO")))
)
USUBJID | TRTSDT | ARM | ACTARM | SAFFL |
---|---|---|---|---|
01-701-1015 | 2014-01-02 | Placebo | Placebo | Y |
01-701-1023 | 2012-08-05 | Placebo | Placebo | Y |
01-701-1028 | 2013-07-19 | Xanomeline High Dose | Xanomeline High Dose | Y |
01-701-1033 | 2014-03-18 | Xanomeline Low Dose | Xanomeline Low Dose | Y |
01-701-1034 | 2014-07-01 | Xanomeline High Dose | Xanomeline High Dose | Y |
01-701-1047 | 2013-02-12 | Placebo | Placebo | Y |
01-701-1057 | NA | Screen Failure | Screen Failure | NA |
01-701-1097 | 2014-01-01 | Xanomeline Low Dose | Xanomeline Low Dose | Y |
01-701-1111 | 2012-09-07 | Xanomeline Low Dose | Xanomeline Low Dose | Y |
01-701-1115 | 2012-11-30 | Xanomeline Low Dose | Xanomeline Low Dose | Y |
The users can add specific code to cover their need for the analysis.
The following functions are helpful for many ADSL derivations:
derive_vars_merged()
- Merge Variables from a Dataset to the Input Datasetderive_var_merged_cat()
- Merge a Categorization Variablederive_var_merged_exist_flag()
- Merge an Existence Flagderive_var_merged_character()
- Merge a Character VariableAdding labels and attributes for SAS transport files is supported by the following packages:
metacore: establish a common foundation for the use of metadata within an R session.
metatools: enable the use of metacore objects. Metatools can be used to build datasets or enhance columns in existing datasets as well as checking datasets against the metadata.
xportr: functionality to associate all metadata information to a local R data frame, perform data set level validation checks and convert into a transport v5 file(xpt).
NOTE: All these packages are in the experimental phase, but the vision is to have them associated with an End to End pipeline under the umbrella of the pharmaverse.
ADaM | Sample Code |
---|---|
ADSL | ad_adsl.R |