Creating a BDS Finding ADaM

Introduction

This article describes creating a BDS finding ADaM. Examples are currently presented and tested in the context of ADVS. However, the examples could be applied to other BDS Finding ADaMs such as ADEG, ADLB, etc. where a single result is captured in an SDTM Finding domain on a single date and/or time.

Note: All examples assume CDISC SDTM and/or ADaM format as input unless otherwise specified.

Programming Workflow

Read in Data

To start, all data frames needed for the creation of ADVS should be read into the environment. This will be a company specific process. Some of the data frames needed may be VS and ADSL.

For example purpose, the CDISC Pilot SDTM and ADaM datasets—which are included in {admiral.test}—are used.

library(admiral)
library(dplyr)
library(admiral.test)
library(lubridate)
library(stringr)
library(tibble)

data("admiral_adsl")
data("admiral_vs")

adsl <- admiral_adsl
vs <- admiral_vs

vs <- convert_blanks_to_na(vs)

At this step, it may be useful to join ADSL to your VS domain. Only the ADSL variables used for derivations are selected at this step. The rest of the relevant ADSL variables would be added later.


adsl_vars <- vars(TRTSDT, TRTEDT, TRT01A, TRT01P)

advs <- derive_vars_merged(
  vs,
  dataset_add = adsl,
  new_vars = adsl_vars,
  by_vars = vars(STUDYID, USUBJID)
)
USUBJID VSTESTCD VSDTC VISIT TRTSDT TRTEDT TRT01A TRT01P
01-701-1015 DIABP 2014-01-16 WEEK 2 2014-01-02 2014-07-02 Placebo Placebo
01-701-1015 DIABP 2014-01-16 WEEK 2 2014-01-02 2014-07-02 Placebo Placebo
01-701-1015 DIABP 2014-01-16 WEEK 2 2014-01-02 2014-07-02 Placebo Placebo
01-701-1023 DIABP 2012-08-27 WEEK 2 2012-08-05 2012-09-01 Placebo Placebo
01-701-1023 DIABP 2012-08-27 WEEK 2 2012-08-05 2012-09-01 Placebo Placebo
01-701-1023 DIABP 2012-08-27 WEEK 2 2012-08-05 2012-09-01 Placebo Placebo
01-703-1086 DIABP 2012-09-16 WEEK 2 2012-09-02 2012-12-04 Xanomeline Low Dose Xanomeline Low Dose
01-703-1086 DIABP 2012-09-16 WEEK 2 2012-09-02 2012-12-04 Xanomeline Low Dose Xanomeline Low Dose
01-703-1086 DIABP 2012-09-16 WEEK 2 2012-09-02 2012-12-04 Xanomeline Low Dose Xanomeline Low Dose
01-703-1096 DIABP 2013-02-09 WEEK 2 2013-01-25 2013-03-16 Placebo Placebo

Derive/Impute Numeric Date/Time and Analysis Day (ADT, ADTM, ADY, ADTF, ATMF)

The function derive_vars_dt() can be used to derive ADT. This function allows the user to impute the date as well.

Example calls:

advs <- derive_vars_dt(advs, new_vars_prefix = "A", dtc = VSDTC)
USUBJID VISIT VSDTC ADT
01-701-1015 SCREENING 1 2013-12-26 2013-12-26
01-701-1015 SCREENING 1 2013-12-26 2013-12-26
01-701-1015 SCREENING 1 2013-12-26 2013-12-26
01-701-1015 SCREENING 2 2013-12-31 2013-12-31
01-701-1015 SCREENING 2 2013-12-31 2013-12-31
01-701-1015 SCREENING 2 2013-12-31 2013-12-31
01-701-1015 BASELINE 2014-01-02 2014-01-02
01-701-1015 BASELINE 2014-01-02 2014-01-02
01-701-1015 BASELINE 2014-01-02 2014-01-02
01-701-1015 AMBUL ECG PLACEMENT 2014-01-14 2014-01-14

If imputation is needed and the date is to be imputed to the first of the month, the call would be:

advs <- derive_vars_dt(
  advs,
  new_vars_prefix = "A",
  dtc = VSDTC,
  highest_imputation = "M"
)
USUBJID VISIT VSDTC ADT ADTF
01-716-1024 SCREENING 1 2012-07 2012-07-01 D
01-716-1024 SCREENING 1 2012-07 2012-07-01 D
01-716-1024 SCREENING 1 2012-07 2012-07-01 D
01-716-1024 SCREENING 2 2012-07-07 2012-07-07 NA
01-716-1024 SCREENING 2 2012-07-07 2012-07-07 NA
01-716-1024 SCREENING 2 2012-07-07 2012-07-07 NA
01-716-1024 BASELINE 2012-07-09 2012-07-09 NA
01-716-1024 BASELINE 2012-07-09 2012-07-09 NA
01-716-1024 BASELINE 2012-07-09 2012-07-09 NA
01-716-1024 AMBUL ECG PLACEMENT 2012-07-22 2012-07-22 NA

Similarly, ADTM may be created using the function derive_vars_dtm(). Imputation may be done on both the date and time components of ADTM.

# CDISC Pilot data does not contain times and the output of the derivation
# ADTM is not presented.
advs <- derive_vars_dtm(
  advs,
  new_vars_prefix = "A",
  dtc = VSDTC,
  highest_imputation = "M"
)

By default, the variable ADTF for derive_vars_dt() or ADTF and ATMF for derive_vars_dtm() will be created and populated with the controlled terminology outlined in the ADaM IG for date imputations.

See also Date and Time Imputation.

Once ADT is derived, the function derive_vars_dy() can be used to derive ADY. This example assumes both ADT and TRTSDT exist on the data frame.

advs <-
  derive_vars_dy(advs, reference_date = TRTSDT, source_vars = vars(ADT))
USUBJID VISIT ADT ADY TRTSDT
01-716-1024 SCREENING 1 2012-07-06 -3 2012-07-09
01-716-1024 SCREENING 1 2012-07-06 -3 2012-07-09
01-716-1024 SCREENING 1 2012-07-06 -3 2012-07-09
01-716-1024 SCREENING 2 2012-07-07 -2 2012-07-09
01-716-1024 SCREENING 2 2012-07-07 -2 2012-07-09
01-716-1024 SCREENING 2 2012-07-07 -2 2012-07-09
01-716-1024 BASELINE 2012-07-09 1 2012-07-09
01-716-1024 BASELINE 2012-07-09 1 2012-07-09
01-716-1024 BASELINE 2012-07-09 1 2012-07-09
01-716-1024 AMBUL ECG PLACEMENT 2012-07-22 14 2012-07-09

Assign PARAMCD, PARAM, PARAMN, PARCAT1

To assign parameter level values such as PARAMCD, PARAM, PARAMN, PARCAT1, etc., a lookup can be created to join to the source data.

For example, when creating ADVS, a lookup based on the SDTM --TESTCD value may be created:

VSTESTCD PARAMCD PARAM PARAMN PARCAT1 PARCAT1N
HEIGHT HEIGHT Height (cm) 1 Subject Characteristic 1
WEIGHT WEIGHT Weight (kg) 2 Subject Characteristic 1
DIABP DIABP Diastolic Blood Pressure (mmHg) 3 Vital Sign 2
MAP MAP Mean Arterial Pressure 4 Vital Sign 2
PULSE PULSE Pulse Rate (beats/min) 5 Vital Sign 2
SYSBP SYSBP Systolic Blood Pressure (mmHg) 6 Vital Sign 2
TEMP TEMP Temperature (C) 7 Vital Sign 2

This lookup may now be joined to the source data:

At this stage, only PARAMCD is required to perform the derivations. Additional derived parameters may be added, so only PARAMCD is joined to the datasets at this point. All other variables related to PARAMCD (e.g. PARAM, PARAMCAT1, …) will be added when all PARAMCD are derived.

advs <- derive_vars_merged(
  advs,
  dataset_add = param_lookup,
  new_vars = vars(PARAMCD),
  by_vars = vars(VSTESTCD)
)
USUBJID VSTESTCD PARAMCD
01-701-1015 DIABP DIABP
01-701-1015 HEIGHT HEIGHT
01-701-1015 PULSE PULSE
01-701-1015 SYSBP SYSBP
01-701-1015 TEMP TEMP
01-701-1015 WEIGHT WEIGHT
01-701-1023 DIABP DIABP
01-701-1023 HEIGHT HEIGHT
01-701-1023 PULSE PULSE
01-701-1023 SYSBP SYSBP

Please note, it may be necessary to include other variables in the join. For example, perhaps the PARAMCD is based on VSTESTCD and VSPOS, it may be necessary to expand this lookup or create a separate look up for PARAMCD.

Derive Results (AVAL, AVALC)

The mapping of AVAL and AVALC is left to the ADaM programmer. An example mapping may be:

advs <- mutate(
  advs,
  AVAL = VSSTRESN,
  AVALC = VSSTRESC
)
VSTESTCD PARAMCD VSSTRESN VSSTRESC AVAL AVALC
DIABP DIABP 74 74 74 74
DIABP DIABP 74 74 74 74
DIABP DIABP 72 72 72 72
DIABP DIABP 78 78 78 78
DIABP DIABP 84 84 84 84
DIABP DIABP 78 78 78 78
DIABP DIABP 80 80 80 80
DIABP DIABP 80 80 80 80
DIABP DIABP 84 84 84 84
DIABP DIABP 90 90 90 90

Derive Additional Parameters (e.g. BSA, BMI or MAP for ADVS)

Optionally derive new parameters creating PARAMCD and AVAL. Note that only variables specified in the by_vars argument will be populated in the newly created records. This is relevant to the functions derive_param_map, derive_param_bsa, derive_param_bmi, and derive_param_qtc.

Below is an example of creating Mean Arterial Pressure for ADVS, see also Example 3 in section below Derive New Rows for alternative way of creating new parameters.

advs <- derive_param_map(
  advs,
  by_vars = vars(STUDYID, USUBJID, !!!adsl_vars, VISIT, VISITNUM, ADT, ADY, VSTPT, VSTPTNUM),
  set_values_to = vars(PARAMCD = "MAP"),
  get_unit_expr = VSSTRESU,
  filter = VSSTAT != "NOT DONE" | is.na(VSSTAT)
)
VSTESTCD PARAMCD VISIT VSTPT AVAL AVALC
DIABP DIABP SCREENING 1 AFTER LYING DOWN FOR 5 MINUTES 64.00000 64
NA MAP SCREENING 1 AFTER LYING DOWN FOR 5 MINUTES 86.33333 NA
SYSBP SYSBP SCREENING 1 AFTER LYING DOWN FOR 5 MINUTES 131.00000 131
DIABP DIABP SCREENING 1 AFTER STANDING FOR 1 MINUTE 83.00000 83
NA MAP SCREENING 1 AFTER STANDING FOR 1 MINUTE 98.33333 NA
SYSBP SYSBP SCREENING 1 AFTER STANDING FOR 1 MINUTE 129.00000 129
DIABP DIABP SCREENING 1 AFTER STANDING FOR 3 MINUTES 57.00000 57
NA MAP SCREENING 1 AFTER STANDING FOR 3 MINUTES 87.00000 NA
SYSBP SYSBP SCREENING 1 AFTER STANDING FOR 3 MINUTES 147.00000 147
DIABP DIABP SCREENING 2 AFTER LYING DOWN FOR 5 MINUTES 68.00000 68

Likewise, function call below, to create parameter Body Surface Area and Body Mass Index for ADVS domain.

advs <- derive_param_bsa(
  advs,
  by_vars = vars(STUDYID, USUBJID, !!!adsl_vars, VISIT, VISITNUM, ADT, ADY, VSTPT, VSTPTNUM),
  method = "Mosteller",
  set_values_to = vars(PARAMCD = "BSA"),
  get_unit_expr = VSSTRESU,
  filter = VSSTAT != "NOT DONE" | is.na(VSSTAT)
)

advs <- derive_param_bmi(
  advs,
  by_vars = vars(STUDYID, USUBJID, !!!adsl_vars, VISIT, VISITNUM, ADT, ADY, VSTPT, VSTPTNUM),
  set_values_to = vars(PARAMCD = "BMI"),
  get_unit_expr = VSSTRESU,
  filter = VSSTAT != "NOT DONE" | is.na(VSSTAT)
)
USUBJID VSTESTCD PARAMCD VISIT VSTPT AVAL AVALC
01-701-1015 NA BMI SCREENING 1 NA 24.871928 NA
01-701-1015 NA BSA SCREENING 1 NA 1.486264 NA
01-701-1023 NA BMI SCREENING 1 NA 29.694517 NA
01-701-1023 NA BSA SCREENING 1 NA 1.882381 NA
01-703-1086 NA BMI SCREENING 1 NA 24.665676 NA
01-703-1086 NA BSA SCREENING 1 NA 2.264029 NA
01-703-1096 NA BMI SCREENING 1 NA 31.886559 NA
01-703-1096 NA BSA SCREENING 1 NA 1.905083 NA
01-707-1037 NA BMI SCREENING 1 NA 23.826992 NA
01-707-1037 NA BSA SCREENING 1 NA 1.530597 NA

Similarly, for ADEG, the parameters QTCBF QTCBS and QTCL can be created with a function call. See example below for PARAMCD = QTCF.


adeg <- tibble::tribble(
  ~USUBJID, ~EGSTRESU, ~PARAMCD, ~AVAL, ~VISIT,
  "P01", "msec", "QT", 350, "CYCLE 1 DAY 1",
  "P01", "msec", "QT", 370, "CYCLE 2 DAY 1",
  "P01", "msec", "RR", 842, "CYCLE 1 DAY 1",
  "P01", "msec", "RR", 710, "CYCLE 2 DAY 1"
)

adeg <- derive_param_qtc(
  adeg,
  by_vars = vars(USUBJID, VISIT),
  method = "Fridericia",
  set_values_to = vars(PARAMCD = "QTCFR"),
  get_unit_expr = EGSTRESU
)

Similarly, for ADLB, the function derive_param_wbc_abs() can be used to create new parameter for lab differentials converted to absolute values. See example below:

adlb <- tibble::tribble(
  ~USUBJID, ~PARAMCD, ~AVAL, ~PARAM, ~VISIT,
  "P01", "WBC", 33, "Leukocyte Count (10^9/L)", "CYCLE 1 DAY 1",
  "P01", "WBC", 38, "Leukocyte Count (10^9/L)", "CYCLE 2 DAY 1",
  "P01", "LYMLE", 0.90, "Lymphocytes (fraction of 1)", "CYCLE 1 DAY 1",
  "P01", "LYMLE", 0.70, "Lymphocytes (fraction of 1)", "CYCLE 2 DAY 1"
)

derive_param_wbc_abs(
  dataset = adlb,
  by_vars = vars(USUBJID, VISIT),
  set_values_to = vars(
    PARAMCD = "LYMPH",
    PARAM = "Lymphocytes Abs (10^9/L)",
    DTYPE = "CALCULATION"
  ),
  get_unit_expr = extract_unit(PARAM),
  wbc_code = "WBC",
  diff_code = "LYMLE",
  diff_type = "fraction"
)

When all PARAMCD have been derived and added to the dataset, the other information from the look-up table (PARAM, PARAMCAT1,…) should be added.


# Derive PARAM and PARAMN
advs <- derive_vars_merged(
  advs,
  dataset_add = select(param_lookup, -VSTESTCD),
  by_vars = vars(PARAMCD)
)
VSTESTCD PARAMCD PARAM PARAMN PARCAT1 PARCAT1N
DIABP DIABP Diastolic Blood Pressure (mmHg) 3 Vital Sign 2
DIABP DIABP Diastolic Blood Pressure (mmHg) 3 Vital Sign 2
DIABP DIABP Diastolic Blood Pressure (mmHg) 3 Vital Sign 2
DIABP DIABP Diastolic Blood Pressure (mmHg) 3 Vital Sign 2
DIABP DIABP Diastolic Blood Pressure (mmHg) 3 Vital Sign 2
DIABP DIABP Diastolic Blood Pressure (mmHg) 3 Vital Sign 2
DIABP DIABP Diastolic Blood Pressure (mmHg) 3 Vital Sign 2
DIABP DIABP Diastolic Blood Pressure (mmHg) 3 Vital Sign 2
DIABP DIABP Diastolic Blood Pressure (mmHg) 3 Vital Sign 2
DIABP DIABP Diastolic Blood Pressure (mmHg) 3 Vital Sign 2

Derive Timing Variables (e.g. APHASE, AVISIT, APERIOD)

Categorical timing variables are protocol and analysis dependent. Below is a simple example.

advs <- advs %>%
  mutate(
    AVISIT = case_when(
      str_detect(VISIT, "SCREEN") ~ NA_character_,
      str_detect(VISIT, "UNSCHED") ~ NA_character_,
      str_detect(VISIT, "RETRIEVAL") ~ NA_character_,
      str_detect(VISIT, "AMBUL") ~ NA_character_,
      !is.na(VISIT) ~ str_to_title(VISIT)
    ),
    AVISITN = as.numeric(case_when(
      VISIT == "BASELINE" ~ "0",
      str_detect(VISIT, "WEEK") ~ str_trim(str_replace(VISIT, "WEEK", ""))
    )),
    ATPT = VSTPT,
    ATPTN = VSTPTNUM
  )


count(advs, VISITNUM, VISIT, AVISITN, AVISIT)
#> # A tibble: 15 x 5
#>    VISITNUM VISIT               AVISITN AVISIT       n
#>       <dbl> <chr>                 <dbl> <chr>    <int>
#>  1      1   SCREENING 1              NA <NA>       102
#>  2      2   SCREENING 2              NA <NA>        78
#>  3      3   BASELINE                  0 Baseline    84
#>  4      3.5 AMBUL ECG PLACEMENT      NA <NA>        65
#>  5      4   WEEK 2                    2 Week 2      84
#>  6      5   WEEK 4                    4 Week 4      70
#>  7      6   AMBUL ECG REMOVAL        NA <NA>        52
#>  8      7   WEEK 6                    6 Week 6      42
#>  9      8   WEEK 8                    8 Week 8      42
#> 10      9   WEEK 12                  12 Week 12     42
#> 11     10   WEEK 16                  16 Week 16     42
#> 12     11   WEEK 20                  20 Week 20     28
#> 13     12   WEEK 24                  24 Week 24     28
#> 14     13   WEEK 26                  26 Week 26     28
#> 15    201   RETRIEVAL                NA <NA>        26

count(advs, VSTPTNUM, VSTPT, ATPTN, ATPT)
#> # A tibble: 4 x 5
#>   VSTPTNUM VSTPT                        ATPTN ATPT                             n
#>      <dbl> <chr>                        <dbl> <chr>                        <int>
#> 1      815 AFTER LYING DOWN FOR 5 MINU…   815 AFTER LYING DOWN FOR 5 MINU…   232
#> 2      816 AFTER STANDING FOR 1 MINUTE    816 AFTER STANDING FOR 1 MINUTE    232
#> 3      817 AFTER STANDING FOR 3 MINUTES   817 AFTER STANDING FOR 3 MINUTES   232
#> 4       NA <NA>                            NA <NA>                           117

Timing Flag Variables (e.g. ONTRTFL)

In some analyses, it may be necessary to flag an observation as on-treatment. The admiral function derive_var_ontrtfl() can be used.

For example, if on-treatment is defined as any observation between treatment start and treatment end, the flag may be derived as:

advs <- derive_var_ontrtfl(
  advs,
  start_date = ADT,
  ref_start_date = TRTSDT,
  ref_end_date = TRTEDT
)
USUBJID PARAMCD ADT TRTSDT TRTEDT ONTRTFL
01-701-1015 DIABP 2014-01-16 2014-01-02 2014-07-02 Y
01-701-1015 DIABP 2014-01-16 2014-01-02 2014-07-02 Y
01-701-1015 DIABP 2014-01-16 2014-01-02 2014-07-02 Y
01-701-1023 DIABP 2012-08-27 2012-08-05 2012-09-01 Y
01-701-1023 DIABP 2012-08-27 2012-08-05 2012-09-01 Y
01-701-1023 DIABP 2012-08-27 2012-08-05 2012-09-01 Y
01-703-1086 DIABP 2012-09-16 2012-09-02 2012-12-04 Y
01-703-1086 DIABP 2012-09-16 2012-09-02 2012-12-04 Y
01-703-1086 DIABP 2012-09-16 2012-09-02 2012-12-04 Y
01-703-1096 DIABP 2013-02-09 2013-01-25 2013-03-16 Y

This function returns the original data frame with the column ONTRTFL added. Additionally, this function does have functionality to handle a window on the ref_end_date. For example, if on-treatment is defined as between treatment start and treatment end plus 60 days, the call would be:

advs <- derive_var_ontrtfl(
  advs,
  start_date = ADT,
  ref_start_date = TRTSDT,
  ref_end_date = TRTEDT,
  ref_end_window = 60
)

In addition, the function does allow you to filter out pre-treatment observations that occurred on the start date. For example, if observations with VSTPT == PRE should not be considered on-treatment when the observation date falls between the treatment start and end date, the user may specify this using the filter_pre_timepoint parameter:

advs <- derive_var_ontrtfl(
  advs,
  start_date = ADT,
  ref_start_date = TRTSDT,
  ref_end_date = TRTEDT,
  filter_pre_timepoint = ATPT == "AFTER LYING DOWN FOR 5 MINUTES"
)

Lastly, the function does allow you to create any on-treatment flag based on the analysis needs. For example, if variable ONTR01FL is needed, showing the on-treatment flag during Period 01, you need to set new var = ONTR01FL. In addition, for Period 01 Start Date and Period 01 End Date, you need ref_start_date = AP01SDT and ref_end_date = AP01EDT.

advs <- derive_var_ontrtfl(
  advs,
  new_var = ONTR01FL,
  start_date = ASTDT,
  end_date = AENDT,
  ref_start_date = AP01SDT,
  ref_end_date = AP01EDT,
  span_period = "Y"
)
USUBJID ASTDT AENDT AP01SDT AP01EDT ONTR01FL
P01 2020-03-15 2020-12-01 2020-01-01 2020-03-01 NA
P02 2019-04-30 2020-03-15 2020-01-01 2020-03-01 Y
P03 2019-04-30 NA 2020-01-01 2020-03-01 Y

Assign Reference Range Indicator (ANRIND)

The admiral function derive_var_anrind() may be used to derive the reference range indicator ANRIND.

This function requires the reference range boundaries to exist on the data frame (ANRLO, ANRHI) and also accommodates the additional boundaries A1LO and A1HI.

The function is called as:

advs <- derive_var_anrind(advs)
USUBJID PARAMCD AVAL ANRLO ANRHI A1LO A1HI ANRIND
01-701-1015 DIABP 56 60 80 40 90 LOW
01-701-1015 DIABP 50 60 80 40 90 LOW
01-701-1015 DIABP 54 60 80 40 90 LOW
01-701-1023 DIABP 88 60 80 40 90 HIGH
01-701-1023 DIABP 86 60 80 40 90 HIGH
01-701-1023 DIABP 90 60 80 40 90 HIGH
01-703-1086 DIABP 68 60 80 40 90 NORMAL
01-703-1086 DIABP 74 60 80 40 90 NORMAL
01-703-1086 DIABP 70 60 80 40 90 NORMAL
01-703-1096 DIABP 74 60 80 40 90 NORMAL

Derive Baseline (BASETYPE, ABLFL, BASE, BASEC, BNRIND)

The BASETYPE should be derived using the function derive_var_basetype(). The parameter basetypes of this function requires a named list of expression detailing how the BASETYPE should be assigned. Note, if a record falls into multiple expressions within the basetypes expression, a row will be produced for each BASETYPE.

advs <- derive_var_basetype(
  dataset = advs,
  basetypes = rlang::exprs(
    "LAST: AFTER LYING DOWN FOR 5 MINUTES" = ATPTN == 815,
    "LAST: AFTER STANDING FOR 1 MINUTE" = ATPTN == 816,
    "LAST: AFTER STANDING FOR 3 MINUTES" = ATPTN == 817,
    "LAST" = is.na(ATPTN)
  )
)

count(advs, ATPT, ATPTN, BASETYPE)
#> # A tibble: 4 x 4
#>   ATPT                           ATPTN BASETYPE                                n
#>   <chr>                          <dbl> <chr>                               <int>
#> 1 AFTER LYING DOWN FOR 5 MINUTES   815 LAST: AFTER LYING DOWN FOR 5 MINUT…   232
#> 2 AFTER STANDING FOR 1 MINUTE      816 LAST: AFTER STANDING FOR 1 MINUTE     232
#> 3 AFTER STANDING FOR 3 MINUTES     817 LAST: AFTER STANDING FOR 3 MINUTES    232
#> 4 <NA>                              NA LAST                                  117

It is important to derive BASETYPE first so that it can be utilized in subsequent derivations. This will be important if the data frame contains multiple values for BASETYPE.

Next, the analysis baseline flag ABLFL can be derived using the {admiral} function derive_var_extreme_flag(). For example, if baseline is defined as the last non-missing AVAL prior or on TRTSDT, the function call for ABLFL would be:

advs <- restrict_derivation(
  advs,
  derivation = derive_var_extreme_flag,
  args = params(
    by_vars = vars(STUDYID, USUBJID, BASETYPE, PARAMCD),
    order = vars(ADT, ATPTN, VISITNUM),
    new_var = ABLFL,
    mode = "last"
  ),
  filter = (!is.na(AVAL) & ADT <= TRTSDT & !is.na(BASETYPE))
)
USUBJID BASETYPE PARAMCD ADT TRTSDT ATPTN ABLFL
01-701-1015 LAST: AFTER LYING DOWN FOR 5 MINUTES DIABP 2014-01-02 2014-01-02 815 Y
01-701-1015 LAST: AFTER STANDING FOR 1 MINUTE DIABP 2014-01-02 2014-01-02 816 Y
01-701-1015 LAST: AFTER STANDING FOR 3 MINUTES DIABP 2014-01-02 2014-01-02 817 Y
01-701-1023 LAST: AFTER LYING DOWN FOR 5 MINUTES DIABP 2012-08-05 2012-08-05 815 Y
01-701-1023 LAST: AFTER STANDING FOR 1 MINUTE DIABP 2012-08-05 2012-08-05 816 Y
01-701-1023 LAST: AFTER STANDING FOR 3 MINUTES DIABP 2012-08-05 2012-08-05 817 Y
01-703-1086 LAST: AFTER LYING DOWN FOR 5 MINUTES DIABP 2012-09-02 2012-09-02 815 Y
01-703-1086 LAST: AFTER STANDING FOR 1 MINUTE DIABP 2012-09-02 2012-09-02 816 Y
01-703-1086 LAST: AFTER STANDING FOR 3 MINUTES DIABP 2012-09-02 2012-09-02 817 Y
01-703-1096 LAST: AFTER LYING DOWN FOR 5 MINUTES DIABP 2013-01-25 2013-01-25 815 Y

Note: Additional examples of the derive_var_extreme_flag() function can be found above.

Lastly, the BASE, BASEC and BNRIND columns can be derived using the {admiral} function derive_var_base(). Example calls are:

advs <- derive_var_base(
  advs,
  by_vars = vars(STUDYID, USUBJID, PARAMCD, BASETYPE),
  source_var = AVAL,
  new_var = BASE
)

advs <- derive_var_base(
  advs,
  by_vars = vars(STUDYID, USUBJID, PARAMCD, BASETYPE),
  source_var = AVALC,
  new_var = BASEC
)

advs <- derive_var_base(
  advs,
  by_vars = vars(STUDYID, USUBJID, PARAMCD, BASETYPE),
  source_var = ANRIND,
  new_var = BNRIND
)
USUBJID BASETYPE PARAMCD ABLFL BASE BASEC ANRIND BNRIND
01-701-1015 LAST: AFTER LYING DOWN FOR 5 MINUTES DIABP Y 56 56 LOW LOW
01-701-1015 LAST: AFTER STANDING FOR 1 MINUTE DIABP Y 51 51 LOW LOW
01-701-1015 LAST: AFTER STANDING FOR 3 MINUTES DIABP Y 61 61 NORMAL NORMAL
01-701-1023 LAST: AFTER LYING DOWN FOR 5 MINUTES DIABP Y 84 84 HIGH HIGH
01-701-1023 LAST: AFTER STANDING FOR 1 MINUTE DIABP Y 86 86 HIGH HIGH
01-701-1023 LAST: AFTER STANDING FOR 3 MINUTES DIABP Y 88 88 HIGH HIGH
01-703-1086 LAST: AFTER LYING DOWN FOR 5 MINUTES DIABP Y 80 80 NORMAL NORMAL
01-703-1086 LAST: AFTER STANDING FOR 1 MINUTE DIABP Y 82 82 HIGH HIGH
01-703-1086 LAST: AFTER STANDING FOR 3 MINUTES DIABP Y 72 72 NORMAL NORMAL
01-703-1096 LAST: AFTER LYING DOWN FOR 5 MINUTES DIABP Y 70 70 NORMAL NORMAL

Derive Change from Baseline (CHG, PCHG)

Change and percent change from baseline can be derived using the {admiral} functions derive_var_chg() and derive_var_pchg(). These functions expect AVAL and BASE to exist in the data frame. The CHG is simply AVAL - BASE and the PCHG is (AVAL - BASE) / absolute value (BASE) * 100. Examples calls are:

advs <- derive_var_chg(advs)

advs <- derive_var_pchg(advs)
USUBJID VISIT BASE AVAL CHG PCHG
01-701-1015 WEEK 2 56 56 0 0.000000
01-701-1015 WEEK 8 56 67 11 19.642857
01-701-1023 WEEK 2 84 88 4 4.761905
01-703-1086 WEEK 2 80 68 -12 -15.000000
01-703-1086 WEEK 8 80 80 0 0.000000
01-703-1096 WEEK 2 70 74 4 5.714286
01-707-1037 WEEK 2 88 72 -16 -18.181818
01-716-1024 WEEK 2 80 86 6 7.500000
01-716-1024 WEEK 8 80 78 -2 -2.500000
01-701-1015 WEEK 2 51 50 -1 -1.960784

If the variables should not be derived for all records, e.g., for post-baseline records only, restrict_derivation() can be used.

Derive Shift (e.g. SHIFT1)

Shift variables can be derived using the {admiral} function derive_var_shift(). This function derives a character shift variable concatenating shift in values based on a user-defined pairing, e.g., shift from baseline reference range BNRIND to analysis reference range ANRIND. Examples calls are:

advs <- derive_var_shift(advs,
  new_var = SHIFT1,
  from_var = BNRIND,
  to_var = ANRIND
)

If the variables should not be derived for all records, e.g., for post-baseline records only, restrict_derivation() can be used.

Derive Analysis Ratio (R2BASE)

Analysis ratio variables can be derived using the {admiral} function derive_var_analysis_ratio(). This function derives a ratio variable based on user-specified pair. For example, Ratio to Baseline is calculated by AVAL / BASE and the function appends a new variable R2BASE to the dataset. Examples calls are:

advs <- derive_var_analysis_ratio(advs,
  numer_var = AVAL,
  denom_var = BASE
)

advs <- derive_var_analysis_ratio(advs,
  numer_var = AVAL,
  denom_var = ANRLO,
  new_var = R01ANRLO
)
USUBJID VISIT BASE AVAL ANRLO R2BASE R01ANRLO
01-701-1015 WEEK 2 56 56 60 1.0000000 0.9333333
01-701-1015 WEEK 8 56 67 60 1.1964286 1.1166667
01-701-1023 WEEK 2 84 88 60 1.0476190 1.4666667
01-703-1086 WEEK 2 80 68 60 0.8500000 1.1333333
01-703-1086 WEEK 8 80 80 60 1.0000000 1.3333333
01-703-1096 WEEK 2 70 74 60 1.0571429 1.2333333
01-707-1037 WEEK 2 88 72 60 0.8181818 1.2000000
01-716-1024 WEEK 2 80 86 60 1.0750000 1.4333333
01-716-1024 WEEK 8 80 78 60 0.9750000 1.3000000
01-701-1015 WEEK 2 51 50 60 0.9803922 0.8333333

If the variables should not be derived for all records, e.g., for post-baseline records only, restrict_derivation() can be used.

Derive Analysis Flags (e.g. ANL01FL)

In most finding ADaMs, an analysis flag is derived to identify the appropriate observation(s) to use for a particular analysis when a subject has multiple observations within a particular timing period.

In this situation, an analysis flag (e.g. ANLxxFL) may be used to choose the appropriate record for analysis.

This flag may be derived using the {admiral} function derive_var_extreme_flag(). For this example, we will assume we would like to choose the latest and highest value by USUBJID, PARAMCD, AVISIT, and ATPT.


advs <- restrict_derivation(
  advs,
  derivation = derive_var_extreme_flag,
  args = params(
    by_vars = vars(STUDYID, USUBJID, BASETYPE, PARAMCD, AVISIT),
    order = vars(ADT, ATPTN, AVAL),
    new_var = ANL01FL,
    mode = "last"
  ),
  filter = !is.na(AVISITN)
)
USUBJID PARAMCD AVISIT ATPTN ADT AVAL ANL01FL
01-701-1015 DIABP Week 2 815 2014-01-16 56 Y
01-701-1015 DIABP Week 8 815 2014-03-05 67 Y
01-701-1015 DIABP Week 2 816 2014-01-16 50 Y
01-701-1015 DIABP Week 8 816 2014-03-05 62 Y
01-701-1015 DIABP Week 2 817 2014-01-16 54 Y
01-701-1015 DIABP Week 8 817 2014-03-05 71 Y
01-701-1023 DIABP Week 2 815 2012-08-27 88 Y
01-701-1023 DIABP Week 2 816 2012-08-27 86 Y
01-701-1023 DIABP Week 2 817 2012-08-27 90 Y
01-703-1086 DIABP Week 2 815 2012-09-16 68 Y

Another common example would be flagging the worst value for a subject, parameter, and visit. For this example, we will assume we have 3 PARAMCD values (SYSBP, DIABP, and RESP). We will also assume high is worst for SYSBP and DIABP and low is worst for RESP.


advs <- restrict_derivation(
  advs,
  derivation = derive_var_worst_flag,
  args = params(
    by_vars = vars(STUDYID, USUBJID, BASETYPE, PARAMCD, AVISIT),
    order = vars(ADT, ATPTN),
    new_var = WORSTFL,
    param_var = PARAMCD,
    analysis_var = AVAL,
    worst_high = c("SYSBP", "DIABP"),
    worst_low = "PULSE"
  ),
  filter = !is.na(AVISIT) & !is.na(AVAL)
)
USUBJID PARAMCD AVISIT AVAL ADT ATPTN WORSTFL
01-701-1015 PULSE Baseline 56 2014-01-02 815 Y
01-701-1015 PULSE Week 12 54 2014-03-26 815 Y
01-701-1015 PULSE Week 16 60 2014-05-07 815 Y
01-701-1015 PULSE Week 2 58 2014-01-16 815 Y
01-701-1015 PULSE Week 20 54 2014-05-21 815 Y
01-701-1015 PULSE Week 24 55 2014-06-18 815 Y
01-701-1015 PULSE Week 26 60 2014-07-02 815 Y
01-701-1015 PULSE Week 4 59 2014-01-30 815 Y
01-701-1015 PULSE Week 6 55 2014-02-12 815 Y
01-701-1015 PULSE Week 8 57 2014-03-05 815 Y

Assign Treatment (TRTA, TRTP)

TRTA and TRTP must correlate to treatment TRTxxP and/or TRTxxA in ADSL. The derivation of TRTA and TRTP for a record are protocol and analysis specific.
{admiral} does not currently have functionality to assist with TRTA and TRTP assignment.

However, an example of a simple implementation could be:

advs <- mutate(advs, TRTP = TRT01P, TRTA = TRT01A)

count(advs, TRTP, TRTA, TRT01P, TRT01A)
#> # A tibble: 2 x 5
#>   TRTP               TRTA              TRT01P            TRT01A                n
#>   <chr>              <chr>             <chr>             <chr>             <int>
#> 1 Placebo            Placebo           Placebo           Placebo             588
#> 2 Xanomeline Low Do… Xanomeline Low D… Xanomeline Low D… Xanomeline Low D…   225

Assign ASEQ

The {admiral} function derive_var_obs_number() can be used to derive ASEQ. An example call is:

advs <- derive_var_obs_number(
  advs,
  new_var = ASEQ,
  by_vars = vars(STUDYID, USUBJID),
  order = vars(PARAMCD, ADT, AVISITN, VISITNUM, ATPTN),
  check_type = "error"
)
USUBJID PARAMCD ADT AVISITN ATPTN VISIT ASEQ
01-701-1015 BMI 2013-12-26 NA NA SCREENING 1 1
01-701-1015 BSA 2013-12-26 NA NA SCREENING 1 2
01-701-1015 DIABP 2013-12-26 NA 815 SCREENING 1 3
01-701-1015 DIABP 2013-12-26 NA 816 SCREENING 1 4
01-701-1015 DIABP 2013-12-26 NA 817 SCREENING 1 5
01-701-1015 DIABP 2013-12-31 NA 815 SCREENING 2 6
01-701-1015 DIABP 2013-12-31 NA 816 SCREENING 2 7
01-701-1015 DIABP 2013-12-31 NA 817 SCREENING 2 8
01-701-1015 DIABP 2014-01-02 0 815 BASELINE 9
01-701-1015 DIABP 2014-01-02 0 816 BASELINE 10

Derive Categorization Variables (AVALCATx)

Admiral does not currently have a generic function to aid in assigning AVALCATx/ AVALCAxN values. Below is a simple example of how these values may be assigned:

avalcat_lookup <- tibble::tribble(
  ~PARAMCD, ~AVALCA1N, ~AVALCAT1,
  "HEIGHT", 1, ">140 cm",
  "HEIGHT", 2, "<= 140 cm"
)

format_avalcat1n <- function(param, aval) {
  case_when(
    param == "HEIGHT" & aval > 140 ~ 1,
    param == "HEIGHT" & aval <= 140 ~ 2
  )
}

advs <- advs %>%
  mutate(AVALCA1N = format_avalcat1n(param = PARAMCD, aval = AVAL)) %>%
  derive_vars_merged(
    avalcat_lookup,
    by = vars(PARAMCD, AVALCA1N)
  )
USUBJID PARAMCD AVAL AVALCA1N AVALCAT1
01-701-1015 HEIGHT 147.32 1 >140 cm
01-701-1023 HEIGHT 162.56 1 >140 cm
01-703-1086 HEIGHT 195.58 1 >140 cm
01-703-1096 HEIGHT 160.02 1 >140 cm
01-707-1037 HEIGHT 152.40 1 >140 cm
01-716-1024 HEIGHT 154.94 1 >140 cm

Add ADSL variables

If needed, the other ADSL variables can now be added. List of ADSL variables already merged held in vector adsl_vars

advs <- advs %>%
  derive_vars_merged(
    dataset_add = select(adsl, !!!negate_vars(adsl_vars)),
    by_vars = vars(STUDYID, USUBJID)
  )
USUBJID RFSTDTC RFENDTC DTHDTC DTHFL AGE AGEU
01-701-1015 2014-01-02 2014-07-02 NA NA 63 YEARS
01-701-1015 2014-01-02 2014-07-02 NA NA 63 YEARS
01-701-1015 2014-01-02 2014-07-02 NA NA 63 YEARS
01-701-1015 2014-01-02 2014-07-02 NA NA 63 YEARS
01-701-1015 2014-01-02 2014-07-02 NA NA 63 YEARS
01-701-1015 2014-01-02 2014-07-02 NA NA 63 YEARS
01-701-1015 2014-01-02 2014-07-02 NA NA 63 YEARS
01-701-1015 2014-01-02 2014-07-02 NA NA 63 YEARS
01-701-1015 2014-01-02 2014-07-02 NA NA 63 YEARS
01-701-1015 2014-01-02 2014-07-02 NA NA 63 YEARS

Derive New Rows

When deriving new rows for a data frame, it is essential the programmer takes time to insert this derivation in the correct location of the code. The location will vary depending on what previous computations should be retained on the new record and what computations must be done with the new records.

Example 1 (Creating a New Record):

To add a new record based on the selection of a certain criterion (e.g. minimum, maximum) derive_extreme_records() can be used. The new records include all variables of the selected records.

Adding a New Record for the Last Value

For each subject and Vital Signs parameter, add a record holding last valid observation before end of treatment. Set AVISIT to "End of Treatment" and assign a unique AVISITN value.

advs_ex1 <- advs %>%
  derive_extreme_records(
    by_vars = vars(STUDYID, USUBJID, PARAMCD),
    order = vars(ADT, AVISITN, ATPTN, AVAL),
    mode = "last",
    filter = (4 < AVISITN & AVISITN <= 12 & ANL01FL == "Y"),
    set_values_to = vars(
      AVISIT = "End of Treatment",
      AVISITN = 99,
      DTYPE = "LOV"
    )
  )
USUBJID PARAMCD ADT AVISITN AVISIT ATPTN AVAL DTYPE ANL01FL
01-701-1015 DIABP 2014-03-26 99 End of Treatment 817 64 LOV Y
01-701-1015 DIABP 2014-07-02 26 Week 26 815 61 NA Y
01-701-1015 DIABP 2014-07-02 26 Week 26 816 59 NA Y
01-701-1015 DIABP 2014-07-02 26 Week 26 817 55 NA Y
01-701-1015 DIABP 2014-06-18 24 Week 24 815 63 NA Y
01-701-1015 DIABP 2014-06-18 24 Week 24 816 57 NA Y
01-701-1015 DIABP 2014-06-18 24 Week 24 817 71 NA Y
01-701-1015 DIABP 2014-05-21 20 Week 20 815 67 NA Y
01-701-1015 DIABP 2014-05-21 20 Week 20 816 65 NA Y
01-701-1015 DIABP 2014-05-21 20 Week 20 817 63 NA Y

Adding a New Record for the Minimum Value

For each subject and Vital Signs parameter, add a record holding the minimum value before end of treatment. If the minimum is attained by multiple observations the first one is selected. Set AVISIT to "Minimum on Treatment" and assign a unique AVISITN value.

advs_ex1 <- advs %>%
  derive_extreme_records(
    by_vars = vars(STUDYID, USUBJID, PARAMCD),
    order = vars(AVAL, ADT, AVISITN, ATPTN),
    mode = "first",
    filter = (4 < AVISITN & AVISITN <= 12 & ANL01FL == "Y" & !is.na(AVAL)),
    set_values_to = vars(
      AVISIT = "Minimum on Treatment",
      AVISITN = 98,
      DTYPE = "MINIMUM"
    )
  )
USUBJID PARAMCD ADT AVISITN AVISIT ATPTN AVAL DTYPE ANL01FL
01-701-1015 DIABP 2014-02-12 98 Minimum on Treatment 815 55 MINIMUM Y
01-701-1015 DIABP 2014-07-02 26 Week 26 815 61 NA Y
01-701-1015 DIABP 2014-07-02 26 Week 26 816 59 NA Y
01-701-1015 DIABP 2014-07-02 26 Week 26 817 55 NA Y
01-701-1015 DIABP 2014-06-18 24 Week 24 815 63 NA Y
01-701-1015 DIABP 2014-06-18 24 Week 24 816 57 NA Y
01-701-1015 DIABP 2014-06-18 24 Week 24 817 71 NA Y
01-701-1015 DIABP 2014-05-21 20 Week 20 815 67 NA Y
01-701-1015 DIABP 2014-05-21 20 Week 20 816 65 NA Y
01-701-1015 DIABP 2014-05-21 20 Week 20 817 63 NA Y

Example 2 (Deriving a Summary Record)

For adding new records based on aggregating records derive_summary_records() can be used. For the new records only the variables specified by by_vars, analysis_var, and set_values_to are populated.

For each subject, Vital Signs parameter, visit, and date add a record holding the average value for observations on that date. Set DTYPE to AVERAGE.

advs_ex2 <- derive_summary_records(
  advs,
  by_vars = vars(STUDYID, USUBJID, PARAMCD, VISITNUM, ADT),
  analysis_var = AVAL,
  summary_fun = mean,
  set_values_to = vars(DTYPE = "AVERAGE")
)
USUBJID PARAMCD VISITNUM ADT AVAL DTYPE
01-701-1015 BMI 1 2013-12-26 24.871928 AVERAGE
01-701-1015 BMI 1 2013-12-26 24.871928 NA
01-701-1015 BSA 1 2013-12-26 1.486264 AVERAGE
01-701-1015 BSA 1 2013-12-26 1.486264 NA
01-701-1015 DIABP 1 2013-12-26 68.000000 AVERAGE
01-701-1015 DIABP 1 2013-12-26 64.000000 NA
01-701-1015 DIABP 1 2013-12-26 83.000000 NA
01-701-1015 DIABP 1 2013-12-26 57.000000 NA
01-701-1015 DIABP 2 2013-12-31 66.000000 AVERAGE
01-701-1015 DIABP 2 2013-12-31 68.000000 NA

Example 3 (Deriving a New PARAMCD)

Use function derive_param_computed() to create a new PARAMCD. Note that only variables specified in the by_vars argument will be populated in the newly created records.

Below is an example of creating Mean Arterial Pressure (PARAMCD = MAP2) with an alternative formula.

advs_ex3 <- derive_param_computed(
  advs,
  by_vars = vars(USUBJID, VISIT, ATPT),
  parameters = c("SYSBP", "DIABP"),
  analysis_value = (AVAL.SYSBP - AVAL.DIABP) / 3 + AVAL.DIABP,
  set_values_to = vars(
    PARAMCD = "MAP2",
    PARAM = "Mean Arterial Pressure 2 (mmHg)"
  )
)
USUBJID PARAMCD VISIT ATPT AVAL
01-701-1015 DIABP AMBUL ECG PLACEMENT AFTER LYING DOWN FOR 5 MINUTES 67.00000
01-701-1015 MAP2 AMBUL ECG PLACEMENT AFTER LYING DOWN FOR 5 MINUTES 90.33333
01-701-1015 SYSBP AMBUL ECG PLACEMENT AFTER LYING DOWN FOR 5 MINUTES 137.00000
01-701-1015 DIABP AMBUL ECG PLACEMENT AFTER STANDING FOR 1 MINUTE 61.00000
01-701-1015 MAP2 AMBUL ECG PLACEMENT AFTER STANDING FOR 1 MINUTE 83.00000
01-701-1015 SYSBP AMBUL ECG PLACEMENT AFTER STANDING FOR 1 MINUTE 127.00000
01-701-1015 DIABP AMBUL ECG PLACEMENT AFTER STANDING FOR 3 MINUTES 65.00000
01-701-1015 MAP2 AMBUL ECG PLACEMENT AFTER STANDING FOR 3 MINUTES 92.00000
01-701-1015 SYSBP AMBUL ECG PLACEMENT AFTER STANDING FOR 3 MINUTES 146.00000
01-701-1015 DIABP AMBUL ECG REMOVAL AFTER LYING DOWN FOR 5 MINUTES 72.00000

Example Scripts

ADaM Sample Code
ADEG ad_adeg.R
ADVS ad_advs.R
ADLB ad_adlb.R