This vignette shows functionalities used for annotating and filtering the data within the clinDataReview
package.
Utility functions to automate standard pre-processing steps of the data are available in the package.
Note that these functions are mainly useful in combination with the specification of the parameters in ‘config’ file in the clinical data reports (see the dedicated reporting vignette).
For this vignette, we will use example data available in the clinUtils
package.
The input dataset for the clinical data review should be a data.frame
with clinical data. Such data is typically imported from SAS data file or xpt data file.
Such dataset can be imported for multiple files at once via the clinUtils::loadDataADaMSDTM
function.
The label of the variables stored in the SAS
datasets is also used for title/captions.
A few ADaM
datasets are included in the clinUtils
package for the demonstration, via the dataset dataADaMCDISCP01
and corresponding variable labels.
The annotateData
enables to add metadata for a specific domain/dataset.
dataLBAnnot <- annotateData(
data = dataLB,
annotations = list(data = dataDM, vars = c("ETHNIC", "ARM")),
verbose = TRUE
)
## Data annotated with variable(s): ETHNIC ('ETHNIC'), ARM ('ARM') from the 'custom' dataset based on the variable(s): USUBJID ('USUBJID').
pander(
head(dataLBAnnot),
caption = paste("Laboratory parameters annotated with",
"demographics information with the `annotatedData` function"
)
)
STUDYID | SUBJID | USUBJID | TRTP | TRTPN |
---|---|---|---|---|
CDISCPILOT01 | 1148 | 01-701-1148 | Xanomeline High Dose | 81 |
CDISCPILOT01 | 1148 | 01-701-1148 | Xanomeline High Dose | 81 |
CDISCPILOT01 | 1148 | 01-701-1148 | Xanomeline High Dose | 81 |
CDISCPILOT01 | 1148 | 01-701-1148 | Xanomeline High Dose | 81 |
CDISCPILOT01 | 1148 | 01-701-1148 | Xanomeline High Dose | 81 |
CDISCPILOT01 | 1148 | 01-701-1148 | Xanomeline High Dose | 81 |
TRTA | TRTAN | TRTSDT | TRTEDT | AGE | AGEGR1 |
---|---|---|---|---|---|
Xanomeline High Dose | 81 | 2013-08-23 | 2014-02-20 | 57 | <65 |
Xanomeline High Dose | 81 | 2013-08-23 | 2014-02-20 | 57 | <65 |
Xanomeline High Dose | 81 | 2013-08-23 | 2014-02-20 | 57 | <65 |
Xanomeline High Dose | 81 | 2013-08-23 | 2014-02-20 | 57 | <65 |
Xanomeline High Dose | 81 | 2013-08-23 | 2014-02-20 | 57 | <65 |
Xanomeline High Dose | 81 | 2013-08-23 | 2014-02-20 | 57 | <65 |
AGEGR1N | RACE | RACEN | SEX | COMP24FL | DSRAEFL | SAFFL | AVISIT |
---|---|---|---|---|---|---|---|
1 | WHITE | 1 | M | Y | Y | Baseline | |
1 | WHITE | 1 | M | Y | Y | Baseline | |
1 | WHITE | 1 | M | Y | Y | Baseline | |
1 | WHITE | 1 | M | Y | Y | Baseline | |
1 | WHITE | 1 | M | Y | Y | Baseline | |
1 | WHITE | 1 | M | Y | Y | Baseline |
AVISITN | ADY | ADT | VISIT | VISITNUM |
---|---|---|---|---|
0 | -9 | 2013-08-14 | SCREENING 1 | 1 |
0 | -9 | 2013-08-14 | SCREENING 1 | 1 |
0 | -9 | 2013-08-14 | SCREENING 1 | 1 |
0 | -9 | 2013-08-14 | SCREENING 1 | 1 |
0 | -9 | 2013-08-14 | SCREENING 1 | 1 |
0 | -9 | 2013-08-14 | SCREENING 1 | 1 |
PARAM | PARAMCD | PARAMN | PARCAT1 | AVAL | BASE | CHG |
---|---|---|---|---|---|---|
Sodium (mmol/L) | SODIUM | 18 | CHEM | 139 | 139 | NA |
Potassium (mmol/L) | K | 19 | CHEM | 4 | 4 | NA |
Chloride (mmol/L) | CL | 20 | CHEM | 109 | 109 | NA |
Bilirubin (umol/L) | BILI | 21 | CHEM | 8.55 | 8.55 | NA |
Alkaline Phosphatase (U/L) | ALP | 22 | CHEM | 88 | 88 | NA |
Gamma Glutamyl Transferase (U/L) | GGT | 23 | CHEM | 43 | 43 | NA |
A1LO | A1HI | R2A1LO | R2A1HI | BR2A1LO | BR2A1HI | ANL01FL | ALBTRVAL |
---|---|---|---|---|---|---|---|
132 | 147 | 1.053 | 0.9456 | 1.053 | 0.9456 | 81.5 | |
3.4 | 5.4 | 1.176 | 0.7407 | 1.176 | 0.7407 | 4.1 | |
94 | 112 | 1.16 | 0.9732 | 1.16 | 0.9732 | 62 | |
3 | 21 | 2.85 | 0.4071 | 2.85 | 0.4071 | 22.95 | |
31 | 110 | 2.839 | 0.8 | 2.839 | 0.8 | 77 | |
10 | 61 | 4.3 | 0.7049 | 4.3 | 0.7049 | 48.5 |
ANRIND | BNRIND | ABLFL | AENTMTFL | LBSEQ | LBNRIND | LBSTRESN | DATASET |
---|---|---|---|---|---|---|---|
N | N | Y | 26 | NORMAL | 139 | ADLBC | |
N | N | Y | 19 | NORMAL | 4 | ADLBC | |
N | N | Y | 11 | NORMAL | 109 | ADLBC | |
N | N | Y | 6 | NORMAL | 8.55 | ADLBC | |
N | N | Y | 2 | NORMAL | 88 | ADLBC | |
N | N | Y | 15 | NORMAL | 43 | ADLBC |
ETHNIC | ARM |
---|---|
NOT HISPANIC OR LATINO | Xanomeline High Dose |
NOT HISPANIC OR LATINO | Xanomeline High Dose |
NOT HISPANIC OR LATINO | Xanomeline High Dose |
NOT HISPANIC OR LATINO | Xanomeline High Dose |
NOT HISPANIC OR LATINO | Xanomeline High Dose |
NOT HISPANIC OR LATINO | Xanomeline High Dose |
The filterData
enables to filter a dataset.
dataLBAnnotTreatment <- filterData(
data = dataLBAnnot,
filters = list(var = "ARM", value = "Placebo", rev = TRUE),
verbose = TRUE
)
## 354 records with ARM ('ARM') %in% 'Placebo' are filtered in data.
pander(
unique(dataLBAnnotTreatment[, c("USUBJID", "ARM")]),
caption = paste("Subset of laboratory parameters filtered",
"with placebo patients"
)
)
USUBJID | ARM | |
---|---|---|
1 | 01-701-1148 | Xanomeline High Dose |
397 | 01-701-1192 | Xanomeline Low Dose |
793 | 01-701-1211 | Xanomeline Low Dose |
1363 | 01-718-1371 | Xanomeline High Dose |
1615 | 01-718-1427 | Xanomeline High Dose |
The transformData
enables to convert data to a different format.
For example, the laboratory data is converted from a long format, containing one record per endpoint * visit * subject to a wide format containing one record per visit * subject. The endpoints are included in different columns.
eDishData <- transformData(
data = subset(dataLB, PARAMCD %in% c("ALT", "BILI")),
transformations = list(
type = "pivot_wider",
varsID = c("USUBJID", "VISIT"),
varsValue = c("LBSTRESN", "LBNRIND"),
varPivot = "PARAMCD"
),
verbose = TRUE,
labelVars = labelVars
)
## Warning in reshapeWide(data, idvar = idvar, timevar = timevar, varying = varying, : some constant variables
## (AVISIT,AVISITN,PARAM,PARAMN,AVAL,BASE,CHG,A1LO,A1HI,R2A1LO,R2A1HI,BR2A1LO,BR2A1HI,ANL01FL,ALBTRVAL,LBSEQ) are really varying
## Warning in reshapeWide(data, idvar = idvar, timevar = timevar, varying = varying, : multiple rows match for PARAMCD=BILI: first taken
## Warning in reshapeWide(data, idvar = idvar, timevar = timevar, varying = varying, : multiple rows match for PARAMCD=ALT: first taken
## Data is converted to a wide format with variables: 'LBSTRESN', 'LBNRIND' for different: 'PARAMCD' by 'Unique Subject Identifier', 'Visit Name' pivoted to different columns.
STUDYID | SUBJID | USUBJID | TRTP | TRTPN | |
---|---|---|---|---|---|
4 | CDISCPILOT01 | 1148 | 01-701-1148 | Xanomeline High Dose | 81 |
40 | CDISCPILOT01 | 1148 | 01-701-1148 | Xanomeline High Dose | 81 |
76 | CDISCPILOT01 | 1148 | 01-701-1148 | Xanomeline High Dose | 81 |
112 | CDISCPILOT01 | 1148 | 01-701-1148 | Xanomeline High Dose | 81 |
148 | CDISCPILOT01 | 1148 | 01-701-1148 | Xanomeline High Dose | 81 |
184 | CDISCPILOT01 | 1148 | 01-701-1148 | Xanomeline High Dose | 81 |
TRTA | TRTAN | TRTSDT | TRTEDT | AGE | |
---|---|---|---|---|---|
4 | Xanomeline High Dose | 81 | 2013-08-23 | 2014-02-20 | 57 |
40 | Xanomeline High Dose | 81 | 2013-08-23 | 2014-02-20 | 57 |
76 | Xanomeline High Dose | 81 | 2013-08-23 | 2014-02-20 | 57 |
112 | Xanomeline High Dose | 81 | 2013-08-23 | 2014-02-20 | 57 |
148 | Xanomeline High Dose | 81 | 2013-08-23 | 2014-02-20 | 57 |
184 | Xanomeline High Dose | 81 | 2013-08-23 | 2014-02-20 | 57 |
AGEGR1 | AGEGR1N | RACE | RACEN | SEX | COMP24FL | DSRAEFL | SAFFL | |
---|---|---|---|---|---|---|---|---|
4 | <65 | 1 | WHITE | 1 | M | Y | Y | |
40 | <65 | 1 | WHITE | 1 | M | Y | Y | |
76 | <65 | 1 | WHITE | 1 | M | Y | Y | |
112 | <65 | 1 | WHITE | 1 | M | Y | Y | |
148 | <65 | 1 | WHITE | 1 | M | Y | Y | |
184 | <65 | 1 | WHITE | 1 | M | Y | Y |
AVISIT | AVISITN | ADY | ADT | VISIT | VISITNUM | |
---|---|---|---|---|---|---|
4 | Baseline | 0 | -9 | 2013-08-14 | SCREENING 1 | 1 |
40 | Week 2 | 2 | 14 | 2013-09-05 | WEEK 2 | 4 |
76 | Week 4 | 4 | 28 | 2013-09-19 | WEEK 4 | 5 |
112 | Week 6 | 6 | 42 | 2013-10-03 | WEEK 6 | 7 |
148 | Week 8 | 8 | 57 | 2013-10-18 | WEEK 8 | 8 |
184 | Week 12 | 12 | 87 | 2013-11-17 | WEEK 12 | 9 |
PARAM | PARAMN | PARCAT1 | AVAL | BASE | CHG | A1LO | |
---|---|---|---|---|---|---|---|
4 | Bilirubin (umol/L) | 21 | CHEM | 8.55 | 8.55 | NA | 3 |
40 | Bilirubin (umol/L) | 21 | CHEM | 8.55 | 8.55 | 0 | 3 |
76 | Bilirubin (umol/L) | 21 | CHEM | 8.55 | 8.55 | 0 | 3 |
112 | Bilirubin (umol/L) | 21 | CHEM | 8.55 | 8.55 | 0 | 3 |
148 | Bilirubin (umol/L) | 21 | CHEM | 8.55 | 8.55 | 0 | 3 |
184 | Bilirubin (umol/L) | 21 | CHEM | 6.84 | 8.55 | -1.71 | 3 |
A1HI | R2A1LO | R2A1HI | BR2A1LO | BR2A1HI | ANL01FL | ALBTRVAL | |
---|---|---|---|---|---|---|---|
4 | 21 | 2.85 | 0.4071 | 2.85 | 0.4071 | 22.95 | |
40 | 21 | 2.85 | 0.4071 | 2.85 | 0.4071 | 22.95 | |
76 | 21 | 2.85 | 0.4071 | 2.85 | 0.4071 | 22.95 | |
112 | 21 | 2.85 | 0.4071 | 2.85 | 0.4071 | 22.95 | |
148 | 21 | 2.85 | 0.4071 | 2.85 | 0.4071 | 22.95 | |
184 | 21 | 2.28 | 0.3257 | 2.85 | 0.4071 | Y | 24.66 |
ANRIND | BNRIND | ABLFL | AENTMTFL | LBSEQ | DATASET | LBSTRESN.BILI | |
---|---|---|---|---|---|---|---|
4 | N | N | Y | 6 | ADLBC | 8.55 | |
40 | N | N | 43 | ADLBC | 8.55 | ||
76 | N | N | 78 | ADLBC | 8.55 | ||
112 | N | N | 108 | ADLBC | 8.55 | ||
148 | N | N | 138 | ADLBC | 8.55 | ||
184 | N | N | 168 | ADLBC | 6.84 |
LBNRIND.BILI | LBSTRESN.ALT | LBNRIND.ALT | |
---|---|---|---|
4 | NORMAL | 34 | NORMAL |
40 | NORMAL | 41 | NORMAL |
76 | NORMAL | 35 | NORMAL |
112 | NORMAL | 31 | NORMAL |
148 | NORMAL | 31 | NORMAL |
184 | NORMAL | 39 | NORMAL |
The processData
function executes all the pre-processing steps described in the previous section at once.
dataLBAnnotTreatment2 <- processData(
data = dataLB,
processing = list(
list(annotate = list(data = dataDM, vars = c("ETHNIC", "ARM"))),
list(filter = list(var = "ARM", value = "Placebo", rev = TRUE))
),
verbose = TRUE
)
identical(dataLBAnnotTreatment, dataLBAnnotTreatment2)
[1] TRUE
R version 4.1.2 (2021-11-01)
Platform: x86_64-pc-linux-gnu (64-bit)
locale: LC_CTYPE=en_US.UTF-8, LC_NUMERIC=C, LC_TIME=en_US.UTF-8, LC_COLLATE=C, LC_MONETARY=en_US.UTF-8, LC_MESSAGES=en_US.UTF-8, LC_PAPER=en_US.UTF-8, LC_NAME=C, LC_ADDRESS=C, LC_TELEPHONE=C, LC_MEASUREMENT=en_US.UTF-8 and LC_IDENTIFICATION=C
attached base packages: stats, graphics, grDevices, utils, datasets, methods and base
other attached packages: clinUtils(v.0.1.1), clinDataReview(v.1.2.2), pander(v.0.6.4) and knitr(v.1.37)
loaded via a namespace (and not attached): tidyselect(v.1.1.1), xfun(v.0.29), purrr(v.0.3.4), haven(v.2.4.3), colorspace(v.2.0-3), vctrs(v.0.3.8), generics(v.0.1.2), htmltools(v.0.5.2), viridisLite(v.0.4.0), yaml(v.2.3.5), utf8(v.1.2.2), plotly(v.4.10.0), rlang(v.1.0.1), pillar(v.1.7.0), glue(v.1.6.1), lifecycle(v.1.0.1), plyr(v.1.8.6), stringr(v.1.4.0), munsell(v.0.5.0), gtable(v.0.3.0), htmlwidgets(v.1.5.4), evaluate(v.0.15), forcats(v.0.5.1), fastmap(v.1.1.0), crosstalk(v.1.2.0), fansi(v.1.0.2), Rcpp(v.1.0.8), scales(v.1.1.1), DT(v.0.20), jsonvalidate(v.1.3.2), jsonlite(v.1.7.3), ggplot2(v.3.3.5), hms(v.1.1.1), digest(v.0.6.29), stringi(v.1.7.6), bookdown(v.0.24), dplyr(v.1.0.8), grid(v.4.1.2), cli(v.3.2.0), tools(v.4.1.2), magrittr(v.2.0.2), lazyeval(v.0.2.2), tibble(v.3.1.6), crayon(v.1.5.0), tidyr(v.1.2.0), pkgconfig(v.2.0.3), ellipsis(v.0.3.2), data.table(v.1.14.2), rmarkdown(v.2.11), httr(v.1.4.2), R6(v.2.5.1) and compiler(v.4.1.2)