In this vignette we create a
using the rtables
layout facility. That is, we demonstrate how the layout based tabulation framework can specify the structure and relations that are commonly found when analyzing clinical trials data.
Note that all the data is created using random number generators. All ex_*
data which is currently attached to the rtables
package were created using random.cdisc.data
another R package that we intend to release as open source soon.
The packages used in this vignette are:
library(rtables)
library(tibble)
library(dplyr)
Demographic tables summarize the variables content for different population subsets (encoded in the columns).
One feature of analyze
that we have not introduced in the previous vignette is that the analysis function afun
can specify multiple rows with the in_rows
function:
<- ex_adsl # Example ADSL dataset
ADSL
basic_table() %>%
split_cols_by("ARM") %>%
analyze(vars = "AGE", afun = function(x) {
in_rows(
"Mean (sd)" = rcell(c(mean(x), sd(x)), format = "xx.xx (xx.xx)"),
"Range" = rcell(range(x), format = "xx.xx - xx.xx")
)%>%
}) build_table(ADSL)
A: Drug X B: Placebo C: Combination
——————————————————————————————————————————————————————————
Mean (sd) 33.77 (6.55) 35.43 (7.90) 35.43 (7.72)
Range 21.00 - 50.00 21.00 - 62.00 20.00 - 69.00
Multiple variables can be analyzed in one analyze
call:
basic_table() %>%
split_cols_by("ARM") %>%
analyze(vars = c("AGE", "BMRKR1"), afun = function(x) {
in_rows(
"Mean (sd)" = rcell(c(mean(x), sd(x)), format = "xx.xx (xx.xx)"),
"Range" = rcell(range(x), format = "xx.xx - xx.xx")
)%>%
}) build_table(ADSL)
A: Drug X B: Placebo C: Combination
————————————————————————————————————————————————————————————
AGE
Mean (sd) 33.77 (6.55) 35.43 (7.90) 35.43 (7.72)
Range 21.00 - 50.00 21.00 - 62.00 20.00 - 69.00
BMRKR1
Mean (sd) 5.97 (3.55) 5.70 (3.31) 5.62 (3.49)
Range 0.41 - 17.67 0.65 - 14.24 0.17 - 21.39
Hence, if afun
can process different data vector types (i.e. variables selected from the data) then we are fairly close to a standard demographic table. Here is a function that either creates a count table or some number summary if the argument x
is a factor or numeric, respectively:
<- function(x) {
s_summary if (is.numeric(x)) {
in_rows(
"n" = rcell(sum(!is.na(x)), format = "xx"),
"Mean (sd)" = rcell(c(mean(x, na.rm = TRUE), sd(x, na.rm = TRUE)), format = "xx.xx (xx.xx)"),
"IQR" = rcell(IQR(x, na.rm = TRUE), format = "xx.xx"),
"min - max" = rcell(range(x, na.rm = TRUE), format = "xx.xx - xx.xx")
)else if (is.factor(x)) {
}
<- as.list(table(x))
vs do.call(in_rows, lapply(vs, rcell, format = "xx"))
else (
} stop("type not supported")
) }
Note we use rcell
s to wrap the results in order to add formatting instructions for rtables
. We can use s_summary
outside the context of tabulation:
s_summary(ADSL$AGE)
RowsVerticalSection (in_rows) object print method:
----------------------------
row_name formatted_cell indent_mod row_label
1 n 400 0 n
2 Mean (sd) 34.88 (7.44) 0 Mean (sd)
3 IQR 10.00 0 IQR
4 min - max 20.00 - 69.00 0 min - max
and
s_summary(ADSL$SEX)
RowsVerticalSection (in_rows) object print method:
----------------------------
row_name formatted_cell indent_mod row_label
1 F 222 0 F
2 M 166 0 M
3 U 9 0 U
4 UNDIFFERENTIATED 3 0 UNDIFFERENTIATED
We can now create a commonly used variant of the demographic table:
<- basic_table() %>%
lyt split_cols_by(var = "ARM") %>%
analyze(c("AGE", "SEX"), afun = s_summary)
<- build_table(lyt, ADSL)
tbl tbl
A: Drug X B: Placebo C: Combination
———————————————————————————————————————————————————————————————————
AGE
n 134 134 132
Mean (sd) 33.77 (6.55) 35.43 (7.90) 35.43 (7.72)
IQR 11.00 10.00 10.00
min - max 21.00 - 50.00 21.00 - 62.00 20.00 - 69.00
SEX
F 79 77 66
M 51 55 60
U 3 2 4
UNDIFFERENTIATED 1 0 2
Note that analyze
can also be called multiple times in sequence:
<- basic_table() %>%
tbl2 split_cols_by(var = "ARM") %>%
analyze("AGE", s_summary) %>%
analyze("SEX", s_summary) %>%
build_table(ADSL)
tbl2
A: Drug X B: Placebo C: Combination
———————————————————————————————————————————————————————————————————
AGE
n 134 134 132
Mean (sd) 33.77 (6.55) 35.43 (7.90) 35.43 (7.72)
IQR 11.00 10.00 10.00
min - max 21.00 - 50.00 21.00 - 62.00 20.00 - 69.00
SEX
F 79 77 66
M 51 55 60
U 3 2 4
UNDIFFERENTIATED 1 0 2
which leads to the identical table as tbl
:
identical(tbl, tbl2)
[1] TRUE
In clinical trials analyses the number of patients per column is often referred to as N
(rather than the overall population which outside of clinical trials is commonly referred to as N
). Column N
s are added using the add_colcounts
function:
basic_table() %>%
split_cols_by(var = "ARMCD") %>%
add_colcounts() %>%
analyze(c("AGE", "SEX"), s_summary) %>%
build_table(ADSL)
ARM A ARM B ARM C
(N=134) (N=134) (N=132)
——————————————————————————————————————————————————————————————————
AGE
n 134 134 132
Mean (sd) 33.77 (6.55) 35.43 (7.90) 35.43 (7.72)
IQR 11.00 10.00 10.00
min - max 21.00 - 50.00 21.00 - 62.00 20.00 - 69.00
SEX
F 79 77 66
M 51 55 60
U 3 2 4
UNDIFFERENTIATED 1 0 2
We will now show a couple of variations of the demographic table that we developed above. These variations are in structure and not in analysis, hence they don’t require a modification to the s_summary
function.
We will start with a standard table analyzing the variables AGE
and BMRKR2
variables:
basic_table() %>%
split_cols_by(var = "ARM") %>%
add_colcounts() %>%
analyze(c("AGE", "BMRKR2"), s_summary) %>%
build_table(ADSL)
A: Drug X B: Placebo C: Combination
(N=134) (N=134) (N=132)
————————————————————————————————————————————————————————————
AGE
n 134 134 132
Mean (sd) 33.77 (6.55) 35.43 (7.90) 35.43 (7.72)
IQR 11.00 10.00 10.00
min - max 21.00 - 50.00 21.00 - 62.00 20.00 - 69.00
BMRKR2
LOW 50 45 40
MEDIUM 37 56 42
HIGH 47 33 50
Assume we would like to have this analysis carried out per gender encoded in the row space:
basic_table() %>%
split_cols_by(var = "ARM") %>%
add_colcounts() %>%
split_rows_by("SEX") %>%
analyze(c("AGE", "BMRKR2"), s_summary) %>%
build_table(ADSL)
A: Drug X B: Placebo C: Combination
(N=134) (N=134) (N=132)
—————————————————————————————————————————————————————————————————
F
AGE
n 79 77 66
Mean (sd) 32.76 (6.09) 34.12 (7.06) 35.20 (7.43)
IQR 9.00 8.00 6.75
min - max 21.00 - 47.00 23.00 - 58.00 21.00 - 64.00
BMRKR2
LOW 26 21 26
MEDIUM 21 38 17
HIGH 32 18 23
M
AGE
n 51 55 60
Mean (sd) 35.57 (7.08) 37.44 (8.69) 35.38 (8.24)
IQR 11.00 9.00 11.00
min - max 23.00 - 50.00 21.00 - 62.00 20.00 - 69.00
BMRKR2
LOW 21 23 11
MEDIUM 15 18 23
HIGH 15 14 26
U
AGE
n 3 2 4
Mean (sd) 31.67 (3.21) 31.00 (5.66) 35.25 (3.10)
IQR 3.00 4.00 3.25
min - max 28.00 - 34.00 27.00 - 35.00 31.00 - 38.00
BMRKR2
LOW 2 1 1
MEDIUM 1 0 2
HIGH 0 1 1
UNDIFFERENTIATED
AGE
n 1 0 2
Mean (sd) 28.00 (NA) NA (NA) 45.00 (1.41)
IQR 0.00 NA 1.00
min - max 28.00 - 28.00 Inf - -Inf 44.00 - 46.00
BMRKR2
LOW 1 0 2
MEDIUM 0 0 0
HIGH 0 0 0
We will now subset ADSL
to include only males and females in the analysis in order to reduces the number of rows in the table:
<- filter(ADSL, SEX %in% c("M", "F"))
ADSL_M_F
basic_table() %>%
split_cols_by(var = "ARM") %>%
add_colcounts() %>%
split_rows_by("SEX") %>%
analyze(c("AGE", "BMRKR2"), s_summary) %>%
build_table(ADSL_M_F)
A: Drug X B: Placebo C: Combination
(N=130) (N=132) (N=126)
—————————————————————————————————————————————————————————————————
F
AGE
n 79 77 66
Mean (sd) 32.76 (6.09) 34.12 (7.06) 35.20 (7.43)
IQR 9.00 8.00 6.75
min - max 21.00 - 47.00 23.00 - 58.00 21.00 - 64.00
BMRKR2
LOW 26 21 26
MEDIUM 21 38 17
HIGH 32 18 23
M
AGE
n 51 55 60
Mean (sd) 35.57 (7.08) 37.44 (8.69) 35.38 (8.24)
IQR 11.00 9.00 11.00
min - max 23.00 - 50.00 21.00 - 62.00 20.00 - 69.00
BMRKR2
LOW 21 23 11
MEDIUM 15 18 23
HIGH 15 14 26
U
AGE
n 0 0 0
Mean (sd) NA (NA) NA (NA) NA (NA)
IQR NA NA NA
min - max Inf - -Inf Inf - -Inf Inf - -Inf
BMRKR2
LOW 0 0 0
MEDIUM 0 0 0
HIGH 0 0 0
UNDIFFERENTIATED
AGE
n 0 0 0
Mean (sd) NA (NA) NA (NA) NA (NA)
IQR NA NA NA
min - max Inf - -Inf Inf - -Inf Inf - -Inf
BMRKR2
LOW 0 0 0
MEDIUM 0 0 0
HIGH 0 0 0
Note that the UNDIFFERENTIATED
and U
levels still show up in the table. This is because tabulation respects the factor levels and level order, exactly as the split
and table
function do. If empty levels should be dropped then rtables
needs to know that at splitting time via the split_fun
argument in split_rows_by
. There are a number of predefined functions. For this example drop_split_levels
is required to drop the empty levels at splitting time. Splitting is a big topic and will be eventually addressed in a specific package vignette.
basic_table() %>%
split_cols_by(var = "ARM") %>%
add_colcounts() %>%
split_rows_by("SEX", split_fun = drop_split_levels, child_labels = "visible") %>%
analyze(c("AGE", "BMRKR2"), s_summary) %>%
build_table(ADSL_M_F)
A: Drug X B: Placebo C: Combination
(N=130) (N=132) (N=126)
——————————————————————————————————————————————————————————————
F
AGE
n 79 77 66
Mean (sd) 32.76 (6.09) 34.12 (7.06) 35.20 (7.43)
IQR 9.00 8.00 6.75
min - max 21.00 - 47.00 23.00 - 58.00 21.00 - 64.00
BMRKR2
LOW 26 21 26
MEDIUM 21 38 17
HIGH 32 18 23
M
AGE
n 51 55 60
Mean (sd) 35.57 (7.08) 37.44 (8.69) 35.38 (8.24)
IQR 11.00 9.00 11.00
min - max 23.00 - 50.00 21.00 - 62.00 20.00 - 69.00
BMRKR2
LOW 21 23 11
MEDIUM 15 18 23
HIGH 15 14 26
In the table above the labels M
and F
are not very descriptive. You can add the full label as follows:
<- ADSL_M_F %>%
ADSL_M_F_l mutate(lbl_sex = case_when(
== "M" ~ "Male",
SEX == "F" ~ "Female",
SEX == "U" ~ "Unknown",
SEX == "UNDIFFERENTIATED" ~ "Undifferentiated"
SEX
))
basic_table() %>%
split_cols_by(var = "ARM") %>%
add_colcounts() %>%
split_rows_by("SEX", labels_var = "lbl_sex", split_fun = drop_split_levels, child_labels = "visible") %>%
analyze(c("AGE", "BMRKR2"), s_summary) %>%
build_table(ADSL_M_F_l)
A: Drug X B: Placebo C: Combination
(N=130) (N=132) (N=126)
——————————————————————————————————————————————————————————————
Female
AGE
n 79 77 66
Mean (sd) 32.76 (6.09) 34.12 (7.06) 35.20 (7.43)
IQR 9.00 8.00 6.75
min - max 21.00 - 47.00 23.00 - 58.00 21.00 - 64.00
BMRKR2
LOW 26 21 26
MEDIUM 21 38 17
HIGH 32 18 23
Male
AGE
n 51 55 60
Mean (sd) 35.57 (7.08) 37.44 (8.69) 35.38 (8.24)
IQR 11.00 9.00 11.00
min - max 23.00 - 50.00 21.00 - 62.00 20.00 - 69.00
BMRKR2
LOW 21 23 11
MEDIUM 15 18 23
HIGH 15 14 26
For the next table variation we only stratify by gender for the AGE
analysis. To do this the nested
argument has to be set to FALSE
in analyze
call:
basic_table() %>%
split_cols_by(var = "ARM") %>%
add_colcounts() %>%
split_rows_by("SEX", labels_var = "lbl_sex", split_fun = drop_split_levels, child_labels = "visible") %>%
analyze("AGE", s_summary, show_labels = "visible") %>%
analyze("BMRKR2", s_summary, nested = FALSE, show_labels = "visible") %>%
build_table(ADSL_M_F_l)
A: Drug X B: Placebo C: Combination
(N=130) (N=132) (N=126)
——————————————————————————————————————————————————————————————
Female
AGE
n 79 77 66
Mean (sd) 32.76 (6.09) 34.12 (7.06) 35.20 (7.43)
IQR 9.00 8.00 6.75
min - max 21.00 - 47.00 23.00 - 58.00 21.00 - 64.00
Male
AGE
n 51 55 60
Mean (sd) 35.57 (7.08) 37.44 (8.69) 35.38 (8.24)
IQR 11.00 9.00 11.00
min - max 23.00 - 50.00 21.00 - 62.00 20.00 - 69.00
BMRKR2
LOW 47 44 37
MEDIUM 36 56 40
HIGH 47 32 49
Once we split the rows into groups (Male
and Female
here) one might want to summarize groups: usually by showing count and column percentages. This is especially important if we have missing data. For example if we create the above table but add missing data to the AGE
variable:
<- function(x) {
insert_NAs sample(c(TRUE, FALSE), length(x), TRUE, prob = c(0.2, 0.8))] <- NA
x[
x
}
set.seed(1)
<- ADSL_M_F_l %>%
ADSL_NA mutate(AGE = insert_NAs(AGE))
basic_table() %>%
split_cols_by(var = "ARM") %>%
add_colcounts() %>%
split_rows_by("SEX", labels_var = "lbl_sex", split_fun = drop_split_levels, child_labels = "visible") %>%
analyze("AGE", s_summary) %>%
analyze("BMRKR2", s_summary, nested = FALSE, show_labels = "visible") %>%
build_table(filter(ADSL_NA, SEX %in% c("M", "F")))
A: Drug X B: Placebo C: Combination
(N=130) (N=132) (N=126)
————————————————————————————————————————————————————————————
Female
n 65 61 54
Mean (sd) 32.71 (6.07) 34.33 (7.31) 34.61 (6.78)
IQR 9.00 10.00 6.75
min - max 21.00 - 47.00 23.00 - 58.00 21.00 - 54.00
Male
n 44 44 50
Mean (sd) 35.66 (6.78) 36.93 (8.18) 35.64 (8.42)
IQR 10.50 8.25 10.75
min - max 24.00 - 48.00 21.00 - 58.00 20.00 - 69.00
BMRKR2
LOW 47 44 37
MEDIUM 36 56 40
HIGH 47 32 49
Here it is not easy to see how many females and males there are in each arm as n
represents the number of non-missing data elements in the variables. Groups within rows that are defined by splitting can be summarized with summarize_row_groups
, for example:
basic_table() %>%
split_cols_by(var = "ARM") %>%
add_colcounts() %>%
split_rows_by("SEX", labels_var = "lbl_sex", split_fun = drop_split_levels) %>%
summarize_row_groups() %>%
analyze("AGE", s_summary) %>%
analyze("BMRKR2", afun = s_summary, nested = FALSE, show_labels = "visible") %>%
build_table(filter(ADSL_NA, SEX %in% c("M", "F")))
A: Drug X B: Placebo C: Combination
(N=130) (N=132) (N=126)
————————————————————————————————————————————————————————————
Female 79 (60.8%) 77 (58.3%) 66 (52.4%)
n 65 61 54
Mean (sd) 32.71 (6.07) 34.33 (7.31) 34.61 (6.78)
IQR 9.00 10.00 6.75
min - max 21.00 - 47.00 23.00 - 58.00 21.00 - 54.00
Male 51 (39.2%) 55 (41.7%) 60 (47.6%)
n 44 44 50
Mean (sd) 35.66 (6.78) 36.93 (8.18) 35.64 (8.42)
IQR 10.50 8.25 10.75
min - max 24.00 - 48.00 21.00 - 58.00 20.00 - 69.00
BMRKR2
LOW 47 44 37
MEDIUM 36 56 40
HIGH 47 32 49
There are a couple of things to note here.
summarize_row_groups
).We can recreate this default behavior (count percentage) by defining a cfun
for illustrative purposes here as it results in the same table as above:
basic_table() %>%
split_cols_by(var = "ARM") %>%
add_colcounts() %>%
split_rows_by("SEX", labels_var = "lbl_sex", split_fun = drop_split_levels) %>%
summarize_row_groups(cfun = function(df, labelstr, .N_col, ...) {
in_rows(
rcell(nrow(df) * c(1, 1/.N_col), format = "xx (xx.xx%)"),
.labels = labelstr
)%>%
}) analyze("AGE", s_summary) %>%
analyze("BEP01FL", afun = s_summary, nested = FALSE, show_labels = "visible") %>%
build_table(filter(ADSL_NA, SEX %in% c("M", "F")))
A: Drug X B: Placebo C: Combination
(N=130) (N=132) (N=126)
————————————————————————————————————————————————————————————
Female 79 (60.77%) 77 (58.33%) 66 (52.38%)
n 65 61 54
Mean (sd) 32.71 (6.07) 34.33 (7.31) 34.61 (6.78)
IQR 9.00 10.00 6.75
min - max 21.00 - 47.00 23.00 - 58.00 21.00 - 54.00
Male 51 (39.23%) 55 (41.67%) 60 (47.62%)
n 44 44 50
Mean (sd) 35.66 (6.78) 36.93 (8.18) 35.64 (8.42)
IQR 10.50 8.25 10.75
min - max 24.00 - 48.00 21.00 - 58.00 20.00 - 69.00
BEP01FL
Y 67 63 65
N 63 69 61
Note that cfun
differs from afun
(which is used in analyze
) in that cfun
does not operate on variables but rather on data.frame
s or tibble
s which are passed via the df
argument (afun
can optionally request df
too). Further, cfun
gives the default group label (factor level from splitting) as an argument to labelstr
and hence it could be modified:
basic_table() %>%
split_cols_by(var = "ARM") %>%
split_rows_by("SEX", labels_var = "lbl_sex", split_fun = drop_split_levels, child_labels = "hidden") %>%
summarize_row_groups(cfun = function(df, labelstr, .N_col, ...) {
in_rows(
rcell(nrow(df) * c(1, 1/.N_col), format = "xx (xx.xx%)"),
.labels = paste0(labelstr, ": count (perc.)")
)%>%
}) analyze("AGE", s_summary) %>%
analyze("BEP01FL", s_summary, nested = FALSE, show_labels = "visible") %>%
build_table(filter(ADSL_NA, SEX %in% c("M", "F")))
A: Drug X B: Placebo C: Combination
——————————————————————————————————————————————————————————————————————
Female: count (perc.) 79 (60.77%) 77 (58.33%) 66 (52.38%)
n 65 61 54
Mean (sd) 32.71 (6.07) 34.33 (7.31) 34.61 (6.78)
IQR 9.00 10.00 6.75
min - max 21.00 - 47.00 23.00 - 58.00 21.00 - 54.00
Male: count (perc.) 51 (39.23%) 55 (41.67%) 60 (47.62%)
n 44 44 50
Mean (sd) 35.66 (6.78) 36.93 (8.18) 35.64 (8.42)
IQR 10.50 8.25 10.75
min - max 24.00 - 48.00 21.00 - 58.00 20.00 - 69.00
BEP01FL
Y 67 63 65
N 63 69 61
Layouts have a couple of advantages over tabulating the tables directly:
Here is an example that demonstrates the reusability of layouts:
<- basic_table() %>%
lyt split_cols_by("ARM") %>%
add_colcounts() %>%
analyze(c("AGE", "SEX"), afun = s_summary)
lyt
A Pre-data Table Layout
Column-Split Structure:
ARM (lvls)
Row-Split Structure:
AGE:SEX (** multivar analysis **)
We can now build a table for ADSL
build_table(lyt, ADSL)
A: Drug X B: Placebo C: Combination
(N=134) (N=134) (N=132)
———————————————————————————————————————————————————————————————————
AGE
n 134 134 132
Mean (sd) 33.77 (6.55) 35.43 (7.90) 35.43 (7.72)
IQR 11.00 10.00 10.00
min - max 21.00 - 50.00 21.00 - 62.00 20.00 - 69.00
SEX
F 79 77 66
M 51 55 60
U 3 2 4
UNDIFFERENTIATED 1 0 2
or for all patients that are older than 18:
build_table(lyt, ADSL %>% filter(AGE > 18))
A: Drug X B: Placebo C: Combination
(N=134) (N=134) (N=132)
———————————————————————————————————————————————————————————————————
AGE
n 134 134 132
Mean (sd) 33.77 (6.55) 35.43 (7.90) 35.43 (7.72)
IQR 11.00 10.00 10.00
min - max 21.00 - 50.00 21.00 - 62.00 20.00 - 69.00
SEX
F 79 77 66
M 51 55 60
U 3 2 4
UNDIFFERENTIATED 1 0 2
There are a number of different adverse event tables. We will now present two tables that show adverse events by id and then by grade and by id.
This time we won’t use the ADAE
dataset from random.cdisc.data
but rather generate a dataset on the fly (see Adrian’s 2016 Phuse paper):
set.seed(1)
<- tribble(
lookup ~AEDECOD, ~AEBODSYS, ~AETOXGR,
'HEADACHE', "NERVOUS SYSTEM DISORDERS", "5",
'BACK PAIN', "MUSCULOSKELETAL AND CONNECTIVE TISSUE DISORDERS", "2",
'GINGIVAL BLEEDING', "GASTROINTESTINAL DISORDERS", "1",
'HYPOTENSION', "VASCULAR DISORDERS", "3",
'FAECES SOFT', "GASTROINTESTINAL DISORDERS", "2",
'ABDOMINAL DISCOMFORT', "GASTROINTESTINAL DISORDERS", "1",
'DIARRHEA', "GASTROINTESTINAL DISORDERS", "1",
'ABDOMINAL FULLNESS DUE TO GAS', "GASTROINTESTINAL DISORDERS", "1",
'NAUSEA (INTERMITTENT)', "GASTROINTESTINAL DISORDERS", "2",
'WEAKNESS', "MUSCULOSKELETAL AND CONNECTIVE TISSUE DISORDERS", "3",
'ORTHOSTATIC HYPOTENSION', "VASCULAR DISORDERS", "4"
)
<- function(x) x/sum(x)
normalize <- normalize(c(0.1, dlnorm(seq(0, 5, length.out = 25), meanlog = 3)))
weightsA <- normalize(c(0.2, dlnorm(seq(0, 5, length.out = 25))))
weightsB
<- 300
N_pop <- data.frame(
ADSL2 USUBJID = seq(1, N_pop, by = 1),
ARM = sample(c('ARM A', 'ARM B'), N_pop, TRUE),
SEX = sample(c('F', 'M'), N_pop, TRUE),
AGE = 20 + rbinom(N_pop, size=40, prob=0.7)
)
<- mapply(ADSL2$USUBJID, ADSL2$ARM, ADSL2$SEX, ADSL2$AGE, FUN = function(id, arm, sex, age) {
l.adae <- sample(0:25, 1, prob = if (arm == "ARM A") weightsA else weightsB)
n_ae <- sample(1:nrow(lookup), size = n_ae, replace = TRUE, prob = c(6, rep(1, 10))/16)
i %>%
lookup[i, ] mutate(
AESEQ = seq_len(n()),
USUBJID = id, ARM = arm, SEX = sex, AGE = age
)SIMPLIFY = FALSE)
},
<- do.call(rbind, l.adae)
ADAE2 <- ADAE2 %>%
ADAE2 mutate(
ARM = factor(ARM, levels = c("ARM A", "ARM B")),
AEDECOD = as.factor(AEDECOD),
AEBODSYS = as.factor(AEBODSYS),
AETOXGR = factor(AETOXGR, levels = as.character(1:5))
%>%
) select(USUBJID, ARM, AGE, SEX, AESEQ, AEDECOD, AEBODSYS, AETOXGR)
ADAE2
# A tibble: 3,118 × 8
USUBJID ARM AGE SEX AESEQ AEDECOD AEBODSYS AETOXGR
<dbl> <fct> <dbl> <chr> <int> <fct> <fct> <fct>
1 1 ARM A 45 F 1 NAUSEA (INTERMITTENT) GASTROINTESTIN… 2
2 1 ARM A 45 F 2 HEADACHE NERVOUS SYSTEM… 5
3 1 ARM A 45 F 3 HEADACHE NERVOUS SYSTEM… 5
4 1 ARM A 45 F 4 HEADACHE NERVOUS SYSTEM… 5
5 1 ARM A 45 F 5 HEADACHE NERVOUS SYSTEM… 5
6 1 ARM A 45 F 6 HEADACHE NERVOUS SYSTEM… 5
7 1 ARM A 45 F 7 HEADACHE NERVOUS SYSTEM… 5
8 1 ARM A 45 F 8 HEADACHE NERVOUS SYSTEM… 5
9 1 ARM A 45 F 9 HEADACHE NERVOUS SYSTEM… 5
10 1 ARM A 45 F 10 FAECES SOFT GASTROINTESTIN… 2
# … with 3,108 more rows
We start by defining an events summary function:
<- function(x, labelstr, .N_col) {
s_events_patients in_rows(
"Total number of patients with at least one event" =
rcell(length(unique(x)) * c(1, 1/.N_col), format = "xx (xx.xx%)"),
"Total number of events" = rcell(length(x), format = "xx")
) }
So, for a population of 5
patients where
we would get the following summary:
s_events_patients(x = c("id 1", "id 1", "id 2"), .N_col = 5)
RowsVerticalSection (in_rows) object print method:
----------------------------
row_name formatted_cell indent_mod
1 Total number of patients with at least one event 2 (40.00%) 0
2 Total number of events 3 0
row_label
1 Total number of patients with at least one event
2 Total number of events
The .N_col
argument is a special keyword argument which build_table
passes the population size for each respective column. For a list of keyword arguments for the functions passed to afun
in analyze
refer to the documentation with ?analyze
.
We now use the s_events_patients
summary function in a tabulation:
basic_table() %>%
split_cols_by("ARM") %>%
add_colcounts() %>%
analyze("USUBJID", s_events_patients) %>%
build_table(ADAE2)
ARM A ARM B
(N=2060) (N=1058)
—————————————————————————————————————————————————————————————————————————————
Total number of patients with at least one event 114 (5.53%) 150 (14.18%)
Total number of events 2060 1058
Note that the column N
’s are wrong as by default they are set to the number of rows per group (i.e. number of AEs per arm here). This also affects the percentages. For this table we are interested in the number of patients per column/arm which is usually taken from ADSL
(variable ADSL2
here):
<- table(ADSL2$ARM)
N_per_arm N_per_arm
ARM A ARM B
146 154
Since this information is not “pre-data” it needs to go to the table creation function build_table
:
basic_table() %>%
split_cols_by("ARM") %>%
add_colcounts() %>%
analyze("USUBJID", s_events_patients) %>%
build_table(ADAE2, col_counts = N_per_arm)
ARM A ARM B
(N=146) (N=154)
——————————————————————————————————————————————————————————————————————————————
Total number of patients with at least one event 114 (78.08%) 150 (97.40%)
Total number of events 2060 1058
We next calculate this information per system organ class:
<- basic_table() %>%
l split_cols_by("ARM") %>%
add_colcounts() %>%
analyze("USUBJID", s_events_patients) %>%
split_rows_by("AEBODSYS", child_labels = "visible", nested = FALSE) %>%
summarize_row_groups("USUBJID", cfun = s_events_patients)
build_table(l, ADAE2, col_counts = N_per_arm)
ARM A ARM B
(N=146) (N=154)
————————————————————————————————————————————————————————————————————————————————
Total number of patients with at least one event 114 (78.08%) 150 (97.40%)
Total number of events 2060 1058
GASTROINTESTINAL DISORDERS
Total number of patients with at least one event 114 (78.08%) 130 (84.42%)
Total number of events 760 374
MUSCULOSKELETAL AND CONNECTIVE TISSUE DISORDERS
Total number of patients with at least one event 98 (67.12%) 81 (52.60%)
Total number of events 273 142
NERVOUS SYSTEM DISORDERS
Total number of patients with at least one event 113 (77.40%) 133 (86.36%)
Total number of events 787 420
VASCULAR DISORDERS
Total number of patients with at least one event 93 (63.70%) 75 (48.70%)
Total number of events 240 122
We now have to the add a count table of AEDECOD
for each AEBODSYS
. The default analyze
behavior for a factor is to create the count table per level (using rtab_inner
):
<- basic_table() %>%
tbl1 split_cols_by("ARM") %>%
add_colcounts() %>%
split_rows_by("AEBODSYS", child_labels = "visible", indent_mod = 1) %>%
summarize_row_groups("USUBJID", cfun = s_events_patients) %>%
analyze("AEDECOD", indent_mod = -1) %>%
build_table(ADAE2, col_counts = N_per_arm)
tbl1
ARM A ARM B
(N=146) (N=154)
——————————————————————————————————————————————————————————————————————————————————
GASTROINTESTINAL DISORDERS
Total number of patients with at least one event 114 (78.08%) 130 (84.42%)
Total number of events 760 374
ABDOMINAL DISCOMFORT 113 65
ABDOMINAL FULLNESS DUE TO GAS 119 65
BACK PAIN 0 0
DIARRHEA 107 53
FAECES SOFT 122 58
GINGIVAL BLEEDING 147 71
HEADACHE 0 0
HYPOTENSION 0 0
NAUSEA (INTERMITTENT) 152 62
ORTHOSTATIC HYPOTENSION 0 0
WEAKNESS 0 0
MUSCULOSKELETAL AND CONNECTIVE TISSUE DISORDERS
Total number of patients with at least one event 98 (67.12%) 81 (52.60%)
Total number of events 273 142
ABDOMINAL DISCOMFORT 0 0
ABDOMINAL FULLNESS DUE TO GAS 0 0
BACK PAIN 135 75
DIARRHEA 0 0
FAECES SOFT 0 0
GINGIVAL BLEEDING 0 0
HEADACHE 0 0
HYPOTENSION 0 0
NAUSEA (INTERMITTENT) 0 0
ORTHOSTATIC HYPOTENSION 0 0
WEAKNESS 138 67
NERVOUS SYSTEM DISORDERS
Total number of patients with at least one event 113 (77.40%) 133 (86.36%)
Total number of events 787 420
ABDOMINAL DISCOMFORT 0 0
ABDOMINAL FULLNESS DUE TO GAS 0 0
BACK PAIN 0 0
DIARRHEA 0 0
FAECES SOFT 0 0
GINGIVAL BLEEDING 0 0
HEADACHE 787 420
HYPOTENSION 0 0
NAUSEA (INTERMITTENT) 0 0
ORTHOSTATIC HYPOTENSION 0 0
WEAKNESS 0 0
VASCULAR DISORDERS
Total number of patients with at least one event 93 (63.70%) 75 (48.70%)
Total number of events 240 122
ABDOMINAL DISCOMFORT 0 0
ABDOMINAL FULLNESS DUE TO GAS 0 0
BACK PAIN 0 0
DIARRHEA 0 0
FAECES SOFT 0 0
GINGIVAL BLEEDING 0 0
HEADACHE 0 0
HYPOTENSION 104 58
NAUSEA (INTERMITTENT) 0 0
ORTHOSTATIC HYPOTENSION 136 64
WEAKNESS 0 0
The indent_mod
argument enables relative indenting changes if the tree structure of the table does not result in the desired indentation by default.
This table so far is however not the usual adverse event table as it counts the total number of events and not the number of subjects one or more events for a particular term. To get the correct table we need to write a custom analysis function:
<- function(df, termvar = "AEDECOD", idvar = "USUBJID") {
table_count_once_per_id
<- df[[termvar]]
x <- df[[idvar]]
id
<- table(x[!duplicated(id)])
counts
in_rows(
.list = as.vector(counts),
.labels = names(counts)
)
}
table_count_once_per_id(ADAE2)
RowsVerticalSection (in_rows) object print method:
----------------------------
row_name formatted_cell indent_mod
1 ABDOMINAL DISCOMFORT 23 0
2 ABDOMINAL FULLNESS DUE TO GAS 21 0
3 BACK PAIN 20 0
4 DIARRHEA 7 0
5 FAECES SOFT 11 0
6 GINGIVAL BLEEDING 15 0
7 HEADACHE 100 0
8 HYPOTENSION 16 0
9 NAUSEA (INTERMITTENT) 21 0
10 ORTHOSTATIC HYPOTENSION 14 0
11 WEAKNESS 16 0
row_label
1 ABDOMINAL DISCOMFORT
2 ABDOMINAL FULLNESS DUE TO GAS
3 BACK PAIN
4 DIARRHEA
5 FAECES SOFT
6 GINGIVAL BLEEDING
7 HEADACHE
8 HYPOTENSION
9 NAUSEA (INTERMITTENT)
10 ORTHOSTATIC HYPOTENSION
11 WEAKNESS
So the desired AE table is:
basic_table() %>%
split_cols_by("ARM") %>%
add_colcounts() %>%
split_rows_by("AEBODSYS", child_labels = "visible", indent_mod = 1) %>%
summarize_row_groups("USUBJID", cfun = s_events_patients) %>%
analyze("AEDECOD", afun = table_count_once_per_id, show_labels = "hidden", indent_mod = -1) %>%
build_table(ADAE2, col_counts = N_per_arm)
ARM A ARM B
(N=146) (N=154)
——————————————————————————————————————————————————————————————————————————————————
GASTROINTESTINAL DISORDERS
Total number of patients with at least one event 114 (78.08%) 130 (84.42%)
Total number of events 760 374
ABDOMINAL DISCOMFORT 24 28
ABDOMINAL FULLNESS DUE TO GAS 18 26
BACK PAIN 0 0
DIARRHEA 17 17
FAECES SOFT 17 14
GINGIVAL BLEEDING 18 25
HEADACHE 0 0
HYPOTENSION 0 0
NAUSEA (INTERMITTENT) 20 20
ORTHOSTATIC HYPOTENSION 0 0
WEAKNESS 0 0
MUSCULOSKELETAL AND CONNECTIVE TISSUE DISORDERS
Total number of patients with at least one event 98 (67.12%) 81 (52.60%)
Total number of events 273 142
ABDOMINAL DISCOMFORT 0 0
ABDOMINAL FULLNESS DUE TO GAS 0 0
BACK PAIN 58 45
DIARRHEA 0 0
FAECES SOFT 0 0
GINGIVAL BLEEDING 0 0
HEADACHE 0 0
HYPOTENSION 0 0
NAUSEA (INTERMITTENT) 0 0
ORTHOSTATIC HYPOTENSION 0 0
WEAKNESS 40 36
NERVOUS SYSTEM DISORDERS
Total number of patients with at least one event 113 (77.40%) 133 (86.36%)
Total number of events 787 420
ABDOMINAL DISCOMFORT 0 0
ABDOMINAL FULLNESS DUE TO GAS 0 0
BACK PAIN 0 0
DIARRHEA 0 0
FAECES SOFT 0 0
GINGIVAL BLEEDING 0 0
HEADACHE 113 133
HYPOTENSION 0 0
NAUSEA (INTERMITTENT) 0 0
ORTHOSTATIC HYPOTENSION 0 0
WEAKNESS 0 0
VASCULAR DISORDERS
Total number of patients with at least one event 93 (63.70%) 75 (48.70%)
Total number of events 240 122
ABDOMINAL DISCOMFORT 0 0
ABDOMINAL FULLNESS DUE TO GAS 0 0
BACK PAIN 0 0
DIARRHEA 0 0
FAECES SOFT 0 0
GINGIVAL BLEEDING 0 0
HEADACHE 0 0
HYPOTENSION 44 31
NAUSEA (INTERMITTENT) 0 0
ORTHOSTATIC HYPOTENSION 49 44
WEAKNESS 0 0
Note that we are missing the overall summary in the first two rows. This can be added with another analyze
call and then setting nested
to FALSE
in the subsequent summarize_row_groups
call:
<- basic_table() %>%
tbl split_cols_by("ARM") %>%
add_colcounts() %>%
analyze("USUBJID", afun = s_events_patients) %>%
split_rows_by("AEBODSYS", child_labels = "visible", nested = FALSE, indent_mod = 1) %>%
summarize_row_groups("USUBJID", cfun = s_events_patients) %>%
analyze("AEDECOD", table_count_once_per_id, show_labels = "hidden", indent_mod = -1) %>%
build_table(ADAE2, col_counts = N_per_arm)
tbl
ARM A ARM B
(N=146) (N=154)
——————————————————————————————————————————————————————————————————————————————————
Total number of patients with at least one event 114 (78.08%) 150 (97.40%)
Total number of events 2060 1058
GASTROINTESTINAL DISORDERS
Total number of patients with at least one event 114 (78.08%) 130 (84.42%)
Total number of events 760 374
ABDOMINAL DISCOMFORT 24 28
ABDOMINAL FULLNESS DUE TO GAS 18 26
BACK PAIN 0 0
DIARRHEA 17 17
FAECES SOFT 17 14
GINGIVAL BLEEDING 18 25
HEADACHE 0 0
HYPOTENSION 0 0
NAUSEA (INTERMITTENT) 20 20
ORTHOSTATIC HYPOTENSION 0 0
WEAKNESS 0 0
MUSCULOSKELETAL AND CONNECTIVE TISSUE DISORDERS
Total number of patients with at least one event 98 (67.12%) 81 (52.60%)
Total number of events 273 142
ABDOMINAL DISCOMFORT 0 0
ABDOMINAL FULLNESS DUE TO GAS 0 0
BACK PAIN 58 45
DIARRHEA 0 0
FAECES SOFT 0 0
GINGIVAL BLEEDING 0 0
HEADACHE 0 0
HYPOTENSION 0 0
NAUSEA (INTERMITTENT) 0 0
ORTHOSTATIC HYPOTENSION 0 0
WEAKNESS 40 36
NERVOUS SYSTEM DISORDERS
Total number of patients with at least one event 113 (77.40%) 133 (86.36%)
Total number of events 787 420
ABDOMINAL DISCOMFORT 0 0
ABDOMINAL FULLNESS DUE TO GAS 0 0
BACK PAIN 0 0
DIARRHEA 0 0
FAECES SOFT 0 0
GINGIVAL BLEEDING 0 0
HEADACHE 113 133
HYPOTENSION 0 0
NAUSEA (INTERMITTENT) 0 0
ORTHOSTATIC HYPOTENSION 0 0
WEAKNESS 0 0
VASCULAR DISORDERS
Total number of patients with at least one event 93 (63.70%) 75 (48.70%)
Total number of events 240 122
ABDOMINAL DISCOMFORT 0 0
ABDOMINAL FULLNESS DUE TO GAS 0 0
BACK PAIN 0 0
DIARRHEA 0 0
FAECES SOFT 0 0
GINGIVAL BLEEDING 0 0
HEADACHE 0 0
HYPOTENSION 44 31
NAUSEA (INTERMITTENT) 0 0
ORTHOSTATIC HYPOTENSION 49 44
WEAKNESS 0 0
Finally, if we wanted to prune the 0 counts row we can do that with the trim_rows
function:
trim_rows(tbl)
ARM A ARM B
(N=146) (N=154)
——————————————————————————————————————————————————————————————————————————————————
Total number of patients with at least one event 114 (78.08%) 150 (97.40%)
Total number of events 2060 1058
GASTROINTESTINAL DISORDERS
Total number of patients with at least one event 114 (78.08%) 130 (84.42%)
Total number of events 760 374
ABDOMINAL DISCOMFORT 24 28
ABDOMINAL FULLNESS DUE TO GAS 18 26
DIARRHEA 17 17
FAECES SOFT 17 14
GINGIVAL BLEEDING 18 25
NAUSEA (INTERMITTENT) 20 20
MUSCULOSKELETAL AND CONNECTIVE TISSUE DISORDERS
Total number of patients with at least one event 98 (67.12%) 81 (52.60%)
Total number of events 273 142
BACK PAIN 58 45
WEAKNESS 40 36
NERVOUS SYSTEM DISORDERS
Total number of patients with at least one event 113 (77.40%) 133 (86.36%)
Total number of events 787 420
HEADACHE 113 133
VASCULAR DISORDERS
Total number of patients with at least one event 93 (63.70%) 75 (48.70%)
Total number of events 240 122
HYPOTENSION 44 31
ORTHOSTATIC HYPOTENSION 49 44
Pruning is a larger topic with a separate rtables
package vignette.
The adverse events table by ID and by grade shows how many patients had at least one adverse event per grade for different subsets of the data (e.g. defined by system organ class).
For this table we do not show the zero count grades. Note that we add the “overall” groups with a custom split function.
<- function(df, labelstr = "", gradevar = "AETOXGR", idvar = "USUBJID", grade_levels = NULL) {
table_count_grade_once_per_id
<- df[[idvar]]
id <- df[[gradevar]]
grade
if (!is.null(grade_levels)) {
stopifnot(all(grade %in% grade_levels))
<- factor(grade, levels = grade_levels)
grade
}
<- !duplicated(id)
id_sel
in_rows(
"--Any Grade--" = sum(id_sel),
.list = as.list(table(grade[id_sel]))
)
}
table_count_grade_once_per_id(ex_adae, grade_levels = 1:5)
RowsVerticalSection (in_rows) object print method:
----------------------------
row_name formatted_cell indent_mod row_label
1 --Any Grade-- 365 0 --Any Grade--
2 1 131 0 1
3 2 70 0 2
4 3 74 0 3
5 4 25 0 4
6 5 65 0 5
All of the layouting concepts needed to create this table have already been introduced so far:
basic_table() %>%
split_cols_by("ARM") %>%
add_colcounts() %>%
analyze("AETOXGR",
afun = table_count_grade_once_per_id,
extra_args = list(grade_levels = 1:5),
var_labels = "- Any adverse events -", show_labels = "visible") %>%
split_rows_by("AEBODSYS", child_labels = "visible", nested = FALSE, indent_mod = 1) %>%
summarize_row_groups(cfun = table_count_grade_once_per_id, format = "xx", indent_mod = 1) %>%
split_rows_by("AEDECOD", child_labels = "visible", indent_mod = -2) %>%
analyze("AETOXGR",
afun = table_count_grade_once_per_id,
extra_args = list(grade_levels = 1:5), show_labels = "hidden") %>%
build_table(ADAE2, col_counts = N_per_arm)
ARM A ARM B
(N=146) (N=154)
—————————————————————————————————————————————————————————————————————
- Any adverse events -
--Any Grade-- 114 150
1 32 34
2 22 30
3 11 21
4 8 6
5 41 59
GASTROINTESTINAL DISORDERS
--Any Grade-- 114 130
1 77 96
2 37 34
3 0 0
4 0 0
5 0 0
ABDOMINAL DISCOMFORT
--Any Grade-- 68 49
1 68 49
2 0 0
3 0 0
4 0 0
5 0 0
ABDOMINAL FULLNESS DUE TO GAS
--Any Grade-- 73 51
1 73 51
2 0 0
3 0 0
4 0 0
5 0 0
BACK PAIN
--Any Grade-- 0 0
1 0 0
2 0 0
3 0 0
4 0 0
5 0 0
DIARRHEA
--Any Grade-- 68 40
1 68 40
2 0 0
3 0 0
4 0 0
5 0 0
FAECES SOFT
--Any Grade-- 76 44
1 0 0
2 76 44
3 0 0
4 0 0
5 0 0
GINGIVAL BLEEDING
--Any Grade-- 80 52
1 80 52
2 0 0
3 0 0
4 0 0
5 0 0
HEADACHE
--Any Grade-- 0 0
1 0 0
2 0 0
3 0 0
4 0 0
5 0 0
HYPOTENSION
--Any Grade-- 0 0
1 0 0
2 0 0
3 0 0
4 0 0
5 0 0
NAUSEA (INTERMITTENT)
--Any Grade-- 83 50
1 0 0
2 83 50
3 0 0
4 0 0
5 0 0
ORTHOSTATIC HYPOTENSION
--Any Grade-- 0 0
1 0 0
2 0 0
3 0 0
4 0 0
5 0 0
WEAKNESS
--Any Grade-- 0 0
1 0 0
2 0 0
3 0 0
4 0 0
5 0 0
MUSCULOSKELETAL AND CONNECTIVE TISSUE DISORDERS
--Any Grade-- 98 81
1 0 0
2 58 45
3 40 36
4 0 0
5 0 0
ABDOMINAL DISCOMFORT
--Any Grade-- 0 0
1 0 0
2 0 0
3 0 0
4 0 0
5 0 0
ABDOMINAL FULLNESS DUE TO GAS
--Any Grade-- 0 0
1 0 0
2 0 0
3 0 0
4 0 0
5 0 0
BACK PAIN
--Any Grade-- 79 62
1 0 0
2 79 62
3 0 0
4 0 0
5 0 0
DIARRHEA
--Any Grade-- 0 0
1 0 0
2 0 0
3 0 0
4 0 0
5 0 0
FAECES SOFT
--Any Grade-- 0 0
1 0 0
2 0 0
3 0 0
4 0 0
5 0 0
GINGIVAL BLEEDING
--Any Grade-- 0 0
1 0 0
2 0 0
3 0 0
4 0 0
5 0 0
HEADACHE
--Any Grade-- 0 0
1 0 0
2 0 0
3 0 0
4 0 0
5 0 0
HYPOTENSION
--Any Grade-- 0 0
1 0 0
2 0 0
3 0 0
4 0 0
5 0 0
NAUSEA (INTERMITTENT)
--Any Grade-- 0 0
1 0 0
2 0 0
3 0 0
4 0 0
5 0 0
ORTHOSTATIC HYPOTENSION
--Any Grade-- 0 0
1 0 0
2 0 0
3 0 0
4 0 0
5 0 0
WEAKNESS
--Any Grade-- 73 43
1 0 0
2 0 0
3 73 43
4 0 0
5 0 0
NERVOUS SYSTEM DISORDERS
--Any Grade-- 113 133
1 0 0
2 0 0
3 0 0
4 0 0
5 113 133
ABDOMINAL DISCOMFORT
--Any Grade-- 0 0
1 0 0
2 0 0
3 0 0
4 0 0
5 0 0
ABDOMINAL FULLNESS DUE TO GAS
--Any Grade-- 0 0
1 0 0
2 0 0
3 0 0
4 0 0
5 0 0
BACK PAIN
--Any Grade-- 0 0
1 0 0
2 0 0
3 0 0
4 0 0
5 0 0
DIARRHEA
--Any Grade-- 0 0
1 0 0
2 0 0
3 0 0
4 0 0
5 0 0
FAECES SOFT
--Any Grade-- 0 0
1 0 0
2 0 0
3 0 0
4 0 0
5 0 0
GINGIVAL BLEEDING
--Any Grade-- 0 0
1 0 0
2 0 0
3 0 0
4 0 0
5 0 0
HEADACHE
--Any Grade-- 113 133
1 0 0
2 0 0
3 0 0
4 0 0
5 113 133
HYPOTENSION
--Any Grade-- 0 0
1 0 0
2 0 0
3 0 0
4 0 0
5 0 0
NAUSEA (INTERMITTENT)
--Any Grade-- 0 0
1 0 0
2 0 0
3 0 0
4 0 0
5 0 0
ORTHOSTATIC HYPOTENSION
--Any Grade-- 0 0
1 0 0
2 0 0
3 0 0
4 0 0
5 0 0
WEAKNESS
--Any Grade-- 0 0
1 0 0
2 0 0
3 0 0
4 0 0
5 0 0
VASCULAR DISORDERS
--Any Grade-- 93 75
1 0 0
2 0 0
3 44 31
4 49 44
5 0 0
ABDOMINAL DISCOMFORT
--Any Grade-- 0 0
1 0 0
2 0 0
3 0 0
4 0 0
5 0 0
ABDOMINAL FULLNESS DUE TO GAS
--Any Grade-- 0 0
1 0 0
2 0 0
3 0 0
4 0 0
5 0 0
BACK PAIN
--Any Grade-- 0 0
1 0 0
2 0 0
3 0 0
4 0 0
5 0 0
DIARRHEA
--Any Grade-- 0 0
1 0 0
2 0 0
3 0 0
4 0 0
5 0 0
FAECES SOFT
--Any Grade-- 0 0
1 0 0
2 0 0
3 0 0
4 0 0
5 0 0
GINGIVAL BLEEDING
--Any Grade-- 0 0
1 0 0
2 0 0
3 0 0
4 0 0
5 0 0
HEADACHE
--Any Grade-- 0 0
1 0 0
2 0 0
3 0 0
4 0 0
5 0 0
HYPOTENSION
--Any Grade-- 66 43
1 0 0
2 0 0
3 66 43
4 0 0
5 0 0
NAUSEA (INTERMITTENT)
--Any Grade-- 0 0
1 0 0
2 0 0
3 0 0
4 0 0
5 0 0
ORTHOSTATIC HYPOTENSION
--Any Grade-- 70 54
1 0 0
2 0 0
3 0 0
4 70 54
5 0 0
WEAKNESS
--Any Grade-- 0 0
1 0 0
2 0 0
3 0 0
4 0 0
5 0 0
The response table that we will create here is composed of 3 parts:
Let’s start with the first part which is fairly simple to derive:
<- ex_adrs %>%
ADRS_BESRSPI filter(PARAMCD == "BESRSPI") %>%
mutate(
rsp = factor(AVALC %in% c("CR", "PR"), levels = c(TRUE, FALSE), labels = c("Responders", "Non-Responders")),
is_rsp = (rsp == "Responders")
)
<- function(x, .N_col) {
s_proportion in_rows(.list = lapply(as.list(table(x)), function(xi) rcell(xi * c(1, 1/.N_col), format = "xx.xx (xx.xx%)")))
}
basic_table() %>%
split_cols_by("ARMCD", ref_group = "ARM A") %>%
add_colcounts() %>%
analyze("rsp", s_proportion, show_labels = "hidden") %>%
build_table(ADRS_BESRSPI)
ARM A ARM B ARM C
(N=134) (N=134) (N=132)
———————————————————————————————————————————————————————————————————
Responders 114.00 (85.07%) 90.00 (67.16%) 120.00 (90.91%)
Non-Responders 20.00 (14.93%) 44.00 (32.84%) 12.00 (9.09%)
Note that we did set the ref_group
argument in split_cols_by
which for the current table had no effect as we only use the cell data for the responder and non-responder counting. The ref_group
argument is needed for the part 2. and 3. of the table.
We will now look the implementation of part “2. Unstratified analysis comparison vs. control group.” Let’s start with the analysis function:
<- function(x, .ref_group, .in_ref_col) {
s_unstratified_response_analysis
if (.in_ref_col) {
return(in_rows(
"Difference in Response Rates (%)" = rcell(numeric(0)),
"95% CI (Wald, with correction)" = rcell(numeric(0)),
"p-value (Chi-Squared Test)" = rcell(numeric(0)),
"Odds Ratio (95% CI)" = rcell(numeric(0))
))
}
<- stats::prop.test(
fit x = c(sum(x), sum(.ref_group)),
n = c(length(x), length(.ref_group)),
correct = FALSE
)
<- stats::glm(
fit_glm formula = rsp ~ group,
data = data.frame(
rsp = c(.ref_group, x),
group = factor(rep(c("ref", "x"), times = c(length(.ref_group), length(x))), levels = c("ref", "x"))
),family = binomial(link = "logit")
)
in_rows(
"Difference in Response Rates (%)" = non_ref_rcell((mean(x) - mean(.ref_group)) * 100,
format = "xx.xx") ,
.in_ref_col, "95% CI (Wald, with correction)" = non_ref_rcell(fit$conf.int * 100,
format = "(xx.xx, xx.xx)"),
.in_ref_col, "p-value (Chi-Squared Test)" = non_ref_rcell(fit$p.value,
format = "x.xxxx | (<0.0001)"),
.in_ref_col, "Odds Ratio (95% CI)" = non_ref_rcell(c(
exp(stats::coef(fit_glm)[-1]),
exp(stats::confint.default(fit_glm, level = .95)[-1, , drop = FALSE])
),format = "xx.xx (xx.xx - xx.xx)")
.in_ref_col,
)
}
s_unstratified_response_analysis(
x = ADRS_BESRSPI %>% filter(ARM == "A: Drug X") %>% pull(is_rsp),
.ref_group = ADRS_BESRSPI %>% filter(ARM == "B: Placebo") %>% pull(is_rsp),
.in_ref_col = FALSE
)
RowsVerticalSection (in_rows) object print method:
----------------------------
row_name formatted_cell indent_mod
1 Difference in Response Rates (%) 17.91 0
2 95% CI (Wald, with correction) (7.93, 27.89) 0
3 p-value (Chi-Squared Test) 0.0006 0
4 Odds Ratio (95% CI) 2.79 (1.53 - 5.06) 0
row_label
1 Difference in Response Rates (%)
2 95% CI (Wald, with correction)
3 p-value (Chi-Squared Test)
4 Odds Ratio (95% CI)
Hence we can now add the next vignette to the table:
basic_table() %>%
split_cols_by("ARMCD", ref_group = "ARM A") %>%
add_colcounts() %>%
analyze("rsp", s_proportion, show_labels = "hidden") %>%
analyze("is_rsp", s_unstratified_response_analysis, show_labels = "visible", var_labels = "Unstratified Response Analysis") %>%
build_table(ADRS_BESRSPI)
ARM A ARM B ARM C
(N=134) (N=134) (N=132)
——————————————————————————————————————————————————————————————————————————————————————————————
Responders 114.00 (85.07%) 90.00 (67.16%) 120.00 (90.91%)
Non-Responders 20.00 (14.93%) 44.00 (32.84%) 12.00 (9.09%)
Unstratified Response Analysis
Difference in Response Rates (%) -17.91 5.83
95% CI (Wald, with correction) (-27.89, -7.93) (-1.94, 13.61)
p-value (Chi-Squared Test) 0.0006 0.1436
Odds Ratio (95% CI) 0.36 (0.20 - 0.65) 1.75 (0.82 - 3.75)
Next we will add part 3. the “multinomial response table”. To do so, we are adding a row-split by response level, and then doing the same thing as we did for the binary response table above.
<- function(df, .N_col) {
s_prop in_rows(
"95% CI (Wald, with correction)" = rcell(binom.test(nrow(df), .N_col)$conf.int * 100, format = "(xx.xx, xx.xx)")
)
}
s_prop(
df = ADRS_BESRSPI %>% filter(ARM == "A: Drug X", AVALC == "CR"),
.N_col = sum(ADRS_BESRSPI$ARM == "A: Drug X")
)
RowsVerticalSection (in_rows) object print method:
----------------------------
row_name formatted_cell indent_mod
1 95% CI (Wald, with correction) (49.38, 66.67) 0
row_label
1 95% CI (Wald, with correction)
We can now create the final response table with all three parts:
basic_table() %>%
split_cols_by("ARMCD", ref_group = "ARM A") %>%
add_colcounts() %>%
analyze("rsp", s_proportion, show_labels = "hidden") %>%
analyze("is_rsp", s_unstratified_response_analysis,
show_labels = "visible", var_labels = "Unstratified Response Analysis") %>%
split_rows_by(
var = "AVALC",
split_fun = reorder_split_levels(neworder = c("CR", "PR", "SD", "NON CR/PD", "PD", "NE"), drlevels = TRUE),
nested = FALSE
%>%
) summarize_row_groups() %>%
analyze("AVALC", afun = s_prop) %>%
build_table(ADRS_BESRSPI)
ARM A ARM B ARM C
(N=134) (N=134) (N=132)
——————————————————————————————————————————————————————————————————————————————————————————————
Responders 114.00 (85.07%) 90.00 (67.16%) 120.00 (90.91%)
Non-Responders 20.00 (14.93%) 44.00 (32.84%) 12.00 (9.09%)
Unstratified Response Analysis
Difference in Response Rates (%) -17.91 5.83
95% CI (Wald, with correction) (-27.89, -7.93) (-1.94, 13.61)
p-value (Chi-Squared Test) 0.0006 0.1436
Odds Ratio (95% CI) 0.36 (0.20 - 0.65) 1.75 (0.82 - 3.75)
CR 78 (58.2%) 55 (41.0%) 97 (73.5%)
95% CI (Wald, with correction) (49.38, 66.67) (32.63, 49.87) (65.10, 80.79)
PR 36 (26.9%) 35 (26.1%) 23 (17.4%)
95% CI (Wald, with correction) (19.58, 35.20) (18.92, 34.41) (11.38, 24.99)
SD 20 (14.9%) 44 (32.8%) 12 (9.1%)
95% CI (Wald, with correction) (9.36, 22.11) (24.97, 41.47) (4.79, 15.34)
In case the we wanted to rename the levels of AVALC
and remove the CI for NE
we could do that as follows:
<- function(x) {
rsp_label <- c(
rsp_full_label CR = "Complete Response (CR)",
PR = "Partial Response (PR)",
SD = "Stable Disease (SD)",
`NON CR/PD` = "Non-CR or Non-PD (NON CR/PD)",
PD = "Progressive Disease (PD)",
NE = "Not Evaluable (NE)",
Missing = "Missing",
`NE/Missing` = "Missing or unevaluable"
)stopifnot(all(x %in% names(rsp_full_label)))
rsp_full_label[x]
}
<- basic_table() %>%
tbl split_cols_by("ARMCD", ref_group = "ARM A") %>%
add_colcounts() %>%
analyze("rsp", s_proportion, show_labels = "hidden") %>%
analyze("is_rsp", s_unstratified_response_analysis,
show_labels = "visible", var_labels = "Unstratified Response Analysis") %>%
split_rows_by(
var = "AVALC",
split_fun = keep_split_levels(c("CR", "PR", "SD", "PD"), reorder = TRUE),
nested = FALSE
%>%
) summarize_row_groups(cfun = function(df, labelstr, .N_col) {
in_rows(nrow(df) * c(1, 1/.N_col), .formats = "xx (xx.xx%)", .labels = rsp_label(labelstr))
%>%
}) analyze("AVALC", afun = s_prop) %>%
analyze("AVALC", afun = function(x, .N_col) {
in_rows(rcell(sum(x == "NE") * c(1, 1/.N_col), format = "xx.xx (xx.xx%)"), .labels = rsp_label("NE"))
nested = FALSE) %>%
}, build_table(ADRS_BESRSPI)
tbl
ARM A ARM B ARM C
(N=134) (N=134) (N=132)
——————————————————————————————————————————————————————————————————————————————————————————————
Responders 114.00 (85.07%) 90.00 (67.16%) 120.00 (90.91%)
Non-Responders 20.00 (14.93%) 44.00 (32.84%) 12.00 (9.09%)
Unstratified Response Analysis
Difference in Response Rates (%) -17.91 5.83
95% CI (Wald, with correction) (-27.89, -7.93) (-1.94, 13.61)
p-value (Chi-Squared Test) 0.0006 0.1436
Odds Ratio (95% CI) 0.36 (0.20 - 0.65) 1.75 (0.82 - 3.75)
Complete Response (CR) 78 (58.21%) 55 (41.04%) 97 (73.48%)
95% CI (Wald, with correction) (49.38, 66.67) (32.63, 49.87) (65.10, 80.79)
Partial Response (PR) 36 (26.87%) 35 (26.12%) 23 (17.42%)
95% CI (Wald, with correction) (19.58, 35.20) (18.92, 34.41) (11.38, 24.99)
Stable Disease (SD) 20 (14.93%) 44 (32.84%) 12 (9.09%)
95% CI (Wald, with correction) (9.36, 22.11) (24.97, 41.47) (4.79, 15.34)
Progressive Disease (PD) 0 (0.00%) 0 (0.00%) 0 (0.00%)
95% CI (Wald, with correction) (0.00, 2.72) (0.00, 2.72) (0.00, 2.76)
Not Evaluable (NE) 0.00 (0.00%) 0.00 (0.00%) 0.00 (0.00%)
Note that the table is missing the rows gaps to make it more readable. The row spacing feature is on the rtables
roadmap and will be implemented in future.