This vignette covers the the less-typical uses of REDCapR to interact with REDCap through its API.
There is some information that is specific to a REDCap project, as opposed to an individual operation. This includes the (1) uri of the server, and the (2) token for the user’s project. This is hosted on a machine used in REDCapR’s public test suite, so you can run this example from any computer. Unless tests are running.
Other than PHI-free demos, we strongly suggest storing tokens
securely and avoiding hard-coding them like below. Our recommendation is
to store tokens in
a database. If that is not feasible for your institution, consider
storing them in a secured csv and retrieving with REDCapR::retrieve_credential_local()
.
library(REDCapR) #Load the package into the current R session.
<- "https://bbmc.ouhsc.edu/redcap/api/"
uri <- "9A81268476645C4E5F03428B8AC3AA7B"
token_simple <- "0434F0E9CF53ED0587847AB6E51DE762" token_longitudinal
Disclaimer: Occasionally we’re asked for a longitudinal dataset to be converted from a “long/tall format” (where typically each row is one observation for a participant) to a “wide format” (where each row is on participant). Usually we advise against it. Besides all the database benefits of a long structure, a wide structure restricts your options with the stat routine. No modern longitudinal analysis procedures (e.g., growth curve models or multilevel/hierarchical models) accept wide. You’re pretty much stuck with repeated measures anova, which is very inflexible for real-world medical-ish analyses. It requires a patient to have a measurement at every time point; otherwise the anova excludes the patient entirely.
However we like going wide to produce visual tables for publications, and here’s one way to do it in R. First retrieve the dataset from REDCap.
library(magrittr)
suppressPackageStartupMessages(requireNamespace("dplyr"))
suppressPackageStartupMessages(requireNamespace("tidyr"))
<- c("dose_1_arm_1", "visit_1_arm_1", "dose_2_arm_1", "visit_2_arm_1")
events_to_retain
<- REDCapR::redcap_read_oneshot(redcap_uri = uri, token = token_longitudinal)$data ds_long
#> 18 records and 125 columns were read from REDCap in 0.1 seconds. The http status code was 200.
%>%
ds_long ::select(study_id, redcap_event_name, pmq1, pmq2, pmq3, pmq4) dplyr
#> study_id redcap_event_name pmq1 pmq2 pmq3 pmq4
#> 1 100 enrollment_arm_1 NA NA NA NA
#> 2 100 dose_1_arm_1 2 2 1 1
#> 3 100 visit_1_arm_1 1 0 0 0
#> 4 100 dose_2_arm_1 3 1 0 0
#> 5 100 visit_2_arm_1 0 1 0 0
#> 6 100 final_visit_arm_1 NA NA NA NA
#> 7 220 enrollment_arm_1 NA NA NA NA
#> 8 220 dose_1_arm_1 0 1 0 2
#> 9 220 visit_1_arm_1 0 3 1 0
#> 10 220 dose_2_arm_1 1 2 0 1
#> 11 220 visit_2_arm_1 3 4 1 0
#> 12 220 final_visit_arm_1 NA NA NA NA
#> 13 304 enrollment_arm_2 NA NA NA NA
#> 14 304 deadline_to_opt_ou_arm_2 NA NA NA NA
#> 15 304 first_dose_arm_2 0 1 0 0
#> 16 304 first_visit_arm_2 2 0 0 0
#> 17 304 final_visit_arm_2 NA NA NA NA
#> 18 304 deadline_to_return_arm_2 NA NA NA NA
When widening only one variable (e.g., pmq1
),
the code’s pretty simple:
<-
ds_wide %>%
ds_long ::select(study_id, redcap_event_name, pmq1) %>%
dplyr::filter(redcap_event_name %in% events_to_retain) %>%
dplyr::pivot_wider(
tidyrid_cols = study_id,
names_from = redcap_event_name,
values_from = pmq1
)# For old versions of tidyr that predate `pivot_wider()`:
# tidyr::spread(key=redcap_event_name, value=pmq1)
ds_wide
#> # A tibble: 2 × 5
#> study_id dose_1_arm_1 visit_1_arm_1 dose_2_arm_1 visit_2_arm_1
#> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 100 2 1 3 0
#> 2 220 0 0 1 3
In some scenarios, multiple variables (e.g.,
pmq1
- pmq4
) can be widened in a single
tidyr::pivot_wider()
operation. This example contains the
additional wrinkle that the REDCap event names “first_dose” and
“first_visit” are renamed “dose_1” and “visit_1”, which will help all
the values be dose and visit values be proper numbers.
<- "^(\\w+?)_arm_(\\d)$"
pattern <-
ds_wide %>%
ds_long ::select(study_id, redcap_event_name, pmq1, pmq2, pmq3, pmq4) %>%
dplyr::mutate(
dplyrevent = sub(pattern, "\\1", redcap_event_name),
event = dplyr::recode(event, "first_dose"="dose_1", "first_visit"="visit_1"),
arm = as.integer(sub(pattern, "\\2", redcap_event_name))
%>%
) ::select(study_id, event, arm, pmq1, pmq2, pmq3, pmq4) %>%
dplyr::filter(!(event %in% c(
dplyr"enrollment", "final_visit", "deadline_to_return", "deadline_to_opt_ou")
%>%
)) ::pivot_wider(
tidyrid_cols = c(study_id, arm),
names_from = event,
values_from = c(pmq1, pmq2, pmq3, pmq4)
)
ds_wide
#> # A tibble: 3 × 18
#> study_id arm pmq1_…¹ pmq1_…² pmq1_…³ pmq1_…⁴ pmq2_…⁵ pmq2_…⁶ pmq2_…⁷ pmq2_…⁸
#> <dbl> <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 100 1 2 1 3 0 2 0 1 1
#> 2 220 1 0 0 1 3 1 3 2 4
#> 3 304 2 0 2 NA NA 1 0 NA NA
#> # … with 8 more variables: pmq3_dose_1 <dbl>, pmq3_visit_1 <dbl>,
#> # pmq3_dose_2 <dbl>, pmq3_visit_2 <dbl>, pmq4_dose_1 <dbl>,
#> # pmq4_visit_1 <dbl>, pmq4_dose_2 <dbl>, pmq4_visit_2 <dbl>, and abbreviated
#> # variable names ¹pmq1_dose_1, ²pmq1_visit_1, ³pmq1_dose_2, ⁴pmq1_visit_2,
#> # ⁵pmq2_dose_1, ⁶pmq2_visit_1, ⁷pmq2_dose_2, ⁸pmq2_visit_2
#> # ℹ Use `colnames()` to see all variable names
However, in other widening scenarios, it can be easier to go even
longer/taller (e.g., ds_eav
) before reversing
direction and going wide.
<-
ds_eav %>%
ds_long ::select(study_id, redcap_event_name, pmq1, pmq2, pmq3, pmq4) %>%
dplyr::mutate(
dplyrevent = sub(pattern, "\\1", redcap_event_name),
event = dplyr::recode(event, "first_dose" = "dose_1", "first_visit" = "visit_1"),
arm = as.integer(sub(pattern, "\\2", redcap_event_name))
%>%
) ::select(study_id, event, arm, pmq1, pmq2, pmq3, pmq4) %>%
dplyr::pivot_longer(
tidyrcols = c(pmq1, pmq2, pmq3, pmq4),
names_to = "key",
values_to = "value"
%>%
) # For old versions of tidyr that predate `pivot_wider()`:
# tidyr::gather(key=key, value=value, pmq1, pmq2, pmq3, pmq4) %>%
::filter(!(event %in% c(
dplyr"enrollment", "final_visit", "deadline_to_return", "deadline_to_opt_ou")
%>%
)) ::mutate( # Simulate correcting for mismatched names across arms:
dplyrkey = paste0(key, "_", event)
%>%
) ::select(-event)
dplyr
# Show the first 10 rows of the EAV table.
%>%
ds_eav head(10)
#> # A tibble: 10 × 4
#> study_id arm key value
#> <dbl> <int> <chr> <dbl>
#> 1 100 1 pmq1_dose_1 2
#> 2 100 1 pmq2_dose_1 2
#> 3 100 1 pmq3_dose_1 1
#> 4 100 1 pmq4_dose_1 1
#> 5 100 1 pmq1_visit_1 1
#> 6 100 1 pmq2_visit_1 0
#> 7 100 1 pmq3_visit_1 0
#> 8 100 1 pmq4_visit_1 0
#> 9 100 1 pmq1_dose_2 3
#> 10 100 1 pmq2_dose_2 1
# Spread the EAV to wide.
<-
ds_wide_2 %>%
ds_eav ::pivot_wider(
tidyrid_cols = c(study_id, arm),
names_from = key,
values_from = value
)# For old versions of tidyr that predate `pivot_wider()`:
# tidyr::spread(key=key, value=value)
ds_wide_2
#> # A tibble: 3 × 18
#> study_id arm pmq1_…¹ pmq2_…² pmq3_…³ pmq4_…⁴ pmq1_…⁵ pmq2_…⁶ pmq3_…⁷ pmq4_…⁸
#> <dbl> <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 100 1 2 2 1 1 1 0 0 0
#> 2 220 1 0 1 0 2 0 3 1 0
#> 3 304 2 0 1 0 0 2 0 0 0
#> # … with 8 more variables: pmq1_dose_2 <dbl>, pmq2_dose_2 <dbl>,
#> # pmq3_dose_2 <dbl>, pmq4_dose_2 <dbl>, pmq1_visit_2 <dbl>,
#> # pmq2_visit_2 <dbl>, pmq3_visit_2 <dbl>, pmq4_visit_2 <dbl>, and abbreviated
#> # variable names ¹pmq1_dose_1, ²pmq2_dose_1, ³pmq3_dose_1, ⁴pmq4_dose_1,
#> # ⁵pmq1_visit_1, ⁶pmq2_visit_1, ⁷pmq3_visit_1, ⁸pmq4_visit_1
#> # ℹ Use `colnames()` to see all variable names
Lots of packages and documentation exist. Our current preference is the tidyverse approach to pivoting, but the data.table approach is worth considering if you’re comfortable with that package. This Stack Overflow post describes several ways. We recommend against the reshape and reshape2 packages, because their developers have replaced them with the tidyr functions described above.
If you require a feature that is not available from your instance’s API, first upgrade your institution’s REDCap instance and see if the feature has been added recently. Second, check if someone has released the desired API-like features as an REDCap External Module.
Third, you may need to query the database underneath REDCap’s web server. The Transfer Credentials section of the Security Database Vignette provides a complete example of using R to query the MySQL database through odbc.
We find it’s best to develop the query in MySQL Workbench,
then copy the code to R (or alternatively, use OuhscMunge::execute_sql_file()
).
Here is an example that retrieves the first_submit_time
,
which is helpful if you need a timestamp from surveys that were not
marked as completed. Replace ‘444’ with your pid, and 1001 through 1003
with the desired events.
SELECT
as participant_survey_id
p.participant_id record as record_id
,r.
,p.event_idas event_name
,e.descrip
,r.first_submit_time
,r.completion_time
-- ,p.*
-- ,r.*
FROM redcapv3.redcap_surveys_participants as p
left join redcapv3.redcap_surveys_response as r on p.participant_id = r.participant_id
left join redcapv3.redcap_events_metadata as e on p.event_id = e.event_id
where
= 444
p.survey_id and
in (
p.event_id 1001, -- start of the year
1002, -- mid term
1003 -- end of year
)
The official cURL site discusses the process of using SSL to verify the server being connected to.
Use the SSL cert file that come with the openssl
package.
<- system.file("cacert.pem", package = "openssl")
cert_location if (file.exists(cert_location)) {
<- list(cainfo = cert_location)
config_options <- redcap_read_oneshot(
ds_different_cert_file redcap_uri = uri,
token = token_simple,
config_options = config_options
$data
) }
#> 5 records and 25 columns were read from REDCap in 0.1 seconds. The http status code was 200.
Force the connection to use SSL=3 (which is not preferred, and possibly insecure).
<- list(sslversion = 3)
config_options <- redcap_read_oneshot(
ds_ssl_3 redcap_uri = uri,
token = token_simple,
config_options = config_options
$data )
#> 5 records and 25 columns were read from REDCap in 0.1 seconds. The http status code was 200.
<- list(ssl.verifypeer = FALSE)
config_options <- redcap_read_oneshot(
ds_no_ssl redcap_uri = uri,
token = token_simple,
config_options = config_options
$data )
#> 5 records and 25 columns were read from REDCap in 0.1 seconds. The http status code was 200.
The solution https://stackoverflow.com/a/51013678/1082435 converts levels specified in SPSS output like
SEX 0 Male
1 Female
LANGUAGE 1 English
2 Spanish
3 Other
6 Unknown
to a dropdown choices in a REDCap data dictionary like
Variable Values
SEX 0, Male | 1, Female
LANGUAGE 1, English | 2, Spanish | 3, Other | 6, Unknown
For the sake of documentation and reproducibility, the current report was rendered in the following environment. Click the line below to expand.
#> ─ Session info ───────────────────────────────────────────────────────────────
#> setting value
#> version R version 4.2.0 (2022-04-22)
#> os Ubuntu 22.04.1 LTS
#> system x86_64, linux-gnu
#> ui X11
#> language (EN)
#> collate C
#> ctype en_US.UTF-8
#> tz America/Chicago
#> date 2022-08-10
#> pandoc 2.18 @ /usr/lib/rstudio/bin/quarto/bin/tools/ (via rmarkdown)
#>
#> ─ Packages ───────────────────────────────────────────────────────────────────
#> package * version date (UTC) lib source
#> archive 1.1.5 2022-05-06 [3] CRAN (R 4.2.0)
#> assertthat 0.2.1 2019-03-21 [3] CRAN (R 4.2.0)
#> backports 1.4.1 2021-12-13 [3] CRAN (R 4.2.0)
#> bit 4.0.4 2020-08-04 [3] CRAN (R 4.2.0)
#> bit64 4.0.5 2020-08-30 [3] CRAN (R 4.2.0)
#> bslib 0.4.0 2022-07-16 [3] CRAN (R 4.2.0)
#> cachem 1.0.6 2021-08-19 [3] CRAN (R 4.2.0)
#> checkmate 2.1.0 2022-04-21 [3] CRAN (R 4.2.0)
#> cli 3.3.0 2022-04-25 [3] CRAN (R 4.2.0)
#> colorspace 2.0-3 2022-02-21 [3] CRAN (R 4.2.0)
#> crayon 1.5.1 2022-03-26 [3] CRAN (R 4.2.0)
#> curl 4.3.2 2021-06-23 [3] CRAN (R 4.2.0)
#> DBI 1.1.3 2022-06-18 [3] CRAN (R 4.2.0)
#> digest 0.6.29 2021-12-01 [3] CRAN (R 4.2.0)
#> dplyr 1.0.9 2022-04-28 [3] CRAN (R 4.2.0)
#> ellipsis 0.3.2 2021-04-29 [3] CRAN (R 4.2.0)
#> evaluate 0.16 2022-08-09 [3] CRAN (R 4.2.0)
#> fansi 1.0.3 2022-03-24 [3] CRAN (R 4.2.0)
#> fastmap 1.1.0 2021-01-25 [3] CRAN (R 4.2.0)
#> generics 0.1.3 2022-07-05 [3] CRAN (R 4.2.0)
#> glue 1.6.2 2022-02-24 [3] CRAN (R 4.2.0)
#> highr 0.9 2021-04-16 [3] CRAN (R 4.2.0)
#> hms 1.1.1 2021-09-26 [3] CRAN (R 4.2.0)
#> htmltools 0.5.3 2022-07-18 [3] CRAN (R 4.2.0)
#> httr 1.4.3 2022-05-04 [3] CRAN (R 4.2.0)
#> jquerylib 0.1.4 2021-04-26 [3] CRAN (R 4.2.0)
#> jsonlite 1.8.0 2022-02-22 [3] CRAN (R 4.2.0)
#> kableExtra 1.3.4 2021-02-20 [3] CRAN (R 4.2.0)
#> knitr * 1.39 2022-04-26 [3] CRAN (R 4.2.0)
#> lifecycle 1.0.1 2021-09-24 [3] CRAN (R 4.2.0)
#> magrittr * 2.0.3 2022-03-30 [3] CRAN (R 4.2.0)
#> munsell 0.5.0 2018-06-12 [3] CRAN (R 4.2.0)
#> pillar 1.8.0 2022-07-18 [3] CRAN (R 4.2.0)
#> pkgconfig 2.0.3 2019-09-22 [3] CRAN (R 4.2.0)
#> png 0.1-7 2013-12-03 [3] CRAN (R 4.2.0)
#> purrr 0.3.4 2020-04-17 [3] CRAN (R 4.2.0)
#> R6 2.5.1 2021-08-19 [3] CRAN (R 4.2.0)
#> readr 2.1.2 2022-01-30 [3] CRAN (R 4.2.0)
#> REDCapR * 1.1.0 2022-08-10 [1] local
#> rlang 1.0.4 2022-07-12 [3] CRAN (R 4.2.0)
#> rmarkdown 2.14 2022-04-25 [3] CRAN (R 4.2.0)
#> rstudioapi 0.13 2020-11-12 [3] CRAN (R 4.2.0)
#> rvest 1.0.2 2021-10-16 [3] CRAN (R 4.2.0)
#> sass 0.4.2 2022-07-16 [3] CRAN (R 4.2.0)
#> scales 1.2.0 2022-04-13 [3] CRAN (R 4.2.0)
#> sessioninfo 1.2.2 2021-12-06 [3] CRAN (R 4.2.0)
#> stringi 1.7.8 2022-07-11 [3] CRAN (R 4.2.0)
#> stringr 1.4.0 2019-02-10 [3] CRAN (R 4.2.0)
#> svglite 2.1.0 2022-02-03 [3] CRAN (R 4.2.0)
#> systemfonts 1.0.4 2022-02-11 [3] CRAN (R 4.2.0)
#> tibble 3.1.8 2022-07-22 [3] CRAN (R 4.2.0)
#> tidyr 1.2.0 2022-02-01 [3] CRAN (R 4.2.0)
#> tidyselect 1.1.2 2022-02-21 [3] CRAN (R 4.2.0)
#> tzdb 0.3.0 2022-03-28 [3] CRAN (R 4.2.0)
#> utf8 1.2.2 2021-07-24 [3] CRAN (R 4.2.0)
#> vctrs 0.4.1 2022-04-13 [3] CRAN (R 4.2.0)
#> viridisLite 0.4.0 2021-04-13 [3] CRAN (R 4.2.0)
#> vroom 1.5.7 2021-11-30 [3] CRAN (R 4.2.0)
#> webshot 0.5.3 2022-04-14 [3] CRAN (R 4.2.0)
#> withr 2.5.0 2022-03-03 [3] CRAN (R 4.2.0)
#> xfun 0.32 2022-08-10 [3] CRAN (R 4.2.0)
#> xml2 1.3.3 2021-11-30 [3] CRAN (R 4.2.0)
#> yaml 2.3.5 2022-02-21 [3] CRAN (R 4.2.0)
#>
#> [1] /tmp/RtmpSYJzex/Rinst29b36238315a1
#> [2] /tmp/RtmplvH3JR/temp_libpath1796625c20c6
#> [3] /home/wibeasley/R/x86_64-pc-linux-gnu-library/4.2
#> [4] /usr/local/lib/R/site-library
#> [5] /usr/lib/R/site-library
#> [6] /usr/lib/R/library
#>
#> ──────────────────────────────────────────────────────────────────────────────
Report rendered by wibeasley at 2022-08-10, 09:25 -0500 in 2 seconds.