The pull_data_synapse()
function accesses the specified
version of the clinical and genomic GENIE BPC data from Synapse
and reads it into the R environment.
This vignette will walk a user through the
pull_data_synapse()
function.
Before beginning this tutorial, be sure to have a Synapse account. If you do not yet have a Synapse account, please follow the instructions below:
Note: Please allow up to a week to review and grant access.
Users must log in to Synapse to access the data successfully.
To set your Synapse credentials during each session, call:
set_synapse_credentials(username = "your_username", password = "your_password")
To store authentication information in your environmental
variables, add the following to your .Renviron file (tip: you can use
usethis::edit_r_environ()
to easily open/edit this
file):
SYNAPSE_USERNAME = <your-username>
SYNAPSE_PASSWORD = <your-password>
Alternatively, you can pass your username and password to each individual data pull function if preferred, although it is recommended that you manage your passwords outside of your scripts for security purposes.
Let’s start by reviewing the pull_data_synapse()
arguments.
Argument | Description |
---|---|
|
Vector or list specifying the cohort(s) of interest. Must be one of 'NSCLC' (Non-Small Cell Lung Cancer), 'CRC' (Colorectal Cancer), or 'BrCa' (Breast Cancer) |
|
Vector specifying the version of the data. Must be one of the following: 'v1.1-consortium', 'v1.2-consortium', 'v2.1-consortium', 'v2.0-public'. When entering multiple cohorts, the order of the version numbers corresponds to the order that the cohorts are specified; the cohort and version number must be in the same order in order to pull the correct data. |
|
If `NULL` (default), data will be returned as a list of dataframes with requested data as list items. Otherwise, specify a folder path to have data automatically downloaded there. When a path is specified, data are not read into the R environment. |
|
Synapse username |
|
Synapse password |
Pull version 2.0-public of the NSCLC data from Synapse and store in the local environment.
= pull_data_synapse("NSCLC", version = "v2.0-public") nsclc_2_0
The resulting nsclc_data
object is a list of elements,
such that each element represents a dataset:
Pull version 2.1-consortium of the NSCLC data and version 1.1-consortium of the CRC data.
pull_data_synapse(c("NSCLC", "CRC"),
version = c("v2.1-consortium","v1.1-consortium"))
Pull version 1.2-consortium of the NSCLC data and version 1.1-consortium of the CRC data.
pull_data_synapse(c("NSCLC", "CRC"),
version = c("v1.2-consortium", "v1.1-consortium"))