MetaboLights is a database for metabolomics experiments and associated information. The database allows users to deposit raw data, sample information, analysis protocols and metabolite annotation data.
metabolighteR provides easy access to publicly available MetaboLights studies via the MetaboLights RESTful API. Only API methods which retrieve data (GET
) are supported by metabolighteR.
metabolighteR can be installed from CRAN or, for the latest development version, directly from GitHub using the remotes
package.
install.packages('metabolighteR')
::install_github('wilsontom/metabolighteR')
remotes
library(metabolighteR)
A list of all public study identification codes can be easily retrieved.
get_studies()
all_study_ids <-
as.vector(all_study_ids$study)
studies <-
head(studies)
#> [1] "MTBLS155" "MTBLS391" "MTBLS102" "MTBLS129" "MTBLS143" "MTBLS165"
Generate a summary table containing; Study ID, Study Title and Study Technology, for publicly available studies.
# For the first five studies
purrr::map(studies[1:5], get_study_title)
study_titles <-names(study_titles) <- studies[1:5]
tibble::as_tibble(study_titles) %>% tidyr::gather()
study_titles <-names(study_titles) <- c('STUDY', 'Title')
get_study_tech()
study_tech <-
study_tech %>% dplyr::filter(STUDY %in% studies[1:5])
study_tech_filter <-
StudyInfoTable <- dplyr::left_join(study_titles, study_tech_filter, by = 'STUDY')
StudyInfoTable#> # A tibble: 5 x 3
#> STUDY Title TECH
#> <chr> <chr> <chr>
#> 1 MTBLS1… Release of Ecologically Relevant Metabolite… UPLC-FT-ICR-MS;UPLC-LTQ-…
#> 2 MTBLS3… Lipid Data Analyzer: Discrimination of isob… HPLC-LTQ-MS;UPLC-LTQ-MS;…
#> 3 MTBLS1… Comparative analysis of the adaptation of S… NMR
#> 4 MTBLS1… Coordinate Regulation of Metabolite Glycosy… UPLC-TOF-MS;UPLC-LTQ-MS
#> 5 MTBLS1… Comprehensive systems biology analysis of a… UPLC-LTQ-MS
A list of all available files can be generated using the get_study_files
function.
get_study_files('MTBLS264')
studyFileList <-
studyFileList#> # A tibble: 6 x 6
#> createdAt directory file status timestamp type
#> <chr> <lgl> <chr> <chr> <chr> <chr>
#> 1 March 20 2017 … FALSE a_mtbls264_NEG_mass_… active 2017032014… metadata_a…
#> 2 March 20 2017 … FALSE a_mtbls264_POS_mass_… active 2017032014… metadata_a…
#> 3 January 13 202… FALSE i_Investigation.txt active 2020011313… metadata_i…
#> 4 March 20 2017 … FALSE m_mtbls264_NEG_mass_… active 2017032014… metadata_m…
#> 5 March 20 2017 … FALSE m_mtbls264_POS_mass_… active 2017032014… metadata_m…
#> 6 March 20 2017 … FALSE s_MTBLS264.txt active 2017032014… metadata_s…
The contents of these files can then be downloaded using the download_file
function.
download_study_file('MTBLS264', studyFileList$file[1])
fileContents_A <-#> No encoding supplied: defaulting to UTF-8.
head(fileContents_A)
#> # A tibble: 6 x 35
#> Sample.Name Protocol.REF Parameter.Value.Po… Parameter.Value.D… Extract.Name
#> <chr> <chr> <lgl> <lgl> <lgl>
#> 1 Volunteer1_b… Extraction NA NA NA
#> 2 Volunteer1_p… Extraction NA NA NA
#> 3 Volunteer1_R… Extraction NA NA NA
#> 4 Volunteer2_b… Extraction NA NA NA
#> 5 Volunteer2_p… Extraction NA NA NA
#> 6 Volunteer2_R… Extraction NA NA NA
#> # … with 30 more variables: Protocol.REF.1 <chr>,
#> # Parameter.Value.Chromatography.Instrument. <chr>, Term.Source.REF <lgl>,
#> # Term.Accession.Number <lgl>, Parameter.Value.Column.model. <chr>,
#> # Parameter.Value.Column.type. <chr>, Labeled.Extract.Name <lgl>,
#> # Label <lgl>, Term.Source.REF.1 <lgl>, Term.Accession.Number.1 <lgl>,
#> # Protocol.REF.2 <chr>, Parameter.Value.Scan.polarity. <chr>,
#> # Parameter.Value.Scan.m.z.range. <chr>, Parameter.Value.Instrument. <chr>,
#> # Term.Source.REF.2 <lgl>, Term.Accession.Number.2 <lgl>,
#> # Parameter.Value.Ion.source. <chr>, Term.Source.REF.3 <chr>,
#> # Term.Accession.Number.3 <chr>, Parameter.Value.Mass.analyzer. <chr>,
#> # Term.Source.REF.4 <chr>, Term.Accession.Number.4 <chr>,
#> # MS.Assay.Name <chr>, Raw.Spectral.Data.File <chr>, Protocol.REF.3 <chr>,
#> # Normalization.Name <lgl>, Derived.Spectral.Data.File <lgl>,
#> # Protocol.REF.4 <chr>, Data.Transformation.Name <lgl>,
#> # Metabolite.Assignment.File <chr>
download_study_file('MTBLS264', studyFileList$file[4])
fileContents_B <-#> No encoding supplied: defaulting to UTF-8.
head(fileContents_B)
#> # A tibble: 6 x 69
#> database_identi… chemical_formula smiles inchi metabolite_iden… mass_to_charge
#> <chr> <chr> <chr> <chr> <chr> <dbl>
#> 1 CHEBI:17552 C10H15N5O11P2 Nc1nc… InCh… GDP 442.
#> 2 CHEBI:17345 C10H14N5O8P Nc1nc… InCh… GMP 362.
#> 3 CHEBI:16695 C9H13N2O9P O[C@@… InCh… UMP 323.
#> 4 CHEBI:15713 C9H15N2O15P3 O[C@@… InCh… UTP 483.
#> 5 CHEBI:17368 C5H4N4O O=c1[… InCh… Hypoxanthine 135.
#> 6 CHEBI:17775 C5H4N4O3 O=c1[… InCh… Urate 167.
#> # … with 63 more variables: fragmentation <lgl>, modifications <chr>,
#> # charge <chr>, retention_time <dbl>, taxid <chr>, species <chr>,
#> # database <lgl>, database_version <lgl>, reliability <chr>, uri <lgl>,
#> # search_engine <lgl>, search_engine_score <lgl>,
#> # smallmolecule_abundance_sub <lgl>, smallmolecule_abundance_stdev_sub <lgl>,
#> # smallmolecule_abundance_std_error_sub <lgl>, Volunteer1_blood_0h_NEG <dbl>,
#> # Volunteer1_plasma_0h_NEG <dbl>, Volunteer1_RBC_0h_NEG <dbl>,
#> # Volunteer2_blood_0h_NEG <dbl>, Volunteer2_plasma_0h_NEG <dbl>,
#> # Volunteer2_RBC_0h_NEG <dbl>, Volunteer3_blood_0h_NEG <dbl>,
#> # Volunteer3_plasma_0h_NEG <dbl>, Volunteer3_RBC_0h_NEG <dbl>,
#> # Volunteer4_blood_0h_NEG <dbl>, Volunteer4_plasma_0h_NEG <dbl>,
#> # Volunteer4_RBC_0h_NEG <dbl>, Volunteer1_blood_1h_NEG <dbl>,
#> # Volunteer1_plasma_1h_NEG <dbl>, Volunteer1_RBC_1h_NEG <dbl>,
#> # Volunteer2_blood_1h_NEG <dbl>, Volunteer2_plasma_1h_NEG <dbl>,
#> # Volunteer2_RBC_1h_NEG <dbl>, Volunteer3_blood_1h_NEG <dbl>,
#> # Volunteer3_plasma_1h_NEG <dbl>, Volunteer3_RBC_1h_NEG <dbl>,
#> # Volunteer4_blood_1h_NEG <dbl>, Volunteer4_plasma_1h_NEG <dbl>,
#> # Volunteer4_RBC_1h_NEG <dbl>, Volunteer1_blood_4h_NEG <dbl>,
#> # Volunteer1_plasma_4h_NEG <dbl>, Volunteer1_RBC_4h_NEG <dbl>,
#> # Volunteer2_blood_4h_NEG <dbl>, Volunteer2_plasma_4h_NEG <dbl>,
#> # Volunteer2_RBC_4h_NEG <dbl>, Volunteer3_blood_4h_NEG <dbl>,
#> # Volunteer3_plasma_4h_NEG <dbl>, Volunteer3_RBC_4h_NEG <dbl>,
#> # Volunteer4_blood_4h_NEG <dbl>, Volunteer4_plasma_4h_NEG <dbl>,
#> # Volunteer4_RBC_4h_NEG <dbl>, Volunteer1_blood_24h_NEG <dbl>,
#> # Volunteer1_plasma_24h_NEG <dbl>, Volunteer1_RBC_24h_NEG <dbl>,
#> # Volunteer2_blood_24h_NEG <dbl>, Volunteer2_plasma_24h_NEG <dbl>,
#> # Volunteer2_RBC_24h_NEG <dbl>, Volunteer3_blood_24h_NEG <dbl>,
#> # Volunteer3_plasma_24h_NEG <dbl>, Volunteer3_RBC_24h_NEG <dbl>,
#> # Volunteer4_blood_24h_NEG <dbl>, Volunteer4_plasma_24h_NEG <dbl>,
#> # Volunteer4_RBC_24h_NEG <dbl>