This short Vignette will show how to correction overloaded signals in (i) an artificial test case and (ii) a provided real data set. To achieve this we need to load the package functions as well as a small data example in xcmsRaw format.
library(CorrectOverloadedPeaks)
data("mzXML_data")
Let’s model a typical overloaded signal occuring frequently in GC-APCI-MS using the provided function .
pk <- CorrectOverloadedPeaks::ModelGaussPeak(height=10^7, width=3, scan_rate=10, e=0, ds=8*10^6, base_line=10^2)
plot(pk, main="Gaussian peak of true intensity 10^7 but cutt off at 8*10^6")
Now we roughly estimate peak boarders before applying the provided function to correct peak data.
idx <- pk[,"int"]>0.005 * max(pk[,"int"])
tmp <- CorrectOverloadedPeaks::FitGaussPeak(x=pk[idx,"rt"], y=pk[idx,"int"], silent=FALSE, xlab="RT", ylab="Intensity")
## [1] "Number of converging sollutions: 10, keeping 1"
The generated QC plot does show the optimal solution found (green line), indicating the substituted intensity values (grey circles) and obtained parameters (blue text) including the probably peak height (max_int=9.7*10^6) being very close to the true peak height (10^7). Now let’s extend this simplified process to peaks from a real data set. The following function call will generate (i) a PDF in the working directory with QC-plots for 10 peaks from 5 chromatographic regions, (ii) processing information output to the console and (iii) a new file “cor_df_all.RData” in the working directory containing all extracted but non-corrected mass traces.
tmp <- CorrectOverloadedPeaks::CorrectOverloadedPeaks(data=mzXML_data, method="EMG", testing=TRUE)
##
## Processing... mzXML_data
##
## Trying to correct 1 overloaded regions.
## [1] "Processing Region/Mass: 1 / 1"
## [1] "Number of converging sollutions: 192, keeping 153"
## [1] "Processing Region/Mass: 1 / 2"
## [1] "Number of converging sollutions: 194, keeping 36"
## [1] "Storing non-corrected data information in 'cor_df_all.RData'"
Let’s load these non-corrected mass traces for further visualization of package capabilities. For instance we can reprocess peak 2 from region 4 using the isotopic ratio approach:
load("cor_df_all.RData")
head(cor_df_all[[1]][[1]])
## Scan RT mz0 int0 mz1 int1 modified
## 169 169 258.771 176.0924 6411 177.0939 1027 FALSE
## 170 170 258.880 176.0923 17631 177.0941 3353 FALSE
## 171 171 258.989 176.0920 53682 177.0938 8674 FALSE
## 172 172 259.098 176.0919 141884 177.0935 23987 FALSE
## 173 173 259.208 176.0921 335177 177.0937 52495 FALSE
## 174 174 259.316 176.0924 673549 177.0938 111346 FALSE
tmp <- CorrectOverloadedPeaks::FitPeakByIsotopicRatio(cor_df=cor_df_all[[1]][[1]], silent=FALSE)
The extracted data contain RT and Intensity information for the overloaded mass trace (mz=350.164) as well as isotopes of this mz up to the first isotope which is not itself overloaded (M+2, green triangles). This isotope is evaluated with respect to its ratio to M+0 in the peak front (15.9%) and this ratio in turn is used to scal up the overloaded data points of M+0 (grey circles) as indicated by the black line. The data could of course be processed alternatively using the Gauss method as shown previously for artificial data.
tmp <- CorrectOverloadedPeaks::FitGaussPeak(x=cor_df_all[[1]][[1]][,"RT"], y=cor_df_all[[1]][[1]][,"int0"], silent=FALSE, xlab="RT", ylab="Intensity")
## [1] "Number of converging sollutions: 10, keeping 7"
## [1] TRUE