Identification of disease-related tissues/cell-types

2022-06-08

library(data.table)
library(xQTLbiolinks)
library(stringr)

In order to easily distinguish whether an eQTL is tissue/cell-type specific or shared across multiple tissue/cell groups, the function xQTLanalyze_propensity is developed. For a given gene-variant eQTL pair, following five steps are executed sequentially in the function:

  1. Retrieve R2 of linkage disequilibrium between the specified variant and the variants around it.

  2. Fetch the eQTL associations of the specified gene for above LD-associated variants among multiple tissue/cells in all studies (default).

  3. Normalize the -log10 p-value of eQTL with min-max method for each tissue.

  4. Calculate the pearson correlation coefficient between R2 and normalized p-value.

  5. Fit linear models to carry out regression with the formula: normalized p-value ~ R2.

After above steps, the function generate the outputs of “correlation coefficient” and “slope” for each tissue to uncover the trend of eQTL significance across LD degree. A greater value of “correlation coefficient” or “slope” suggests a more tissue/cell-type-specific pattern of the eQTL.

Here, we take an example of eQTL of a pair of “MMP7”-“rs11568818”.

propensityRes <- xQTLanalyze_propensity(gene="MMP7", variantName="rs11568818", study="")

The output contains four data.table objects, including: “snpLD” for LD details of the specified SNP; “assoAllLd” for eQTL details of LD-associated SNPs; “lm_R2_logP” for liner regression results; “cor_R2_logP” for correlation outputs;

names(propensityRes)
#> [1] "snpLD"            "tissuePropensity" "cor_R2_logP"      "lm_R2_logP"      
#> [5] "assoAllLd"

To visualize the significance across degree, the function xQTLvisual_qtlPropensity is developed, and two plot methods is available: heatmap and regression.

For heatmap, All SNPs that LD-associated with the specified SNP are devided into four (default) R2 bins, (0, 0.25], (0.25, 0.5], (0.5, 0.75], and (0.75, 1] according to their LD score, then display the SNP with the smallest p-value in each bin across different tissues/cells. Note: p-value are min-max normailized after being taken the logarithm using base 10.

xQTLvisual_qtlPropensity(propensityRes)