This document lists the input parameters expected / accepted in the CALANGO definition files (or, alternatively, in the defs
list).
Type: character string
Description: path to the directory where annotation files are located
Required: YES
Default: none
Type: character string
Description: path to the output directory where results should be saved
Required: YES
Default: none
Type: character string
Description: path to a file containing the genome metadata. It should contain at least, for each genome: (1) path for annotation data; (2) phenotype data (numeric); (3) normalization data (numeric) It must be a tab-separated value file with no column headers.
Required: YES
Default: none
Type: integer/numeric
Description: index of the column from the file specified in dataset.info
containing the phenotype data, which will be used to sort the genomes and find annotation terms associated to that phenotype.
Required: YES
Default: none
Type: integer/numeric
Description: index of the column from the file specified in dataset.info
containing the short names for species/lineages to be used when plotting data.
Required: YES
Default: none
Type: integer/numeric
Description: index of the column from the file specified in dataset.info
containing the group to be used for coloring the heatmaps
Required: YES
Default: none
Type: character string.
Description: which dictionary data type to use? Accepts “GO” or “other”
Required: YES
Default: none
Type: character string
Description: path to dictionary file (a two-column tab-separated value file containing annotation IDs and their descriptions). Not needed if ontology = "GO"
.
Required: NO
Default: none
Type: character string
Description: the name of the column in the annotation file that should be used.
Required: YES
Default: none
Type: integer/numeric
Description: index of the column from the file specified in dataset.info
containing the normalization data.
Required: NO
Default: none
Type: character string
Description: path to the tree file.
Required: YES
Default: none
Type: character string
Description: tree file type. Accepts “nexus” or “newick”. Case-sensitive.
Required: YES
Default: none
Type: character string
Description: type of analysis to perform. Currently accepts only “correlation”
Required: YES
Default: none
Type: character string
Description: type of multiple hypothesis testing correction to apply. Accepts all methods listed in stats::p.adjust.methods
.
Required: NO
Default: “BH”
Type: integer/numeric
Description: Number of cores to use. Must be a positive integer.
Required: NO
Default: 1
Cutoffs are used to regulate how much graphical output is produced by CALANGO. The tab-separated value files that are generated at the end of the analysis (and saved in the output.dir) will always contain all, unfiltered results.
q-value cutoffs are used for correlation and phylogeny-aware linear models. Only entries with q-values smaller than these cutoffs will be shown.
Type: numeric between 0 and 1
Required: NO
Default: 1
Type: numeric between 0 and 1
Required: NO
Default: 1
Type: numeric between 0 and 1
Required: NO
Default: 1
Type: numeric between 0 and 1
Required: NO
Default: 1
correlation cutoffs are used to establish thresholds of positive/negative correlation values for the graphical output. Important: these parameters are a bit counter-intuitive. Please check the example below for clarity.
Type: numeric values between 0 and 1
Description: Thresholds for Spearman correlation values. The selection criteria is: (Spearman correlation < lower.cutoff) OR (Spearman correlation > upper.cutoff)
Required: NO
Defaults: spearman.cor.upper.cutoff = -1
; spearman.cor.lower.cutoff = 1
(i.e., no filtering)
Example 1: If you set spearman.cor.upper.cutoff = 0.8
and spearman.cor.lower.cutoff = -0.8
, only pairs with Spearman correlation values smaller than -0.8
OR greater than 0.8
will be shown.
Example 2: If you set spearman.cor.upper.cutoff = 0
and spearman.cor.lower.cutoff = -1
, pairs with Spearman correlation values smaller than -1
OR greater than 0
will be shown. Since the Spearman correlation cannot be smaller than -1
, this means that only positively correlated pairs will be shown.
Example 3: If you set any values such that spearman.cor.upper.cutoff < spearman.cor.lower.cutoff
, all pairs are shown (no filtering is performed).
Type: numeric values between 0 and 1
Description: Thresholds for Pearson correlation values. The selection criteria is: (Pearson correlation < lower.cutoff) OR (Pearson correlation > upper.cutoff)
Required: NO
Defaults: pearson.cor.upper.cutoff = -1
; pearson.cor.lower.cutoff = 1
(i.e., no filtering)
Type: numeric values between 0 and 1
Description: Thresholds for Kendall correlation values. The selection criteria is: (Kendall correlation < lower.cutoff) OR (Kendall correlation > upper.cutoff)
Required: NO
Defaults: kendall.cor.upper.cutoff = -1
; kendall.cor.lower.cutoff = 1
(i.e., no filtering)
standard deviation and coefficient of variation cutoffs (only values greater than cutoff will be shown)
Type: non-negative numeric value
Required: NO
Default: 0
Type: non-negative numeric value
Required: NO
Default: 0
sum of annotation terms cutoff (only values greater than cutoff will be shown)
Type: non-negative integer/numeric value
Required: NO
Default: 0
prevalence and heterogeneity cutoffs (only values greater than cutoff will be shown). Prevalence is defined as the percentage of lineages where annotation term was observed at least once. Heterogeneity is defined as the percentage of lineages where annotation term count is different from the median.
Type: numeric value between 0 and 1
Required: NO
Default: 0
Type: numeric value between 0 and 1
Required: NO
Default: 0
Type: character string. Accepts “TRUE” or “FALSE”
Description: If “TRUE” all annotation terms where standard deviation for annotation raw values before normalization is zero are removed. This filter is used to remove the (quite common) bias when QPAL (phenotype) and normalizing factors are strongly associated by chance.
Required: YES
Default: “TRUE”