General:
junctionAlignment
, which counts the number of nucleotides in the reference germline not present in the alignment, and the number of V and J nucleotides in the CDR3.Gene Usage:
getFamily
where temporary designation gene names were not being correctly subset to the cluster (family) level.Lineage:
runPhylip
which was causing buildPhylipLineage
to fail when run on Windows.General:
readFastqDb
, which reads a repertoire’s .fastq file and imports the sequencing quality scores for sequence_alignment
. Added maskPositionsByQuality
masks positions that have a sequencing quality score lower than the specified threshold. The convenience function getPositionQuality
will create a data.frame
with quality scores per position.dplyr
dependency to v1.0.padSeqEnds
, the argument mod3=TRUE
has been added so that sequences are padded to a length that is a multiple of 3.translateDNA
where NA
values weren’t being translated properly.Amino Acid Analysis:
aminoAcidProperties
, which will now default to nt=TRUE
.Diversity: + Added a parameter to countClones
(remove_na
) that will remove all rows with NA values in the clone column if TRUE
(default) and issue a warning with how many were removed. If FALSE
, those rows will be kept instead.
Gene Usage:
getLocus
to extract the locus information from the segment call.getChain
to define the chain from the segment or locus call.countGenes
to give a warning instead of an error so as not to disrupt running workflows.getSegment
where filtering of non-localized genes was not being applied when called from getFamily
, because the “NL” part of the name was removed before the filtering step.getAllele
, getGene
, getFamily
and getLocus
, to parse constant region gene names correctly.getSegment
to be able to parse constant region gene names correctly and not remove the “D” from “IGHD” when strip_d=TRUE
.Lineage:
branch_length
argument to buildPhylipLineage
, and augmented graphToPhylo
and phyloToGraph
to track intermediate sequence in nodes for phylo object.countGenes
(remove_na
) that will remove all rows with NA values in the gene column if TRUE
(default) and issue a warning with how many were removed. If FALSE
, those rows will be kept instead.Diversity:
plotDiversityTest
that caused all values of q
to appear on the plot rather than just the specified one.Gene Usage:
groupGenes
where the v_call
j_call
column for J gene grouping.groupGenes
.only_igh
argument of groupGenes
to only_heavy
.Backwards Incompatible Changes:
V_CALL
(Change-O) as the default to identify the field that stored the V gene calls, they now use v_call
(AIRR). That means, scripts that relied on default values (previously, v_call="V_CALL"
), will now fail if calls to the functions are not updated to reflect the correct value for the data. If data are in the Change-O format, the current default value v_call="v_call"
will fail to identify the column with the V gene calls as the column v_call
doesn’t exist. In this case, v_call="V_CALL"
needs to be specified in the function call.ExampleDb
converted to the AIRR Rearrangement standard and examples updated accordingly. The legacy Change-O version is available as ExampleDbChangeo
.GRAVY
to gravy
);countGenes
, countClones
(e.g., SEQ_COUNT
to seq_count
)estimateAbundance
(e.g., RANK
to rank
)groupGenes
(e.g., VJ_GROUP
to vj_group
)collapseDuplicates
and makeChangeoClone
(e.g., SEQUENCE_ID
to sequence_id
, COLLAPSE_COUNT
to collapse_count
)summarizeTrees
, getPathLengths
, getMRCA
, tableEdges
, testEdges
) also return columns in lower case (e.g., parent
, child
, outdegree
, steps
, annotation
, pvalue
)IG_COLOR
names converted to official C region identifiers (IGHA, IGHD, IGHE, IGHG, IGHM, IGHK, IGHL).General:
baseTheme
looks is now consistent across sizing
options.cpuCount
will now return 1
if the core count cannot be determined.padSeqEnds
wherein the pad_char
argument was being ignored.Diversity:
estimateAbundance
slot clone_by
now contains the name of the column with the clonal group identifier, as specified in the function call. For example, if the function was called with clone="clone_id"
, then the clone_by
slot will be clone_id
.Lineage:
buildPhylipLineage
arguments vcall
, jcall
and dnapars_exec
to v_call
, j_call
and phylip_exec
, respectively.Deprecated:
rarefyDiversity
is deprecated in favor of alphaDiversity
, which includes the same functionality.testDiversity
is deprecated. The test calculation have been added to the normal output of alphaDiversity
.General:
ape
and tibble
dependencies.Lineage:
readIgphyml
to read in IgPhyML output and combineIgphyml
to combine parameter estimates across samples.graphToPhylo
and phyloToGraph
to allow conversion between graph and phylo formats.Diversity:
estimateAbundance
where setting the clone
column to a non-default value produced an error.estimateAbundance
through the min_n
, max_n
, and uniform
arguments.estimateAbundance
. alphaDiversity
will call estimateAbundance
for bootstrapping if not provided an existing AbundanceCurve
object.DiversityCurve
and AbundanceCurve
objects to accomodate the new diversity methods.Gene Usage:
groupGenes
now supports grouping by V gene, J gene, and junction length (junc_len
) as well, in addition to grouping by V gene and J gene without junction length. Also added support for single-cell input data with the addition of new arguments cell_id
, locus
, and only_igh
.General:
nonsquareDist
function to calculate the non-square distance matrix of sequences.progressBar
, baseTheme
, checkColumns
and cpuCount
.Diversity:
estimateAbundance
, and plotAbundanceCurve
, will now allow group=NULL
to be specified to performance abundance calculations on ungrouped data.Gene Usage:
fill
argument to countGenes
. When set TRUE
this adds zeroes to the group
pairs that do not exist in the data.groupGenes
to group sequences sharing same V and J gene.Toplogy Analysis:
indirect=TRUE
.makeChangeoClone
will now issue an error and terminate, instead of continuing with a warning, when all sequences are not the same length.General:
IPUAC_AA
wherein X was not properly matching against Q.getAAMatrix
to treat * (stop codon) as a mismatch.General:
readChangeoDb
.padSeqEnds
function which pads sequences with Ns to make then equal in length.collapseDuplicates
.Diversity:
uniform
argument to rarefyDiversity
allowing users to toggle uniform vs non-uniform sampling.plotAbundance
to plotAbundanceCurve
.estimateAbundance
return object from a data.frame to a new AbundanceCurve
custom class.plot
call for AbundanceCurve
to plotAbundanceCurve
.annotate
argument from plotDiversityCurve
to plotAbundanceCurve
.score
argument to plotDiversityCurve
to toggle between plotting diversity or evenness.plotDiversityTest
to generate a simple plot of DiversityTest
object summaries.Gene Usage:
omit_nl
argument to getAllele
, getGene
and getFamily
to allow optional filtering of non-localized (NL) genes.Lineage:
makeChangeoClone
preventing it from interpreting the id
argument correctly.pad_end
argument to makeChangeoClone
to allow automatic padding of ends to make sequences the same length.General:
dry
argument to collapseDuplicates
which will annotate duplicate sequences but not remove them when set to TRUE
.collapseDuplicates
was returning one sequence if all sequences were considered ambiguous.Lineage:
makeChangeoClone
and buildPhylipLineage
for purposes of (optionally) treating indels as mismatches.buildPhylipLineage
when PHYLIP doesn’t generate inferred sequences and has only one block.General:
readChangeoDb
causing the select
argument to do nothing.Gene Usage:
countGenes
when the clone
argument is specified to CLONE_COUNT
/CLONE_FREQ
.General:
readChangeoDb
and writeChangeoDb
.General:
seqDist()
wherein distance was not properly calculated in some sequences containing gap characters.getAAMatrix()
return matrix.General:
readChangeoDb()
to wrap data.table::fread()
instead of utils::read.table()
if the input file is not compressed.testSeqEqual()
, getSeqDistance()
and getSeqMatrix()
to C++ to improve performance of collapseDuplicates()
and other dependent functions.testSeqEqual()
, getSeqDistance()
and getSeqMatrix()
to seqEqual()
, seqDist()
and pairwiseDist()
, respectively.pairwiseEqual()
which creates a logical sequence distance matrix; TRUE if sequences are identical, FALSE if not, excluding Ns and gaps.X
in translateDNA()
.collapseDuplicates()
wherein the input data type sanity check would cause the vignette to fail to build under R 3.3.ExampleDb.gz
file with a larger, more clonal, ExampleDb
data object.ExampleTrees
with a larger set of trees.multiggplot()
to gridPlot()
.Amino Acid Analysis:
normalize=FALSE
for charge calculations to be more consistent with previously published repertoire sequencing results.Diversity Analysis:
progress
argument to rarefyDiversity()
and testDiversity()
to enable the (previously default) progress bar.estimateAbundance()
were the function would fail if there was only a single input sequence per group.data
and summary
slots of DiversityTest
to uppercase for consistency with other tools.plot
to plotDiversityCurve
for DiversityCurve
objects.Gene Usage:
sortGenes()
function to sort V(D)J genes by name or locus position.clone
argument to countGenes()
to allow restriction of gene abundance to one gene per clone.Topology Analysis:
General:
base::nchar()
.General:
Amino Acid Analysis:
aliphatic()
function were not being passed through the ellipsis argument of aminoAcidProperties()
.aminoAcidProperties()
.AA_TRANS
to ABBREV_AA
.Diversity:
rarefyDiversity()
output.Lineage:
ExampleTrees
data with example output from buildPhylipLineage()
.General:
getDNADistMatrix()
and getAADistMatrix()
to getDNAMatrix
and getAAMatrix()
, respectively.getSeqMatrix()
which calculates a pairwise distance matrix for a set of sequences.multiggplot()
function for performing multiple panel plots.Amino Acid Analysis:
gravy()
, bulk()
, aliphatic()
, polar()
, charge()
, countPatterns()
and aminoAcidProperties()
.Annotation:
getSegment()
, getAllele()
, getGene()
and getFamily()
. May be disabled by providing the argument strip_d=FALSE
.countGenes()
to tabulate V(D)J allele, gene and family usage.Diversity:
countClones()
, estimateAbundance()
and plotAbundance()
.resampleDiversity()
to rarefyDiversity()
and changed many of the internals. Bootstrapping is now performed on an inferred complete relative abundance distribution.rarefyDiversity()
and testDiversity()
.rarefyDiversity()
and testDiversity()
are now calculated using the mean and standard deviation of the bootstrap realizations, rather than the median and upper/lower quantiles.plotDiversityCurve()
.Initial public release.
General:
citation("alakazam")
command.Lineage:
buildPhylipLineage()
.Lineage:
buildPhylipLineage()
would hang on R 3.2 due to R change request PR#15508.Prerelease for review.