Calculating the average nLTT plot of multiple phylogenies is not a trivial tasks.
The function get_nltt_values
collects the nLTT values of
a collection of phylogenies as tidy data.
This allows for a good interplay with ggplot2.
Create two easy trees:
<- "((A:1,B:1):2,C:3);"
newick1 <- "((A:2,B:2):1,C:3);"
newick2 <- ape::read.tree(text = newick1)
phylogeny1 <- ape::read.tree(text = newick2)
phylogeny2 <- c(phylogeny1, phylogeny2) phylogenies
There are very similar. phylogeny1
has short tips:
::plot.phylo(phylogeny1)
ape::add.scale.bar() #nolint ape
This can be observed in the nLTT plot:
::nltt_plot(phylogeny1, ylim = c(0, 1)) nLTT
As a collection of timepoints:
<- nLTT::get_phylogeny_nltt_matrix(phylogeny1)
t ::kable(t) knitr
time | N |
---|---|
0.0000000 | 0.3333333 |
0.6666667 | 0.6666667 |
1.0000000 | 1.0000000 |
Plotting those timepoints:
<- as.data.frame(nLTT::get_phylogeny_nltt_matrix(phylogeny1))
df ::qplot(
ggplot2data = df, geom = "step", ylim = c(0, 1), direction = "vh",
time, N, main = "NLTT plot of phylogeny 1"
)
phylogeny2
has longer tips:
::plot.phylo(phylogeny2)
ape::add.scale.bar() #nolint ape
Also this can be observed in the nLTT plot:
::nltt_plot(phylogeny2, ylim = c(0, 1)) nLTT
As a collection of timepoints:
<- nLTT::get_phylogeny_nltt_matrix(phylogeny2)
t ::kable(t) knitr
time | N |
---|---|
0.0000000 | 0.3333333 |
0.3333333 | 0.6666667 |
1.0000000 | 1.0000000 |
Plotting those timepoints:
<- as.data.frame(nLTT::get_phylogeny_nltt_matrix(phylogeny2))
df ::qplot(
ggplot2data = df, geom = "step", ylim = c(0, 1), direction = "vh",
time, N, main = "NLTT plot of phylogeny 2"
)
The average nLTT plot should be somewhere in the middle.
It is constructed from stretched nLTT matrices.
Here is the nLTT matrix of the first phylogeny:
<- nLTT::stretch_nltt_matrix(
t ::get_phylogeny_nltt_matrix(phylogeny1), dt = 0.20, step_type = "upper"
nLTT
)::kable(t) knitr
0.0 | 0.6666667 |
0.2 | 0.6666667 |
0.4 | 0.6666667 |
0.6 | 0.6666667 |
0.8 | 1.0000000 |
1.0 | 1.0000000 |
Here is the nLTT matrix of the second phylogeny:
<- nLTT::stretch_nltt_matrix(
t ::get_phylogeny_nltt_matrix(phylogeny2), dt = 0.20, step_type = "upper"
nLTT
)::kable(t) knitr
0.0 | 0.6666667 |
0.2 | 0.6666667 |
0.4 | 1.0000000 |
0.6 | 1.0000000 |
0.8 | 1.0000000 |
1.0 | 1.0000000 |
Here is the average nLTT matrix of both phylogenies:
<- nLTT::get_average_nltt_matrix(phylogenies, dt = 0.20)
t ::kable(t) knitr
0.0 | 0.6666667 |
0.2 | 0.6666667 |
0.4 | 0.8333333 |
0.6 | 0.8333333 |
0.8 | 1.0000000 |
1.0 | 1.0000000 |
Observe how the numbers get averaged.
The same, now shown as a plot:
::nltts_plot(phylogenies, dt = 0.20, plot_nltts = TRUE) nLTT
Here a demo how the new function works:
<- nLTT::get_nltt_values(c(phylogeny1, phylogeny2), dt = 0.2)
t ::kable(t) knitr
id | t | nltt |
---|---|---|
1 | 0.0 | 0.6666667 |
1 | 0.2 | 0.6666667 |
1 | 0.4 | 0.6666667 |
1 | 0.6 | 0.6666667 |
1 | 0.8 | 1.0000000 |
1 | 1.0 | 1.0000000 |
2 | 0.0 | 0.6666667 |
2 | 0.2 | 0.6666667 |
2 | 0.4 | 1.0000000 |
2 | 0.6 | 1.0000000 |
2 | 0.8 | 1.0000000 |
2 | 1.0 | 1.0000000 |
Plotting options, first create a data frame:
<- nLTT::get_nltt_values(c(phylogeny1, phylogeny2), dt = 0.01) df
Here we see an averaged nLTT plot, where the original nLTT values are still visible:
::qplot(
ggplot2data = df, geom = "point", ylim = c(0, 1),
t, nltt, main = "Average nLTT plot of phylogenies", color = id, size = I(0.1)
+ ggplot2::stat_summary(
) fun.data = "mean_cl_boot", color = "red", geom = "smooth"
)
Here we see an averaged nLTT plot, with the original nLTT values omitted:
::qplot(t, nltt, data = df, geom = "blank", ylim = c(0, 1),
ggplot2main = "Average nLTT plot of phylogenies"
+ ggplot2::stat_summary(
) fun.data = "mean_cl_boot", color = "red", geom = "smooth"
)
Create two harder trees:
<- "((A:1,B:1):1,(C:1,D:1):1);"
newick1 <- paste0("((((XD:1,ZD:1):1,CE:2):1,(FE:2,EE:2):1):4,((AE:1,BE:1):1,",
newick2 "(WD:1,YD:1):1):5);"
)<- ape::read.tree(text = newick1)
phylogeny1 <- ape::read.tree(text = newick2)
phylogeny2 <- c(phylogeny1, phylogeny2) phylogenies
There are different. phylogeny1
is relatively simple,
with two branching events happening at the same time:
::plot.phylo(phylogeny1)
ape::add.scale.bar() #nolint ape
This can be observed in the nLTT plot:
::nltt_plot(phylogeny1, ylim = c(0, 1)) nLTT
As a collection of timepoints:
<- nLTT::get_phylogeny_nltt_matrix(phylogeny2)
t ::kable(t) knitr
time | N |
---|---|
0.0000000 | 0.1111111 |
0.5714286 | 0.2222222 |
0.7142857 | 0.3333333 |
0.7142857 | 0.4444444 |
0.7142857 | 0.5555556 |
0.8571429 | 0.6666667 |
0.8571429 | 0.7777778 |
0.8571429 | 0.8888889 |
1.0000000 | 1.0000000 |
phylogeny2
is more elaborate:
::plot.phylo(phylogeny2)
ape::add.scale.bar() #nolint ape
Also this can be observed in the nLTT plot:
::nltt_plot(phylogeny2, ylim = c(0, 1)) nLTT
As a collection of timepoints:
<- nLTT::get_phylogeny_nltt_matrix(phylogeny2)
t ::kable(t) knitr
time | N |
---|---|
0.0000000 | 0.1111111 |
0.5714286 | 0.2222222 |
0.7142857 | 0.3333333 |
0.7142857 | 0.4444444 |
0.7142857 | 0.5555556 |
0.8571429 | 0.6666667 |
0.8571429 | 0.7777778 |
0.8571429 | 0.8888889 |
1.0000000 | 1.0000000 |
The average nLTT plot should be somewhere in the middle.
It is constructed from stretched nLTT matrices.
Here is the nLTT matrix of the first phylogeny:
<- nLTT::stretch_nltt_matrix(
t ::get_phylogeny_nltt_matrix(phylogeny1), dt = 0.20, step_type = "upper"
nLTT
)::kable(t) knitr
0.0 | 0.5 |
0.2 | 0.5 |
0.4 | 0.5 |
0.6 | 1.0 |
0.8 | 1.0 |
1.0 | 1.0 |
Here is the nLTT matrix of the second phylogeny:
<- nLTT::stretch_nltt_matrix(
t ::get_phylogeny_nltt_matrix(phylogeny2), dt = 0.20, step_type = "upper"
nLTT
)::kable(t) knitr
0.0 | 0.2222222 |
0.2 | 0.2222222 |
0.4 | 0.2222222 |
0.6 | 0.3333333 |
0.8 | 0.6666667 |
1.0 | 1.0000000 |
Here is the average nLTT matrix of both phylogenies:
<- nLTT::get_average_nltt_matrix(phylogenies, dt = 0.20)
t ::kable(t) knitr
0.0 | 0.3611111 |
0.2 | 0.3611111 |
0.4 | 0.3611111 |
0.6 | 0.6666667 |
0.8 | 0.8333333 |
1.0 | 1.0000000 |
Observe how the numbers get averaged.
Here a demo how the new function works:
<- nLTT::get_nltt_values(c(phylogeny1, phylogeny2), dt = 0.2)
t ::kable(t) knitr
id | t | nltt |
---|---|---|
1 | 0.0 | 0.5000000 |
1 | 0.2 | 0.5000000 |
1 | 0.4 | 0.5000000 |
1 | 0.6 | 1.0000000 |
1 | 0.8 | 1.0000000 |
1 | 1.0 | 1.0000000 |
2 | 0.0 | 0.2222222 |
2 | 0.2 | 0.2222222 |
2 | 0.4 | 0.2222222 |
2 | 0.6 | 0.3333333 |
2 | 0.8 | 0.6666667 |
2 | 1.0 | 1.0000000 |
Plotting options, first create a data frame:
<- nLTT::get_nltt_values(c(phylogeny1, phylogeny2), dt = 0.01) df
Here we see an averaged nLTT plot, where the original nLTT values are still visible:
::qplot(
ggplot2data = df, geom = "point", ylim = c(0, 1),
t, nltt, main = "Average nLTT plot of phylogenies", color = id, size = I(0.1)
+ ggplot2::stat_summary(
) fun.data = "mean_cl_boot", color = "red", geom = "smooth"
)
Here we see an averaged nLTT plot, with the original nLTT values omitted:
::qplot(t, nltt, data = df, geom = "blank", ylim = c(0, 1),
ggplot2main = "Average nLTT plot of phylogenies"
+ ggplot2::stat_summary(
) fun.data = "mean_cl_boot", color = "red", geom = "smooth"
)
Create three random trees:
set.seed(42)
<- ape::rcoal(10)
phylogeny1 <- ape::rcoal(20)
phylogeny2 <- ape::rcoal(30)
phylogeny3 <- ape::rcoal(40)
phylogeny4 <- ape::rcoal(50)
phylogeny5 <- ape::rcoal(60)
phylogeny6 <- ape::rcoal(70)
phylogeny7 <- c(
phylogenies
phylogeny1, phylogeny2, phylogeny3,
phylogeny4, phylogeny5, phylogeny6, phylogeny7 )
Here a demo how the new function works:
<- nLTT::get_nltt_values(phylogenies, dt = 0.2)
t ::kable(t) knitr
id | t | nltt |
---|---|---|
1 | 0.0 | 0.2000000 |
1 | 0.2 | 0.2000000 |
1 | 0.4 | 0.2000000 |
1 | 0.6 | 0.2000000 |
1 | 0.8 | 0.3000000 |
1 | 1.0 | 1.0000000 |
2 | 0.0 | 0.1000000 |
2 | 0.2 | 0.1000000 |
2 | 0.4 | 0.1000000 |
2 | 0.6 | 0.1000000 |
2 | 0.8 | 0.2000000 |
2 | 1.0 | 1.0000000 |
3 | 0.0 | 0.0666667 |
3 | 0.2 | 0.0666667 |
3 | 0.4 | 0.1000000 |
3 | 0.6 | 0.1333333 |
3 | 0.8 | 0.2333333 |
3 | 1.0 | 1.0000000 |
4 | 0.0 | 0.0500000 |
4 | 0.2 | 0.0500000 |
4 | 0.4 | 0.0500000 |
4 | 0.6 | 0.1000000 |
4 | 0.8 | 0.2750000 |
4 | 1.0 | 1.0000000 |
5 | 0.0 | 0.0400000 |
5 | 0.2 | 0.0600000 |
5 | 0.4 | 0.0600000 |
5 | 0.6 | 0.0600000 |
5 | 0.8 | 0.1000000 |
5 | 1.0 | 1.0000000 |
6 | 0.0 | 0.0333333 |
6 | 0.2 | 0.0333333 |
6 | 0.4 | 0.0666667 |
6 | 0.6 | 0.0666667 |
6 | 0.8 | 0.0833333 |
6 | 1.0 | 1.0000000 |
7 | 0.0 | 0.0285714 |
7 | 0.2 | 0.0285714 |
7 | 0.4 | 0.0285714 |
7 | 0.6 | 0.0428571 |
7 | 0.8 | 0.1000000 |
7 | 1.0 | 1.0000000 |
Here we see an averaged nLTT plot, where the original nLTT values are still visible:
::qplot(t, nltt, data = df, geom = "point", ylim = c(0, 1),
ggplot2main = "Average nLTT plot of phylogenies", color = id, size = I(0.1)
+ ggplot2::stat_summary(
) fun.data = "mean_cl_boot", color = "red", geom = "smooth"
)
Here we see an averaged nLTT plot, with the original nLTT values omitted:
::qplot(t, nltt, data = df, geom = "blank", ylim = c(0, 1),
ggplot2main = "Average nLTT plot of phylogenies"
+ ggplot2::stat_summary(
) fun.data = "mean_cl_boot", color = "red", geom = "smooth"
)