Does brain size scale as the 2/3 power of body size?
How should we analyse the data to answer this research question?
We are primarily interested in the slope coefficient, and do not have a predictor and a response, rather we have two response variables. So we want to avoid regression to the mean and should look at this as a multivariate problem, and use principal components analysis (or related methods).
How steep is the slope of the line representing the leaf economics spectrum? It is steeper than one? Does the steepness of the line vary across environments?
What method should [Ian] use to do this?
We are primarily interested in the slope coefficients, and do not have a predictor and a response, rather we have two response variables. So we want to avoid regression to the mean and should look at this as a multivariate problem, and use principal components analysis (or related methods) and generalisations designed to compare slopes of several such axes.
library(MASS)
data(Animals)
=lm(log(brain)~log(body),data=Animals)
ftBrainBodyconfint(ftBrainBody)
#> 2.5 % 97.5 %
#> (Intercept) 1.7056829 3.4041133
#> log(body) 0.3353152 0.6566742
This confidence interval does not cover 2/3
so suggests
the data do not fit the 2/3 power law.
=lm(log(body)~log(brain),data=Animals)
ftBodyBrainconfint(ftBodyBrain)
#> 2.5 % 97.5 %
#> (Intercept) -3.6396580 0.3396307
#> log(brain) 0.8281789 1.6218881
Flipping x
and y
axes, this confidence
interval does cover 3/2
, which suggests that the
data do fit the 2/3 power law!
library(smatr)
= sma(brain~body, data=Animals,log="xy",slope.test=2/3)
sma_brainBody
sma_brainBody#> Call: sma(formula = brain ~ body, data = Animals, log = "xy", slope.test = 2/3)
#>
#> Fit using Standardized Major Axis
#>
#> These variables were log-transformed before fitting: xy
#>
#> Confidence intervals (CI) are at 95%
#>
#> ------------------------------------------------------------
#> Coefficients:
#> elevation slope
#> estimate 0.8797718 0.6363038
#> lower limit 0.4999123 0.4955982
#> upper limit 1.2596314 0.8169572
#>
#> H0 : variables uncorrelated
#> R-squared : 0.6076101
#> P-value : 1.0169e-06
#>
#> ------------------------------------------------------------
#> H0 : slope not different from 0.6666667
#> Test statistic : r= -0.07424 with 26 degrees of freedom under H0
#> P-value : 0.70734
Here we got \(r=-0.07\), \(P=0.71\) and conclude there is no evidence
against the 2/3 power law. Note that 2/3
is towards the
centre of the confidence interval for the SMA slope.
Reversing the axes:
sma(body~brain, data=Animals,log="xy",slope.test=3/2)
#> Call: sma(formula = body ~ brain, data = Animals, log = "xy", slope.test = 3/2)
#>
#> Fit using Standardized Major Axis
#>
#> These variables were log-transformed before fitting: xy
#>
#> Confidence intervals (CI) are at 95%
#>
#> ------------------------------------------------------------
#> Coefficients:
#> elevation slope
#> estimate -1.3826286 1.571576
#> lower limit -2.2584635 1.224054
#> upper limit -0.5067936 2.017763
#>
#> H0 : variables uncorrelated
#> R-squared : 0.6076101
#> P-value : 1.0169e-06
#>
#> ------------------------------------------------------------
#> H0 : slope not different from 1.5
#> Test statistic : r= 0.07424 with 26 degrees of freedom under H0
#> P-value : 0.70734
We get exactly the same test results: \(r=-0.07\), \(P=0.71\) and reach the same conclusion.
Note that 3/2
is towards the centre of the confidence
interval for the SMA slope.
Is this what you would have expected? Is there evidence against the 2/3 power law?
This is as expected, (S)MA is invariant under flipping of
x
and y
. As above there is no evidence against
the 2/3 power law.
smatr
data(leaflife)
= sma(longev~lma*site, log="xy", data=leaflife)
leafSlopes summary(leafSlopes)
#> Call: sma(formula = longev ~ lma * site, data = leaflife, log = "xy")
#>
#> Fit using Standardized Major Axis
#>
#> These variables were log-transformed before fitting: xy
#>
#> Confidence intervals (CI) are at 95%
#>
#> ------------------------------------------------------------
#> Results of comparing lines among groups.
#>
#> H0 : slopes are equal.
#> Likelihood ratio statistic : 9.382 with 3 degrees of freedom
#> P-value : 0.02462
#> ------------------------------------------------------------
#>
#> Coefficients by group in variable "site"
#>
#> Group: 1
#> elevation slope
#> estimate -4.218236 2.119823
#> lower limit -5.903527 1.451816
#> upper limit -2.532946 3.095192
#>
#> H0 : variables uncorrelated.
#> R-squared : 0.5039152
#> P-value : 0.0014113
#>
#> Group: 2
#> elevation slope
#> estimate -2.321737 1.1768878
#> lower limit -3.475559 0.7631512
#> upper limit -1.167915 1.8149286
#>
#> H0 : variables uncorrelated.
#> R-squared : 0.3407371
#> P-value : 0.013891
#>
#> Group: 3
#> elevation slope
#> estimate -2.523634 1.1825381
#> lower limit -3.135266 0.9398479
#> upper limit -1.912003 1.4878965
#>
#> H0 : variables uncorrelated.
#> R-squared : 0.7392636
#> P-value : 1.462e-07
#>
#> Group: 4
#> elevation slope
#> estimate -3.837710 1.786551
#> lower limit -5.291926 1.257257
#> upper limit -2.383495 2.538672
#>
#> H0 : variables uncorrelated.
#> R-squared : 0.80651
#> P-value : 0.00041709
plot(leafSlopes)
smatr
= subset(leaflife, soilp == "low")
leaf_low_soilp = sma(longev~lma+rain, log="xy", data=leaf_low_soilp)
leafElev
leafElev#> Call: sma(formula = longev ~ lma + rain, data = leaf_low_soilp, log = "xy")
#>
#> Fit using Standardized Major Axis
#>
#> These variables were log-transformed before fitting: xy
#>
#> Confidence intervals (CI) are at 95%
#>
#> ------------------------------------------------------------
#> Results of comparing lines among groups.
#>
#> H0 : slopes are equal.
#> Likelihood ratio statistic : 2.367 with 1 degrees of freedom
#> P-value : 0.12395
#> ------------------------------------------------------------
#>
#> H0 : no difference in elevation.
#> Wald statistic: 6.566 with 1 degrees of freedom
#> P-value : 0.010393
#> ------------------------------------------------------------
#>
#> Use the summary() function to print coefficients by group.
par(mfrow=c(1,2),mar=c(3,3,1,1),mgp=c(1.75,0.75,0))
plot(sma_brainBody,which="residual") # residual plot
{abline(a=0,b=0,col="red")}
qqnorm(residuals(sma_brainBody))
qqline(residuals(sma_brainBody), col="red")
= sma(brain~body, data=Animals,log="xy",
sma_brainBodyRobust slope.test=2/3,robust=TRUE)
sma_brainBodyRobust#> Call: sma(formula = brain ~ body, data = Animals, log = "xy", slope.test = 2/3,
#> robust = TRUE)
#>
#> Fit using Standardized Major Axis
#>
#> These variables were log-transformed before fitting: xy
#>
#> Confidence intervals (CI) are at 95%
#>
#> ------------------------------------------------------------
#> Coefficients:
#> elevation slope
#> estimate 0.8367912 0.7541820
#> lower limit 0.6056688 0.6452409
#> upper limit 1.0679136 0.8815164
#>
#> H0 : variables uncorrelated
#> R-squared : 0.6076101
#> P-value : 1.0169e-06
#>
#> ------------------------------------------------------------
#> H0 : slope not different from 0.6666667
#> Test statistic : r= 0.3458 with 26 degrees of freedom under H0
#> P-value : 0.11673
plot(brain~body,data=Animals,log="xy")
abline(sma_brainBody, col="red")
abline(sma_brainBodyRobust, col="blue")
Notice that this line is slightly steeper than previously, because it is less sensitive to the outliers below the line, also note that the confidence interval is narrower. Is this what you expected to happen?
Yes – the outliers were towards the bottom-right, so giving them less weight in analysis should drag the line up towards the rest of the data. It makes sense that CIs are narrower because outliers make variances and covariances inefficient estimators, so by using a method that can better handle outliers we can get more precise estimates.
Repeat the analyses of the brain-body mass data, in Code Boxes
13.2 and 13.6, excluding the three dinosaur species from analysis by
using data=AnimalsSnipped[-c(6,16,26),]
.
=Animals[-c(6,16,26),]
AnimalsSnipped= sma(brain~body, data=AnimalsSnipped,log="xy",slope.test=2/3)
sma_brainBody
sma_brainBody#> Call: sma(formula = brain ~ body, data = AnimalsSnipped, log = "xy",
#> slope.test = 2/3)
#>
#> Fit using Standardized Major Axis
#>
#> These variables were log-transformed before fitting: xy
#>
#> Confidence intervals (CI) are at 95%
#>
#> ------------------------------------------------------------
#> Coefficients:
#> elevation slope
#> estimate 0.8927447 0.7835628
#> lower limit 0.7115619 0.6946736
#> upper limit 1.0739275 0.8838260
#>
#> H0 : variables uncorrelated
#> R-squared : 0.9216991
#> P-value : 3.2428e-14
#>
#> ------------------------------------------------------------
#> H0 : slope not different from 0.6666667
#> Test statistic : r= 0.5016 with 23 degrees of freedom under H0
#> P-value : 0.010623
This suddenly gives a significant effect (\(r=0.50\), \(P=0.01\)), with the slope having jumped up
from 0.64
to 0.78
. Using robust methods:
= sma(brain~body, data=AnimalsSnipped,log="xy",
sma_brainBodyRobust slope.test=2/3,robust=TRUE)
sma_brainBodyRobust#> Call: sma(formula = brain ~ body, data = AnimalsSnipped, log = "xy",
#> slope.test = 2/3, robust = TRUE)
#>
#> Fit using Standardized Major Axis
#>
#> These variables were log-transformed before fitting: xy
#>
#> Confidence intervals (CI) are at 95%
#>
#> ------------------------------------------------------------
#> Coefficients:
#> elevation slope
#> estimate 0.8844792 0.7709489
#> lower limit 0.7222772 0.6990277
#> upper limit 1.0466812 0.8502700
#>
#> H0 : variables uncorrelated
#> R-squared : 0.9216991
#> P-value : 3.2428e-14
#>
#> ------------------------------------------------------------
#> H0 : slope not different from 0.6666667
#> Test statistic : r= 0.5261 with 23 degrees of freedom under H0
#> P-value : 0.005345
Is robust SMA less sensitive to the dinosaur outliers? Is this what you expected?
The results changed in both cases, but in a less dramatic way when
using robust SMA. The robust SMA slope only changed from
0.75
to 0.77
on outlier removal, whereas the
regular SMA slope changed from 0.64
to 0.78
.
In both cases there is significant evidence against the 2/3 power law
after outlier removal, but in the robust case, this change was less
dramatic. It is expected that robust SMA would be less sensitive to
outliers.
Repeat the analysis of Ian’s leaf economics data, as in Code Boxes 13.3-13.4, using robust=TRUE. Do results work out differently?
= sma(longev~lma*site, log="xy", data=leaflife, robust=TRUE)
leafSlopes summary(leafSlopes)
#> Call: sma(formula = longev ~ lma * site, data = leaflife, log = "xy",
#> robust = TRUE)
#>
#> Fit using Standardized Major Axis
#>
#> These variables were log-transformed before fitting: xy
#>
#> Confidence intervals (CI) are at 95%
#>
#> ------------------------------------------------------------
#> Results of comparing lines among groups.
#>
#> H0 : slopes are equal.
#> Likelihood ratio statistic : 15.03 with 3 degrees of freedom
#> P-value : 0.001789
#> ------------------------------------------------------------
#>
#> Coefficients by group in variable "site"
#>
#> Group: 1
#> elevation slope
#> estimate -4.811259 2.411612
#> lower limit -6.589002 1.694694
#> upper limit -3.033515 3.431812
#>
#> H0 : variables uncorrelated.
#> R-squared : 0.5039152
#> P-value : 0.0014113
#>
#> Group: 2
#> elevation slope
#> estimate -2.824595 1.4019760
#> lower limit -4.040935 0.9552956
#> upper limit -1.608255 2.0575167
#>
#> H0 : variables uncorrelated.
#> R-squared : 0.3407371
#> P-value : 0.013891
#>
#> Group: 3
#> elevation slope
#> estimate -2.448110 1.1510462
#> lower limit -2.979219 0.9373065
#> upper limit -1.917001 1.4135261
#>
#> H0 : variables uncorrelated.
#> R-squared : 0.7392636
#> P-value : 1.462e-07
#>
#> Group: 4
#> elevation slope
#> estimate -4.142043 1.925544
#> lower limit -5.680918 1.361460
#> upper limit -2.603167 2.723342
#>
#> H0 : variables uncorrelated.
#> R-squared : 0.80651
#> P-value : 0.00041709
This result is fairly similar, although there is stronger evidence now that the slopes are not the same, perhaps because the slope for group 1 is now noticeably steeper.
= subset(leaflife, soilp == "low")
leaf_low_soilp = sma(longev~lma+rain, log="xy", data=leaf_low_soilp, robust=TRUE)
leafElev
leafElev#> Call: sma(formula = longev ~ lma + rain, data = leaf_low_soilp, log = "xy",
#> robust = TRUE)
#>
#> Fit using Standardized Major Axis
#>
#> These variables were log-transformed before fitting: xy
#>
#> Confidence intervals (CI) are at 95%
#>
#> ------------------------------------------------------------
#> Results of comparing lines among groups.
#>
#> H0 : slopes are equal.
#> Likelihood ratio statistic : 1.84 with 1 degrees of freedom
#> P-value : 0.17498
#> ------------------------------------------------------------
#>
#> H0 : no difference in elevation.
#> Wald statistic: 6.017 with 1 degrees of freedom
#> P-value : 0.014166
#> ------------------------------------------------------------
#>
#> Use the summary() function to print coefficients by group.
When comparing elevation of lines fitted to low soil nutrient sites, there does not seem to be evidence of a difference due to rainfall.