cvCovEst
Cross-Validated Covariance Matrix Estimation
Authors: Philippe Boileau, Brian Collica, and Nima Hejazi
cvCovEst
?cvCovEst
implements an efficient cross-validated
procedure for covariance matrix estimation, particularly useful in
high-dimensional settings. The general methodology allows for
cross-validation to be used to data adaptively identify the optimal
estimator of the covariance matrix from a prespecified set of candidate
estimators. An overview of the framework is provided in the package
vignette. For a more detailed description, see Boileau et al. (2021). A
suite of plotting and diagnostic tools are also included.
For standard use, install cvCovEst
from CRAN:
install.packages("cvCovEst")
The development version of the package may be installed from
GitHub using remotes
:
::install_github("PhilBoileau/cvCovEst") remotes
To illustrate how cvCovEst
may be used to select an
optimal covariance matrix estimator via cross-validation, consider the
following toy example:
library(MASS)
library(cvCovEst)
set.seed(1584)
# generate a 50x50 covariance matrix with unit variances and off-diagonal
# elements equal to 0.5
<- matrix(0.5, nrow = 50, ncol = 50) + diag(0.5, nrow = 50)
Sigma
# sample 50 observations from multivariate normal with mean = 0, var = Sigma
<- mvrnorm(n = 50, mu = rep(0, 50), Sigma = Sigma)
dat
# run CV-selector
<- cvCovEst(
cv_cov_est_out dat = dat,
estimators = c(linearShrinkLWEst, denseLinearShrinkEst,
thresholdingEst, poetEst, sampleCovEst),estimator_params = list(
thresholdingEst = list(gamma = c(0.2, 2)),
poetEst = list(lambda = c(0.1, 0.2), k = c(1L, 2L))
),cv_loss = cvMatrixFrobeniusLoss,
cv_scheme = "v_fold",
v_folds = 5,
)
# print the table of risk estimates
# NOTE: the estimated covariance matrix is accessible via the `$estimate` slot
$risk_df
cv_cov_est_out#> # A tibble: 9 × 3
#> estimator hyperparameters cv_risk
#> <chr> <chr> <dbl>
#> 1 linearShrinkLWEst hyperparameters = NA 357.
#> 2 poetEst lambda = 0.2, k = 1 369.
#> 3 poetEst lambda = 0.2, k = 2 372.
#> 4 poetEst lambda = 0.1, k = 2 375.
#> 5 poetEst lambda = 0.1, k = 1 376.
#> 6 denseLinearShrinkEst hyperparameters = NA 379.
#> 7 sampleCovEst hyperparameters = NA 379.
#> 8 thresholdingEst gamma = 0.2 384.
#> 9 thresholdingEst gamma = 2 826.
If you encounter any bugs or have any specific feature requests, please file an issue.
Contributions are very welcome. Interested contributors should consult our contribution guidelines prior to submitting a pull request.
Please cite the following paper when using the cvCovEst
R software package.
@article{cvCovEst2021,
doi = {10.21105/joss.03273},
url = {https://doi.org/10.21105/joss.03273},
year = {2021},
publisher = {The Open Journal},
volume = {6},
number = {63},
pages = {3273},
author = {Philippe Boileau and Nima S. Hejazi and Brian Collica and Mark J. van der Laan and Sandrine Dudoit},
title = {cvCovEst: Cross-validated covariance matrix estimator selection and evaluation in `R`},
journal = {Journal of Open Source Software}
}
When describing or discussing the theory underlying the
cvCovEst
method, or simply using the method, please cite
the pre-print below.
@misc{boileau2021,
title={Cross-Validated Loss-Based Covariance Matrix Estimator Selection in High Dimensions},
author={Philippe Boileau and Nima S. Hejazi and Mark J. van der Laan and Sandrine Dudoit},
year={2021},
eprint={2102.09715},
archivePrefix={arXiv},
primaryClass={stat.ME}
}
© 2020-2022 Philippe Boileau
The contents of this repository are distributed under the MIT
license. See file LICENSE.md
for details.