Title: | Estimation of Large Block Covariance Matrices |
---|---|
Description: | Computation of large covariance matrices having a block structure up to a permutation of their columns and rows from a small number of samples with respect to the dimension of the matrix. The method is described in the paper Perrot-Dockès et al. (2019) <arXiv:1806.10093>. |
Authors: | M. Perrot-Dock\`es, C. Lévy-Leduc |
Maintainer: | Marie Perrot-Dockès <[email protected]> |
License: | GPL (>= 2) |
Version: | 0.1.1 |
Built: | 2024-11-08 03:42:03 UTC |
Source: | https://github.com/cran/BlockCov |
Title
cv_bl(E, v_ord, N)
cv_bl(E, v_ord, N)
E |
the observation matrix such that each of its row has a block structure correlation matrix Sigma to estimate up to a permutation of its columns and rows. |
v_ord |
the absolute value of the upper triangular part matrix |
N |
number of replication in the "cross-validation" |
In order to get the treshold one must do rev(v_ord)[cv_bl(E, v_ord, N=N)]
the number of non null values selected for the estimation of the covariance matrix
n <- 30 q <- 100 Sigma <- Simu_Sigma(q = q, diag = FALSE, equal = TRUE) Matrix::image(Sigma) E <- matrix(rnorm(n * q), ncol = q) %*% chol(as.matrix(Sigma)) k <- 5 v_up <- est_up(E, k = k) a_vup <- abs(v_up) ord_vup <- order(a_vup) v_ord <- a_vup[ord_vup] N <- 10 nb_nn0 <- cv_bl(E, v_ord, N=N) tresh <- rev(v_ord)[nb_nn0]
n <- 30 q <- 100 Sigma <- Simu_Sigma(q = q, diag = FALSE, equal = TRUE) Matrix::image(Sigma) E <- matrix(rnorm(n * q), ncol = q) %*% chol(as.matrix(Sigma)) k <- 5 v_up <- est_up(E, k = k) a_vup <- abs(v_up) ord_vup <- order(a_vup) v_ord <- a_vup[ord_vup] N <- 10 nb_nn0 <- cv_bl(E, v_ord, N=N) tresh <- rev(v_ord)[nb_nn0]
Title
est_up(E, k = 5)
est_up(E, k = 5)
E |
the observation matrix such that each of its row has a block structure correlation matrix Sigma wich has a low rank once its diagonal is removed. |
k |
the rank of the correlation matrix of |
an approximation of the correlation matrix of E
with its diagonal removed
n <- 30 q <- 100 Sigma <- Simu_Sigma(q = q, diag = FALSE, equal = TRUE) Matrix::image(Sigma) E <- matrix(rnorm(n * q), ncol = q) %*% chol(as.matrix(Sigma)) k <- 5 v_up <- est_up(E, k = k)
n <- 30 q <- 100 Sigma <- Simu_Sigma(q = q, diag = FALSE, equal = TRUE) Matrix::image(Sigma) E <- matrix(rnorm(n * q), ncol = q) %*% chol(as.matrix(Sigma)) k <- 5 v_up <- est_up(E, k = k)
Title
PA(E, times = 10)
PA(E, times = 10)
E |
the observation matrix such that each of its row has a block structure correlation matrix Sigma wich has a low rank once its diagonal is removed. |
times |
number of random sampling |
the mean of the eigen values of the times
sampled matrix
n <- 30 q <- 100 Sigma <- Simu_Sigma(q = q, diag = FALSE, equal = TRUE) Matrix::image(Sigma) E <- matrix(rnorm(n * q), ncol = q) %*% chol(as.matrix(Sigma)) random_eigen <- PA(E, times = 10)
n <- 30 q <- 100 Sigma <- Simu_Sigma(q = q, diag = FALSE, equal = TRUE) Matrix::image(Sigma) E <- matrix(rnorm(n * q), ncol = q) %*% chol(as.matrix(Sigma)) random_eigen <- PA(E, times = 10)
This function computes an estimator of the covariance matrix and the square root of its inverse and permutes its rows and columns if it is necessary to make the block structure appear.
Sigma_estimation(E, k = NULL, nb_nn0 = NULL, big = FALSE, reorder = FALSE, inv_12 = FALSE, method_k = "Cattell", times = 10, method_0 = "Elbow", N = 10)
Sigma_estimation(E, k = NULL, nb_nn0 = NULL, big = FALSE, reorder = FALSE, inv_12 = FALSE, method_k = "Cattell", times = 10, method_0 = "Elbow", N = 10)
E |
the observation matrix such that each of its row has a block structure correlation matrix Sigma to estimate up to a permutation of its columns and rows. |
k |
numerical or NULL, the rank for the low rank approximation. If NULL the rank is computed using the slope_change function applied on the eigenvalues of the low rank part of Sigma. Default to NULL. |
nb_nn0 |
numerical or NULL, corresponds to the number of non null values to keep in the estimation of the covariance matrix. If NULL the number of non null values is computed using the slope_change function to the Frobenius norm of the difference between the empirical correlation matrix and its estimation with nb_nn0 non null values. Default to NULL. |
big |
logical, default to FALSE. If the dataset is too big the empirical correlation is calculated by crossprod(E) * 1 / n to fasten the computation |
reorder |
logical, default to FALSE. Whether or not the columns of E are permuted. If TRUE a hierarchical clustering is first performed and the columns are permuted according to it. |
inv_12 |
logical, default to FALSE Whether or not computing the square root of the inverse of the covariance matrix. |
method_k |
character if "Cattell" (the default) then the Cattell criterion (Cattell 1966) is performed on the singular values of the covariance matrix. to estimate the number of rank use in the low rank approximation, while "PA" use the parrallel analysis (Horn 1965) wich can be more accurate if the number of rows of E is not to small but which is much slower. |
times |
numeric the number of resampling done for the "PA" method, ignored if metod_k is different from "PA". |
method_0 |
character if "Elbow" (the default) then the Elbow criterion (see Perrot-Dockès et al. (2018) for details) is performed to estimate the number of rank use in the low rank approximation, while "BL" use the approach proposed in Bickel and Levina (2008) based on cross-validation wich can be more accurate if the number of rows of E is not to small but which is much slower. |
N |
numeric the number of fold used for the "BL" method. Ignored if method_0 is different from "BL" |
A list with the elements
Sigma_est |
estimator of the covariance matrix |
k |
rank of the low rank part of the covariance matrix |
nb_nn0 |
number of non null values of the upper triangular part of the covariance matrix |
S_inv_12 |
square root of the inverse of the estimated covariance matrix |
order |
permutation to apply to the rows and the columns of the covariance to make the block structure appear |
Bickel PJ, Levina E (2008).
“Covariance regularization by thresholding.”
Ann. Statist., 36(6), 2577–2604.
doi:10.1214/08-AOS600.
Cattell RB (1966).
“The scree test for the number of factors.”
Multivariate behavioral research, 1(2), 245-276.
Horn JL (1965).
“A rationale and test for the number of factors in factor analysis.”
Psychometrika, 30(2), 179–185.
ISSN 1860-0980, doi:10.1007/BF02289447.
Perrot-Dockès M, Lévy-Leduc C, Rajjou L (2018).
“Estimation of large block structured covariance matrices: Application to "multi-omic" approaches to study seed quality.”
arXiv:1806.10093.
n <- 30 q <- 100 Sigma <- Simu_Sigma(q = q, diag = FALSE, equal = TRUE) Matrix::image(Sigma) E <- matrix(rnorm(n * q), ncol = q) %*% chol(as.matrix(Sigma)) res <- Sigma_estimation(E, inv_12 = TRUE) Matrix::image(res$Sigma_est) Matrix::image(res$S_inv_12)
n <- 30 q <- 100 Sigma <- Simu_Sigma(q = q, diag = FALSE, equal = TRUE) Matrix::image(Sigma) E <- matrix(rnorm(n * q), ncol = q) %*% chol(as.matrix(Sigma)) res <- Sigma_estimation(E, inv_12 = TRUE) Matrix::image(res$Sigma_est) Matrix::image(res$S_inv_12)
This function generates a block structured symmetric positive definite matrix to test the BlockCov methodology.
Simu_Sigma(q, diag = TRUE, equal = TRUE)
Simu_Sigma(q, diag = TRUE, equal = TRUE)
q |
integer corresponding to the size of the covariance matrix. |
diag |
logical, whether or not the covariance matrix is block-diagonal. |
equal |
logical, whether or not the values in the blocks are equal. |
Sigma a correlation matrix to test the BlockCov methodology.
Sigma <- Simu_Sigma(q = 100, diag = FALSE, equal = TRUE) Matrix::image(Sigma)
Sigma <- Simu_Sigma(q = 100, diag = FALSE, equal = TRUE) Matrix::image(Sigma)
This function fits to a numerical vector sorted in the non decreasing order two simple linear regressions and returns the index corresponding to the estimated change between the two regression models.
slope_change(Y)
slope_change(Y)
Y |
numerical vector sorted in the non decreasing order. |
K the index corresponding to the estimated change between the two linear regression models.
n <- 30 q <- 100 Sigma <- Simu_Sigma(q = q, diag = FALSE, equal = TRUE) Matrix::image(Sigma) E <- matrix(rnorm(n * q), ncol = q) %*% chol(as.matrix(Sigma)) corE <- cor(as.matrix(E)) vec_up_emp <- corE[upper.tri(corE)] G <- matrix(0, ncol = (q - 1), nrow = (q - 1)) G[upper.tri(G, diag = TRUE)] <- vec_up_emp G[lower.tri(G)] <- t(as.matrix(G))[lower.tri(t(as.matrix(G)))] res_svd <- svd(G) vp <- res_svd$d slope_change(vp)
n <- 30 q <- 100 Sigma <- Simu_Sigma(q = q, diag = FALSE, equal = TRUE) Matrix::image(Sigma) E <- matrix(rnorm(n * q), ncol = q) %*% chol(as.matrix(Sigma)) corE <- cor(as.matrix(E)) vec_up_emp <- corE[upper.tri(corE)] G <- matrix(0, ncol = (q - 1), nrow = (q - 1)) G[upper.tri(G, diag = TRUE)] <- vec_up_emp G[lower.tri(G)] <- t(as.matrix(G))[lower.tri(t(as.matrix(G)))] res_svd <- svd(G) vp <- res_svd$d slope_change(vp)