6533b858fe1ef96bd12b6275

RESEARCH PRODUCT

How to simulate normal data sets with the desired correlation structure

Alberto FerrerFrancisco Arteaga

subject

Mathematical optimizationCovariance functionCovariance matrixProcess Chemistry and TechnologyMathematicsofComputing_NUMERICALANALYSISMultivariate normal distributionCovarianceComputer Science ApplicationsAnalytical ChemistryEstimation of covariance matricesScatter matrixMatrix normal distributionCMA-ESAlgorithmComputer Science::DatabasesSpectroscopySoftwareMathematics

description

The Cholesky decomposition is a widely used method to draw samples from multivariate normal distribution with non-singular covariance matrices. In this work we introduce a simple method by using singular value decomposition (SVD) to simulate multivariate normal data even if the covariance matrix is singular, which is often the case in chemometric problems. The covariance matrix can be specified by the user or can be generated by specifying a subset of the eigenvalues. The latter can be an advantage for simulating data sets with a particular latent structure. This can be useful for testing the performance of chemometric methods with data sets matching the theoretical conditions for their applicability; checking their robustness when the hypothesized properties fail; or generating data from multi-stage or multi-phase processes.

https://doi.org/10.1016/j.chemolab.2009.12.003