6533b7ddfe1ef96bd12747ac

RESEARCH PRODUCT

Asymptotic efficiency of the calibration estimator in a high-dimensional data setting

Camelia GogaGuillaume Chauvet

subject

Statistics and ProbabilityVariance inflation factorAuxiliary variablesVariable (computer science)Calibration (statistics)Applied MathematicsStatisticsEstimatorVariance (accounting)Statistics Probability and UncertaintyPopulation samplingMathematics

description

Abstract In a finite population sampling survey, auxiliary information is commonly used to improve the Horvitz-Thompson estimators and calibration has been extensively used by national statistical agencies over the last decades for that purpose. This method enables to make estimators consistent with known totals of auxiliary variables and to reduce variance if the calibration variables are explanatory for the variable of interest. Nowadays, it is not unusual anymore to have high-dimensional auxiliary data sets and adding too much additional calibration variables may increase the variance of calibration estimators. We study in this paper the asymptotic efficiency of the calibration estimator with high-dimensional auxiliary data sets and we prove that it may suffer from an additional variability that may not be neglected in certain conditions. We suggest a bootstrap criterion in the choice of calibration variables. A short simulation study shows that the proposed method may lead to a more parsimonious number of calibration variables with associated weights of smaller variation and no variance inflation.

https://doi.org/10.1016/j.jspi.2021.07.011