6533b7dafe1ef96bd126e1e5

RESEARCH PRODUCT

Regression with Imputed Covariates: A Generalized Missing Indicator Approach

Valentino DardanoniSalvatore ModicaFranco Peracchi

subject

Set (abstract data type)Reduction (complexity)Relation (database)Bias of an estimatorStatisticsCovariateSettore SECS-P/05 - EconometriaStatistics::MethodologyRegression analysisMissing dataRegressionMathematics

description

A common problem in applied regression analysis is that covariate values may be missing for some observations but imputed values may be available. This situation generates a trade-off between bias and precision: the complete cases are often disarmingly few, but replacing the missing observations with the imputed values to gain precision may lead to bias. In this paper we formalize this trade-off by showing that one can augment the regression model with a set of auxiliary variables so as to obtain, under weak assumptions about the imputations, the same unbiased estimator of the parameters of interest as complete-case analysis. Given this augmented model, the bias-precision trade-off may then be tackled by either model reduction procedures or model averaging methods. We illustrate our approach by considering the problem of estimating the relation between income and the body mass index (BMI) using survey data affected by item non-response, where the missing values on the main covariates are filled in by imputations.

https://doi.org/10.2139/ssrn.1485547