6533b853fe1ef96bd12ac30c

RESEARCH PRODUCT

The continuous sample of working lives: Improving its representativeness

Juan Manuel Pérez-salamero GonzálezMarta Regúlez-castilloCarlos Vidal-meliá

subject

J26Subsample selectionComputer scienceChi-square testContinuous Sample of Working LivesPopulationMicrodata (statistics)Sample (statistics)p valueRepresentativeness heuristicPensionsGoodness of fit0502 economics and businessEconometricsddc:330050207 economicsH55education050205 econometrics education.field_of_studyPensionPublic pension system05 social sciencesStratified samplingStratified samplingSocial securityC81General Economics Econometrics and Finance

description

This paper studies the representativeness of the Continuous Sample of Working Lives (CSWL), a set of anonymized microdata containing information on individuals from Spanish Social Security records. We examine several CSWL waves (2005–2013) and show that it is not representative for the population with a pension income. We then develop a methodology to draw a large dataset from the CSWL that is much more representative of the retired population in terms of pension type, gender and age. This procedure also makes it possible for users to choose between goodness of fit and subsample size. In order to illustrate the practical significance of our methodology, the paper also contains an application in which we generate a large subsample distribution from the 2010 CSWL. The results are striking: with a very small reduction in the size of the original CSWL, we significantly reduce errors in estimating pension expenditure for 2010, with a p value greater or equal to 0.999.

10.1007/s13209-017-0154-0https://hdl.handle.net/10419/195287