Search results for "IMPUTATION"

showing 10 items of 57 documents

Regression imputation for Space-Time datasets with missing values

2009

Data consisting in repeated observations on a series of fixed units are very common in different context like biological, environmental and social sciences, and different terminology is often used to indicate this kind of data: panel data, longitudinal data, time series-cross section data (TSCS), spatio-temporal data. Missing information are inevitable in longitudinal studies, and can produce biased estimates and loss of powers. The aim of this paper is to propose a new regression (single) imputation method that, considering the particular structure and characteristics of the data set, creates a “complete” data set that can be analyzed by any researcher on different occasions and using diff…

Cross-sectional dataSpace timeMissing datacomputer.software_genreRegressionTerminologyGeographyStatisticsSpace-time data imputationPerformance indicatorImputation (statistics)Data miningSettore SECS-S/01 - StatisticacomputerPanel data
researchProduct

Single imputation method of missing values in environmental pollution data sets

2006

Abstract Missing data represent a general problem in many scientific fields above all in environmental research. Several methods have been proposed in literature for handling missing data and the choice of an appropriate method depends, among others, on the missing data pattern and on the missing-data mechanism. One approach to the problem is to impute them to yield a complete data set. The goal of this paper is to propose a new single imputation method and to compare its performance to other single and multiple imputation methods known in literature. Considering a data set of PM 10 concentration measured every 2 h by eight monitoring stations distributed over the metropolitan area of Paler…

Data setAtmospheric ScienceCorrelation coefficientStatisticsEnvironmental pollutionImputation (statistics)Performance indicatorTime seriesMissing dataRoot-mean-square deviationGeneral Environmental ScienceMathematicsAtmospheric Environment
researchProduct

Regression with imputed covariates: A generalized missing-indicator approach

2011

A common problem in applied regression analysis is that covariate values may be missing for some observations but imputed values may be available. This situation generates a trade-off between bias and precision: the complete cases are often disarmingly few, but replacing the missing observations with the imputed values to gain precision may lead to bias. In this paper, we formalize this trade-off by showing that one can augment the regression model with a set of auxiliary variables so as to obtain, under weak assumptions about the imputations, the same unbiased estimator of the parameters of interest as complete-case analysis. Given this augmented model, the bias-precision trade-off may the…

Economics and EconometricsApplied MathematicsRegression analysisMissing dataRegressionSet (abstract data type)Reduction (complexity)Economic dataBias of an estimatorStatisticsCovariateMissing covariates ImputationsBias precision trade-off Model reduction Model averaging BMI and incomeEconometricsStatistics::MethodologyC12C13C19Missing covariatesImputationsBias-precision trade-offModel reductionModel averagingBMI and incomeMathematics
researchProduct

Imputation Procedures in Surveys Using Nonparametric and Machine Learning Methods: An Empirical Comparison

2020

Abstract Nonparametric and machine learning methods are flexible methods for obtaining accurate predictions. Nowadays, data sets with a large number of predictors and complex structures are fairly common. In the presence of item nonresponse, nonparametric and machine learning procedures may thus provide a useful alternative to traditional imputation procedures for deriving a set of imputed values used next for the estimation of study parameters defined as solution of population estimating equation. In this paper, we conduct an extensive empirical investigation that compares a number of imputation procedures in terms of bias and efficiency in a wide variety of settings, including high-dimens…

FOS: Computer and information sciencesStatistics and ProbabilityStatistics::ApplicationsEmpirical comparisonbusiness.industryComputer scienceApplied MathematicsNonparametric statisticsMachine learningcomputer.software_genreStatistics - ComputationVariety (cybernetics)Methodology (stat.ME)Set (abstract data type)Statistics::MethodologyImputation (statistics)Artificial intelligenceStatistics Probability and UncertaintybusinesscomputerStatistics - MethodologyComputation (stat.CO)Social Sciences (miscellaneous)Journal of Survey Statistics and Methodology
researchProduct

Comparing Spatial and Spatio-temporal FPCA to Impute Large Continuous Gaps in Space

2018

Multivariate spatio-temporal data analysis methods usually assume fairly complete data, while a number of gaps often occur along time or in space. In air quality data long gaps may be due to instrument malfunctions; moreover, not all the pollutants of interest are measured in all the monitoring stations of a network. In literature, many statistical methods have been proposed for imputing short sequences of missing values, but most of them are not valid when the fraction of missing values is high. Furthermore, the limitation of the methods commonly used consists in exploiting temporal only, or spatial only, correlation of the data. The objective of this paper is to provide an approach based …

Functional principal component analysisComplete dataMultivariate statisticsLong gapComputer sciencecomputer.software_genreMissing dataCorrelationFDA FPCA GAM P-splinesData analysisData miningImputation (statistics)Settore SECS-S/01 - Statisticacomputer
researchProduct

2015

Hearing loss and individual differences in normal hearing both have a substantial genetic basis. Although many new genes contributing to deafness have been identified, very little is known about genes/variants modulating the normal range of hearing ability. To fill this gap, we performed a two-stage meta-analysis on hearing thresholds (tested at 0.25, 0.5, 1, 2, 4, 8 kHz) and on pure-tone averages (low-, medium- and high-frequency thresholds grouped) in several isolated populations from Italy and Central Asia (total N = 2636). Here, we detected two genome-wide significant loci close to PCDH20 and SLC28A3 (top hits: rs78043697, P = 4.71E-10 and rs7032430, P = 2.39E-09, respectively). For bot…

Genetics0303 health sciencesSequence analysisHearing lossGenome-wide association studySingle-nucleotide polymorphismGeneral MedicineBiologyGenome03 medical and health sciences0302 clinical medicineGenotypeotorhinolaryngologic diseasesGeneticsmedicinemedicine.symptomMolecular BiologyGene030217 neurology & neurosurgeryGenetics (clinical)Imputation (genetics)030304 developmental biologyHuman Molecular Genetics
researchProduct

Meta-analysis and imputation refines the association of 15q25 with smoking quantity.

2010

Smoking is a leading global cause of disease and mortality(1). We established the Oxford-GlaxoSmithKline study (Ox-GSK) to perform a genome-wide meta-analysis of SNP association with smoking-related behavioral traits. Our final data set included 41,150 individuals drawn from 20 disease, population and control cohorts. Our analysis confirmed an effect on smoking quantity at a locus on 15q25 (P = 9.45 x 10(-19)) that includes CHRNA5, CHRNA3 and CHRNB4, three genes encoding neuronal nicotinic acetylcholine receptor subunits. We used data from the 1000 Genomes project to investigate the region using imputation, which allowed for analysis of virtually all common SNPs in the region and offered a …

Genetics0303 health scienceseducation.field_of_study/dk/atira/pure/subjectarea/asjc/1300/1311PopulationSingle-nucleotide polymorphismGenome-wide association studyLocus (genetics)BiologyArticle3. Good health03 medical and health sciences0302 clinical medicineGenome-Wide Association; Nicotine Dependence; Lung-Cancer; Susceptibility Locus; Risk-Factors; Disease; Genes; SNPS; Colaus StudyGeneticsSNP1000 Genomes ProjectAlleleeducation030217 neurology & neurosurgeryImputation (genetics)genome-wide association study; smoking initiation; smoking quantity030304 developmental biologyNature genetics
researchProduct

Sedentary behaviours and cognitive function among community dwelling adults aged 50+ years: Results from the Irish longitudinal study of ageing

2020

Background:\ud \ud Sedentary behaviours (SB) are risk factors for poor cardiovascular health and all-cause mortality. However, their role in cognitive health in older adults is unclear. A few studies have examined associations between sedentary behaviours and cognition, but are limited by heterogeneity and insufficient longitudinal analyses. Therefore more robust studies, which would address identified limitations, are needed to accurately determine associations.\ud \ud Method:\ud \ud This study analysed data collected from participants aged 50+ years of The Irish Longitudinal Study of Ageing (TILDA). We conducted cross-sectional linear regression with multivariate imputation analyses of ba…

Gerontologymedicine.medical_specialtyLongitudinal studyLonelineDepressionPhysical activityPublic healthCognition030229 sport sciencesSitting030227 psychiatry03 medical and health sciencesPsychiatry and Mental healthSocial isolation.0302 clinical medicinemedicineVerbal fluency testImputation (statistics)Verbal memoryAssociation (psychology)PsychologyApplied PsychologySitting
researchProduct

The effect of women’s participation in the labour market on the postponement of first childbirth: a comparison of Italy and Hungary

2014

This paper analyses the effect of increasing female participation in the labour market on the transition to first childbirth. Regional perspectives are considered to help us understand how postponement behaviour is changing over time and at different paces in each region. The analysis is based on the first wave of the Generations and Gender Survey of Italy and Hungary. We use a multilevel event history random intercept model to examine the effect of individuals’ positions in the labour market on the transition to motherhood, controlling for differences in macrolevel factors related to regional backgrounds in the two countries. The regional data for Italy came from the Italian National Stati…

Labour economicslow fertility postponement first job education multilelvel event history modelsPostponementControl (management)Protective factorEconomicsChildbirthImputation (statistics)Risk factor (finance)Minor (academic)Settore SECS-S/04 - DemografiaRandom interceptDemography
researchProduct

Imputation of posterior linkage probability relations reveals a significant influence of structural 3D constraints on linkage disequilibrium

2018

Genetic association studies have become increasingly important in unraveling the genetics of diseases or complex traits. Despite their value for modern genetics, conflicting conclusions often arise through the difficulty of confirming and replicating experimental results. We argue that this problem is largely based on the application of statistical relation measures that are not appropriate for genomic data analysis and demonstrate that the standard measures used for Genome-wide association studies or genomics linkage analysis bear a statistic bias. This may come from the violation of underlying assumptions (such as independence or stationarity) as well as from other conceptual limitations …

Linkage disequilibriumComputer sciencePosterior probabilityEconometricsGenomicsImputation (statistics)Latent variableCategorical variableStatisticImputation (genetics)Genetic association
researchProduct