Search results for "missing"

showing 10 items of 174 documents

Gap Filling of Biophysical Parameter Time Series with Multi-Output Gaussian Processes

2018

In this work we evaluate multi-output (MO) Gaussian Process (GP) models based on the linear model of coregionalization (LMC) for estimation of biophysical parameter variables under a gap filling setup. In particular, we focus on LAI and fAPAR over rice areas. We show how this problem cannot be solved with standard single-output (SO) GP models, and how the proposed MO-GP models are able to successfully predict these variables even in high missing data regimes, by implicitly performing an across-domain information transfer.

FOS: Computer and information sciencesComputer Science - Machine Learning010504 meteorology & atmospheric sciences0211 other engineering and technologiesFOS: Physical sciencesMachine Learning (stat.ML)02 engineering and technology01 natural sciencesQuantitative Biology - Quantitative MethodsMachine Learning (cs.LG)Data modelingsymbols.namesakeStatistics - Machine LearningApplied mathematicsTime seriesGaussian processQuantitative Methods (q-bio.QM)021101 geological & geomatics engineering0105 earth and related environmental sciencesMathematicsSeries (mathematics)Linear modelProbability and statisticsMissing dataFOS: Biological sciencesPhysics - Data Analysis Statistics and ProbabilitysymbolsFocus (optics)Data Analysis Statistics and Probability (physics.data-an)

researchProduct

Do-search -- a tool for causal inference and study design with multiple data sources

2020

Epidemiologic evidence is based on multiple data sources including clinical trials, cohort studies, surveys, registries, and expert opinions. Merging information from different sources opens up new possibilities for the estimation of causal effects. We show how causal effects can be identified and estimated by combining experiments and observations in real and realistic scenarios. As a new tool, we present do-search, a recently developed algorithmic approach that can determine the identifiability of a causal effect. The approach is based on do-calculus, and it can utilize data with nontrivial missing data and selection bias mechanisms. When the effect is identifiable, do-search outputs an i…

FOS: Computer and information sciencesEpidemiologyComputer sciencemedia_common.quotation_subjectInformation Storage and RetrievalMachine learningcomputer.software_genre01 natural sciencesStatistics - ApplicationsMethodology (stat.ME)010104 statistics & probability03 medical and health sciences0302 clinical medicineHumansApplications (stat.AP)030212 general & internal medicine0101 mathematicsSalt intakeStatistics - Methodologymedia_commonSelection biasbusiness.industryNutrition SurveysMissing dataCausalityCausalityResearch DesignCausal inferenceMeta-analysisSurvey data collectionIdentifiabilityArtificial intelligencebusinesscomputer

researchProduct

Causal Effect Identification from Multiple Incomplete Data Sources: A General Search-Based Approach

2021

Causal effect identification considers whether an interventional probability distribution can be uniquely determined without parametric assumptions from measured source distributions and structural knowledge on the generating system. While complete graphical criteria and procedures exist for many identification problems, there are still challenging but important extensions that have not been considered in the literature. To tackle these new settings, we present a search algorithm directly over the rules of do-calculus. Due to generality of do-calculus, the search is capable of taking more advanced data-generating mechanisms into account along with an arbitrary type of both observational and…

FOS: Computer and information sciencesStatistics and ProbabilityComputer Science - Machine LearningcausalityComputer Science - Artificial IntelligenceHeuristic (computer science)Computer scienceeducationMachine Learning (stat.ML)transportabilitycomputer.software_genre01 natural sciencesMachine Learning (cs.LG)R-kielimissing dataQA76.75-76.765; QA273-280010104 statistics & probabilitydo-calculuscausality; do-calculus; selection bias; transportability; missing data; case-control design; meta-analysisStatistics - Machine LearningSearch algorithmselection bias0101 mathematicsParametric statisticspäättelymeta-analyysicase-control designhakualgoritmit113 Computer and information sciencesMissing datameta-analysisIdentification (information)Artificial Intelligence (cs.AI)Causal inferencekausaliteettiIdentifiabilityProbability distributionData miningStatistics Probability and UncertaintycomputerSoftwareJournal of Statistical Software

researchProduct

Estimating with kernel smoothers the mean of functional data in a finite population setting. A note on variance estimation in presence of partially o…

2014

In the near future, millions of load curves measuring the electricity consumption of French households in small time grids (probably half hours) will be available. All these collected load curves represent a huge amount of information which could be exploited using survey sampling techniques. In particular, the total consumption of a specific cus- tomer group (for example all the customers of an electricity supplier) could be estimated using unequal probability random sampling methods. Unfortunately, data collection may undergo technical problems resulting in missing values. In this paper we study a new estimation method for the mean curve in the presence of missing values which consists in…

FOS: Computer and information sciencesStatistics and ProbabilityPopulationRatio estimatorLinearizationRatio estimator01 natural sciencesSurvey sampling.Horvitz–Thompson estimatorMethodology (stat.ME)010104 statistics & probabilityH\'ajek estimator0502 economics and businessApplied mathematicsMissing valuesHorvitz-Thompson estimator0101 mathematicseducationStatistics - Methodology050205 econometrics MathematicsPointwiseeducation.field_of_study[STAT.ME] Statistics [stat]/Methodology [stat.ME]05 social sciencesNonparametric statisticsEstimator16. Peace & justiceMissing dataFunctional data[ STAT.ME ] Statistics [stat]/Methodology [stat.ME]Kernel (statistics)Statistics Probability and UncertaintyNonparametric estimation[STAT.ME]Statistics [stat]/Methodology [stat.ME]

researchProduct

Study design in causal models

2012

The causal assumptions, the study design and the data are the elements required for scientific inference in empirical research. The research is adequately communicated only if all of these elements and their relations are described precisely. Causal models with design describe the study design and the missing data mechanism together with the causal structure and allow the direct application of causal calculus in the estimation of the causal effects. The flow of the study is visualized by ordering the nodes of the causal diagram in two dimensions by their causal order and the time of the observation. Conclusions whether a causal or observational relationship can be estimated from the collect…

FOS: Computer and information sciencesdesignstructural equation modelG.362A01 62-09 62F99 62D05 62P10 62K99 68T30graphical modelMachine Learning (stat.ML)G.2.2Statistics - ApplicationsG.3; G.2.2Methodology (stat.ME)missing dataStatistics - Machine LearningkausaliteettiApplications (stat.AP)epidemiologiaStatistics - Methodology

researchProduct

Comparing Spatial and Spatio-temporal FPCA to Impute Large Continuous Gaps in Space

2018

Multivariate spatio-temporal data analysis methods usually assume fairly complete data, while a number of gaps often occur along time or in space. In air quality data long gaps may be due to instrument malfunctions; moreover, not all the pollutants of interest are measured in all the monitoring stations of a network. In literature, many statistical methods have been proposed for imputing short sequences of missing values, but most of them are not valid when the fraction of missing values is high. Furthermore, the limitation of the methods commonly used consists in exploiting temporal only, or spatial only, correlation of the data. The objective of this paper is to provide an approach based …

Functional principal component analysisComplete dataMultivariate statisticsLong gapComputer sciencecomputer.software_genreMissing dataCorrelationFDA FPCA GAM P-splinesData analysisData miningImputation (statistics)Settore SECS-S/01 - Statisticacomputer

researchProduct

Model averaging estimation of generalized linear models with imputed covariates

2015

a b s t r a c t We address the problem of estimating generalized linear models when some covariate values are missing but imputations are available to fill-in the missing values. This situation generates a bias-precision trade- off in the estimation of the model parameters. Extending the generalized missing-indicator method proposed by Dardanoni et al. (2011) for linear regression, we handle this trade-off as a problem of model uncertainty using Bayesian averaging of classical maximum likelihood estimators (BAML). We also propose a block model averaging strategy that incorporates information on the missing-data patterns and is computationally simple. An empirical application illustrates our…

Generalized linear modelEconomics and EconometricsApplied MathematicsSettore SECS-P/05 - EconometriaEstimatorMissing dataGeneralized linear mixed modelModel averaging Bayesian averaging of maximum likelihood destimators Generalized linear models Missing covariates Generalized missing-indicator method shareHierarchical generalized linear modelStatisticsLinear regressionCovariateApplied mathematicsGeneralized estimating equationMathematics

researchProduct

Impact of missing data mechanism on the estimate of change: a case study on cognitive function and polypharmacy among older persons

2015

Piia Lavikainen,1,2 Esko Leskinen,3 Sirpa Hartikainen,1,2 Jyrki M&ouml;tt&ouml;nen,4 Raimo Sulkava,5 Maarit J Korhonen6 1Kuopio Research Centre of Geriatric Care, University of Eastern Finland, Kuopio, Finland; 2School of Pharmacy, Faculty of Health Sciences, University of Eastern Finland, Kuopio, Finland; 3Department of Mathematics and Statistics, University of Jyv&auml;skyl&auml;, Jyv&auml;skyl&auml;, Finland; 4Department of Social Research, University of Helsinki, Helsinki, Finland; 5Department of Geriatrics, Institute of Public Health and Clinical Nutrition, Faculty of Health Sciences, University of Eastern Finland, Kuopio, Finland; 6Department of Pharmacology, D…

GerontologyattritionlongitudinalEpidemiology01 natural sciences010104 statistics & probability0504 sociologynumber of drugsMedicineClinical EpidemiologyAttrition0101 mathematicsCognitive declineLatent variable modelOriginal ResearchPolypharmacyta112Mini–Mental State Examinationmedicine.diagnostic_testbusiness.industryMechanism (biology)05 social sciences050401 social sciences methodsCognitionta3142medicine.diseaseMissing dataData science3. Good healthlatent variable modelingolder personsMini-Mental State Examinationbusiness

researchProduct

Comparative analysis of different techniques for spatial interpolation of rainfall data to create a serially complete monthly time series of precipit…

2011

Abstract The availability of good and reliable rainfall data is fundamental for most hydrological analyses and for the design and management of water resources systems. However, in practice, precipitation records often suffer from missing data values mainly due to malfunctioning of raingauge for specific time periods. This is an important issue in practical hydrology because it affects the continuity of rainfall data and ultimately influences the results of hydrologic studies which use rainfall as input. Many methods to estimate missing rainfall data have been proposed in literature and, among these, most are based on spatial interpolation algorithms. In this paper different spatial interpo…

Global and Planetary ChangeSettore ICAR/02 - Costruzioni Idrauliche E Marittime E IdrologiaDEMInterpolation methodsGeostatisticsPrecipitationManagement Monitoring Policy and LawMissing dataMultivariate interpolationGeographyKrigingGeostatisticInverse distance weightingStatisticsComputers in Earth SciencesSpatial dependenceSimple linear regressionEarth-Surface ProcessesInterpolation

researchProduct

Robust Principal Component Analysis of Data with Missing Values

2015

Principal component analysis is one of the most popular machine learning and data mining techniques. Having its origins in statistics, principal component analysis is used in numerous applications. However, there seems to be not much systematic testing and assessment of principal component analysis for cases with erroneous and incomplete data. The purpose of this article is to propose multiple robust approaches for carrying out principal component analysis and, especially, to estimate the relative importances of the principal components to explain the data variability. Computational experiments are first focused on carefully designed simulated tests where the ground truth is known and can b…

Ground truthPCAComputer scienceRobust statisticsMissing datacomputer.software_genreSet (abstract data type)missing dataMultiple correspondence analysisrobust statisticsPrincipal component analysisData miningcomputerRobust principal component analysis

researchProduct