Search results for "missing data"

showing 10 items of 83 documents

Physics-aware Gaussian processes in remote sensing

2018

Abstract Earth observation from satellite sensory data poses challenging problems, where machine learning is currently a key player. In recent years, Gaussian Process (GP) regression has excelled in biophysical parameter estimation tasks from airborne and satellite observations. GP regression is based on solid Bayesian statistics, and generally yields efficient and accurate parameter estimates. However, GPs are typically used for inverse modeling based on concurrent observations and in situ measurements only. Very often a forward model encoding the well-understood physical relations between the state vector and the radiance observations is available though and could be useful to improve pre…

Signal Processing (eess.SP)FOS: Computer and information sciences010504 meteorology & atmospheric sciences0211 other engineering and technologies02 engineering and technologyStatistics - Applications01 natural sciencessymbols.namesakeFOS: Electrical engineering electronic engineering information engineeringApplications (stat.AP)Electrical Engineering and Systems Science - Signal ProcessingGaussian processGaussian process emulator021101 geological & geomatics engineering0105 earth and related environmental sciencesbusiness.industryEstimation theoryBayesian optimizationState vectorMissing dataBayesian statisticssymbolsGlobal Positioning SystembusinessAlgorithmSoftwareApplied Soft Computing
researchProduct

Missing value imputation in proximity extension assay-based targeted proteomics data

2020

Targeted proteomics utilizing antibody-based proximity extension assays provides sensitive and highly specific quantifications of plasma protein levels. Multivariate analysis of this data is hampered by frequent missing values (random or left censored), calling for imputation approaches. While appropriate missing-value imputation methods exist, benchmarks of their performance in targeted proteomics data are lacking. Here, we assessed the performance of two methods for imputation of values missing completely at random, the previously top-benchmarked ‘missForest’ and the recently published ‘GSimp’ method. Evaluation was accomplished by comparing imputed with remeasured relative concentrations…

ProteomicsMaleMultivariate analysisProtein ExpressionBiochemistryProtein expressionDatabase and Informatics MethodsLimit of DetectionStatisticsMedicine and Health SciencesBiochemical SimulationsImputation (statistics)Immune ResponseMathematicsMultidisciplinaryProteomic DatabasesQREukaryotaBlood ProteinsVenous ThromboembolismPlantsMiddle AgedLegumesTargeted proteomicssymbolsEngineering and TechnologyMedicineFemaleAlgorithmsResearch ArticleQuality ControlAdultScienceImmunologyResearch and Analysis Methodssymbols.namesakeSigns and SymptomsBiasIndustrial EngineeringProtein Concentration AssaysGene Expression and Vector TechniquesMissing value imputationHumansMolecular Biology TechniquesMolecular BiologyAgedInflammationMolecular Biology Assays and Analysis TechniquesInterleukin-6OrganismsPeasBiology and Life SciencesComputational BiologyMissing dataPearson product-moment correlation coefficientBiological DatabasesMultivariate AnalysisClinical MedicineVenous thromboembolismPLOS ONE
researchProduct

Can Deliberately Incomplete Gene Sample Augmentation Improve a Phylogeny Estimate for the Advanced Moths and Butterflies (Hexapoda: Lepidoptera)?

2011

Abstract This paper addresses the question of whether one can economically improve the robustness of a molecular phylogeny estimate by increasing gene sampling in only a subset of taxa, without having the analysis invalidated by artifacts arising from large blocks of missing data. Our case study stems from an ongoing effort to resolve poorly understood deeper relationships in the large clade Ditrysia ( > 150,000 species) of the insect order Lepidoptera (butterflies and moths). Seeking to remedy the overall weak support for deeper divergences in an initial study based on five nuclear genes (6.6 kb) in 123 exemplars, we nearly tripled the total gene sample (to 26 genes, 18.4 kb) but only in a…

0106 biological sciencesNonsynonymous substitutionNuclear genetaxon samplingStatistics as TopicGenes Insect010603 evolutionary biology01 natural sciencesmolecular phylogeneticsGenetic Heterogeneitymissing data03 medical and health sciencesDitrysiaGeneticsAnimalsGelechioideaPhylogenyEcology Evolution Behavior and Systematics030304 developmental biologyGenetics0303 health sciencesbiologyNucleotidesHexapodaClassificationnuclear genesbiology.organism_classificationMissing dataLepidopteragene samplingTaxonMacrolepidopteraEvolutionary biologyMolecular phylogeneticsDitrysiaRegular ArticlesSystematic Biology
researchProduct

Comparing Spatial and Spatio-temporal FPCA to Impute Large Continuous Gaps in Space

2018

Multivariate spatio-temporal data analysis methods usually assume fairly complete data, while a number of gaps often occur along time or in space. In air quality data long gaps may be due to instrument malfunctions; moreover, not all the pollutants of interest are measured in all the monitoring stations of a network. In literature, many statistical methods have been proposed for imputing short sequences of missing values, but most of them are not valid when the fraction of missing values is high. Furthermore, the limitation of the methods commonly used consists in exploiting temporal only, or spatial only, correlation of the data. The objective of this paper is to provide an approach based …

Functional principal component analysisComplete dataMultivariate statisticsLong gapComputer sciencecomputer.software_genreMissing dataCorrelationFDA FPCA GAM P-splinesData analysisData miningImputation (statistics)Settore SECS-S/01 - Statisticacomputer
researchProduct

2014

This paper considers the parameter estimation for linear time-invariant (LTI) systems in an input-output setting with output error (OE) time-delay model structure. The problem of missing data is commonly experienced in industry due to irregular sampling, sensor failure, data deletion in data preprocessing, network transmission fault, and so forth; to deal with the identification of LTI systems with time-delay in incomplete-data problem, the generalized expectation-maximization (GEM) algorithm is adopted to estimate the model parameters and the time-delay simultaneously. Numerical examples are provided to demonstrate the effectiveness of the proposed method.

Identification (information)Transmission (telecommunications)Estimation theoryComputer scienceControl theoryGeneral MathematicsGeneral EngineeringStructure (category theory)Sampling (statistics)Data pre-processingMissing dataFault (power engineering)AlgorithmMathematical Problems in Engineering
researchProduct

Effectiveness of the physical activity intervention program in the PREDIMED-Plus study: a randomized controlled trial

2018

[Background] The development and implementation of effective physical activity (PA) intervention programs is challenging, particularly in older adults. After the first year of the intervention program used in the ongoing PREvención con DIeta MEDiterránea (PREDIMED)-Plus trial, we assessed the initial effectiveness of the PA component.

0301 basic medicineMaleMediterranean diethumanosrestricción calóricaMyocardial InfarctionMedicine (miscellaneous)physical activitycardiovascular-diseaseejercicio físicoDiet MediterraneanPersones granslaw.inventionmissing data0302 clinical medicineClinical trialsRandomized controlled trialpreventionlawSurveys and QuestionnairesClinical endpoint030212 general & internal medicineolder-adultsStrokeGeneralized estimating equationlcsh:RC620-627mediana edadolder adultsBody mass index2. Zero hungerancianoNutrition and Dieteticssobrepesodietaresultado del tratamientolcsh:Public aspects of medicinehealthMiddle Agedwaist circumference3. Good healthlcsh:Nutritional diseases. Deficiency diseasesTreatment Outcomeestilo de vidaOlder adultsWaist circumferenceFemalewomenpérdida de pesometaanalysisRandomized control trialmedicine.medical_specialtyWaistPes corporalPhysical Therapy Sports Therapy and RehabilitationExercicibody mass indexClinical nutrition03 medical and health sciencesWeight LossmedicineHumansIntervention programObesityLife StyleobesidadExerciseinfarto de miocardioAgedCaloric Restriction030109 nutrition & dieteticsperímetro abdominalbusiness.industrybehaviorPhysical activityResearchíndice de masa corporallcsh:RA1-1270OverweightBody weightmedicine.diseaserandomized control trialDiettamaño de la muestraintervention programSample SizePhysical therapyOlder peoplebusinessBody mass indexAssaigs clínicsInternational Journal of Behavioral Nutrition and Physical Activity
researchProduct

Impact of the terrestrial reference frame on the determination of the celestial reference frame.

2022

Currently three up-to-date Terrestrial Reference Frames (TRF) are available, the ITRF2014 from IGN, the DTRF2014 from DGFI-TUM, and JTRF2014 from JPL. All use the identical input data of space-geodetic station positions and Earth orientation parameters, but the concept of combining these data is fundamentally different. The IGN approach is based on the combination of technique solutions, while the DGFI is combining the normal equation systems. Both yield in reference epoch coordinates and velocities for a global set of stations. JPL uses a Kalman filter approach, realizing a TRF through weekly time series of geocentric coordinates. As the determination of the CRF is not independent of the T…

lcsh:QB275-343010504 meteorology & atmospheric sciencesEpoch (astronomy)lcsh:Geodesylcsh:QC801-809Kalman filter010502 geochemistry & geophysicsGeodesyMissing data01 natural sciencesGeocentric coordinateslcsh:Geophysics. Cosmic physicsGeophysicsPosition (vector)Computers in Earth SciencesTerrestrial reference frameLinear least squares0105 earth and related environmental sciencesEarth-Surface ProcessesReference frameMathematicsGeodesy and geodynamics
researchProduct

Psychosocial Problems, Indoor Air-Related Symptoms, and Perceived Indoor Air Quality among Students in Schools without Indoor Air Problems: A Longitu…

2018

The effect of students&rsquo

MaleLongitudinal studySTRESSHealth Toxicology and Mutagenesislcsh:Medicine010501 environmental sciences01 natural sciences0302 clinical medicineIndoor air qualityDIFFICULTIES QUESTIONNAIREADOLESCENTSsosioemotionaaliset ongelmatLongitudinal Studies030212 general & internal medicineChildta515SchoolsSocioemotional selectivity theory4. Educationsisäilman laatulaatuASSOCIATION3142 Public health care science environmental and occupational healthALLERGIC RHINITISteacher–student relationsindoor air problemsAir Pollution IndoorSTRENGTHSFemaleHEALTHongelmatPsychologyPsychosocialsocioemotional difficultiesindoor air qualityClinical psychologyAdolescentKansanterveystiede ympäristö ja työterveys - Public health care science environmental and occupational healthIndoor airLongitudinal dataschooleducationChild Behavior DisordersArticle03 medical and health sciencesMISSING DATAsisäilmaongelmatHumanslower secondary schoolsStudentsopettaja-oppilassuhde0105 earth and related environmental sciencesPsykologia - Psychologysisäilmalcsh:RPublic Health Environmental and Occupational Healthindoor air-related symptomsWORK ENVIRONMENTSSocioeconomic Factorspsychosocial problemskouluPerceptionteacher student relationsoireetpsykososiaaliset ongelmatSTRUCTURAL EQUATION ANALYSISInternational Journal of Environmental Research and Public Health
researchProduct

DATimeS: A machine learning time series GUI toolbox for gap-filling and vegetation phenology trends detection

2020

Abstract Optical remotely sensed data are typically discontinuous, with missing values due to cloud cover. Consequently, gap-filling solutions are needed for accurate crop phenology characterization. The here presented Decomposition and Analysis of Time Series software (DATimeS) expands established time series interpolation methods with a diversity of advanced machine learning fitting algorithms (e.g., Gaussian Process Regression: GPR) particularly effective for the reconstruction of multiple-seasons vegetation temporal patterns. DATimeS is freely available as a powerful image time series software that generates cloud-free composite maps and captures seasonal vegetation dynamics from regula…

Environmental Engineering010504 meteorology & atmospheric sciencesComputer science0211 other engineering and technologies02 engineering and technologyMachine learningcomputer.software_genre01 natural sciencesArticleSoftwareKrigingTime seriesLeaf area index021101 geological & geomatics engineering0105 earth and related environmental sciencesSeries (mathematics)business.industryEcological ModelingVegetation15. Life on landMissing dataArtificial intelligencebusinesscomputerSoftwareInterpolationEnvironmental Modelling & Software
researchProduct

Causal Effect Identification from Multiple Incomplete Data Sources: A General Search-Based Approach

2021

Causal effect identification considers whether an interventional probability distribution can be uniquely determined without parametric assumptions from measured source distributions and structural knowledge on the generating system. While complete graphical criteria and procedures exist for many identification problems, there are still challenging but important extensions that have not been considered in the literature. To tackle these new settings, we present a search algorithm directly over the rules of do-calculus. Due to generality of do-calculus, the search is capable of taking more advanced data-generating mechanisms into account along with an arbitrary type of both observational and…

FOS: Computer and information sciencesStatistics and ProbabilityComputer Science - Machine LearningcausalityComputer Science - Artificial IntelligenceHeuristic (computer science)Computer scienceeducationMachine Learning (stat.ML)transportabilitycomputer.software_genre01 natural sciencesMachine Learning (cs.LG)R-kielimissing dataQA76.75-76.765; QA273-280010104 statistics & probabilitydo-calculuscausality; do-calculus; selection bias; transportability; missing data; case-control design; meta-analysisStatistics - Machine LearningSearch algorithmselection bias0101 mathematicsParametric statisticspäättelymeta-analyysicase-control designhakualgoritmit113 Computer and information sciencesMissing datameta-analysisIdentification (information)Artificial Intelligence (cs.AI)Causal inferencekausaliteettiIdentifiabilityProbability distributionData miningStatistics Probability and UncertaintycomputerSoftwareJournal of Statistical Software
researchProduct