Search results for "missing data"
showing 10 items of 83 documents
Physics-aware Gaussian processes in remote sensing
2018
Abstract Earth observation from satellite sensory data poses challenging problems, where machine learning is currently a key player. In recent years, Gaussian Process (GP) regression has excelled in biophysical parameter estimation tasks from airborne and satellite observations. GP regression is based on solid Bayesian statistics, and generally yields efficient and accurate parameter estimates. However, GPs are typically used for inverse modeling based on concurrent observations and in situ measurements only. Very often a forward model encoding the well-understood physical relations between the state vector and the radiance observations is available though and could be useful to improve pre…
Missing value imputation in proximity extension assay-based targeted proteomics data
2020
Targeted proteomics utilizing antibody-based proximity extension assays provides sensitive and highly specific quantifications of plasma protein levels. Multivariate analysis of this data is hampered by frequent missing values (random or left censored), calling for imputation approaches. While appropriate missing-value imputation methods exist, benchmarks of their performance in targeted proteomics data are lacking. Here, we assessed the performance of two methods for imputation of values missing completely at random, the previously top-benchmarked ‘missForest’ and the recently published ‘GSimp’ method. Evaluation was accomplished by comparing imputed with remeasured relative concentrations…
Can Deliberately Incomplete Gene Sample Augmentation Improve a Phylogeny Estimate for the Advanced Moths and Butterflies (Hexapoda: Lepidoptera)?
2011
Abstract This paper addresses the question of whether one can economically improve the robustness of a molecular phylogeny estimate by increasing gene sampling in only a subset of taxa, without having the analysis invalidated by artifacts arising from large blocks of missing data. Our case study stems from an ongoing effort to resolve poorly understood deeper relationships in the large clade Ditrysia ( > 150,000 species) of the insect order Lepidoptera (butterflies and moths). Seeking to remedy the overall weak support for deeper divergences in an initial study based on five nuclear genes (6.6 kb) in 123 exemplars, we nearly tripled the total gene sample (to 26 genes, 18.4 kb) but only in a…
Comparing Spatial and Spatio-temporal FPCA to Impute Large Continuous Gaps in Space
2018
Multivariate spatio-temporal data analysis methods usually assume fairly complete data, while a number of gaps often occur along time or in space. In air quality data long gaps may be due to instrument malfunctions; moreover, not all the pollutants of interest are measured in all the monitoring stations of a network. In literature, many statistical methods have been proposed for imputing short sequences of missing values, but most of them are not valid when the fraction of missing values is high. Furthermore, the limitation of the methods commonly used consists in exploiting temporal only, or spatial only, correlation of the data. The objective of this paper is to provide an approach based …
2014
This paper considers the parameter estimation for linear time-invariant (LTI) systems in an input-output setting with output error (OE) time-delay model structure. The problem of missing data is commonly experienced in industry due to irregular sampling, sensor failure, data deletion in data preprocessing, network transmission fault, and so forth; to deal with the identification of LTI systems with time-delay in incomplete-data problem, the generalized expectation-maximization (GEM) algorithm is adopted to estimate the model parameters and the time-delay simultaneously. Numerical examples are provided to demonstrate the effectiveness of the proposed method.
Effectiveness of the physical activity intervention program in the PREDIMED-Plus study: a randomized controlled trial
2018
[Background] The development and implementation of effective physical activity (PA) intervention programs is challenging, particularly in older adults. After the first year of the intervention program used in the ongoing PREvención con DIeta MEDiterránea (PREDIMED)-Plus trial, we assessed the initial effectiveness of the PA component.
Impact of the terrestrial reference frame on the determination of the celestial reference frame.
2022
Currently three up-to-date Terrestrial Reference Frames (TRF) are available, the ITRF2014 from IGN, the DTRF2014 from DGFI-TUM, and JTRF2014 from JPL. All use the identical input data of space-geodetic station positions and Earth orientation parameters, but the concept of combining these data is fundamentally different. The IGN approach is based on the combination of technique solutions, while the DGFI is combining the normal equation systems. Both yield in reference epoch coordinates and velocities for a global set of stations. JPL uses a Kalman filter approach, realizing a TRF through weekly time series of geocentric coordinates. As the determination of the CRF is not independent of the T…
Psychosocial Problems, Indoor Air-Related Symptoms, and Perceived Indoor Air Quality among Students in Schools without Indoor Air Problems: A Longitu…
2018
The effect of students&rsquo
DATimeS: A machine learning time series GUI toolbox for gap-filling and vegetation phenology trends detection
2020
Abstract Optical remotely sensed data are typically discontinuous, with missing values due to cloud cover. Consequently, gap-filling solutions are needed for accurate crop phenology characterization. The here presented Decomposition and Analysis of Time Series software (DATimeS) expands established time series interpolation methods with a diversity of advanced machine learning fitting algorithms (e.g., Gaussian Process Regression: GPR) particularly effective for the reconstruction of multiple-seasons vegetation temporal patterns. DATimeS is freely available as a powerful image time series software that generates cloud-free composite maps and captures seasonal vegetation dynamics from regula…
Causal Effect Identification from Multiple Incomplete Data Sources: A General Search-Based Approach
2021
Causal effect identification considers whether an interventional probability distribution can be uniquely determined without parametric assumptions from measured source distributions and structural knowledge on the generating system. While complete graphical criteria and procedures exist for many identification problems, there are still challenging but important extensions that have not been considered in the literature. To tackle these new settings, we present a search algorithm directly over the rules of do-calculus. Due to generality of do-calculus, the search is capable of taking more advanced data-generating mechanisms into account along with an arbitrary type of both observational and…