Search results for "MISSING DATA"
showing 10 items of 83 documents
A new methodology based on functional principal component analysis tostudy postural stability post-stroke
2018
[EN] Background. A major goal in stroke rehabilitation is the establishment of more effective physical therapy techniques to recover postural stability. Functional Principal Component Analysis provides greater insight into recovery trends. However, when missing values exist, obtaining functional data presents some difficulties. The purpose of this study was to reveal an alternative technique for obtaining the Functional Principal Components without requiring the conversion to functional data beforehand and to investigate this methodology to determine the effect of specific physical therapy techniques in balance recovery trends in elderly subjects with hemiplegia post-stroke. Methods: A rand…
Adjusting for selective non-participation with re-contact data in the FINRISK 2012 survey
2018
Aims: A common objective of epidemiological surveys is to provide population-level estimates of health indicators. Survey results tend to be biased under selective non-participation. One approach to bias reduction is to collect information about non-participants by contacting them again and asking them to fill in a questionnaire. This information is called re-contact data, and it allows to adjust the estimates for non-participation. Methods: We analyse data from the FINRISK 2012 survey, where re-contact data were collected. We assume that the respondents of the re-contact survey are similar to the remaining non-participants with respect to the health given their available background informa…
Psychosocial Problems, Indoor Air-Related Symptoms, and Perceived Indoor Air Quality among Students in Schools without Indoor Air Problems: A Longitu…
2018
The effect of students&rsquo
Treating missing data in a clinical neuropsychological dataset--data imputation.
2001
Missing data frequently reduce the applicability of clinically collected data in research requiring multivariate statistics. In data imputation, missing values are replaced by predicted values obtained from models based on auxiliary information. Our aim was to complete a clinical child neuropsychological data set containing 5.2% of missing observations. This was to be used in research requiring multivariate statistics. We compared four data imputation methods by artificially deleting some data. A real-donor imputation method which preserved the parameter estimates and which predicted the observed values with acceptable accuracy was used to complete the data set. In addressing the lack of st…
Evaluation of the performance of Dutch Lipid Clinic Network score in an Italian FH population: The LIPIGEN study
2018
Abstract Background and aims Familial hypercholesterolemia (FH) is an inherited disorder characterized by high levels of blood cholesterol from birth and premature coronary heart disease. Thus, the identification of FH patients is crucial to prevent or delay the onset of cardiovascular events, and the availability of a tool helping with the diagnosis in the setting of general medicine is essential to improve FH patient identification. Methods This study evaluated the performance of the Dutch Lipid Clinic Network (DLCN) score in FH patients enrolled in the LIPIGEN study, an Italian integrated network aimed at improving the identification of patients with genetic dyslipidaemias, including FH.…
Physics-Aware Gaussian Processes for Earth Observation
2017
Earth observation from satellite sensory data pose challenging problems, where machine learning is currently a key player. In recent years, Gaussian Process (GP) regression and other kernel methods have excelled in biophysical parameter estimation tasks from space. GP regression is based on solid Bayesian statistics, and generally yield efficient and accurate parameter estimates. However, GPs are typically used for inverse modeling based on concurrent observations and in situ measurements only. Very often a forward model encoding the well-understood physical relations is available though. In this work, we review three GP models that respect and learn the physics of the underlying processes …
Multiple imputation of rainfall missing data in the Iberian Mediterranean context
2017
Abstract Given the increasing need for complete rainfall data networks, in recent years have been proposed diverse methods for filling gaps in observed precipitation series, progressively more advanced that traditional approaches to overcome the problem. The present study has consisted in validate 10 methods (6 linear, 2 non-linear and 2 hybrid) that allow multiple imputation, i.e., fill at the same time missing data of multiple incomplete series in a dense network of neighboring stations. These were applied for daily and monthly rainfall in two sectors in the Jucar River Basin Authority (east Iberian Peninsula), which is characterized by a high spatial irregularity and difficulty of rainfa…
2021
Data collected in criminal investigations may suffer from issues like: (i) incompleteness, due to the covert nature of criminal organizations; (ii) incorrectness, caused by either unintentional data collection errors or intentional deception by criminals; (iii) inconsistency, when the same information is collected into law enforcement databases multiple times, or in different formats. In this paper we analyze nine real criminal networks of different nature (i.e., Mafia networks, criminal street gangs and terrorist organizations) in order to quantify the impact of incomplete data, and to determine which network type is most affected by it. The networks are firstly pruned using two specific m…
Imputation Strategies for Missing Data in Environmental Time Series for An Unlucky Situation
2005
After a detailed review of the main specific solutions for treatment of missing data in environmental time series, this paper deals with the unlucky situation in which, in an hourly series, missing data immediately follow an absolutely anomalous period, for which we do not have any similar period to use for imputation. A tentative multivariate and multiple imputation is put forward and evaluated; it is based on the possibility, typical of environmental time series, to resort to correlations or physical laws that characterize relationships between air pollutants.
Comparison of different predictive models for nutrient estimation in a sequencing batch reactor for wastewater treatment
2006
Abstract In this paper different predictive models for nutrient estimation in a sequencing batch reactor (SBR) for wastewater treatment are compared: principal component regression (PCR), partial least squares (PLS), and artificial neural networks (ANNs). Two unfolding procedures were used: batch-wise and variable-wise. For the latter unfolding method, X and Y matrix augmentation with lagged variables were used in some models to incorporate process dynamics. The results have shown that batch-wise unfolding PLS models outperform the other approaches. The ANN models are good predictive models, but in this particular case-study, they do not outperform those multivariate projection models that …