Search results for "missing"

showing 10 items of 174 documents

Online Principal Component Analysis in High Dimension: Which Algorithm to Choose?

2017

Summary Principal component analysis (PCA) is a method of choice for dimension reduction. In the current context of data explosion, online techniques that do not require storing all data in memory are indispensable to perform the PCA of streaming data and/or massive data. Despite the wide availability of recursive algorithms that can efficiently update the PCA when new data are observed, the literature offers little guidance on how to select a suitable algorithm for a given application. This paper reviews the main approaches to online PCA, namely, perturbation techniques, incremental methods and stochastic optimisation, and compares the most widely employed techniques in terms statistical a…

Statistics and ProbabilityComputer scienceComputationDimensionality reductionIncremental methods02 engineering and technologyMissing data01 natural sciences010104 statistics & probabilityData explosionStreaming dataPrincipal component analysis0202 electrical engineering electronic engineering information engineering020201 artificial intelligence & image processing0101 mathematicsStatistics Probability and UncertaintyAlgorithmEigendecomposition of a matrixInternational Statistical Review

researchProduct

Correction: Correcting for non-ignorable missingness in smoking trends

2017

Statistics and ProbabilityComputer scienceStatisticsStatistics Probability and UncertaintyMissing dataStat

researchProduct

Extending graphical models for applications: on covariates, missingness and normality

2021

The authors of the paper “Bayesian Graphical Models for Modern Biological Applications” have put forward an important framework for making graphical models more useful in applied settings. In this discussion paper, we give a number of suggestions for making this framework even more suitable for practical scenarios. Firstly, we show that an alternative and simplified definition of covariate might make the framework more manageable in high-dimensional settings. Secondly, we point out that the inclusion of missing variables is important for practical data analysis. Finally, we comment on the effect that the Gaussianity assumption has in identifying the underlying conditional independence graph…

Statistics and ProbabilityComputer sciencemedia_common.quotation_subjectMissing dataConditional graphical modelsCopula graphical modelsMissing dataCovariateEconometricsSparse inferenceGraphical modelStatistics Probability and UncertaintyNormalitymedia_common

researchProduct

Study Design in Causal Models

2014

The causal assumptions, the study design and the data are the elements required for scientific inference in empirical research. The research is adequately communicated only if all of these elements and their relations are described precisely. Causal models with design describe the study design and the missing-data mechanism together with the causal structure and allow the direct application of causal calculus in the estimation of the causal effects. The flow of the study is visualized by ordering the nodes of the causal diagram in two dimensions by their causal order and the time of the observation. Conclusions on whether a causal or observational relationship can be estimated from the coll…

Statistics and ProbabilityEmpirical researchTheoretical computer scienceGraph (abstract data type)Graphical modelStatistics Probability and UncertaintyCausal structureMissing dataCausalityStructural equation modelingCausal modelMathematicsScandinavian Journal of Statistics

researchProduct

Bayesian joint modeling for assessing the progression of chronic kidney disease in children.

2016

Joint models are rich and flexible models for analyzing longitudinal data with nonignorable missing data mechanisms. This article proposes a Bayesian random-effects joint model to assess the evolution of a longitudinal process in terms of a linear mixed-effects model that accounts for heterogeneity between the subjects, serial correlation, and measurement error. Dropout is modeled in terms of a survival model with competing risks and left truncation. The model is applied to data coming from ReVaPIR, a project involving children with chronic kidney disease whose evolution is mainly assessed through longitudinal measurements of glomerular filtration rate.

Statistics and ProbabilityEpidemiologyComputer scienceBayesian probability030232 urology & nephrologyRenal function01 natural sciences010104 statistics & probability03 medical and health sciences0302 clinical medicineHealth Information ManagementStatisticsEconometricsmedicineHumans0101 mathematicsRenal Insufficiency ChronicChildJoint (geology)Dropout (neural networks)Survival analysisAutocorrelationBayes Theoremmedicine.diseaseMissing dataSurvival AnalysisChild PreschoolDisease ProgressionKidney diseaseStatistical methods in medical research

researchProduct

Bayesian models for data missing not at random in health examination surveys

2018

In epidemiological surveys, data missing not at random (MNAR) due to survey nonresponse may potentially lead to a bias in the risk factor estimates. We propose an approach based on Bayesian data augmentation and survival modelling to reduce the nonresponse bias. The approach requires additional information based on follow-up data. We present a case study of smoking prevalence using FINRISK data collected between 1972 and 2007 with a follow-up to the end of 2012 and compare it to other commonly applied missing at random (MAR) imputation approaches. A simulation experiment is carried out to study the validity of the approaches. Our approach appears to reduce the nonresponse bias substantially…

Statistics and ProbabilityFOS: Computer and information sciencesmedicine.medical_specialtymultiple imputationComputer scienceBayesian probability01 natural sciencesStatistics - Applicationssurvival analysisfollow-up dataMethodology (stat.ME)010104 statistics & probability03 medical and health sciencesHealth examination0302 clinical medicineEpidemiologyStatisticsmedicineApplications (stat.AP)030212 general & internal medicine0101 mathematicsSurvival analysisStatistics - MethodologyBayes estimatorta112elinaika-analyysiRisk factor (computing)Bayesian estimation3. Good healthhealth examination surveysStatistics Probability and UncertaintyMissing not at randomdata augmentation

researchProduct

A hierarchical Bayesian birth cohort analysis from incomplete registry data: evaluating the trends in the age of onset of insulin-dependent diabetes …

2005

Childhood diabetes is one of the major non-communicable diseases in children under 15 years of age. It requires a life-long insulin treatment and may lead to serious complications. Along with the worldwide increase in the incidence several countries have recently reported a decreasing trend in the age of onset of the disease. The aim of this study is to analyse long-term data on the incidence of the childhood diabetes in Finland from the birth cohorts perspective. The annual incidence data were available for the period 1965--1996 which translates into 1951--1996 birth cohorts. Hence the data consist of completely and partially observed cohorts. Bayesian modelling was employed in the analysi…

Statistics and ProbabilityMaleAdolescentEpidemiologymedicine.medical_treatmentDiseaseCohort StudiesDiabetes mellitusMedicineHumansAge of OnsetChildFinlandModels Statisticalbusiness.industryInsulinIncidence (epidemiology)Bayes Theoremmedicine.diseaseMissing dataMarkov ChainsDiabetes Mellitus Type 1Child PreschoolCohortFemaleAge of onsetbusinessMonte Carlo MethodCohort studyDemographyStatistics in medicine

researchProduct

Multiple Comparisons of Treatments with Stable Multivariate Tests in a Two‐Stage Adaptive Design, Including a Test for Non‐Inferiority

2000

The application of stabilized multivariate tests is demonstrated in the analysis of a two-stage adaptive clinical trial with three treatment arms. Due to the clinical problem, the multiple comparisons include tests of superiority as well as a test for non-inferiority, where non-inferiority is (because of missing absolute tolerance limits) expressed as linear contrast of the three treatments. Special emphasis is paid to the combination of the three sources of multiplicity - multiple endpoints, multiple treatments, and two stages of the adaptive design. Particularly, the adaptation after the first stage comprises a change of the a-priori order of hypotheses.

Statistics and ProbabilityMultivariate statisticsAdaptive clinical trialMultivariate analysisMultiple comparisons problemStatisticsContrast (statistics)Regression analysisGeneral MedicineStatistics Probability and UncertaintyMissing dataStatistical hypothesis testingMathematicsBiometrical Journal

researchProduct

Estimating Mean Lifetime from Partially Observed Events in Nuclear Physics

2022

Abstract The mean lifetime is an important characteristic of particles to be identified in nuclear physics. State-of-the-art particle detectors can identify the arrivals of single radioactive nuclei as well as their subsequent radioactive decays (departures). Challenges arise when the arrivals and departures are unmatched and the departures are only partially observed. An inefficient solution is to run experiments where the arrival rate is set very low to allow for the matching of arrivals and departures. We propose an estimation method that works for a wide range of arrival rates. The method combines an initial estimator and a numerical bias correction technique. Simulations and examples b…

Statistics and ProbabilityPhysicsNuclear physicsdesign of experimentsmissing datanoisy binary searchradioactive decayPoisson processStatistics Probability and Uncertaintyydinfysiikkatilastolliset mallitestimointiradioaktiivisuusJournal of the Royal Statistical Society Series C: Applied Statistics

researchProduct

cglasso: An R Package for Conditional Graphical Lasso Inference with Censored and Missing Values

2023

Sparse graphical models have revolutionized multivariate inference. With the advent of high-dimensional multivariate data in many applied fields, these methods are able to detect a much lower-dimensional structure, often represented via a sparse conditional independence graph. There have been numerous extensions of such methods in the past decade. Many practical applications have additional covariates or suffer from missing or censored data. Despite the development of these extensions of sparse inference methods for graphical models, there have been so far no implementations for, e.g., conditional graphical models. Here we present the general-purpose package cglasso for estimating sparse co…

Statistics and Probabilityconditional Gaussian graphical modelscglasso conditional Gaussian graphical models glasso high-dimensionality sparsity censoring missing dataglassosparsityhigh-dimensionalityconditional Gaussian graphical models glasso high-dimensionality sparsity censoring missing datacglassomissing datacensoringStatistics Probability and UncertaintySettore SECS-S/01 - StatisticaSoftware

researchProduct