Search results for "Probability and Uncertainty"

showing 10 items of 578 documents

Panel Data Analysis via Mechanistic Models

2018

Panel data, also known as longitudinal data, consist of a collection of time series. Each time series, which could itself be multivariate, comprises a sequence of measurements taken on a distinct unit. Mechanistic modeling involves writing down scientifically motivated equations describing the collection of dynamic systems giving rise to the observations on each unit. A defining characteristic of panel systems is that the dynamic interaction between units should be negligible. Panel models therefore consist of a collection of independent stochastic processes, generally linked through shared parameters while also having unit-specific parameters. To give the scientist flexibility in model spe…

FOS: Computer and information sciencesStatistics and ProbabilityMultivariate statisticsSeries (mathematics)Longitudinal dataComputer science05 social sciences01 natural sciencesMethodology (stat.ME)010104 statistics & probabilityNonlinear system0502 economics and business0101 mathematicsStatistics Probability and UncertaintyParticle filterAlgorithmStatistics - Methodology050205 econometrics Panel dataSequence (medicine)Journal of the American Statistical Association

researchProduct

Mixture Hidden Markov Models for Sequence Data: The seqHMM Package in R

2019

Sequence analysis is being more and more widely used for the analysis of social sequences and other multivariate categorical time series data. However, it is often complex to describe, visualize, and compare large sequence data, especially when there are multiple parallel sequences per subject. Hidden (latent) Markov models (HMMs) are able to detect underlying latent structures and they can be used in various longitudinal settings: to account for measurement error, to detect unobservable states, or to compress information across several types of observations. Extending to mixture hidden Markov models (MHMMs) allows clustering data into homogeneous subsets, with or without external covariate…

FOS: Computer and information sciencesStatistics and ProbabilityMultivariate statisticssequence analysisaikasarjatComputer sciencerMarkov modelStatistics - ComputationStatistics - Applications01 natural sciencesUnobservablecategorical time seriesR-kieli010104 statistics & probabilitymulti-channel sequences; categorical time series; visualizing sequence data; visualizing models; latent Markov models; latent class models; RCovariateApplications (stat.AP)Sannolikhetsteori och statistikComputer software0101 mathematicsTime seriesProbability Theory and StatisticsHidden Markov modelCluster analysislcsh:Statisticslcsh:HA1-4737Categorical variableComputation (stat.CO)ta112business.industryvisualizing sequence dataR (programming languages)Pattern recognitionmulti-channel sequencesvisualizing modelslatent class modelssekvenssianalyysiArtificial intelligencelatent markov modelstime seriesStatistics Probability and UncertaintybusinessSoftwareJournal of Statistical Software

researchProduct

Estimating with kernel smoothers the mean of functional data in a finite population setting. A note on variance estimation in presence of partially o…

2014

In the near future, millions of load curves measuring the electricity consumption of French households in small time grids (probably half hours) will be available. All these collected load curves represent a huge amount of information which could be exploited using survey sampling techniques. In particular, the total consumption of a specific cus- tomer group (for example all the customers of an electricity supplier) could be estimated using unequal probability random sampling methods. Unfortunately, data collection may undergo technical problems resulting in missing values. In this paper we study a new estimation method for the mean curve in the presence of missing values which consists in…

FOS: Computer and information sciencesStatistics and ProbabilityPopulationRatio estimatorLinearizationRatio estimator01 natural sciencesSurvey sampling.Horvitz–Thompson estimatorMethodology (stat.ME)010104 statistics & probabilityH\'ajek estimator0502 economics and businessApplied mathematicsMissing valuesHorvitz-Thompson estimator0101 mathematicseducationStatistics - Methodology050205 econometrics MathematicsPointwiseeducation.field_of_study[STAT.ME] Statistics [stat]/Methodology [stat.ME]05 social sciencesNonparametric statisticsEstimator16. Peace & justiceMissing dataFunctional data[ STAT.ME ] Statistics [stat]/Methodology [stat.ME]Kernel (statistics)Statistics Probability and UncertaintyNonparametric estimation[STAT.ME]Statistics [stat]/Methodology [stat.ME]

researchProduct

Conditional Bias Robust Estimation of the Total of Curve Data by Sampling in a Finite Population: An Illustration on Electricity Load Curves

2020

Abstract For marketing or power grid management purposes, many studies based on the analysis of total electricity consumption curves of groups of customers are now carried out by electricity companies. Aggregated totals or mean load curves are estimated using individual curves measured at fine time grid and collected according to some sampling design. Due to the skewness of the distribution of electricity consumptions, these samples often contain outlying curves which may have an important impact on the usual estimation procedures. We introduce several robust estimators of the total consumption curve which are not sensitive to such outlying curves. These estimators are based on the conditio…

FOS: Computer and information sciencesStatistics and ProbabilityPopulationWaveletsStatistics - Applications01 natural sciencesSurvey samplingMethodology (stat.ME)010104 statistics & probabilityKokic and bell methodConditional bias0502 economics and businessStatisticsApplications (stat.AP)Conditional bias0101 mathematics[MATH]Mathematics [math]educationStatistics - Methodology050205 econometrics MathematicsEstimationeducation.field_of_studyModified band depthbusiness.industryApplied Mathematics05 social sciencesSampling (statistics)Functional dataBootstrapElectricityStatistics Probability and Uncertaintybusinessasymptotic confidence bandsSocial Sciences (miscellaneous)Spherical principal component analysis

researchProduct

Asymptotic and bootstrap tests for subspace dimension

2022

Most linear dimension reduction methods proposed in the literature can be formulated using an appropriate pair of scatter matrices, see e.g. Ye and Weiss (2003), Tyler et al. (2009), Bura and Yang (2011), Liski et al. (2014) and Luo and Li (2016). The eigen-decomposition of one scatter matrix with respect to another is then often used to determine the dimension of the signal subspace and to separate signal and noise parts of the data. Three popular dimension reduction methods, namely principal component analysis (PCA), fourth order blind identification (FOBI) and sliced inverse regression (SIR) are considered in detail and the first two moments of subsets of the eigenvalues are used to test…

FOS: Computer and information sciencesStatistics and ProbabilityPrincipal component analysisMathematics - Statistics TheoryStatistics Theory (math.ST)01 natural sciencesMethodology (stat.ME)010104 statistics & probabilityDimension (vector space)Scatter matrixSliced inverse regression0502 economics and businessFOS: MathematicsSliced inverse regressionApplied mathematics0101 mathematicsEigenvalues and eigenvectorsStatistics - Methodology050205 econometrics MathematicsestimointiNumerical AnalysisOrder determinationDimensionality reduction05 social sciencesriippumattomien komponenttien analyysimonimuuttujamenetelmätPrincipal component analysisStatistics Probability and UncertaintySubspace topologySignal subspace

researchProduct

Imputation Procedures in Surveys Using Nonparametric and Machine Learning Methods: An Empirical Comparison

2020

Abstract Nonparametric and machine learning methods are flexible methods for obtaining accurate predictions. Nowadays, data sets with a large number of predictors and complex structures are fairly common. In the presence of item nonresponse, nonparametric and machine learning procedures may thus provide a useful alternative to traditional imputation procedures for deriving a set of imputed values used next for the estimation of study parameters defined as solution of population estimating equation. In this paper, we conduct an extensive empirical investigation that compares a number of imputation procedures in terms of bias and efficiency in a wide variety of settings, including high-dimens…

FOS: Computer and information sciencesStatistics and ProbabilityStatistics::ApplicationsEmpirical comparisonbusiness.industryComputer scienceApplied MathematicsNonparametric statisticsMachine learningcomputer.software_genreStatistics - ComputationVariety (cybernetics)Methodology (stat.ME)Set (abstract data type)Statistics::MethodologyImputation (statistics)Artificial intelligenceStatistics Probability and UncertaintybusinesscomputerStatistics - MethodologyComputation (stat.CO)Social Sciences (miscellaneous)Journal of Survey Statistics and Methodology

researchProduct

An ensemble approach to short-term forecast of COVID-19 intensive care occupancy in Italian Regions

2020

Abstract The availability of intensive care beds during the COVID‐19 epidemic is crucial to guarantee the best possible treatment to severely affected patients. In this work we show a simple strategy for short‐term prediction of COVID‐19 intensive care unit (ICU) beds, that has proved very effective during the Italian outbreak in February to May 2020. Our approach is based on an optimal ensemble of two simple methods: a generalized linear mixed regression model, which pools information over different areas, and an area‐specific nonstationary integer autoregressive methodology. Optimal weights are estimated using a leave‐last‐out rationale. The approach has been set up and validated during t…

FOS: Computer and information sciencesStatistics and ProbabilityTime FactorsOccupancyCoronavirus disease 2019 (COVID-19)Computer science01 natural sciencesGeneralized linear mixed modelSARS‐CoV‐2law.inventionclustered data; COVID-19; generalized linear mixed model; integer autoregressive; integer autoregressive model; panel data; SARS-CoV-2; weighted ensembleMethodology (stat.ME)panel data010104 statistics & probability03 medical and health sciences0302 clinical medicinelawCOVID‐19Intensive careEconometricsHumansclustered data030212 general & internal medicine0101 mathematicsPandemicsStatistics - MethodologySARS-CoV-2Reproducibility of ResultsCOVID-19General Medicineweighted ensembleIntensive care unitResearch PapersTerm (time)integer autoregressiveIntensive Care UnitsAutoregressive modelItalyNonlinear Dynamicsgeneralized linear mixed modelinteger autoregressive modelclustered data; COVID-19; generalized linear mixed model; integer autoregressive; integer autoregressive model; panel data; SARS-CoV-2; weighted ensemble; COVID-19; Humans; Intensive Care Units; Italy; Nonlinear Dynamics; Pandemics; Reproducibility of Results; Time Factors; ForecastingStatistics Probability and UncertaintySettore SECS-S/01Settore SECS-S/01 - StatisticaPanel dataResearch PaperForecasting

researchProduct

KFAS : Exponential Family State Space Models in R

2017

State space modelling is an efficient and flexible method for statistical inference of a broad class of time series and other data. This paper describes an R package KFAS for state space modelling with the observations from an exponential family, namely Gaussian, Poisson, binomial, negative binomial and gamma distributions. After introducing the basic theory behind Gaussian and non-Gaussian state space models, an illustrative example of Poisson time series forecasting is provided. Finally, a comparison to alternative R packages suitable for non-Gaussian time series modelling is presented.

FOS: Computer and information sciencesStatistics and ProbabilityaikasarjatGaussianNegative binomial distributionforecastingPoisson distribution01 natural sciencesStatistics - ComputationMethodology (stat.ME)010104 statistics & probability03 medical and health sciencessymbols.namesake0302 clinical medicineExponential familyexponential familyGamma distributionStatistical inferenceState spaceApplied mathematicsSannolikhetsteori och statistik030212 general & internal medicine0101 mathematicsProbability Theory and Statisticslcsh:Statisticslcsh:HA1-4737Computation (stat.CO)Statistics - MethodologyMathematicsR; exponential family; state space models; time series; forecasting; dynamic linear modelsta112state space modelsSeries (mathematics)RStatistics; Computer softwaresymbolsStatistics Probability and Uncertaintytime seriesSoftwaredynamic linear models

researchProduct

Community characterization of heterogeneous complex systems

2011

We introduce an analytical statistical method to characterize the communities detected in heterogeneous complex systems. By posing a suitable null hypothesis, our method makes use of the hypergeometric distribution to assess the probability that a given property is over-expressed in the elements of a community with respect to all the elements of the investigated set. We apply our method to two specific complex networks, namely a network of world movies and a network of physics preprints. The characterization of the elements and of the communities is done in terms of languages and countries for the movie network and of journals and subject categories for papers. We find that our method is ab…

FOS: Computer and information sciencesStatistics and Probabilityrandom graphs networks statistical inference socio-economic networksPhysics - Physics and SocietyTheoretical computer scienceProperty (programming)Complex systemFOS: Physical sciencesPhysics and Society (physics.soc-ph)socio-economic networksStatistical inferenceSocial and Information Networks (cs.SI)Random graphComputer Science - Social and Information NetworksStatistical and Nonlinear PhysicsProbability and statisticsComplex networkSettore FIS/07 - Fisica Applicata(Beni Culturali Ambientali Biol.e Medicin)Hypergeometric distributionPhysics - Data Analysis Statistics and ProbabilitynetworkStatistics Probability and UncertaintyNull hypothesisData Analysis Statistics and Probability (physics.data-an)random graphstatistical inferenceJournal of Statistical Mechanics: Theory and Experiment

researchProduct

A Unified SVM Framework for Signal Estimation

2013

This paper presents a unified framework to tackle estimation problems in Digital Signal Processing (DSP) using Support Vector Machines (SVMs). The use of SVMs in estimation problems has been traditionally limited to its mere use as a black-box model. Noting such limitations in the literature, we take advantage of several properties of Mercer's kernels and functional analysis to develop a family of SVM methods for estimation in DSP. Three types of signal model equations are analyzed. First, when a specific time-signal structure is assumed to model the underlying system that generated the data, the linear signal model (so called Primal Signal Model formulation) is first stated and analyzed. T…

FOS: Computer and information sciencesbusiness.industryNoise (signal processing)Computer scienceApplied MathematicsSpectral density estimationArray processingPattern recognitionMachine Learning (stat.ML)Statistics - ApplicationsSupport vector machineKernel (linear algebra)Kernel methodComputational Theory and MathematicsStatistics - Machine LearningArtificial IntelligenceSignal ProcessingApplications (stat.AP)Computer Vision and Pattern RecognitionArtificial intelligenceElectrical and Electronic EngineeringStatistics Probability and UncertaintybusinessDigital signal processingReproducing kernel Hilbert space

researchProduct