Search results for " statistics"

showing 10 items of 1891 documents

Unbiased Estimators and Multilevel Monte Carlo

2018

Multilevel Monte Carlo (MLMC) and unbiased estimators recently proposed by McLeish (Monte Carlo Methods Appl., 2011) and Rhee and Glynn (Oper. Res., 2015) are closely related. This connection is elaborated by presenting a new general class of unbiased estimators, which admits previous debiasing schemes as special cases. New lower variance estimators are proposed, which are stratified versions of earlier unbiased schemes. Under general conditions, essentially when MLMC admits the canonical square root Monte Carlo error rate, the proposed new schemes are shown to be asymptotically as efficient as MLMC, both in terms of variance and cost. The experiments demonstrate that the variance reduction…

FOS: Computer and information sciencesMonte Carlo methodWord error rate010103 numerical & computational mathematicsstochastic differential equationManagement Science and Operations ResearchStatistics - Computation01 natural sciences010104 statistics & probabilityStochastic differential equationstratificationSquare rootFOS: MathematicsApplied mathematics0101 mathematicsComputation (stat.CO)stokastiset prosessitMathematicsProbability (math.PR)ta111EstimatorVariance (accounting)unbiased estimatorsComputer Science ApplicationsMonte Carlo -menetelmät65C05 (Primary) 65C30 (Secondary)efficiencykerrostuneisuusVariance reductionunbiasemultilevel Monte CarlodifferentiaaliyhtälötMathematics - ProbabilityOperations Research
researchProduct

Pattern statistics in faro words and permutations

2021

We study the distribution and the popularity of some patterns in $k$-ary faro words, i.e. words over the alphabet $\{1, 2, \ldots, k\}$ obtained by interlacing the letters of two nondecreasing words of lengths differing by at most one. We present a bijection between these words and dispersed Dyck paths (i.e. Motzkin paths with all level steps on the $x$-axis) with a given number of peaks. We show how the bijection maps statistics of consecutive patterns of faro words into linear combinations of other pattern statistics on paths. Then, we deduce enumerative results by providing multivariate generating functions for the distribution and the popularity of patterns of length at most three. Fina…

FOS: Computer and information sciencesMultivariate statisticsDistribution (number theory)Discrete Mathematics (cs.DM)Interlacing0102 computer and information sciences02 engineering and technology[INFO.INFO-DM]Computer Science [cs]/Discrete Mathematics [cs.DM]01 natural sciencesTheoretical Computer ScienceCombinatoricsStatistics[MATH.MATH-CO]Mathematics [math]/Combinatorics [math.CO]05A05 (Primary) 05A15 05A19 68R15 (Secondary)0202 electrical engineering electronic engineering information engineeringFOS: MathematicsDiscrete Mathematics and CombinatoricsMathematics - CombinatoricsLinear combinationMathematicsDiscrete mathematicsMathematics::Combinatorics020206 networking & telecommunicationsComputer Science::Computation and Language (Computational Linguistics and Natural Language and Speech Processing)Derangement010201 computation theory & mathematicsBijectionCombinatorics (math.CO)AlphabetComputer Science::Formal Languages and Automata TheoryComputer Science - Discrete Mathematics
researchProduct

Gaussianizing the Earth: Multidimensional Information Measures for Earth Data Analysis

2021

Information theory is an excellent framework for analyzing Earth system data because it allows us to characterize uncertainty and redundancy, and is universally interpretable. However, accurately estimating information content is challenging because spatio-temporal data is high-dimensional, heterogeneous and has non-linear characteristics. In this paper, we apply multivariate Gaussianization for probability density estimation which is robust to dimensionality, comes with statistical guarantees, and is easy to apply. In addition, this methodology allows us to estimate information-theoretic measures to characterize multivariate densities: information, entropy, total correlation, and mutual in…

FOS: Computer and information sciencesMultivariate statisticsGeneral Computer ScienceComputer scienceMachine Learning (stat.ML)Mutual informationInformation theorycomputer.software_genreStatistics - ApplicationsEarth system scienceRedundancy (information theory)13. Climate actionStatistics - Machine LearningGeneral Earth and Planetary SciencesEntropy (information theory)Applications (stat.AP)Total correlationData miningElectrical and Electronic EngineeringInstrumentationcomputerCurse of dimensionality
researchProduct

Fractional Spectral Moments for Digital Simulation of Multivariate Wind Velocity Fields

2012

In this paper, a method for the digital simulation of wind velocity fields by Fractional Spectral Moment function is proposed. It is shown that by constructing a digital filter whose coefficients are the fractional spectral moments, it is possible to simulate samples of the target process as superposition of Riesz fractional derivatives of a Gaussian white noise processes. The key of this simulation technique is the generalized Taylor expansion proposed by the authors. The method is extended to multivariate processes and practical issues on the implementation of the method are reported.

FOS: Computer and information sciencesMultivariate wind velocity fieldMultivariate statisticsStatistical Mechanics (cond-mat.stat-mech)Fractional spectral momentRenewable Energy Sustainability and the EnvironmentMechanical EngineeringMathematical analysisFOS: Physical sciencesGeneralized Taylor formWhite noiseFunction (mathematics)Digital simulation of Gaussian stationary processeFractional calculuStatistics - ComputationTransfer functionWind speedFractional calculusSuperposition principleSettore ICAR/08 - Scienza Delle CostruzioniComputation (stat.CO)Condensed Matter - Statistical MechanicsLinear filterCivil and Structural EngineeringMathematics
researchProduct

Gradients of O-information: Low-order descriptors of high-order dependencies

2023

O-information is an information-theoretic metric that captures the overall balance between redundant and synergistic information shared by groups of three or more variables. To complement the global assessment provided by this metric, here we propose the gradients of the O-information as low-order descriptors that can characterise how high-order effects are localised across a system of interest. We illustrate the capabilities of the proposed framework by revealing the role of specific spins in Ising models with frustration, and on practical data analysis on US macroeconomic data. Our theoretical and empirical analyses demonstrate the potential of these gradients to highlight the contributio…

FOS: Computer and information sciencesPhysics and AstronomyInformation Theory (cs.IT)Computer Science - Information TheoryPhysics - Data Analysis Statistics and ProbabilitySettore ING-INF/06 - Bioingegneria Elettronica E InformaticaFOS: Physical sciencesGeneral Physics and Astronomycomplex systems information theory dynamical systems econophysicsData Analysis Statistics and Probability (physics.data-an)Physical Review Research
researchProduct

Simulation-based marginal likelihood for cluster strong lensing cosmology

2015

Comparisons between observed and predicted strong lensing properties of galaxy clusters have been routinely used to claim either tension or consistency with $\Lambda$CDM cosmology. However, standard approaches to such cosmological tests are unable to quantify the preference for one cosmology over another. We advocate approximating the relevant Bayes factor using a marginal likelihood that is based on the following summary statistic: the posterior probability distribution function for the parameters of the scaling relation between Einstein radii and cluster mass, $\alpha$ and $\beta$. We demonstrate, for the first time, a method of estimating the marginal likelihood using the X-ray selected …

FOS: Computer and information sciencesSTATISTICAL [METHODS]Cold dark matterCosmology and Nongalactic Astrophysics (astro-ph.CO)NUMERICAL [METHODS]Ciencias FísicasPosterior probabilityFOS: Physical sciencesAstrophysics::Cosmology and Extragalactic Astrophysics01 natural sciencesStatistics - ApplicationsCosmologymethods: numerical//purl.org/becyt/ford/1 [https]cosmology: theory0103 physical sciencesCluster (physics)Applications (stat.AP)Statistical physics010303 astronomy & astrophysicsInstrumentation and Methods for Astrophysics (astro-ph.IM)Galaxy clusterPhysicsmethods: statisticalgravitational lensing: strong; methods: numerical; methods: statistical; galaxies: clusters: general; cosmology: theory010308 nuclear & particles physicsgravitational lensing: strongAstronomy and AstrophysicsBayes factor//purl.org/becyt/ford/1.3 [https]STRONG [GRAVITATIONAL LENSING]RedshiftMarginal likelihoodAstronomíaTHEORY [COSMOLOGY]Space and Planetary Sciencegalaxies: clusters: generalPhysics - Data Analysis Statistics and ProbabilityCLUSTERS: GENERAL [GALAXIES]Astrophysics - Instrumentation and Methods for AstrophysicsData Analysis Statistics and Probability (physics.data-an)CIENCIAS NATURALES Y EXACTASAstrophysics - Cosmology and Nongalactic Astrophysics
researchProduct

Pattern Recovery in Penalized and Thresholded Estimation and its Geometry

2023

We consider the framework of penalized estimation where the penalty term is given by a real-valued polyhedral gauge, which encompasses methods such as LASSO (and many variants thereof such as the generalized LASSO), SLOPE, OSCAR, PACS and others. Each of these estimators can uncover a different structure or ``pattern'' of the unknown parameter vector. We define a general notion of patterns based on subdifferentials and formalize an approach to measure their complexity. For pattern recovery, we provide a minimal condition for a particular pattern to be detected by the procedure with positive probability, the so-called accessibility condition. Using our approach, we also introduce the stronge…

FOS: Computer and information sciencesStatistics - Machine LearningFOS: MathematicsMathematics - Statistics TheoryMachine Learning (stat.ML)[MATH] Mathematics [math]Statistics Theory (math.ST)
researchProduct

Bayesian Modeling and MCMC Computation in Linear Logistic Regression for Presence-only Data

2013

Presence-only data are referred to situations in which, given a censoring mechanism, a binary response can be observed only with respect to on outcome, usually called \textit{presence}. In this work we present a Bayesian approach to the problem of presence-only data based on a two levels scheme. A probability law and a case-control design are combined to handle the double source of uncertainty: one due to the censoring and one due to the sampling. We propose a new formalization for the logistic model with presence-only data that allows further insight into inferential issues related to the model. We concentrate on the case of the linear logistic regression and, in order to make inference on…

FOS: Computer and information sciencesStatistics - Other StatisticsOther Statistics (stat.OT)Statistics - ComputationComputation (stat.CO)
researchProduct

BayesVarSel: Bayesian Testing, Variable Selection and model averaging in Linear Models using R

2016

This paper introduces the R package BayesVarSel which implements objective Bayesian methodology for hypothesis testing and variable selection in linear models. The package computes posterior probabilities of the competing hypotheses/models and provides a suite of tools, specifically proposed in the literature, to properly summarize the results. Additionally, \ourpack\ is armed with functions to compute several types of model averaging estimations and predictions with weights given by the posterior probabilities. BayesVarSel contains exact algorithms to perform fast computations in problems of small to moderate size and heuristic sampling methods to solve large problems. The software is inte…

FOS: Computer and information sciencesStatistics - Other StatisticsOther Statistics (stat.OT)bepress|Physical Sciences and Mathematics|Statistics and Probability
researchProduct

Sparse and Smooth: improved guarantees for Spectral Clustering in the Dynamic Stochastic Block Model

2020

In this paper, we analyse classical variants of the Spectral Clustering (SC) algorithm in the Dynamic Stochastic Block Model (DSBM). Existing results show that, in the relatively sparse case where the expected degree grows logarithmically with the number of nodes, guarantees in the static case can be extended to the dynamic case and yield improved error bounds when the DSBM is sufficiently smooth in time, that is, the communities do not change too much between two time steps. We improve over these results by drawing a new link between the sparsity and the smoothness of the DSBM: the more regular the DSBM is, the more sparse it can be, while still guaranteeing consistent recovery. In particu…

FOS: Computer and information sciencesStatistics and ProbabilityComputer Science - Machine Learning[STAT.ML]Statistics [stat]/Machine Learning [stat.ML]Statistics - Machine LearningFOS: MathematicsMachine Learning (stat.ML)Mathematics - Statistics TheoryStatistics Theory (math.ST)Statistics Probability and Uncertainty[STAT.ML] Statistics [stat]/Machine Learning [stat.ML]Machine Learning (cs.LG)
researchProduct