Search results for " statistics"
showing 10 items of 1891 documents
Unbiased Estimators and Multilevel Monte Carlo
2018
Multilevel Monte Carlo (MLMC) and unbiased estimators recently proposed by McLeish (Monte Carlo Methods Appl., 2011) and Rhee and Glynn (Oper. Res., 2015) are closely related. This connection is elaborated by presenting a new general class of unbiased estimators, which admits previous debiasing schemes as special cases. New lower variance estimators are proposed, which are stratified versions of earlier unbiased schemes. Under general conditions, essentially when MLMC admits the canonical square root Monte Carlo error rate, the proposed new schemes are shown to be asymptotically as efficient as MLMC, both in terms of variance and cost. The experiments demonstrate that the variance reduction…
Pattern statistics in faro words and permutations
2021
We study the distribution and the popularity of some patterns in $k$-ary faro words, i.e. words over the alphabet $\{1, 2, \ldots, k\}$ obtained by interlacing the letters of two nondecreasing words of lengths differing by at most one. We present a bijection between these words and dispersed Dyck paths (i.e. Motzkin paths with all level steps on the $x$-axis) with a given number of peaks. We show how the bijection maps statistics of consecutive patterns of faro words into linear combinations of other pattern statistics on paths. Then, we deduce enumerative results by providing multivariate generating functions for the distribution and the popularity of patterns of length at most three. Fina…
Gaussianizing the Earth: Multidimensional Information Measures for Earth Data Analysis
2021
Information theory is an excellent framework for analyzing Earth system data because it allows us to characterize uncertainty and redundancy, and is universally interpretable. However, accurately estimating information content is challenging because spatio-temporal data is high-dimensional, heterogeneous and has non-linear characteristics. In this paper, we apply multivariate Gaussianization for probability density estimation which is robust to dimensionality, comes with statistical guarantees, and is easy to apply. In addition, this methodology allows us to estimate information-theoretic measures to characterize multivariate densities: information, entropy, total correlation, and mutual in…
Fractional Spectral Moments for Digital Simulation of Multivariate Wind Velocity Fields
2012
In this paper, a method for the digital simulation of wind velocity fields by Fractional Spectral Moment function is proposed. It is shown that by constructing a digital filter whose coefficients are the fractional spectral moments, it is possible to simulate samples of the target process as superposition of Riesz fractional derivatives of a Gaussian white noise processes. The key of this simulation technique is the generalized Taylor expansion proposed by the authors. The method is extended to multivariate processes and practical issues on the implementation of the method are reported.
Gradients of O-information: Low-order descriptors of high-order dependencies
2023
O-information is an information-theoretic metric that captures the overall balance between redundant and synergistic information shared by groups of three or more variables. To complement the global assessment provided by this metric, here we propose the gradients of the O-information as low-order descriptors that can characterise how high-order effects are localised across a system of interest. We illustrate the capabilities of the proposed framework by revealing the role of specific spins in Ising models with frustration, and on practical data analysis on US macroeconomic data. Our theoretical and empirical analyses demonstrate the potential of these gradients to highlight the contributio…
Simulation-based marginal likelihood for cluster strong lensing cosmology
2015
Comparisons between observed and predicted strong lensing properties of galaxy clusters have been routinely used to claim either tension or consistency with $\Lambda$CDM cosmology. However, standard approaches to such cosmological tests are unable to quantify the preference for one cosmology over another. We advocate approximating the relevant Bayes factor using a marginal likelihood that is based on the following summary statistic: the posterior probability distribution function for the parameters of the scaling relation between Einstein radii and cluster mass, $\alpha$ and $\beta$. We demonstrate, for the first time, a method of estimating the marginal likelihood using the X-ray selected …
Pattern Recovery in Penalized and Thresholded Estimation and its Geometry
2023
We consider the framework of penalized estimation where the penalty term is given by a real-valued polyhedral gauge, which encompasses methods such as LASSO (and many variants thereof such as the generalized LASSO), SLOPE, OSCAR, PACS and others. Each of these estimators can uncover a different structure or ``pattern'' of the unknown parameter vector. We define a general notion of patterns based on subdifferentials and formalize an approach to measure their complexity. For pattern recovery, we provide a minimal condition for a particular pattern to be detected by the procedure with positive probability, the so-called accessibility condition. Using our approach, we also introduce the stronge…
Bayesian Modeling and MCMC Computation in Linear Logistic Regression for Presence-only Data
2013
Presence-only data are referred to situations in which, given a censoring mechanism, a binary response can be observed only with respect to on outcome, usually called \textit{presence}. In this work we present a Bayesian approach to the problem of presence-only data based on a two levels scheme. A probability law and a case-control design are combined to handle the double source of uncertainty: one due to the censoring and one due to the sampling. We propose a new formalization for the logistic model with presence-only data that allows further insight into inferential issues related to the model. We concentrate on the case of the linear logistic regression and, in order to make inference on…
BayesVarSel: Bayesian Testing, Variable Selection and model averaging in Linear Models using R
2016
This paper introduces the R package BayesVarSel which implements objective Bayesian methodology for hypothesis testing and variable selection in linear models. The package computes posterior probabilities of the competing hypotheses/models and provides a suite of tools, specifically proposed in the literature, to properly summarize the results. Additionally, \ourpack\ is armed with functions to compute several types of model averaging estimations and predictions with weights given by the posterior probabilities. BayesVarSel contains exact algorithms to perform fast computations in problems of small to moderate size and heuristic sampling methods to solve large problems. The software is inte…
Sparse and Smooth: improved guarantees for Spectral Clustering in the Dynamic Stochastic Block Model
2020
In this paper, we analyse classical variants of the Spectral Clustering (SC) algorithm in the Dynamic Stochastic Block Model (DSBM). Existing results show that, in the relatively sparse case where the expected degree grows logarithmically with the number of nodes, guarantees in the static case can be extended to the dynamic case and yield improved error bounds when the DSBM is sufficiently smooth in time, that is, the communities do not change too much between two time steps. We improve over these results by drawing a new link between the sparsity and the smoothness of the DSBM: the more regular the DSBM is, the more sparse it can be, while still guaranteeing consistent recovery. In particu…