Search results for "uncertainty."
showing 10 items of 972 documents
Graphical User Interfaces for R
2012
Since R was first launched, it has managed to gain the support of an ever-increasing percentage of academic and professional statisticians. However, the spread of its use among novice and occasional users of statistics have not progressed at the same pace, which can be attributed partially to the lack of a graphical user interface (GUI). Nevertheless, this situation has changed in the last years and there is currently several projects that have added GUIs to R. This article discusses briefly the history of GUIs for data analysis and then introduces the papers submitted to an special issue of the Journal of Statistical Software on GUIs for R.
On delocalization of eigenvectors of random non-Hermitian matrices
2019
We study delocalization of null vectors and eigenvectors of random matrices with i.i.d entries. Let $A$ be an $n\times n$ random matrix with i.i.d real subgaussian entries of zero mean and unit variance. We show that with probability at least $1-e^{-\log^{2} n}$ $$ \min\limits_{I\subset[n],\,|I|= m}\|{\bf v}_I\| \geq \frac{m^{3/2}}{n^{3/2}\log^Cn}\|{\bf v}\| $$ for any real eigenvector ${\bf v}$ and any $m\in[\log^C n,n]$, where ${\bf v}_I$ denotes the restriction of ${\bf v}$ to $I$. Further, when the entries of $A$ are complex, with i.i.d real and imaginary parts, we show that with probability at least $1-e^{-\log^{2} n}$ all eigenvectors of $A$ are delocalized in the sense that $$ \min\l…
A PHASE TRANSITION FOR LARGE VALUES OF BIFURCATING AUTOREGRESSIVE MODELS
2019
We describe the asymptotic behavior of the number $$Z_n[a_n,\infty )$$ of individuals with a large value in a stable bifurcating autoregressive process, where $$a_n\rightarrow \infty $$ . The study of the associated first moment is equivalent to the annealed large deviation problem of an autoregressive process in a random environment. The trajectorial behavior of $$Z_n[a_n,\infty )$$ is obtained by the study of the ancestral paths corresponding to the large deviation event together with the environment of the process. This study of large deviations of autoregressive processes in random environment is of independent interest and achieved first. The estimates for bifurcating autoregressive pr…
Bayesian measures of surprise for outlier detection
2003
From a Bayesian point of view, testing whether an observation is an outlier is usually reduced to a testing problem concerning a parameter of a contaminating distribution. This requires elicitation of both (i) the contaminating distribution that generates the outlier and (ii) prior distributions on its parameters. However, very little information is typically available about how the possible outlier could have been generated. Thus easy, preliminary checks in which these assessments can often be avoided may prove useful. Several such measures of surprise are derived for outlier detection in normal models. Results are applied to several examples. Default Bayes factors, where the contaminating…
Electricity consumption prediction with functional linear regression using spline estimators
2010
A functional linear regression model linking observations of a functional response variable with measurements of an explanatory functional variable is considered. This model serves to analyse a real data set describing electricity consumption in Sardinia. The interest lies in predicting either oncoming weekends’ or oncoming weekdays’ consumption, provided actual weekdays’ consumption is known. A B-spline estimator of the functional parameter is used. Selected computational issues are addressed as well.
What subject matter questions motivate the use of machine learning approaches compared to statistical models for probability prediction?
2014
This is a discussion of the following papers: "Probability estimation with machine learning methods for dichotomous and multicategory outcome: Theory" by Jochen Kruppa, Yufeng Liu, Gerard Biau, Michael Kohler, Inke R. Konig, James D. Malley, and Andreas Ziegler; and "Probability estimation with machine learning methods for dichotomous and multicategory outcome: Applications" by Jochen Kruppa, Yufeng Liu, Hans-Christian Diener, Theresa Holste, Christian Weimar, Inke R. Konig, and Andreas Ziegler.
cglasso: An R Package for Conditional Graphical Lasso Inference with Censored and Missing Values
2023
Sparse graphical models have revolutionized multivariate inference. With the advent of high-dimensional multivariate data in many applied fields, these methods are able to detect a much lower-dimensional structure, often represented via a sparse conditional independence graph. There have been numerous extensions of such methods in the past decade. Many practical applications have additional covariates or suffer from missing or censored data. Despite the development of these extensions of sparse inference methods for graphical models, there have been so far no implementations for, e.g., conditional graphical models. Here we present the general-purpose package cglasso for estimating sparse co…
Estimation of total electricity consumption curves by sampling in a finite population when some trajectories are partially unobserved
2019
International audience; Millions of smart meters that are able to collect individual load curves, that is, electricity consumption time series, of residential and business customers at fine scale time grids are now deployed by electricity companies all around the world. It may be complex and costly to transmit and exploit such a large quantity of information, therefore it can be relevant to use survey sampling techniques to estimate mean load curves of specific groups of customers. Data collection, like every mass process, may undergo technical problems at every point of the metering and collection chain resulting in missing values. We consider imputation approaches (linear interpolation, k…
Basing the Analysis of Comparative Bioavailability Trials on an Individualized Statistical Definition of Equivalence
1993
The conventional definition of bioequivalence in terms of population means only, is criticized for lacking relevance to the individual subject. Both approaches to bioequivalence assessment proposed here for avoiding this shortcoming, focus on the probability of an event induced by the response of a randomly selected subject to two formulations of a given active agent. The first approach leads to converting the basic idea underlying the well-known 75-rule into an exact statistical procedure. The second approach is of a parametric nature. It reduces bioequivalence assessment to testing against the alternative hypothesis that the standardized expected value of a Gaussian distribution is contai…
A Bayesian comparison of cluster, strata, and random samples
1999
When sampling from finite populations, simple random sampling (SRS) is rarely used in practice, due to either high cost or information to be gained from more efficient designs. Bayesian hierarchical models are a natural framework to model the non-randomness in the sample. This paper concentrates on the effects that the design has on inference about characteristics of the finite population, and makes a critical comparison among some common designs.