Search results for "probability"
showing 10 items of 3417 documents
Sparse relative risk regression models
2020
Summary Clinical studies where patients are routinely screened for many genomic features are becoming more routine. In principle, this holds the promise of being able to find genomic signatures for a particular disease. In particular, cancer survival is thought to be closely linked to the genomic constitution of the tumor. Discovering such signatures will be useful in the diagnosis of the patient, may be used for treatment decisions and, perhaps, even the development of new treatments. However, genomic data are typically noisy and high-dimensional, not rarely outstripping the number of patients included in the study. Regularized survival models have been proposed to deal with such scenarios…
A fast and recursive algorithm for clustering large datasets with k-medians
2012
Clustering with fast algorithms large samples of high dimensional data is an important challenge in computational statistics. Borrowing ideas from MacQueen (1967) who introduced a sequential version of the $k$-means algorithm, a new class of recursive stochastic gradient algorithms designed for the $k$-medians loss criterion is proposed. By their recursive nature, these algorithms are very fast and are well adapted to deal with large samples of data that are allowed to arrive sequentially. It is proved that the stochastic gradient algorithm converges almost surely to the set of stationary points of the underlying loss criterion. A particular attention is paid to the averaged versions, which…
A Comment on the Coefficient of Determination for Binary Responses
1992
Abstract Linear logistic or probit regression can be closely approximated by an unweighted least squares analysis of the regression linear in the conditional probabilities provided that these probabilities for success and failure are not too extreme. It is shown how this restriction on the probabilities translates into a restriction on the range of the coefficient of determination R 2 so that, as a consequence, R 2 is not suitable to judge the effectiveness of linear regressions with binary responses even if an important relation is present.
Correlated randomness and switching phenomena
2010
One challenge of biology, medicine, and economics is that the systems treated by these serious scientific disciplines have no perfect metronome in time and no perfect spatial architecture—crystalline or otherwise. Nonetheless, as if by magic, out of nothing but randomness one finds remarkably fine-tuned processes in time and remarkably fine-tuned structures in space. Further, many of these processes and structures have the remarkable feature of “switching” from one behavior to another as if by magic. The past century has, philosophically, been concerned with placing aside the human tendency to see the universe as a fine-tuned machine. Here we will address the challenge of uncovering how, th…
An interest rates cluster analysis
2004
An empirical analysis of interest rates in money and capital markets is performed. We investigate a set of 34 different weekly interest rate time series during a time period of 16 years between 1982 and 1997. Our study is focused on the collective behavior of the stochastic fluctuations of these time-series which is investigated by using a clustering linkage procedure. Without any a priori assumption, we individuate a meaningful separation in 6 main clusters organized in a hierarchical structure.
Bayesian subset selection for additive and linear loss function
1979
Given k independent samples of common size n from k populations πj,…,πk with distribution the problem is to select a non-empty subset form {πj,…,πk}, which is associated with "good" (large) θ-values. We consider this problem from a Bayesian approach. By choosing additive and especially linear loss functions we try to fill a gap lying in between the results of Deely and Gupta (1968) and more recent papers due to Goel and Rubin (1977), Gupta and Hsu (1978) and other authors. It is shown that under acertain "normal model" Seal's procedure turns out to be Bayes w.r.t. an unrealistic loss function where as Gupta's maximunl means procedure turns out to be ( for large n) asymptotically Bayes w.r. …
The asymptotic covariance matrix of the Oja median
2003
The Oja median, based on a sample of multivariate data, is an affine equivariant estimate of the centre of the distribution. It reduces to the sample median in one dimension and has several nice robustness and efficiency properties. We develop different representations of its asymptotic variance and discuss ways to estimate this quantity. We consider symmetric multivariate models and also the more narrow elliptical models. A small simulation study is included to compare finite sample results to the asymptotic formulas.
Random Logistic Maps II. The Critical Case
2003
Let (X n )∞ 0 be a Markov chain with state space S=[0,1] generated by the iteration of i.i.d. random logistic maps, i.e., X n+1=C n+1 X n (1−X n ),n≥0, where (C n )∞ 1 are i.i.d. random variables with values in [0, 4] and independent of X 0. In the critical case, i.e., when E(log C 1)=0, Athreya and Dai(2) have shown that X n → P 0. In this paper it is shown that if P(C 1=1)<1 and E(log C 1)=0 then (i) X n does not go to zero with probability one (w.p.1) and in fact, there exists a 0<β<1 and a countable set ▵⊂(0,1) such that for all x∈A≔(0,1)∖▵, P x (X n ≥β for infinitely many n≥1)=1, where P x stands for the probability distribution of (X n )∞ 0 with X 0=x w.p.1. A is a closed set for (X n…
A GALTON-WATSON BRANCHING PROCESS IN VARYING ENVIRONMENTS WITH ESSENTIALLY CONSTANT OFFSPRING MEANS AND TWO RATES OF GROWTH1
1983
Summary A Galton-Watson process in varying environments (Zn), with essentially constant offspring means, i.e. E(Zn)/mnα∈(0, ∞), and exactly two rates of growth is constructed. The underlying sample space Ω can be decomposed into parts A and B such that (Zn)n grows like 2non A and like mnon B (m > 4).
A Unified Approach to Likelihood Inference on Stochastic Orderings in a Nonparametric Context
1998
Abstract For data in a two-way contingency table with ordered margins, we consider various hypotheses of stochastic orders among the conditional distributions considered by rows and show that each is equivalent to requiring that an invertible transformation of the vectors of conditional row probabilities satisfies an appropriate set of linear inequalities. This leads to the construction of a general algorithm for maximum likelihood estimation under multinomial sampling and provides a simple framework for deriving the asymptotic distribution of log-likelihood ratio tests. The usual stochastic ordering and the so called uniform and likelihood ratio orderings are considered as special cases. I…