Search results for "Matrix"
showing 10 items of 3205 documents
Online Principal Component Analysis in High Dimension: Which Algorithm to Choose?
2017
Summary Principal component analysis (PCA) is a method of choice for dimension reduction. In the current context of data explosion, online techniques that do not require storing all data in memory are indispensable to perform the PCA of streaming data and/or massive data. Despite the wide availability of recursive algorithms that can efficiently update the PCA when new data are observed, the literature offers little guidance on how to select a suitable algorithm for a given application. This paper reviews the main approaches to online PCA, namely, perturbation techniques, incremental methods and stochastic optimisation, and compares the most widely employed techniques in terms statistical a…
Blind Source Separation Based on Joint Diagonalization in R: The Packages JADE and BSSasymp
2017
Blind source separation (BSS) is a well-known signal processing tool which is used to solve practical data analysis problems in various fields of science. In BSS, we assume that the observed data consists of linear mixtures of latent variables. The mixing system and the distributions of the latent variables are unknown. The aim is to find an estimate of an unmixing matrix which then transforms the observed data back to latent sources. In this paper we present the R packages JADE and BSSasymp. The package JADE offers several BSS methods which are based on joint diagonalization. Package BSSasymp contains functions for computing the asymptotic covariance matrices as well as their data-based es…
Fast Estimation of the Median Covariation Matrix with Application to Online Robust Principal Components Analysis
2017
International audience; The geometric median covariation matrix is a robust multivariate indicator of dispersion which can be extended without any difficulty to functional data. We define estimators, based on recursive algorithms, that can be simply updated at each new observation and are able to deal rapidly with large samples of high dimensional data without being obliged to store all the data in memory. Asymptotic convergence properties of the recursive algorithms are studied under weak conditions. The computation of the principal components can also be performed online and this approach can be useful for online outlier detection. A simulation study clearly shows that this robust indicat…
The rank of random regular digraphs of constant degree
2018
Abstract Let d be a (large) integer. Given n ≥ 2 d , let A n be the adjacency matrix of a random directed d -regular graph on n vertices, with the uniform distribution. We show that the rank of A n is at least n − 1 with probability going to one as n grows to infinity. The proof combines the well known method of simple switchings and a recent result of the authors on delocalization of eigenvectors of A n .
Archetypoids: A new approach to define representative archetypal data
2015
[EN] The new concept archetypoids is introduced. Archetypoid analysis represents each observation in a dataset as a mixture of actual observations in the dataset, which are pure type or archetypoids. Unlike archetype analysis, archetypoids are real observations, not a mixture of observations. This is relevant when existing archetypal observations are needed, rather than fictitious ones. An algorithm is proposed to find them and some of their theoretical properties are introduced. It is also shown how they can be obtained when only dissimilarities between observations are known (features are unavailable). Archetypoid analysis is illustrated in two design problems and several examples, compar…
Sign and rank covariance matrices
2000
The robust estimation of multivariate location and shape is one of the most challenging problems in statistics and crucial in many application areas. The objective is to find highly efficient, robust, computable and affine equivariant location and covariance matrix estimates. In this paper, three different concepts of multivariate sign and rank are considered and their ability to carry information about the geometry of the underlying distribution (or data cloud) are discussed. New techniques for robust covariance matrix estimation based on different sign and rank concepts are proposed and algorithms for computing them outlined. In addition, new tools for evaluating the qualitative and quant…
The affine equivariant sign covariance matrix: asymptotic behavior and efficiencies
2003
We consider the affine equivariant sign covariance matrix (SCM) introduced by Visuri et al. (J. Statist. Plann. Inference 91 (2000) 557). The population SCM is shown to be proportional to the inverse of the regular covariance matrix. The eigenvectors and standardized eigenvalues of the covariance, matrix can thus be derived from the SCM. We also construct an estimate of the covariance and correlation matrix based on the SCM. The influence functions and limiting distributions of the SCM and its eigenvectors and eigenvalues are found. Limiting efficiencies are given in multivariate normal and t-distribution cases. The estimates are highly efficient in the multivariate normal case and perform …
Inference based on the affine invariant multivariate Mann–Whitney–Wilcoxon statistic
2003
A new affine invariant multivariate analogue of the two-sample Mann–Whitney–Wilcoxon test based on the Oja criterion function is introduced. The associated affine equivariant estimate of shift, the multivariate Hodges-Lehmann estimate, is also considered. Asymptotic theory is developed to provide approximations for null distribution as well as for a sequence of contiguous alternatives to consider limiting efficiencies of the test and estimate. The theory is illustrated by an example. Hettmansperger et al. [9] considered alternative slightly different affine invariant extensions also based on the Oja criterion. The methods proposed in this paper are computationally more intensive, but surpri…
Cotas inferiores para el QAP-Arbol
1985
The Tree-QAP is a special case of the Quadratic Assignment Problem where the flows not equal zero form a tree. No condition is required for the distance matrix. In this paper we present an integer programming formulation for the Tree-QAP. We use this formulation to construct four Lagrangean relaxations that produce several lower bounds for this problem. To solve one of the relaxed problems we present a Dynamic Programming algorithm which is a generalization of the algorithm of this type that gives a lower bound for the Travelling Salesman Problem. A comparison is given between the lower bounds obtained by each ralaxation for examples with size from 12 to 25.
Symmetrised M-estimators of multivariate scatter
2007
AbstractIn this paper we introduce a family of symmetrised M-estimators of multivariate scatter. These are defined to be M-estimators only computed on pairwise differences of the observed multivariate data. Symmetrised Huber's M-estimator and Dümbgen's estimator serve as our examples. The influence functions of the symmetrised M-functionals are derived and the limiting distributions of the estimators are discussed in the multivariate elliptical case to consider the robustness and efficiency properties of estimators. The symmetrised M-estimators have the important independence property; they can therefore be used to find the independent components in the independent component analysis (ICA).