6533b824fe1ef96bd12800b9

RESEARCH PRODUCT

STATIS and DISTATIS: optimum multitable principal component analysis and three way metric multidimensional scaling

Domininique ValentinHervé AbdiMohammed Bennani-dosseLynne J. Williams

subject

Statistics and ProbabilityMathematical optimizationSimilarity (geometry)[STAT.TH]Statistics [stat]/Statistics Theory [stat.TH]Linear discriminant analysiscomputer.software_genre01 natural sciences[ STAT.TH ] Statistics [stat]/Statistics Theory [stat.TH]Correspondence analysisSet (abstract data type)010104 statistics & probability03 medical and health sciences0302 clinical medicine[MATH.MATH-ST]Mathematics [math]/Statistics [math.ST]Multiple factor analysisPrincipal component analysisMetric (mathematics)Data miningMultidimensional scaling[ MATH.MATH-ST ] Mathematics [math]/Statistics [math.ST]0101 mathematicscomputer030217 neurology & neurosurgeryComputingMilieux_MISCELLANEOUSMathematics

description

STATIS is an extension of principal component analysis PCA tailored to handle multiple data tables that measure sets of variables collected on the same observations, or, alternatively, as in a variant called dual-STATIS, multiple data tables where the same variables are measured on different sets of observations. STATIS proceeds in two steps: First it analyzes the between data table similarity structure and derives from this analysis an optimal set of weights that are used to compute a linear combination of the data tables called the compromise that best represents the information common to the different data tables; Second, the PCA of this compromise gives an optimal map of the observations. Each of the data tables also provides a map of the observations that is in the same space as the optimum compromise map. In this article, we present STATIS, explain the criteria that it optimizes, review the recent inferential extensions to STATIS and illustrate it with a detailed example.

10.1002/wics.198https://hal.science/hal-00955809