6533b7dcfe1ef96bd1272c93

RESEARCH PRODUCT

Differential geometric least angle regression: a differential geometric approach to sparse generalized linear models

Luigi AugugliaroErnst WitAngelo Mineo

subject

Statistics and ProbabilityGeneralized linear modelSparse modelMathematical optimizationGeneralized linear modelsVariable selectionPath following algorithmEquiangular polygonGeneralized linear modelLASSODANTZIG SELECTORsymbols.namesakeExponential familyLasso (statistics)Sparse modelsDifferential geometryInformation geometryCOORDINATE DESCENTFisher informationERRORMathematicsLeast-angle regressionLeast angle regressionGeneralized degrees of freedomsymbolsSHRINKAGEStatistics Probability and UncertaintySimple linear regressionInformation geometrySettore SECS-S/01 - StatisticaAlgorithmCovariance penalty theory

description

Summary Sparsity is an essential feature of many contemporary data problems. Remote sensing, various forms of automated screening and other high throughput measurement devices collect a large amount of information, typically about few independent statistical subjects or units. In certain cases it is reasonable to assume that the underlying process generating the data is itself sparse, in the sense that only a few of the measured variables are involved in the process. We propose an explicit method of monotonically decreasing sparsity for outcomes that can be modelled by an exponential family. In our approach we generalize the equiangular condition in a generalized linear model. Although the geometry involves the Fisher information in a way that is not obvious in the simple regression setting, the equiangular condition turns out to be equivalent with an intuitive condition imposed on the Rao score test statistics. In certain special cases the method can be tweaked to obtain L1-penalized generalized linear model solution paths, but the method itself defines sparsity more directly. Although the computation of the solution paths is not trivial, the method compares favourably with other path following algorithms.

10.1111/rssb.12000http://hdl.handle.net/10447/77933