Sparse relative risk regression models

6533b821fe1ef96bd127b076

RESEARCH PRODUCT

Sparse relative risk regression models

Hassan Pazira Ernst Wit Fentaw Abegaz Fentaw Abegaz Javier González Luigi Augugliaro

subject

Statistics and Probability Clustering high-dimensional data Computer science dgLARS Inference Scale (descriptive set theory)Biostatistics Machine learning computer.software_genre Risk Assessment 01 natural sciences Regularization (mathematics)Relative risk regression model 010104 statistics & probability 03 medical and health sciences Neoplasms Covariate Humans Computer Simulation 0101 mathematics Online Only Articles Survival analysis 030304 developmental biology 0303 health sciences Models Statistical business.industry Least-angle regression Regression analysis General Medicine Survival Analysis High-dimensional data Gene expression data Regression Analysis Artificial intelligence Statistics Probability and Uncertainty Settore SECS-S/01 - Statistica business Sparsity computer

description

Summary Clinical studies where patients are routinely screened for many genomic features are becoming more routine. In principle, this holds the promise of being able to find genomic signatures for a particular disease. In particular, cancer survival is thought to be closely linked to the genomic constitution of the tumor. Discovering such signatures will be useful in the diagnosis of the patient, may be used for treatment decisions and, perhaps, even the development of new treatments. However, genomic data are typically noisy and high-dimensional, not rarely outstripping the number of patients included in the study. Regularized survival models have been proposed to deal with such scenarios. These methods typically induce sparsity by means of a coincidental match of the geometry of the convex likelihood and a (near) non-convex regularizer. The disadvantages of such methods are that they are typically non-invariant to scale changes of the covariates, they struggle with highly correlated covariates, and they have a practical problem of determining the amount of regularization. In this article, we propose an extension of the differential geometric least angle regression method for sparse inference in relative risk regression models. A software implementation of our method is available on github (https://github.com/LuigiAugugliaro/dgcox).

year	journal	country	edition	language
2020-04-01	Biostatistics

10.1093/biostatistics/kxy060 https://hdl.handle.net/11370/1175bca5-a60f-4e63-b256-58210fb10e5d