Search results for " Variable selection"
showing 10 items of 11 documents
A graphical model selection tool for mixed models
2017
Model selection can be defined as the task of estimating the performance of different models in order to choose the most parsimonious one, among a potentially very large set of candidate statistical models. We propose a graphical representation to be considered as an extension to the class of mixed models of the deviance plot proposed in the literature within the framework of classical and generalized linear models. This graphical representation allows, once a reduced number of models have been selected, to identify important covariates focusing only on the fixed effects component, assuming the random part properly specified. Nevertheless, we suggest also a standalone figure representing th…
Differential geometric LARS via cyclic coordinate descent method
2012
We address the problem of how to compute the coefficient path implicitly defined by the differential geometric LARS (dgLARS) method in a high-dimensional setting. Although the geometrical theory developed to define the dgLARS method does not need of the definition of a penalty function, we show that it is possible to develop a cyclic coordinate descent algorithm to compute the solution curve in a high-dimensional setting. Simulation studies show that the proposed algorithm is significantly faster than the prediction-corrector algorithm originally developed to compute the dgLARS solution curve.
Using the dglars Package to Estimate a Sparse Generalized Linear Model
2015
dglars is a publicly available R package that implements the method proposed in Augugliaro et al. (J. R. Statist. Soc. B 75(3), 471-498, 2013) developed to study the sparse structure of a generalized linear model (GLM). This method, called dgLARS, is based on a differential geometrical extension of the least angle regression method. The core of the dglars package consists of two algorithms implemented in Fortran 90 to efficiently compute the solution curve. dglars is a publicly available R package that implements the method proposed in Augugliaro et al. (J. R. Statist. Soc. B 75(3), 471-498, 2013) developed to study the sparse structure of a generalized linear model (GLM). This method, call…
Applying differential geometric LARS algorithm to ultra-high dimensional feature space
2009
Variable selection is fundamental in high-dimensional statistical modeling. Many techniques to select relevant variables in generalized linear models are based on a penalized likelihood approach. In a recent paper, Fan and Lv (2008) proposed a sure independent screening (SIS) method to select relevant variables in a linear regression model defined on a ultrahigh dimensional feature space. Aim of this paper is to define a generalization of the SIS method for generalized linear models based on a differential geometric approach.
Induced smoothing in LASSO regression
The thesis is being carried out with the National research Council at the Institute of Biomedicine and Molecular Immunology "Alberto Monroy" of Palermo, where I am a fellow, under the supervision of MD Stefania La Grutta. Our research unit is focused on clinical research in allergic respiratory problems in children. In particular, we are interested in to assess the determinants of impaired lung function in a sample of outpatient asthmatic children aged between 5 and 17 years enrolled from 2011 to 2017. Our dataset is composed by n = 529 children and several covariates regarding host and environmental factors. This thesis focuses on hypothesis testing in lasso regression, when one is interes…
Using differential LARS algorithm to study the expression profile of a sample of patients with latex-fruit syndrome
2010
Natural rubber latex IgE-mediated hypersensitivity is one of the most important health problems in allergy during recent years. The prevalence of individuals allergic to latex shows an associated hypersensitivity to some plant-derived foods, especially freshly consumed fruit. This association of latex allergy and allergy to plant-derived foods is called latex-fruit syndrome. The aim of this study is to use the differential geometric generalization of the LARS algorithm to identify candidate genes that may be associated with the pathogenesis of allergy to latex or vegetable food.
A new tuning parameter selector in lasso regression
2019
Penalized regression models are popularly used in high-dimensional data analysis to carry out variable selction and model fitting simultaneously. Whereas success has been widely reported in literature, their performance largely depend on the tuning parameter that balances the trade-off between model fitting and sparsity. In this work we introduce a new tuning parameter selction criterion based on the maximization of the signal-to-noise ratio. To prove its effectiveness we applied it to a real data on prostate cancer disease.
Using differential geometric LARS algorithm to study the expression profile of a sample of patients with latex-fruit syndrome
2011
Natural rubber latex IgE-mediated hypersensitivity is one of the most important health problems in allergy during recent years. The prevalence of individuals allergic to latex shows an associated hypersensitivity to some plant-derived foods, especially freshly consumed fruit. This association of latex allergy and allergy to plant-derived foods is called latex-fruit syndrome. The aim of this study is to use the differential geometric generalization of the LARS algorithm to identify candidate genes that may be associated with the pathogenesis of allergy to latex or vegetable.
dglars: An R Package to Estimate Sparse Generalized Linear Models
2014
dglars is a publicly available R package that implements the method proposed in Augugliaro, Mineo, and Wit (2013), developed to study the sparse structure of a generalized linear model. This method, called dgLARS, is based on a differential geometrical extension of the least angle regression method proposed in Efron, Hastie, Johnstone, and Tibshirani (2004). The core of the dglars package consists of two algorithms implemented in Fortran 90 to efficiently compute the solution curve: a predictor-corrector algorithm, proposed in Augugliaro et al. (2013), and a cyclic coordinate descent algorithm, proposed in Augugliaro, Mineo, and Wit (2012). The latter algorithm, as shown here, is significan…
Clusters of effects curves in quantile regression models
2018
In this paper, we propose a new method for finding similarity of effects based on quantile regression models. Clustering of effects curves (CEC) techniques are applied to quantile regression coefficients, which are one-to-one functions of the order of the quantile. We adopt the quantile regression coefficients modeling (QRCM) framework to describe the functional form of the coefficient functions by means of parametric models. The proposed method can be utilized to cluster the effect of covariates with a univariate response variable, or to cluster a multivariate outcome. We report simulation results, comparing our approach with the existing techniques. The idea of combining CEC with QRCM per…