Search results for "Variable selection."

showing 10 items of 21 documents

Using differential geometric LARS algorithm to study the expression profile of a sample of patients with latex-fruit syndrome

2011

Natural rubber latex IgE-mediated hypersensitivity is one of the most important health problems in allergy during recent years. The prevalence of individuals allergic to latex shows an associated hypersensitivity to some plant-derived foods, especially freshly consumed fruit. This association of latex allergy and allergy to plant-derived foods is called latex-fruit syndrome. The aim of this study is to use the differential geometric generalization of the LARS algorithm to identify candidate genes that may be associated with the pathogenesis of allergy to latex or vegetable.

Settore SECS-S/01 - StatisticaLatex-fruit syndrome variable selection penalized regression high dimensional LARS
researchProduct

dglars: An R Package to Estimate Sparse Generalized Linear Models

2014

dglars is a publicly available R package that implements the method proposed in Augugliaro, Mineo, and Wit (2013), developed to study the sparse structure of a generalized linear model. This method, called dgLARS, is based on a differential geometrical extension of the least angle regression method proposed in Efron, Hastie, Johnstone, and Tibshirani (2004). The core of the dglars package consists of two algorithms implemented in Fortran 90 to efficiently compute the solution curve: a predictor-corrector algorithm, proposed in Augugliaro et al. (2013), and a cyclic coordinate descent algorithm, proposed in Augugliaro, Mineo, and Wit (2012). The latter algorithm, as shown here, is significan…

Statistics and ProbabilityGeneralized linear modelEXPRESSIONMathematical optimizationTISSUESFortrancyclic coordinate descent algorithmdgLARSFeature selectionDANTZIG SELECTORpredictor-corrector algorithmLIKELIHOODLEAST ANGLE REGRESSIONsparse modelsDifferential (infinitesimal)differential geometrylcsh:Statisticslcsh:HA1-4737computer.programming_languageMathematicsLeast-angle regressionExtension (predicate logic)Expression (computer science)generalized linear modelsBREAST-CANCER RISKVARIABLE SELECTIONDifferential geometrydifferential geometry generalized linear models dgLARS predictor-corrector algorithm cyclic coordinate descent algorithm sparse models variable selection.MARKERSHRINKAGEStatistics Probability and UncertaintyHAPLOTYPESSettore SECS-S/01 - StatisticacomputerAlgorithmSoftware
researchProduct

Extended differential geometric LARS for high-dimensional GLMs with general dispersion parameter

2018

A large class of modeling and prediction problems involves outcomes that belong to an exponential family distribution. Generalized linear models (GLMs) are a standard way of dealing with such situations. Even in high-dimensional feature spaces GLMs can be extended to deal with such situations. Penalized inference approaches, such as the $$\ell _1$$ or SCAD, or extensions of least angle regression, such as dgLARS, have been proposed to deal with GLMs with high-dimensional feature spaces. Although the theory underlying these methods is in principle generic, the implementation has remained restricted to dispersion-free models, such as the Poisson and logistic regression models. The aim of this…

Statistics and ProbabilityGeneralized linear modelMathematical optimizationGeneralized linear modelsPredictor-€“corrector algorithmGeneralized linear model02 engineering and technologyPoisson distributionDANTZIG SELECTOR01 natural sciencesCross-validationHigh-dimensional inferenceTheoretical Computer Science010104 statistics & probabilitysymbols.namesakeExponential familyLEAST ANGLE REGRESSION0202 electrical engineering electronic engineering information engineeringApplied mathematicsStatistics::Methodology0101 mathematicsCROSS-VALIDATIONMathematicsLeast-angle regressionLinear model020206 networking & telecommunicationsProbability and statisticsVARIABLE SELECTIONEfficient estimatorPredictor-corrector algorithmComputational Theory and MathematicsDispersion paremeterLINEAR-MODELSsymbolsSHRINKAGEStatistics Probability and UncertaintySettore SECS-S/01 - StatisticaStatistics and Computing
researchProduct

Differential geometric least angle regression: a differential geometric approach to sparse generalized linear models

2013

Summary Sparsity is an essential feature of many contemporary data problems. Remote sensing, various forms of automated screening and other high throughput measurement devices collect a large amount of information, typically about few independent statistical subjects or units. In certain cases it is reasonable to assume that the underlying process generating the data is itself sparse, in the sense that only a few of the measured variables are involved in the process. We propose an explicit method of monotonically decreasing sparsity for outcomes that can be modelled by an exponential family. In our approach we generalize the equiangular condition in a generalized linear model. Although the …

Statistics and ProbabilityGeneralized linear modelSparse modelMathematical optimizationGeneralized linear modelsVariable selectionPath following algorithmEquiangular polygonGeneralized linear modelLASSODANTZIG SELECTORsymbols.namesakeExponential familyLasso (statistics)Sparse modelsDifferential geometryInformation geometryCOORDINATE DESCENTFisher informationERRORMathematicsLeast-angle regressionLeast angle regressionGeneralized degrees of freedomsymbolsSHRINKAGEStatistics Probability and UncertaintySimple linear regressionInformation geometrySettore SECS-S/01 - StatisticaAlgorithmCovariance penalty theory
researchProduct

Criteria for Bayesian model choice with application to variable selection

2012

In objective Bayesian model selection, no single criterion has emerged as dominant in defining objective prior distributions. Indeed, many criteria have been separately proposed and utilized to propose differing prior choices. We first formalize the most general and compelling of the various criteria that have been suggested, together with a new criterion. We then illustrate the potential of these criteria in determining objective model selection priors by considering their application to the problem of variable selection in normal linear models. This results in a new model selection objective prior with a number of compelling properties.

Statistics and ProbabilityMathematical optimization62C10Model selectiong-priorLinear modelMathematics - Statistics TheoryFeature selectionStatistics Theory (math.ST)Model selectionBayesian inferenceObjective model62J05Prior probability62J15FOS: MathematicsStatistics Probability and Uncertaintyobjective BayesSelection (genetic algorithm)variable selectionMathematicsThe Annals of Statistics
researchProduct

Clusters of effects curves in quantile regression models

2018

In this paper, we propose a new method for finding similarity of effects based on quantile regression models. Clustering of effects curves (CEC) techniques are applied to quantile regression coefficients, which are one-to-one functions of the order of the quantile. We adopt the quantile regression coefficients modeling (QRCM) framework to describe the functional form of the coefficient functions by means of parametric models. The proposed method can be utilized to cluster the effect of covariates with a univariate response variable, or to cluster a multivariate outcome. We report simulation results, comparing our approach with the existing techniques. The idea of combining CEC with QRCM per…

Statistics and ProbabilityStatistics::TheoryMultivariate statistics05 social sciencesUnivariateFunctional data analysis01 natural sciencesQuantile regressionQuantile regression coefficients modeling Multivariate analysis Functional data analysis Curves clustering Variable selection010104 statistics & probabilityComputational Mathematics0502 economics and businessParametric modelCovariateStatistics::MethodologyApplied mathematics0101 mathematicsStatistics Probability and UncertaintyCluster analysisSettore SECS-S/01 - Statistica050205 econometrics MathematicsQuantile
researchProduct

Variable selection with unbiased estimation: the CDF penalty

2022

We propose a new SCAD-type penalty in general regression models. The new penalty can be considered a competitor of the LASSO, SCAD or MCP penalties, as it guarantees sparse variable selection, i.e., null regression coefficient estimates, while attenuating bias for the non-null estimates. In this work, the method is discussed, and some comparisons are presented.

Variable selection L1-type penalty LASSO SCAD MCP
researchProduct

Variable Selection with Quasi-Unbiased Estimation: the CDF Penalty

2022

We propose a new non-convex penalty in linear regression models. The new penalty function can be considered a competitor of the LASSO, SCAD or MCP penalties, as it guarantees sparse variable selection while reducing bias for the non-null estimates. We introduce the methodology and present some comparisons among different approaches.

Variable selection non-convex penalty function LASSO SCAD MCP
researchProduct

Evaluation of the effect of chance correlations on variable selection using Partial Least Squares -Discriminant Analysis

2013

Variable subset selection is often mandatory in high throughput metabolomics and proteomics. However, depending on the variable to sample ratio there is a significant susceptibility of variable selection towards chance correlations. The evaluation of the predictive capabilities of PLSDA models estimated by cross-validation after feature selection provides overly optimistic results if the selection is performed on the entire set and no external validation set is available. In this work, a simulation of the statistical null hypothesis is proposed to test whether the discrimination capability of a PLSDA model after variable selection estimated by cross-validation is statistically higher than t…

Variable selectionESTADISTICA E INVESTIGACION OPERATIVAFeature selectionChance correlationsAnalytical ChemistrySet (abstract data type)ResamplingPartial least squares regressionStatisticsHumansMetabolomicsLeast-Squares AnalysisSelection (genetic algorithm)ProbabilityGaucher DiseaseModels StatisticalChemistryDiscriminant AnalysisReproducibility of ResultsPartial Least Squares-Discriminant Analysis (PLSDA)Linear discriminant analysisVariable (computer science)Null hypothesisAlgorithmsSoftware
researchProduct

Analyses spectrale et texturale de données haute résolution pour la détection automatique des maladies de la vigne

2019

‘Flavescence dorée’ is a contagious and incurable disease present on the vine leaves. The DAMAV project (Automatic detection of Vine Diseases) aims to develop a solution for automated detection of vine diseases using a micro-drone. The goal is to offer a turnkey solution for wine growers. This tool will allow the search for potential foci, and then more generally any type of detectable vine disease on the foliage. To enable this diagnosis, the foliage is proposed to be studied using a dedicated high-resolution multispectral camera.The objective of this PhD-thesis in the context of DAMAV is to participate in the design and implementation of a Multi-Spectral (MS) image acquisition system and …

Variable selection[INFO.INFO-TI] Computer Science [cs]/Image Processing [eess.IV][INFO.INFO-TI]Computer Science [cs]/Image Processing [eess.IV]Maladies de la VigneSpectral analysisAnalyse de textureSélection de variablesFlavescence DoréeClassification de donnéesData classificationGrapevine diseasesTextural analysisAnalyse spectrale
researchProduct