Search results for "Feature selection"

showing 10 items of 139 documents

An Automatic System for the Analysis and Classification of Human Atrial Fibrillation Patterns from Intracardiac Electrograms

2008

This paper presents an automatic system for the analysis and classification of atrial fibrillation (AF) patterns from bipolar intracardiac signals. The system is made up of: 1) a feature- extraction module that defines and extracts a set of measures potentially useful for characterizing AF types on the basis of their degree of organization; 2) a feature-selection module (based on the Jeffries-Matusita distance and a branch and bound search algorithm) identifying the best subset of features for discriminating different AF types; and 3) a support vector machine technique-based classification module that automatically discriminates the AF types according to the Wells' criteria. The automatic s…

Signal processingComputer scienceFeature extractionBiomedical EngineeringFeature extraction and selectionFeature selectionSensitivity and SpecificityIntracardiac injectionPattern Recognition AutomatedArtificial IntelligenceSearch algorithmAtrial FibrillationmedicineHumansDiagnosis Computer-AssistedIntracardiac ElectrogramArrhythmia organizationSignal processingmedicine.diagnostic_testbusiness.industrySupport vector machines (SVMs)Reproducibility of ResultsPattern recognitionAtrial fibrillationHuman atrial fibrillationmedicine.diseaseSupport vector machineSettore ING-INF/06 - Bioingegneria Elettronica E InformaticaAutomatic classificationArtificial intelligenceIntracardiac electrogrambusinessElectrocardiographyAlgorithmsIEEE Transactions on Biomedical Engineering
researchProduct

Optimal band selection for future satellite sensor dedicated to soil science

2009

Hyperspectral imaging systems could be used for identifying the different soil types from the satellites. However, detecting the reflectance of the soils in all the wavelengths involves the use of a large number of sensors with high accuracy and also creates a problem in transmitting the data to earth stations for processing. The current sensors can reach a bandwidth of 20 nm and hence, the reflectance obtained using the sensors are the integration of reflectance obtained in each of the wavelength present in the spectral band. Moreover, not all spectral bands contribute equally to classification and hence, identifying the bands necessary to have a good classification is necessary to reduce …

Statistical classificationContextual image classificationComputer scienceBandwidth (signal processing)Hyperspectral imagingSatelliteFeature selectionSpectral bandsData transmissionRemote sensing2009 First Workshop on Hyperspectral Image and Signal Processing: Evolution in Remote Sensing
researchProduct

Breaking the curse of dimensionality in quadratic discriminant analysis models with a novel variant of a Bayes classifier enhances automated taxa ide…

2013

Macroinvertebrate samples are commonly used in biomonitoring to study changes on aquatic ecosystems. Traditionally, specimens are identified manually to taxa by human experts being time-consuming and cost intensive. Using the image data of 35 taxa and 64 features, we propose a novel variant of the quadratic discriminant analysis for breaking the curse of dimensionality in quadratic discriminant analysis models. Our variant, called a random Bayes array (RBA), uses bagging and random feature selection similar to random forest. We explore several variations of RBA. We consider three classification (i.e taxa identification) decisions: majority vote, averaged posterior probabilities, and a novel…

Statistics and ProbabilityBayes' theoremEcological ModelingBayesian probabilityStatisticsPosterior probabilityFeature selectionContext (language use)Bayes classifierQuadratic classifierMathematicsRandom forestEnvironmetrics
researchProduct

Automatic variable selection for exposure-driven propensity score matching with unmeasured confounders.

2020

Multivariable model building for propensity score modeling approaches is challenging. A common propensity score approach is exposure-driven propensity score matching, where the best model selection strategy is still unclear. In particular, the situation may require variable selection, while it is still unclear if variables included in the propensity score should be associated with the exposure and the outcome, with either the exposure or the outcome, with at least the exposure or with at least the outcome. Unmeasured confounders, complex correlation structures, and non-normal covariate distributions further complicate matters. We consider the performance of different modeling strategies in …

Statistics and ProbabilityBiometryModels StatisticalComputer scienceModel selectionFeature selectionGeneral MedicineVariance (accounting)01 natural sciencesOutcome (game theory)Correlation010104 statistics & probability03 medical and health sciencesAutomation0302 clinical medicineCovariatePropensity score matchingStatisticsMultivariate Analysis030212 general & internal medicine0101 mathematicsStatistics Probability and UncertaintyPropensity ScoreCounterexampleBiometrical journal. Biometrische ZeitschriftREFERENCES
researchProduct

Cluster-Localized Sparse Logistic Regression for SNP Data

2012

The task of analyzing high-dimensional single nucleotide polymorphism (SNP) data in a case-control design using multivariable techniques has only recently been tackled. While many available approaches investigate only main effects in a high-dimensional setting, we propose a more flexible technique, cluster-localized regression (CLR), based on localized logistic regression models, that allows different SNPs to have an effect for different groups of individuals. Separate multivariable regression models are fitted for the different groups of individuals by incorporating weights into componentwise boosting, which provides simultaneous variable selection, hence sparse fits. For model fitting, th…

Statistics and ProbabilityBoosting (machine learning)Computer scienceMultivariable calculusComputational BiologyHigh-Throughput Nucleotide SequencingFeature selectionRegression analysisModels TheoreticalLogistic regressioncomputer.software_genrePolymorphism Single NucleotideRegressionComputational MathematicsLogistic ModelsData Interpretation StatisticalGeneticsCluster AnalysisHumansData miningCluster analysisMolecular BiologyUnit-weighted regressioncomputerGenome-Wide Association StudyStatistical Applications in Genetics and Molecular Biology
researchProduct

Sample size planning for survival prediction with focus on high-dimensional data

2011

Sample size planning should reflect the primary objective of a trial. If the primary objective is prediction, the sample size determination should focus on prediction accuracy instead of power. We present formulas for the determination of training set sample size for survival prediction. Sample size is chosen to control the difference between optimal and expected prediction error. Prediction is carried out by Cox proportional hazards models. The general approach considers censoring as well as low-dimensional and high-dimensional explanatory variables. For dimension reduction in the high-dimensional setting, a variable selection step is inserted. If not all informative variables are included…

Statistics and ProbabilityClustering high-dimensional dataClinical Trials as TopicLung NeoplasmsModels StatisticalKaplan-Meier EstimateEpidemiologyProportional hazards modelDimensionality reductionGene ExpressionFeature selectionKaplan-Meier EstimateBiostatisticsPrognosisBrier scoreSample size determinationCarcinoma Non-Small-Cell LungSample SizeCensoring (clinical trials)StatisticsHumansProportional Hazards ModelsMathematicsStatistics in Medicine
researchProduct

Tailoring sparse multivariable regression techniques for prognostic single-nucleotide polymorphism signatures.

2011

When seeking prognostic information for patients, modern technologies provide a huge amount of genomic measurements as a starting point. For single-nucleotide polymorphisms (SNPs), there may be more than one million covariates that need to be simultaneously considered with respect to a clinical endpoint. Although the underlying biological problem cannot be solved on the basis of clinical cohorts of only modest size, some important SNPs might still be identified. Sparse multivariable regression techniques have recently become available for automatically identifying prognostic molecular signatures that comprise relatively few covariates and provide reasonable prediction performance. For illus…

Statistics and ProbabilityEpidemiologyComputer scienceFeature selectionBiostatisticscomputer.software_genrePolymorphism Single NucleotideLasso (statistics)Gene FrequencyResamplingCovariateHumansLikelihood FunctionsModels StatisticalMultivariable calculusRegression analysisGenomicsPrognosisRegressionMinor allele frequencyLeukemia Myeloid AcuteMultivariate AnalysisRegression AnalysisData miningcomputerAlgorithmsStatistics in medicine
researchProduct

Methods and Tools for Bayesian Variable Selection and Model Averaging in Normal Linear Regression

2018

In this paper, we briefly review the main methodological aspects concerned with the application of the Bayesian approach to model choice and model averaging in the context of variable selection in regression models. This includes prior elicitation, summaries of the posterior distribution and computational strategies. We then examine and compare various publicly available R-packages, summarizing and explaining the differences between packages and giving recommendations for applied users. We find that all packages reviewed (can) lead to very similar results, but there are potentially important differences in flexibility and efficiency of the packages.

Statistics and ProbabilityGeneral linear modelProper linear modelbusiness.industryComputer science05 social sciencesPosterior probabilityRegression analysisFeature selectionMachine learningcomputer.software_genre01 natural sciences010104 statistics & probabilityBayesian multivariate linear regression0502 economics and businessLinear regressionEconometricsArtificial intelligence050207 economics0101 mathematicsStatistics Probability and UncertaintyBayesian linear regressionbusinesscomputerInternational Statistical Review
researchProduct

dglars: An R Package to Estimate Sparse Generalized Linear Models

2014

dglars is a publicly available R package that implements the method proposed in Augugliaro, Mineo, and Wit (2013), developed to study the sparse structure of a generalized linear model. This method, called dgLARS, is based on a differential geometrical extension of the least angle regression method proposed in Efron, Hastie, Johnstone, and Tibshirani (2004). The core of the dglars package consists of two algorithms implemented in Fortran 90 to efficiently compute the solution curve: a predictor-corrector algorithm, proposed in Augugliaro et al. (2013), and a cyclic coordinate descent algorithm, proposed in Augugliaro, Mineo, and Wit (2012). The latter algorithm, as shown here, is significan…

Statistics and ProbabilityGeneralized linear modelEXPRESSIONMathematical optimizationTISSUESFortrancyclic coordinate descent algorithmdgLARSFeature selectionDANTZIG SELECTORpredictor-corrector algorithmLIKELIHOODLEAST ANGLE REGRESSIONsparse modelsDifferential (infinitesimal)differential geometrylcsh:Statisticslcsh:HA1-4737computer.programming_languageMathematicsLeast-angle regressionExtension (predicate logic)Expression (computer science)generalized linear modelsBREAST-CANCER RISKVARIABLE SELECTIONDifferential geometrydifferential geometry generalized linear models dgLARS predictor-corrector algorithm cyclic coordinate descent algorithm sparse models variable selection.MARKERSHRINKAGEStatistics Probability and UncertaintyHAPLOTYPESSettore SECS-S/01 - StatisticacomputerAlgorithmSoftware
researchProduct

Coupled variable selection for regression modeling of complex treatment patterns in a clinical cancer registry.

2013

For determining a manageable set of covariates potentially influential with respect to a time-to-event endpoint, Cox proportional hazards models can be combined with variable selection techniques, such as stepwise forward selection or backward elimination based on p-values, or regularized regression techniques such as component-wise boosting. Cox regression models have also been adapted for dealing with more complex event patterns, for example, for competing risks settings with separate, cause-specific hazard models for each event type, or for determining the prognostic effect pattern of a variable over different landmark times, with one conditional survival model for each landmark. Motivat…

Statistics and ProbabilityMaleNiacinamideBoosting (machine learning)Carcinoma HepatocellularEpidemiologyComputer scienceScoreFeature selectionAntineoplastic Agentscomputer.software_genreDecision Support TechniquesNeoplasmsCovariateHumansRegistriesAgedProportional Hazards ModelsProportional hazards modelPhenylurea CompoundsLiver NeoplasmsRegression analysisConfounding Factors EpidemiologicMiddle AgedSorafenibPrognosisRegressionCancer registryData Interpretation StatisticalRegression AnalysisData miningcomputerStatistics in medicine
researchProduct