Search results for "machine learning."

showing 10 items of 1455 documents

Classification trees for multivariate ordinal response: an application to Student Evaluation Teaching

2016

Data from multiple items on an ordinal scale are commonly collected when qualitative variables, such as feelings, attitudes and many other behavioral and health-related variables are observed. In this paper we introduce a method to derive a distance-based tree for multivariate ordinal response that allows, when subject-specific characteristics are available, to derive common profiles for respondents giving the same/similar multivariate ratings. Special attention will be paid to the performance comparison in terms of AUC, for three different distances used as splitting criteria. Simulated data an a dataset from a Student Evaluation of Teaching survey will be used as illustrative examples. Th…

Statistics and ProbabilityOrdinal dataMultivariate statisticsComputer sciencebusiness.industryOrdinal ScaleDecision treeGeneral Social SciencesDecision tree Ordinal response Student Evaluation of Teaching Distances02 engineering and technologyMachine learningcomputer.software_genre01 natural sciencesOrdinal regression010104 statistics & probabilityStatistics0202 electrical engineering electronic engineering information engineeringProfiling (information science)020201 artificial intelligence & image processingTree (set theory)Artificial intelligence0101 mathematicsbusinesscomputerOrdinal response

researchProduct

A Hooke's law-based approach to protein folding rate

2014

Kinetics is a key aspect of the renowned protein folding problem. Here, we propose a comprehensive approach to folding kinetics where a polypeptide chain is assumed to behave as an elastic material described by the Hooke[U+05F3]s law. A novel parameter called elastic-folding constant results from our model and is suggested to distinguish between protein with two-state and multi-state folding pathways. A contact-free descriptor, named folding degree, is introduced as a suitable structural feature to study protein-folding kinetics. This approach generalizes the observed correlations between varieties of structural descriptors with the folding rate constant. Additionally several comparisons am…

Statistics and ProbabilityPROTDCALStructure analysisGeneral Biochemistry Genetics and Molecular BiologyArticleProtein Structure SecondaryAmino acid sequencesymbols.namesakeProtein structureEnergeticsFeature (machine learning)Statistical physicsProtein foldingTheoretical modelProtein secondary structureReaction kineticsGeneral Immunology and MicrobiologyChemical modelApplied MathematicsProteinHooke's lawModelingProteinsGeneral MedicineDNAComputer simulationElasticityFolding degreeFolding (chemistry)ChemistryKineticsModels ChemicalModeling and SimulationPeptidesymbolsProtein structureElastic folding constantPhysical chemistryProtein secondary structureThermodynamicsProtein foldingDownhill foldingPolypeptideGeneral Agricultural and Biological SciencesConstant (mathematics)Folding kinetics

researchProduct

Varying-coefficient functional linear regression models

2008

This article considers a generalization of the functional linear regression in which an additional real variable influences smoothly the functional coefficient. We thus define a varying-coefficient regression model for functional data. We propose two estimators based, respectively, on conditional functional principal regression and on local penalized regression splines and prove their pointwise consistency. We check, with the prediction one day ahead of ozone concentration in the city of Toulouse, the ability of such nonlinear functional approaches to produce competitive estimations.

Statistics and ProbabilityPolynomial regressionStatistics::TheoryProper linear modelMultivariate adaptive regression splines010504 meteorology & atmospheric sciencesLocal regression01 natural sciences62G05 (62G20 62M20)Statistics::ComputationNonparametric regressionStatistics::Machine Learning010104 statistics & probabilityLinear regressionStatisticsStatistics::Methodology0101 mathematicsSegmented regressionRegression diagnosticComputingMilieux_MISCELLANEOUS0105 earth and related environmental sciencesMathematics

researchProduct

Powerful short-cuts for multiple testing procedures with special reference to gatekeeping strategies.

2007

In this paper we present a general testing principle for a class of multiple testing problems based on weighted hypotheses. Under moderate conditions, this principle leads to powerful consonant multiple testing procedures. Furthermore, short-cut versions can be derived, which simplify substantially the implementation and interpretation of the related test procedures. It is shown that many well-known multiple test procedures turn out to be special cases of this general principle. Important examples include gatekeeping procedures, which are often applied in clinical trials when primary and secondary objectives are investigated, and multiple test procedures based on hypotheses which are comple…

Statistics and ProbabilityResearch designClass (computer programming)Clinical Trials as TopicGatekeepingInterpretation (logic)Models StatisticalEpidemiologybusiness.industryTest proceduresMachine learningcomputer.software_genreGatekeepingEuropesymbols.namesakeBonferroni correctionResearch DesignMultiple comparisons problemsymbolsHumansArtificial intelligencebusinessAlgorithmcomputerMathematicsStatistics in medicine

researchProduct

DRUDIT: Web-based DRUgs DIscovery Tools to design small molecules as modulators of biological targets

2019

Abstract Motivation New in silico tools to predict biological affinities for input structures are presented. The tools are implemented in the DRUDIT (DRUgs DIscovery Tools) web service. The DRUDIT biological finder module is based on molecular descriptors that are calculated by the MOLDESTO (MOLecular DEScriptors TOol) software module developed by the same authors, which is able to calculate more than one thousand molecular descriptors. At this stage, DRUDIT includes 250 biological targets, but new external targets can be added. This feature extends the application scope of DRUDIT to several fields. Moreover, two more functions are implemented: the multi- and on/off-target tasks. These tool…

Statistics and ProbabilityService (systems architecture)PolypharmacologyComputer scienceIn silicoMachine learningcomputer.software_genre01 natural sciencesBiochemistrybiological target finderdrug discoveryMolecular descriptors03 medical and health sciencesMolecular descriptorSettore BIO/10 - BiochimicaWeb applicationComputer SimulationPolypharmacologyMolecular Biology030304 developmental biologySettore ING-INF/05 - Sistemi Di Elaborazione Delle InformazioniInternet0303 health sciencesbusiness.industrySmall moleculeSettore CHIM/08 - Chimica Farmaceutica0104 chemical sciencesComputer Science Applications010404 medicinal & biomolecular chemistryComputational MathematicsComputational Theory and MathematicsBiological targetThe InternetArtificial intelligencebusinesscomputerSoftware

researchProduct

The Induced Smoothed lasso: A practical framework for hypothesis testing in high dimensional regression.

2020

This paper focuses on hypothesis testing in lasso regression, when one is interested in judging statistical significance for the regression coefficients in the regression equation involving a lot of covariates. To get reliable p-values, we propose a new lasso-type estimator relying on the idea of induced smoothing which allows to obtain appropriate covariance matrix and Wald statistic relatively easily. Some simulation experiments reveal that our approach exhibits good performance when contrasted with the recent inferential tools in the lasso framework. Two real data analyses are presented to illustrate the proposed framework in practice.

Statistics and ProbabilityStatistics::TheoryInduced smoothingEpidemiologyComputer scienceFeature selectionWald test01 natural sciencesasthma researchStatistics::Machine Learning010104 statistics & probability03 medical and health sciencesHealth Information ManagementLasso (statistics)Linear regressionsparse modelsStatistics::MethodologyComputer Simulation0101 mathematicssandwich formula030304 developmental biologyStatistical hypothesis testing0303 health sciencesCovariance matrixlung functionRegression analysisStatistics::Computationsparse modelResearch DesignAlgorithmSmoothingvariable selectionStatistical methods in medical research

researchProduct

Selecting the tuning parameter in penalized Gaussian graphical models

2019

Penalized inference of Gaussian graphical models is a way to assess the conditional independence structure in multivariate problems. In this setting, the conditional independence structure, corresponding to a graph, is related to the choice of the tuning parameter, which determines the model complexity or degrees of freedom. There has been little research on the degrees of freedom for penalized Gaussian graphical models. In this paper, we propose an estimator of the degrees of freedom in $$\ell _1$$ -penalized Gaussian graphical models. Specifically, we derive an estimator inspired by the generalized information criterion and propose to use this estimator as the bias term for two informatio…

Statistics and ProbabilityStatistics::TheoryKullback–Leibler divergenceKullback-Leibler divergenceComputer scienceGaussianInformation Criteria010103 numerical & computational mathematicsModel complexityModel selection01 natural sciencesTheoretical Computer Science010104 statistics & probabilitysymbols.namesakeStatistics::Machine LearningGeneralized information criterionEntropy (information theory)Statistics::MethodologyGraphical model0101 mathematicsPenalized Likelihood Kullback-Leibler Divergence Model Complexity Model Selection Generalized Information Criterion.Model selectionEstimatorStatistics::ComputationComputational Theory and MathematicsConditional independencesymbolsPenalized likelihoodStatistics Probability and UncertaintySettore SECS-S/01 - StatisticaAlgorithmStatistics and Computing

researchProduct

Test and power considerations for multiple endpoint analyses using sequentially rejective graphical procedures

2009

A variety of powerful test procedures are available for the analysis of clinical trials addressing multiple objectives, such as comparing several treatments with a control, assessing the benefit of a new drug for more than one endpoint, etc. However, some of these procedures have reached a level of complexity that makes it difficult to communicate the underlying test strategies to clinical teams. Graphical approaches have been proposed instead that facilitate the derivation and communication of Bonferroni-based closed test procedures. In this paper we give a coherent description of the methodology and illustrate it with a real clinical trial example. We further discuss suitable power measur…

Statistics and ProbabilityTest strategyEndpoint DeterminationEpidemiologyComputer scienceControl (management)Analysis of clinical trialsMachine learningcomputer.software_genresymbols.namesakeDrug TherapyComputer GraphicsConfidence IntervalsHumansMulticenter Studies as TopicRandomized Controlled Trials as Topicbusiness.industryVariety (cybernetics)Test (assessment)Clinical trialBonferroni correctionClinical Trials Phase III as TopicData Interpretation StatisticalMultiple comparisons problemsymbolsArtificial intelligencebusinessAlgorithmcomputerStatistics in Medicine

researchProduct

What subject matter questions motivate the use of machine learning approaches compared to statistical models for probability prediction?

2014

This is a discussion of the following papers: "Probability estimation with machine learning methods for dichotomous and multicategory outcome: Theory" by Jochen Kruppa, Yufeng Liu, Gerard Biau, Michael Kohler, Inke R. Konig, James D. Malley, and Andreas Ziegler; and "Probability estimation with machine learning methods for dichotomous and multicategory outcome: Applications" by Jochen Kruppa, Yufeng Liu, Hans-Christian Diener, Theresa Holste, Christian Weimar, Inke R. Konig, and Andreas Ziegler.

Statistics and Probabilitybusiness.industryProbability estimationStatistical modelGeneral MedicineMachine learningcomputer.software_genreLogistic regressionMulticategoryOutcome (probability)Subject matterDienerEconometricsArtificial intelligenceStatistics Probability and UncertaintybusinesscomputerMathematicsBiometrical Journal

researchProduct

Contributed discussion on article by Pratola

2016

The author should be commended for his outstanding contribution to the literature on Bayesian regression tree models. The author introduces three innovative sampling approaches which allow for efficient traversal of the model space. In this response, we add a fourth alternative.

Statistics and Probabilitymodel selectionMarkov Chain Monte Carlo (MCMC)Bayesian regression treeComputer scienceBig dataBayesian regression tree (BRT) modelsComputingMilieux_LEGALASPECTSOFCOMPUTINGbirth–death processMachine learningcomputer.software_genreSequential Monte Carlo methods01 natural sciencespopulation Markov chain Monte Carlo010104 statistics & probabilitysymbols.namesakebig data0502 economics and businessBayesian Regression Trees (BART)0101 mathematics050205 econometrics Bayesian treed regressionMultiple Try Metropolis algorithmsINFERÊNCIA ESTATÍSTICAbusiness.industryApplied MathematicsModel selection05 social sciencesRejection samplingData scienceVariable-order Bayesian networkTree (data structure)Tree traversalMarkov chain Monte Carlocontinuous time Markov processsymbolsArtificial intelligencebusinessBayesian linear regressioncommunication-freecomputerGibbs samplingBayesian Analysis

researchProduct