Search results for "Machine learning"

showing 10 items of 1464 documents

Varying-coefficient functional linear regression models

2008

This article considers a generalization of the functional linear regression in which an additional real variable influences smoothly the functional coefficient. We thus define a varying-coefficient regression model for functional data. We propose two estimators based, respectively, on conditional functional principal regression and on local penalized regression splines and prove their pointwise consistency. We check, with the prediction one day ahead of ozone concentration in the city of Toulouse, the ability of such nonlinear functional approaches to produce competitive estimations.

Statistics and ProbabilityPolynomial regressionStatistics::TheoryProper linear modelMultivariate adaptive regression splines010504 meteorology & atmospheric sciencesLocal regression01 natural sciences62G05 (62G20 62M20)Statistics::ComputationNonparametric regressionStatistics::Machine Learning010104 statistics & probabilityLinear regressionStatisticsStatistics::Methodology0101 mathematicsSegmented regressionRegression diagnosticComputingMilieux_MISCELLANEOUS0105 earth and related environmental sciencesMathematics
researchProduct

Powerful short-cuts for multiple testing procedures with special reference to gatekeeping strategies.

2007

In this paper we present a general testing principle for a class of multiple testing problems based on weighted hypotheses. Under moderate conditions, this principle leads to powerful consonant multiple testing procedures. Furthermore, short-cut versions can be derived, which simplify substantially the implementation and interpretation of the related test procedures. It is shown that many well-known multiple test procedures turn out to be special cases of this general principle. Important examples include gatekeeping procedures, which are often applied in clinical trials when primary and secondary objectives are investigated, and multiple test procedures based on hypotheses which are comple…

Statistics and ProbabilityResearch designClass (computer programming)Clinical Trials as TopicGatekeepingInterpretation (logic)Models StatisticalEpidemiologybusiness.industryTest proceduresMachine learningcomputer.software_genreGatekeepingEuropesymbols.namesakeBonferroni correctionResearch DesignMultiple comparisons problemsymbolsHumansArtificial intelligencebusinessAlgorithmcomputerMathematicsStatistics in medicine
researchProduct

DRUDIT: Web-based DRUgs DIscovery Tools to design small molecules as modulators of biological targets

2019

Abstract Motivation New in silico tools to predict biological affinities for input structures are presented. The tools are implemented in the DRUDIT (DRUgs DIscovery Tools) web service. The DRUDIT biological finder module is based on molecular descriptors that are calculated by the MOLDESTO (MOLecular DEScriptors TOol) software module developed by the same authors, which is able to calculate more than one thousand molecular descriptors. At this stage, DRUDIT includes 250 biological targets, but new external targets can be added. This feature extends the application scope of DRUDIT to several fields. Moreover, two more functions are implemented: the multi- and on/off-target tasks. These tool…

Statistics and ProbabilityService (systems architecture)PolypharmacologyComputer scienceIn silicoMachine learningcomputer.software_genre01 natural sciencesBiochemistrybiological target finderdrug discoveryMolecular descriptors03 medical and health sciencesMolecular descriptorSettore BIO/10 - BiochimicaWeb applicationComputer SimulationPolypharmacologyMolecular Biology030304 developmental biologySettore ING-INF/05 - Sistemi Di Elaborazione Delle InformazioniInternet0303 health sciencesbusiness.industrySmall moleculeSettore CHIM/08 - Chimica Farmaceutica0104 chemical sciencesComputer Science Applications010404 medicinal & biomolecular chemistryComputational MathematicsComputational Theory and MathematicsBiological targetThe InternetArtificial intelligencebusinesscomputerSoftware
researchProduct

The Induced Smoothed lasso: A practical framework for hypothesis testing in high dimensional regression.

2020

This paper focuses on hypothesis testing in lasso regression, when one is interested in judging statistical significance for the regression coefficients in the regression equation involving a lot of covariates. To get reliable p-values, we propose a new lasso-type estimator relying on the idea of induced smoothing which allows to obtain appropriate covariance matrix and Wald statistic relatively easily. Some simulation experiments reveal that our approach exhibits good performance when contrasted with the recent inferential tools in the lasso framework. Two real data analyses are presented to illustrate the proposed framework in practice.

Statistics and ProbabilityStatistics::TheoryInduced smoothingEpidemiologyComputer scienceFeature selectionWald test01 natural sciencesasthma researchStatistics::Machine Learning010104 statistics & probability03 medical and health sciencesHealth Information ManagementLasso (statistics)Linear regressionsparse modelsStatistics::MethodologyComputer Simulation0101 mathematicssandwich formula030304 developmental biologyStatistical hypothesis testing0303 health sciencesCovariance matrixlung functionRegression analysisStatistics::Computationsparse modelResearch DesignAlgorithmSmoothingvariable selectionStatistical methods in medical research
researchProduct

Selecting the tuning parameter in penalized Gaussian graphical models

2019

Penalized inference of Gaussian graphical models is a way to assess the conditional independence structure in multivariate problems. In this setting, the conditional independence structure, corresponding to a graph, is related to the choice of the tuning parameter, which determines the model complexity or degrees of freedom. There has been little research on the degrees of freedom for penalized Gaussian graphical models. In this paper, we propose an estimator of the degrees of freedom in $$\ell _1$$ -penalized Gaussian graphical models. Specifically, we derive an estimator inspired by the generalized information criterion and propose to use this estimator as the bias term for two informatio…

Statistics and ProbabilityStatistics::TheoryKullback–Leibler divergenceKullback-Leibler divergenceComputer scienceGaussianInformation Criteria010103 numerical & computational mathematicsModel complexityModel selection01 natural sciencesTheoretical Computer Science010104 statistics & probabilitysymbols.namesakeStatistics::Machine LearningGeneralized information criterionEntropy (information theory)Statistics::MethodologyGraphical model0101 mathematicsPenalized Likelihood Kullback-Leibler Divergence Model Complexity Model Selection Generalized Information Criterion.Model selectionEstimatorStatistics::ComputationComputational Theory and MathematicsConditional independencesymbolsPenalized likelihoodStatistics Probability and UncertaintySettore SECS-S/01 - StatisticaAlgorithmStatistics and Computing
researchProduct

Test and power considerations for multiple endpoint analyses using sequentially rejective graphical procedures

2009

A variety of powerful test procedures are available for the analysis of clinical trials addressing multiple objectives, such as comparing several treatments with a control, assessing the benefit of a new drug for more than one endpoint, etc. However, some of these procedures have reached a level of complexity that makes it difficult to communicate the underlying test strategies to clinical teams. Graphical approaches have been proposed instead that facilitate the derivation and communication of Bonferroni-based closed test procedures. In this paper we give a coherent description of the methodology and illustrate it with a real clinical trial example. We further discuss suitable power measur…

Statistics and ProbabilityTest strategyEndpoint DeterminationEpidemiologyComputer scienceControl (management)Analysis of clinical trialsMachine learningcomputer.software_genresymbols.namesakeDrug TherapyComputer GraphicsConfidence IntervalsHumansMulticenter Studies as TopicRandomized Controlled Trials as Topicbusiness.industryVariety (cybernetics)Test (assessment)Clinical trialBonferroni correctionClinical Trials Phase III as TopicData Interpretation StatisticalMultiple comparisons problemsymbolsArtificial intelligencebusinessAlgorithmcomputerStatistics in Medicine
researchProduct

What subject matter questions motivate the use of machine learning approaches compared to statistical models for probability prediction?

2014

This is a discussion of the following papers: "Probability estimation with machine learning methods for dichotomous and multicategory outcome: Theory" by Jochen Kruppa, Yufeng Liu, Gerard Biau, Michael Kohler, Inke R. Konig, James D. Malley, and Andreas Ziegler; and "Probability estimation with machine learning methods for dichotomous and multicategory outcome: Applications" by Jochen Kruppa, Yufeng Liu, Hans-Christian Diener, Theresa Holste, Christian Weimar, Inke R. Konig, and Andreas Ziegler.

Statistics and Probabilitybusiness.industryProbability estimationStatistical modelGeneral MedicineMachine learningcomputer.software_genreLogistic regressionMulticategoryOutcome (probability)Subject matterDienerEconometricsArtificial intelligenceStatistics Probability and UncertaintybusinesscomputerMathematicsBiometrical Journal
researchProduct

Contributed discussion on article by Pratola

2016

The author should be commended for his outstanding contribution to the literature on Bayesian regression tree models. The author introduces three innovative sampling approaches which allow for efficient traversal of the model space. In this response, we add a fourth alternative.

Statistics and Probabilitymodel selectionMarkov Chain Monte Carlo (MCMC)Bayesian regression treeComputer scienceBig dataBayesian regression tree (BRT) modelsComputingMilieux_LEGALASPECTSOFCOMPUTINGbirth–death processMachine learningcomputer.software_genreSequential Monte Carlo methods01 natural sciencespopulation Markov chain Monte Carlo010104 statistics & probabilitysymbols.namesakebig data0502 economics and businessBayesian Regression Trees (BART)0101 mathematics050205 econometrics Bayesian treed regressionMultiple Try Metropolis algorithmsINFERÊNCIA ESTATÍSTICAbusiness.industryApplied MathematicsModel selection05 social sciencesRejection samplingData scienceVariable-order Bayesian networkTree (data structure)Tree traversalMarkov chain Monte Carlocontinuous time Markov processsymbolsArtificial intelligencebusinessBayesian linear regressioncommunication-freecomputerGibbs samplingBayesian Analysis
researchProduct

On Limiting Fréchet ε-Subdifferentials

1998

This paper presents an e-sub differential calculus for nonconvex and nonsmooth functions. We extend the previous work by Jofre et all to the case where the functions are lower semicontinuous instead of locally Lipschitz.

Statistics::Machine LearningPure mathematicsWork (thermodynamics)Tangent coneMathematics::Optimization and ControlDifferential calculusLimitingLipschitz continuityMathematics
researchProduct

On the Ambiguous Consequences of Omitting Variables

2015

This paper studies what happens when we move from a short regression to a long regression (or vice versa), when the long regression is shorter than the data-generation process. In the special case where the long regression equals the data-generation process, the least-squares estimators have smaller bias (in fact zero bias) but larger variances in the long regression than in the short regression. But if the long regression is also misspecified, the bias may not be smaller. We provide bias and mean squared error comparisons and study the dependence of the differences on the misspecification parameter.

Statistics::Machine LearningStatistics::TheoryC51C52BiasMisspecificationLeast-squares estimatorsddc:330Statistics::MethodologyC13Mean squared errorOmitted variablesStatistics::Computation
researchProduct