Search results for "Hyperparameter"
showing 10 items of 22 documents
Implicit differentiation for fast hyperparameter selection in non-smooth convex learning
2022
International audience; Finding the optimal hyperparameters of a model can be cast as a bilevel optimization problem, typically solved using zero-order techniques. In this work we study first-order methods when the inner optimization problem is convex but non-smooth. We show that the forward-mode differentiation of proximal gradient descent and proximal coordinate descent yield sequences of Jacobians converging toward the exact Jacobian. Using implicit differentiation, we show it is possible to leverage the non-smoothness of the inner problem to speed up the computation. Finally, we provide a bound on the error made on the hypergradient when the inner optimization problem is solved approxim…
Towards Model-Based Reinforcement Learning for Industry-Near Environments
2019
Deep reinforcement learning has over the past few years shown great potential in learning near-optimal control in complex simulated environments with little visible information. Rainbow (Q-Learning) and PPO (Policy Optimisation) have shown outstanding performance in a variety of tasks, including Atari 2600, MuJoCo, and Roboschool test suite. Although these algorithms are fundamentally different, both suffer from high variance, low sample efficiency, and hyperparameter sensitivity that, in practice, make these algorithms a no-go for critical operations in the industry.
Online Hyperparameter Search Interleaved with Proximal Parameter Updates
2021
There is a clear need for efficient hyperparameter optimization (HO) algorithms for statistical learning, since commonly applied search methods (such as grid search with N-fold cross-validation) are inefficient and/or approximate. Previously existing gradient-based HO algorithms that rely on the smoothness of the cost function cannot be applied in problems such as Lasso regression. In this contribution, we develop a HO method that relies on the structure of proximal gradient methods and does not require a smooth cost function. Such a method is applied to Leave-one-out (LOO)-validated Lasso and Group Lasso, and an online variant is proposed. Numerical experiments corroborate the convergence …
Passive millimeter wave image classification with large scale Gaussian processes
2017
Passive Millimeter Wave Images (PMMWIs) are being increasingly used to identify and localize objects concealed under clothing. Taking into account the quality of these images and the unknown position, shape, and size of the hidden objects, large data sets are required to build successful classification/detection systems. Kernel methods, in particular Gaussian Processes (GPs), are sound, flexible, and popular techniques to address supervised learning problems. Unfortunately, their computational cost is known to be prohibitive for large scale applications. In this work, we present a novel approach to PMMWI classification based on the use of Gaussian Processes for large data sets. The proposed…
A Tsetlin Machine with Multigranular Clauses
2019
The recently introduced Tsetlin Machine (TM) has provided competitive pattern recognition accuracy in several benchmarks, however, requires a 3-dimensional hyperparameter search. In this paper, we introduce the Multigranular Tsetlin Machine (MTM). The MTM eliminates the specificity hyperparameter, used by the TM to control the granularity of the conjunctive clauses that it produces for recognizing patterns. Instead of using a fixed global specificity, we encode varying specificity as part of the clauses, rendering the clauses multigranular. This makes it easier to configure the TM because the dimensionality of the hyperparameter search space is reduced to only two dimensions. Indeed, it tur…
Geographical variation in pharmacological prescription
2009
Promoting rational drug administration in treatments is one of the most important issues in Public Health. Bayesian hierarchical models are a very useful tool for incorporating geographical information into the analysis of pharmacological prescription data. They allow the mapping of spatial components which express the trend of geographical variation. In addition, these models are able to deal with uncertainty in a sequential way through prior distributions on parameters and hyperparameters. Bayes' theorem combines all types of information and provides the posterior distribution which is computed through Markov Chain Monte Carlo (MCMC) simulation methods. Simulated data for pharmacological …
A Novel System for Multi-level Crohn’s Disease Classification and Grading Based on a Multiclass Support Vector Machine
2020
Crohn’s disease (CD) is a chronic inflammatory condition of the gastrointestinal tract that can highly alter patient’s quality of life. Diagnostic imaging, such as Enterography Magnetic Resonance Imaging (E-MRI), provides crucial information for CD activity assessment. Automatic learning methods play a fundamental role in the classification of CD and allow to avoid the long and expensive manual classification process by radiologists. This paper presents a novel classification method that uses a multiclass Support Vector Machine (SVM) based on a Radial Basis Function (RBF) kernel for the grading of CD inflammatory activity. To validate the system, we have used a dataset composed of 800 E-MRI…
Biophysical parameter estimation with adaptive Gaussian Processes
2009
We evaluate Gaussian Processes (GPs) for the estimation of biophysical parameters from acquired multispectral data. The standard GP formulation is used, and all hyperparameters (kernel parameters and noise variance) are optimized by maximizing the marginal likelihood. This gives rise to a fully-adaptive GP to data characteristics, both in terms of signal and noise properties. The good numerical results in the estimation of oceanic chlorophyll concentration and leaf membrane state confirm GPs as adequate, alternative non-parametric methods for biophysical parameter estimation. GPs are also analyzed by scrutinizing the predictive variance, the estimated noise variance, and the relevance of ea…
A deep learning approach for the segmentation of myocardial diseases
2021
Cardiac left ventricular (LV) segmentation is a paramount essential step for both diagnosis and treatment of cardiac pathologies such as ischemia, myocardial infarction, arrhythmia and myocarditis. However, this segmentation is challenging due to high variability across patients and the potential lack of contrast between structures. In this work, we propose and evaluate a (2.5D) SegU-Net model based on the fusion of two deep learning segmentation techniques (U-Net and Seg-Net) for automated LGE-MRI (Late gadolinium enhanced magnetic resonance imaging) myocardial disease (infarct core and no-reflow region) quantification in a new multifield expert annotated dataset. Given that the scar tissu…
A novel approach to quantifying the sensitivity of current and future cosmological datasets to the neutrino mass ordering through Bayesian hierarchic…
2017
We present a novel approach to derive constraints on neutrino masses from cosmological data, while taking into account our ignorance of the neutrino mass ordering. We derive constraints from a combination of current and future cosmological datasets on the total neutrino mass $M_\nu$ and on the mass fractions carried by each of the mass eigenstates, after marginalizing over the (unknown) neutrino mass ordering, either normal (NH) or inverted (IH). The bounds take therefore into account the uncertainty related to our ignorance of the mass hierarchy. This novel approach is carried out in the framework of Bayesian analysis of a typical hierarchical problem. In this context, the choice of the ne…