0000000000585004

AUTHOR

Samuel Vaiter

From optimization to algorithmic differentiation: a graph detour

This manuscript highlights the work of the author since he was nominated as "Chargé de Recherche" (research scientist) at Centre national de la recherche scientifique (CNRS) in 2015. In particular, the author shows a thematic and chronological evolution of his research interests:- The first part, following his post-doctoral work, is concerned with the development of new algorithms for non-smooth optimization.- The second part is the heart of his research in 2020. It is focused on the analysis of machine learning methods for graph (signal) processing.- Finally, the third and last part, oriented towards the future, is concerned with (automatic or not) differentiation of algorithms for learnin…

research product

Model identification and local linear convergence of coordinate descent

For composite nonsmooth optimization problems, Forward-Backward algorithm achieves model identification (e.g., support identification for the Lasso) after a finite number of iterations, provided the objective function is regular enough. Results concerning coordinate descent are scarcer and model identification has only been shown for specific estimators, the support-vector machine for instance. In this work, we show that cyclic coordinate descent achieves model identification in finite time for a wide class of functions. In addition, we prove explicit local linear convergence rates for coordinate descent. Extensive experiments on various estimators and on real datasets demonstrate that thes…

research product

Refitting Solutions Promoted by $$\ell _{12}$$ Sparse Analysis Regularizations with Block Penalties

International audience; In inverse problems, the use of an l(12) analysis regularizer induces a bias in the estimated solution. We propose a general refitting framework for removing this artifact while keeping information of interest contained in the biased solution. This is done through the use of refitting block penalties that only act on the co-support of the estimation. Based on an analysis of related works in the literature, we propose a new penalty that is well suited for refitting purposes. We also present an efficient algorithmic method to obtain the refitted solution along with the original (biased) solution for any convex refitting block penalty. Experiments illustrate the good be…

research product

Exploiting regularity in sparse Generalized Linear Models

International audience; Generalized Linear Models (GLM) are a wide class ofregression and classification models, where the predictedvariable is obtained from a linear combination of the in-put variables. For statistical inference in high dimensions,sparsity inducing regularization have proven useful whileoffering statistical guarantees. However, solving the result-ing optimization problems can be challenging: even forpopular iterative algorithms such as coordinate descent, oneneeds to loop over a large number of variables. To mitigatethis, techniques known asscreening rulesandworking setsdiminish the size of the optimization problem at hand, eitherby progressively removing variables, or by …

research product

Convergence and Stability of Graph Convolutional Networks on Large Random Graphs

International audience; We study properties of Graph Convolutional Networks (GCNs) by analyzing their behavior on standard models of random graphs, where nodes are represented by random latent variables and edges are drawn according to a similarity kernel. This allows us to overcome the difficulties of dealing with discrete notions such as isomorphisms on very large graphs, by considering instead more natural geometric aspects. We first study the convergence of GCNs to their continuous counterpart as the number of nodes grows. Our results are fully non-asymptotic and are valid for relatively sparse graphs with an average degree that grows logarithmically with the number of nodes. We then an…

research product

Characterizing the maximum parameter of the total-variation denoising through the pseudo-inverse of the divergence

International audience; We focus on the maximum regularization parameter for anisotropic total-variation denoising. It corresponds to the minimum value of the regularization parameter above which the solution remains constant. While this value is well know for the Lasso, such a critical value has not been investigated in details for the total-variation. Though, it is of importance when tuning the regularization parameter as it allows fixing an upper-bound on the grid for which the optimal parameter is sought. We establish a closed form expression for the one-dimensional case, as well as an upper-bound for the two-dimensional case, that appears reasonably tight in practice. This problem is d…

research product

On the Universality of Graph Neural Networks on Large Random Graphs

International audience; We study the approximation power of Graph Neural Networks (GNNs) on latent position random graphs. In the large graph limit, GNNs are known to converge to certain "continuous" models known as c-GNNs, which directly enables a study of their approximation power on random graph models. In the absence of input node features however, just as GNNs are limited by the Weisfeiler-Lehman isomorphism test, c-GNNs will be severely limited on simple random graph models. For instance, they will fail to distinguish the communities of a well-separated Stochastic Block Model (SBM) with constant degree function. Thus, we consider recently proposed architectures that augment GNNs with …

research product

Dual Extrapolation for Sparse Generalized Linear Models

International audience; Generalized Linear Models (GLM) form a wide class of regression and classification models, where prediction is a function of a linear combination of the input variables. For statistical inference in high dimension, sparsity inducing regularizations have proven to be useful while offering statistical guarantees. However, solving the resulting optimization problems can be challenging: even for popular iterative algorithms such as coordinate descent, one needs to loop over a large number of variables. To mitigate this, techniques known as screening rules and working sets diminish the size of the optimization problem at hand, either by progressively removing variables, o…

research product

Refitting solutions promoted by $\ell_{12}$ sparse analysis regularization with block penalties

International audience; In inverse problems, the use of an $\ell_{12}$ analysis regularizer induces a bias in the estimated solution. We propose a general refitting framework for removing this artifact while keeping information of interest contained in the biased solution. This is done through the use of refitting block penalties that only act on the co-support of the estimation. Based on an analysis of related works in the literature, we propose a new penalty that is well suited for refitting purposes. We also present an efficient algorithmic method to obtain the refitted solution along with the original (biased) solution for any convex refitting block penalty. Experiments illustrate the g…

research product

CLEAR: Covariant LEAst-Square Refitting with Applications to Image Restoration

International audience; In this paper, we propose a new framework to remove parts of the systematic errors affecting popular restoration algorithms, with a special focus for image processing tasks. Generalizing ideas that emerged for $\ell_1$ regularization, we develop an approach re-fitting the results of standard methods towards the input data. Total variation regularizations and non-local means are special cases of interest. We identify important covariant information that should be preserved by the re-fitting method, and emphasize the importance of preserving the Jacobian (w.r.t. the observed signal) of the original estimator. Then, we provide an approach that has a ``twicing'' flavor a…

research product

Implicit differentiation of Lasso-type models for hyperparameter optimization

International audience; Setting regularization parameters for Lasso-type estimators is notoriously difficult, though crucial in practice. The most popular hyperparam-eter optimization approach is grid-search using held-out validation data. Grid-search however requires to choose a predefined grid for each parameter , which scales exponentially in the number of parameters. Another approach is to cast hyperparameter optimization as a bi-level optimization problem, one can solve by gradient descent. The key challenge for these methods is the estimation of the gradient w.r.t. the hyperpa-rameters. Computing this gradient via forward or backward automatic differentiation is possible yet usually s…

research product

Sparse and Smooth: improved guarantees for Spectral Clustering in the Dynamic Stochastic Block Model

In this paper, we analyse classical variants of the Spectral Clustering (SC) algorithm in the Dynamic Stochastic Block Model (DSBM). Existing results show that, in the relatively sparse case where the expected degree grows logarithmically with the number of nodes, guarantees in the static case can be extended to the dynamic case and yield improved error bounds when the DSBM is sufficiently smooth in time, that is, the communities do not change too much between two time steps. We improve over these results by drawing a new link between the sparsity and the smoothness of the DSBM: the more regular the DSBM is, the more sparse it can be, while still guaranteeing consistent recovery. In particu…

research product

Implicit differentiation for fast hyperparameter selection in non-smooth convex learning

International audience; Finding the optimal hyperparameters of a model can be cast as a bilevel optimization problem, typically solved using zero-order techniques. In this work we study first-order methods when the inner optimization problem is convex but non-smooth. We show that the forward-mode differentiation of proximal gradient descent and proximal coordinate descent yield sequences of Jacobians converging toward the exact Jacobian. Using implicit differentiation, we show it is possible to leverage the non-smoothness of the inner problem to speed up the computation. Finally, we provide a bound on the error made on the hypergradient when the inner optimization problem is solved approxim…

research product