Search results for "High Dimension"
showing 9 items of 19 documents
On Shimura subvarieties of the Prym locus
2018
We show that families of Pryms of abelian Galois covers of $\mathbb{P}^1$ in $A_{g-1}$ (resp. $A_g$) do not give rise to high dimensional Shimura subvareties.
From optimization to algorithmic differentiation: a graph detour
2021
This manuscript highlights the work of the author since he was nominated as "Chargé de Recherche" (research scientist) at Centre national de la recherche scientifique (CNRS) in 2015. In particular, the author shows a thematic and chronological evolution of his research interests:- The first part, following his post-doctoral work, is concerned with the development of new algorithms for non-smooth optimization.- The second part is the heart of his research in 2020. It is focused on the analysis of machine learning methods for graph (signal) processing.- Finally, the third and last part, oriented towards the future, is concerned with (automatic or not) differentiation of algorithms for learnin…
A fast and recursive algorithm for clustering large datasets with k-medians
2012
Clustering with fast algorithms large samples of high dimensional data is an important challenge in computational statistics. Borrowing ideas from MacQueen (1967) who introduced a sequential version of the $k$-means algorithm, a new class of recursive stochastic gradient algorithms designed for the $k$-medians loss criterion is proposed. By their recursive nature, these algorithms are very fast and are well adapted to deal with large samples of data that are allowed to arrive sequentially. It is proved that the stochastic gradient algorithm converges almost surely to the set of stationary points of the underlying loss criterion. A particular attention is paid to the averaged versions, which…
On the empirical spectral distribution for certain models related to sample covariance matrices with different correlations
2021
Given [Formula: see text], we study two classes of large random matrices of the form [Formula: see text] where for every [Formula: see text], [Formula: see text] are iid copies of a random variable [Formula: see text], [Formula: see text], [Formula: see text] are two (not necessarily independent) sets of independent random vectors having different covariance matrices and generating well concentrated bilinear forms. We consider two main asymptotic regimes as [Formula: see text]: a standard one, where [Formula: see text], and a slightly modified one, where [Formula: see text] and [Formula: see text] while [Formula: see text] for some [Formula: see text]. Assuming that vectors [Formula: see t…
Stochastic algorithms for robust statistics in high dimension
2016
This thesis focus on stochastic algorithms in high dimension as well as their application in robust statistics. In what follows, the expression high dimension may be used when the the size of the studied sample is large or when the variables we consider take values in high dimensional spaces (not necessarily finite). In order to analyze these kind of data, it can be interesting to consider algorithms which are fast, which do not need to store all the data, and which allow to update easily the estimates. In large sample of high dimensional data, outliers detection is often complicated. Nevertheless, these outliers, even if they are not many, can strongly disturb simple indicators like the me…
Feature Selection Approach based on Mutual Information and Partial Least Squares
2014
Feature selection technology can improve the modeling accuracy and reduce model’s complexity, especially for the high dimensional spectral data. Aim at this problem, feature selection approach based on mutual information (MI) and partial least square (PLS) is proposed in this paper. MI values between features and responsible variable are calculated, and the threshold value using to select final features is optimal selected based on PLS algorithm. The numbers of the latent values of the PLS and the threshold value of MI are selected according the modeling performance simultaneously. The experimental results based on the near-infrared spectrum show that the proposed approach has better perfor…
Saliency in spectral images
2011
International audience; Even though the study of saliency for color images has been thoroughly investigated in the past, very little attention has been given to datasets that cannot be displayed on traditional computer screens such as spectral images. Nevertheless, more than a means to predict human gaze, the study of saliency primarily allows for measuring infor- mative content. Thus, we propose a novel approach for the computation of saliency maps for spectral images. Based on the Itti model, it in- volves the extraction of both spatial and spectral features, suitable for high dimensionality images. As an application, we present a comparison framework to evaluate how dimensionality reduct…
The Three Steps of Clustering in the Post-Genomic Era: A Synopsis
2011
Clustering is one of the most well known activities in scientific investigation and the object of research in many disciplines, ranging from Statistics to Computer Science. Following Handl et al., it can be summarized as a three step process: (a) choice of a distance function; (b) choice of a clustering algorithm; (c) choice of a validation method. Although such a purist approach to clustering is hardly seen in many areas of science, genomic data require that level of attention, if inferences made from cluster analysis have to be of some relevance to biomedical research. Unfortunately, the high dimensionality of the data and their noisy nature makes cluster analysis of genomic data particul…
Sparse relative risk survival modelling
2016
Cancer survival is thought to closed linked to the genimic constitution of the tumour. Discovering such signatures will be useful in the diagnosis of the patient and may be used for treatment decisions and perhaps even the development of new treatments. However, genomic data are typically noisy and high-dimensional, often outstripping the number included in the study. Regularized survival models have been proposed to deal with such scenary. These methods typically induce sparsity by means of a coincidental match of the geometry of the convex likelihood and (near) non-convex regularizer.