Search results for "High dimensional"
showing 8 items of 18 documents
Image retrieval system for citizen services using penalized logistic regression models
2020
This paper describes a procedure to deal with large image collections obtained by smart city services based on interaction with citizens providing pictures. The semantic gap between the low-level image features and represented concepts and situations has been addressed using image retrieval techniques. A relevance feedback procedure is proposed for Content-Based Image Retrieval (CBIR) based on the modelling of user responses. One of the novelties of the proposal is that the feedback learning procedure can use the information that citizens themselves can provide when using these services.The proposed algorithm considers the probability of an image belonging to the set of those sought by the …
Variability of Classification Results in Data with High Dimensionality and Small Sample Size
2021
The study focuses on the analysis of biological data containing information on the number of genome sequences of intestinal microbiome bacteria before and after antibiotic use. The data have high dimensionality (bacterial taxa) and a small number of records, which is typical of bioinformatics data. Classification models induced on data sets like this usually are not stable and the accuracy metrics have high variance. The aim of the study is to create a preprocessing workflow and a classification model that can perform the most accurate classification of the microbiome into groups before and after the use of antibiotics and lessen the variability of accuracy measures of the classifier. To ev…
Saliency in spectral images
2011
International audience; Even though the study of saliency for color images has been thoroughly investigated in the past, very little attention has been given to datasets that cannot be displayed on traditional computer screens such as spectral images. Nevertheless, more than a means to predict human gaze, the study of saliency primarily allows for measuring infor- mative content. Thus, we propose a novel approach for the computation of saliency maps for spectral images. Based on the Itti model, it in- volves the extraction of both spatial and spectral features, suitable for high dimensionality images. As an application, we present a comparison framework to evaluate how dimensionality reduct…
The Three Steps of Clustering in the Post-Genomic Era: A Synopsis
2011
Clustering is one of the most well known activities in scientific investigation and the object of research in many disciplines, ranging from Statistics to Computer Science. Following Handl et al., it can be summarized as a three step process: (a) choice of a distance function; (b) choice of a clustering algorithm; (c) choice of a validation method. Although such a purist approach to clustering is hardly seen in many areas of science, genomic data require that level of attention, if inferences made from cluster analysis have to be of some relevance to biomedical research. Unfortunately, the high dimensionality of the data and their noisy nature makes cluster analysis of genomic data particul…
Feature Selection Approach based on Mutual Information and Partial Least Squares
2014
Feature selection technology can improve the modeling accuracy and reduce model’s complexity, especially for the high dimensional spectral data. Aim at this problem, feature selection approach based on mutual information (MI) and partial least square (PLS) is proposed in this paper. MI values between features and responsible variable are calculated, and the threshold value using to select final features is optimal selected based on PLS algorithm. The numbers of the latent values of the PLS and the threshold value of MI are selected according the modeling performance simultaneously. The experimental results based on the near-infrared spectrum show that the proposed approach has better perfor…
Using differential geometric LARS algorithm to study the expression profile of a sample of patients with latex-fruit syndrome
2011
Natural rubber latex IgE-mediated hypersensitivity is one of the most important health problems in allergy during recent years. The prevalence of individuals allergic to latex shows an associated hypersensitivity to some plant-derived foods, especially freshly consumed fruit. This association of latex allergy and allergy to plant-derived foods is called latex-fruit syndrome. The aim of this study is to use the differential geometric generalization of the LARS algorithm to identify candidate genes that may be associated with the pathogenesis of allergy to latex or vegetable.
LipidMS: An R Package for Lipid Annotation in Untargeted Liquid Chromatography-Data Independent Acquisition-Mass Spectrometry Lipidomics.
2018
High resolution LC-MS untargeted lipidomics using data independent acquisition (DIA) has the potential to increase lipidome coverage, as it enables the continuous and unbiased acquisition of all eluting ions. However, the loss of the link between the precursor and the product ions combined with the high dimensionality of DIA data sets hinder accurate feature annotation. Here, we present LipidMS, an R package aimed to confidently identify lipid species in untargeted LC-DIA-MS. To this end, LipidMS combines a coelution score, which links precursor and fragment ions with fragmentation and intensity rules. Depending on the MS evidence reached by the identification function survey, LipidMS provi…
Approximation of functions over manifolds : A Moving Least-Squares approach
2021
We present an algorithm for approximating a function defined over a $d$-dimensional manifold utilizing only noisy function values at locations sampled from the manifold with noise. To produce the approximation we do not require any knowledge regarding the manifold other than its dimension $d$. We use the Manifold Moving Least-Squares approach of (Sober and Levin 2016) to reconstruct the atlas of charts and the approximation is built on-top of those charts. The resulting approximant is shown to be a function defined over a neighborhood of a manifold, approximating the originally sampled manifold. In other words, given a new point, located near the manifold, the approximation can be evaluated…