Search results for " recognition."
showing 10 items of 3189 documents
A clustering package for nucleotide sequences using Laplacian Eigenmaps and Gaussian Mixture Model.
2018
International audience; In this article, a new Python package for nucleotide sequences clustering is proposed. This package, freely available on-line, implements a Laplacian eigenmap embedding and a Gaussian Mixture Model for DNA clustering. It takes nucleotide sequences as input, and produces the optimal number of clusters along with a relevant visualization. Despite the fact that we did not optimise the computational speed, our method still performs reasonably well in practice. Our focus was mainly on data analytics and accuracy and as a result, our approach outperforms the state of the art, even in the case of divergent sequences. Furthermore, an a priori knowledge on the number of clust…
Modeling Chronic Toxicity: A Comparison of Experimental Variability With (Q)SAR/Read-Across Predictions
2018
This study compares the accuracy of (Q)SAR/read-across predictions with the experimental variability of chronic lowest-observed-adverse-effect levels (LOAELs) from in vivo experiments. We could demonstrate that predictions of the lazy structure-activity relationships (lazar) algorithm within the applicability domain of the training data have the same variability as the experimental training data. Predictions with a lower similarity threshold (i.e., a larger distance from the applicability domain) are also significantly better than random guessing, but the errors to be expected are higher and a manual inspection of prediction results is highly recommended.
Contribution of the commensal microbiota to atherosclerosis and arterial thrombosis
2018
The commensal gut microbiota is an environmental factor that has been implicated in the development of cardiovascular disease. The development of atherosclerotic lesions is largely influenced not only by the microbial-associated molecular patterns of the gut microbiota but also by the meta-organismal trimethylamine N-oxide pathway. Recent studies have described a role for the gut microbiota in platelet activation and arterial thrombosis. This review summarizes the results from gnotobiotic mouse models and clinical data that linked microbiota-induced pattern recognition receptor signalling with atherogenesis. Based on recent insights, we here provide an overview of how the gut microbiota cou…
Establishing and validating a new source analysis method using phase.
2017
Electroencephalogram (EEG) measures the brain oscillatory activity non-invasively. The localization of deep brain generators of the electric fields is essential for understanding neuronal function in healthy humans and for damasking specific regions that cause abnormal activity in patients with neurological disorders. The aim of this study was to test whether the phase estimation from scalp data can be reliably used to identify the number of dipoles in source analyses. The steps performed included: i) modeling different phasic oscillatory signals using auto-regressive processes at a particular frequency, ii) simulation of two different noises, namely white and colored noise, having differen…
Mutations in the GLA Gene and LysoGb3: Is It Really Anderson-Fabry Disease?
2018
Anderson-Fabry disease (FD) is a rare, progressive, multisystem storage disorder caused by the partial or total deficit of the lysosomal enzyme &alpha
Newly Digitized Database Reveals the Lives and Families of Forced Migrants from Finnish Karelia
2017
Studies on displaced persons often suffer from a lack of data on the long-term effects of forced migration. A register created during 1960s and published as a book series ‘Siirtokarjalaisten tie’ in 1970 documented the lives of individuals who fled the southern Karelian district of Finland after its first and second occupation by the Soviet Union in 1940 and 1944. To realize the potential value of these data for scientific research, we have recently scanned the register using optical character recognition (OCR) software, and developed proprietary computer code to extract these data. Here we outline the steps involved in the digitization process, and present an overview of the Migration Kare…
Variable Ranking Feature Selection for the Identification of Nucleosome Related Sequences
2018
Several recent works have shown that K-mer sequence representation of a DNA sequence can be used for classification or identification of nucleosome positioning related sequences. This representation can be computationally expensive when k grows, making the complexity in spaces of exponential dimension. This issue effects significantly the classification task computed by a general machine learning algorithm used for the purpose of sequence classification. In this paper, we investigate the advantage offered by the so-called Variable Ranking Feature Selection method to select the most informative k − mers associated to a set of DNA sequences, for the final purpose of nucleosome/linker classifi…
Clustering of low-correlated spatial gene expression patterns in the mouse brain in the Allen Brain Atlas
2018
In this paper, clustering techniques are applied to spatial gene expression patterns with a low genomic correlation between the sagittal and coronal projections. The data analysed here are hosted on an available public DB named ABA (Allen Brain Atlas). The results are compared to those obtained by Bohland et al. on the complementary dataset (high correlation values). We prove that, by analysing a reduced dataset,hence reducing the computational burden, we get the same accuracy in highlighting different neuroanatomical region.
Discovering discriminative graph patterns from gene expression data
2016
We consider the problem of mining gene expression data in order to single out interesting features characterizing healthy/unhealthy samples of an input dataset. We present an approach based on a network model of the input gene expression data, where there is a labelled graph for each sample. To the best of our knowledge, this is the first attempt to build a different graph for each sample and, then, to have a database of graphs for representing a sample set. Our main goal is that of singling out interesting differences between healthy and unhealthy samples, through the extraction of "discriminative patterns" among graphs belonging to the two different sample sets. Differently from the other…
The intrinsic combinatorial organization and information theoretic content of a sequence are correlated to the DNA encoded nucleosome organization of…
2015
Abstract Motivation: Thanks to research spanning nearly 30 years, two major models have emerged that account for nucleosome organization in chromatin: statistical and sequence specific. The first is based on elegant, easy to compute, closed-form mathematical formulas that make no assumptions of the physical and chemical properties of the underlying DNA sequence. Moreover, they need no training on the data for their computation. The latter is based on some sequence regularities but, as opposed to the statistical model, it lacks the same type of closed-form formulas that, in this case, should be based on the DNA sequence only. Results: We contribute to close this important methodological gap …