Search results for " classification"
showing 10 items of 1043 documents
Recurrent Deep Neural Networks for Nucleosome Classification
2020
Nucleosomes are the fundamental repeating unit of chromatin. A nucleosome is an 8 histone proteins complex, in which approximately 147–150 pairs of DNA bases bind. Several biological studies have clearly stated that the regulation of cell type-specific gene activities are influenced by nucleosome positioning. Bioinformatic studies have improved those results showing proof of sequence specificity in nucleosomes’ DNA fragment. In this work, we present a recurrent neural network that uses nucleosome sequence features representation for their classification. In particular, we implement an architecture which stacks convolutional and long short-term memory layers, with the main purpose to avoid t…
LEGO-based generalized set of two linear algebraic 3D bio-macro-molecular descriptors: Theory and validation by QSARs
2019
Abstract Novel 3D protein descriptors based on bilinear, quadratic and linear algebraic maps in R n are proposed. The latter employs the kth 2-tuple (dis) similarity matrix to codify information related to covalent and non-covalent interactions in these biopolymers. The calculation of the inter-amino acid distances is generalized by using several dis-similarity coefficients, where normalization procedures based on the simple stochastic and mutual probability schemes are applied. A new local-fragment approach based on amino acid-types and amino acid-groups is proposed to characterize regions of interest in proteins. Topological and geometric macromolecular cutoffs are defined using local and…
Deep learning models for bacteria taxonomic classification of metagenomic data.
2018
Background An open challenge in translational bioinformatics is the analysis of sequenced metagenomes from various environmental samples. Of course, several studies demonstrated the 16S ribosomal RNA could be considered as a barcode for bacteria classification at the genus level, but till now it is hard to identify the correct composition of metagenomic data from RNA-seq short-read data. 16S short-read data are generated using two next generation sequencing technologies, i.e. whole genome shotgun (WGS) and amplicon (AMP); typically, the former is filtered to obtain short-reads belonging to a 16S shotgun (SG), whereas the latter take into account only some specific 16S hypervariable regions.…
Evaluation of DNA Methylation Episignatures for Diagnosis and Phenotype Correlations in 42 Mendelian Neurodevelopmental Disorders
2020
Contains fulltext : 218274.pdf (Publisher’s version ) (Closed access) Genetic syndromes frequently present with overlapping clinical features and inconclusive or ambiguous genetic findings which can confound accurate diagnosis and clinical management. An expanding number of genetic syndromes have been shown to have unique genomic DNA methylation patterns (called "episignatures"). Peripheral blood episignatures can be used for diagnostic testing as well as for the interpretation of ambiguous genetic test results. We present here an approach to episignature mapping in 42 genetic syndromes, which has allowed the identification of 34 robust disease-specific episignatures. We examine emerging pa…
ICTV Virus Taxonomy Profile: Finnlakeviridae
2020
Finnlakeviridae is a family of icosahedral, internal membrane-containing bacterial viruses with circular, single-stranded DNA genomes. The family includes the genus, Finnlakevirus, with the species, Flavobacterium virus FLiP. Flavobacterium phage FLiP was isolated with its Gram-negative host bacterium from a boreal freshwater habitat in Central Finland in 2010. It is the first described single-stranded DNA virus with an internal membrane and shares minimal sequence similarity with other known viruses. The virion organization (pseudo T=21 dextro) and major capsid protein fold (double-β-barrel) resemble those of Pseudoalteromonas phage PM2 (family Corticoviridae), which has a double-stranded…
Taxonomic Classification for Living Organisms Using Convolutional Neural Networks
2017
Taxonomic classification has a wide-range of applications such as finding out more about evolutionary history. Compared to the estimated number of organisms that nature harbors, humanity does not have a thorough comprehension of to which specific classes they belong. The classification of living organisms can be done in many machine learning techniques. However, in this study, this is performed using convolutional neural networks. Moreover, a DNA encoding technique is incorporated in the algorithm to increase performance and avoid misclassifications. The algorithm proposed outperformed the state of the art algorithms in terms of accuracy and sensitivity, which illustrates a high potential f…
Machine learning–XGBoost analysis of language networks to classify patients with epilepsy
2017
Our goal was to apply a statistical approach to allow the identification of atypical language patterns and to differentiate patients with epilepsy from healthy subjects, based on their cerebral activity, as assessed by functional MRI (fMRI). Patients with focal epilepsy show reorganization or plasticity of brain networks involved in cognitive functions, inducing ‘atypical’ (compared to ‘typical’ in healthy people) brain profiles. Moreover, some of these patients suffer from drug-resistant epilepsy, and they undergo surgery to stop seizures. The neurosurgeon should only remove the zone generating seizures and must preserve cognitive functions to avoid deficits. To preserve functions, one sho…
Bacteria classification using minimal absent words
2017
Bacteria classification has been deeply investigated with different tools for many purposes, such as early diagnosis, metagenomics, phylogenetics. Classification methods based on ribosomal DNA sequences are considered a reference in this area. We present a new classificatier for bacteria species based on a dissimilarity measure of purely combinatorial nature. This measure is based on the notion of Minimal Absent Words, a combinatorial definition that recently found applications in bioinformatics. We can therefore incorporate this measure into a probabilistic neural network in order to classify bacteria species. Our approach is motivated by the fact that there is a vast literature on the com…
ICTV Virus Taxonomy Profile: Solinviviridae
2019
Solinviviridae is a family of picorna/calici-like viruses with non-segmented, linear, positive-sense RNA genomes of approximately 10-11 kb. Unusually, their capsid proteins are encoded towards the 3'-end of the genome where they can be expressed both from a subgenomic RNA and as an extension of the replication (picorna-like helicase-protease-polymerase) polyprotein. Members of two species within the family infect ants, but related unclassified virus sequences derive from a large variety of insects and other arthropods. This is a summary of the International Committee on Taxonomy of Viruses (ICTV) Report on the Solinviviridae, which is available at www.ictv.global/report/solinviviridae.
Low-cost scalable discretization, prediction and feature selection for complex systems
2019
The introduced data-driven tool allows simultaneous feature selection, model inference, and marked cost and quality gains.