Search results for "DATA MINING"
showing 10 items of 907 documents
Gene-based and semantic structure of the Gene Ontology as a complex network
2012
The last decade has seen the advent and consolidation of ontology based tools for the identification and biological interpretation of classes of genes, such as the Gene Ontology. The information accumulated time-by-time and included in the GO is encoded in the definition of terms and in the setting up of semantic relations amongst terms. This approach might be usefully complemented by a bottom-up approach based on the knowledge of relationships amongst genes. To this end, we investigate the Gene Ontology from a complex network perspective. We consider the semantic network of terms naturally associated with the semantic relationships provided by the Gene Ontology consortium and a gene-based …
Model selection for factorial Gaussian graphical models with an application to dynamic regulatory networks.
2016
Abstract Factorial Gaussian graphical Models (fGGMs) have recently been proposed for inferring dynamic gene regulatory networks from genomic high-throughput data. In the search for true regulatory relationships amongst the vast space of possible networks, these models allow the imposition of certain restrictions on the dynamic nature of these relationships, such as Markov dependencies of low order – some entries of the precision matrix are a priori zeros – or equal dependency strengths across time lags – some entries of the precision matrix are assumed to be equal. The precision matrix is then estimated by l 1-penalized maximum likelihood, imposing a further constraint on the absolute value…
Reverse screening on indicaxanthin from Opuntia ficus-indica as natural chemoactive and chemopreventive agent
2018
Indicaxanthin is a bioactive and bioavailable betalain pigment extracted from Opuntia ficus indica fruits. Indicaxanthin has pharmacokinetic proprieties, rarely found in other phytochemicals, and it has been demonstrated that it provides a broad-spectrum of pharmaceutical activity, exerting anti-proliferative, anti-inflammatory, and neuromodulator effects. The discovery of the Indicaxanthin physiological targets plays an important role in understanding the biochemical mechanism. In this study, combined reverse pharmacophore mapping, reverse docking, and text-based database search identified Inositol Trisphosphate 3-Kinase (ITP3K-A), Glutamate carboxypeptidase II (GCPII), Leukotriene-A4 hydr…
Reference genome assessment from a population scale perspective: an accurate profile of variability and noise.
2017
Abstract Motivation Current plant and animal genomic studies are often based on newly assembled genomes that have not been properly consolidated. In this scenario, misassembled regions can easily lead to false-positive findings. Despite quality control scores are included within genotyping protocols, they are usually employed to evaluate individual sample quality rather than reference sequence reliability. We propose a statistical model that combines quality control scores across samples in order to detect incongruent patterns at every genomic region. Our model is inherently robust since common artifact signals are expected to be shared between independent samples over misassembled regions …
dAPE: a web server to detect homorepeats and follow their evolution.
2016
Abstract Summary Homorepeats are low complexity regions consisting of repetitions of a single amino acid residue. There is no current consensus on the minimum number of residues needed to define a functional homorepeat, nor even if mismatches are allowed. Here we present dAPE, a web server that helps following the evolution of homorepeats based on orthology information, using a sensitive but tunable cutoff to help in the identification of emerging homorepeats. Availability and Implementation dAPE can be accessed from http://cbdm-01.zdv.uni-mainz.de/∼munoz/polyx. Supplementary information Supplementary data are available at Bioinformatics online.
MSAProbs-MPI: parallel multiple sequence aligner for distributed-memory systems
2016
This is a pre-copyedited, author-produced version of an article accepted for publication in Bioinformatics following peer review. The version of recordJorge González-Domínguez, Yongchao Liu, Juan Touriño, Bertil Schmidt; MSAProbs-MPI: parallel multiple sequence aligner for distributed-memory systems, Bioinformatics, Volume 32, Issue 24, 15 December 2016, Pages 3826–3828, https://doi.org/10.1093/bioinformatics/btw558is available online at: https://doi.org/10.1093/bioinformatics/btw558 [Abstracts] MSAProbs is a state-of-the-art protein multiple sequence alignment tool based on hidden Markov models. It can achieve high alignment accuracy at the expense of relatively long runtimes for large-sca…
How to deal with Haplotype data: An Extension to the Conceptual Schema of the Human Genome
2016
[EN] The goal of this work is to describe the advantages of the application of Conceptual Modeling (CM) in complex domains, such as genomics. Nowadays, the study and comprehension of the human genome is a major challenge due to its high level of complexity. The constant evolution in the genomic domain contributes to the generation of ever larger amounts of new data, which means that if we do not manage it correctly data quality could be compromised (i.e., problems related with heterogeneity and inconsistent data). In this paper, we propose the use of a Conceptual Schema of the Human Genome (CSHG), designed to understand and improve our ontological commitment to the domain and also extend (e…
Methods for RNA Modification Mapping Using Deep Sequencing: Established and New Emerging Technologies
2019
New analytics of post-transcriptional RNA modifications have paved the way for a tremendous upswing of the biological and biomedical research in this field. This especially applies to methods that included RNA-Seq techniques, and which typically result in what is termed global scale modification mapping. In this process, positions inside a cell`s transcriptome are receiving a status of potential modification sites (so called modification calling), typically based on a score of some kind that issues from the particular method applied. The resulting data are thought to represent information that goes beyond what is contained in typical transcriptome data, and hence the field has taken to use …
Text mining and expert curation to develop a database on psychiatric diseases and their genes
2017
Psychiatric disorders constitute one of the main causes of disability worldwide. During the past years, considerable research has been conducted on the genetic architecture of such diseases, although little understanding of their etiology has been achieved. The difficulty to access up-to-date, relevant genotype-phenotype information has hampered the application of this wealth of knowledge to translational research and clinical practice in order to improve diagnosis and treatment of psychiatric patients. PsyGeNET (http://www.psygenet.org/) has been developed with the aim of supporting research on the genetic architecture of psychiatric diseases, by providing integrated and structured accessi…
Editorial: Protein Interaction Networks in Health and Disease
2016
The identification and annotation of protein-protein interactions (PPIs) is of great importance in systems biology. Big data produced from experimental or computational approaches allow not only the construction of large protein interaction maps but also expand our knowledge on how proteins build up molecular complexes to perform sophisticated tasks inside a cell. However, if we want to accurately understand the functionality of these complexes, we need to go beyond the simple identification of PPIs. We need to know when and where an interaction happens in the cell and also understand the flow of information through a protein interaction network. Another perspective of the research on PPI n…