Search results for "Annotation"
showing 10 items of 161 documents
In Search of Pathogens: Transcriptome-Based Identification of Viral Sequences from the Pine Processionary Moth (Thaumetopoea pityocampa)
2015
Thaumetopoea pityocampa (pine processionary moth) is one of the most important pine pests in the forests of Mediterranean countries, Central Europe, the Middle East and North Africa. Apart from causing significant damage to pinewoods, T. pityocampa occurrence is also an issue for public and animal health, as it is responsible for dermatological reactions in humans and animals by contact with its irritating hairs. High throughput sequencing technologies have allowed the fast and cost-effective generation of genetic information of interest to understand different biological aspects of non-model organisms as well as the identification of potential pathogens. Using these technologies, we have o…
The design and implementation of Neuma, a collaborative Digital Scores Library - Requirements, architecture, and models
2012
This paper presents the design and implementation of the Neuma platform, a digital library devoted to the preservation and dissemination of symbolic music content (scores). Neuma is open to musicologists, musicians, and music publishers. It consists of a repository dedicated to the storage of large collections of digital scores, where users/applications can upload their documents. It also proposes services to publish, annotate, query, transform, and analyze scores. The long-term goal of the project is to enable an open and collaborative space where musician communities will be able to share music in symbolic notation. The project is organized around the French IRPMF institute (BnF–CNRS) whi…
Data from: Chironomus riparius (Diptera) genome sequencing reveals the impact of minisatellite transposable elements on population divergence
2017
Active transposable elements (TEs) may result in divergent genomic insertion and abundance patterns among conspecific populations. Upon secondary contact, such divergent genetic backgrounds can theoretically give rise to classical Dobzhansky-Muller incompatibilities (DMI), thus contributing to the evolution of endogenous genetic barriers and eventually cause population divergence. We investigated differential TE abundance among conspecific populations of the non-biting midge Chironomus riparius and evaluated their potential role in causing endogenous genetic incompatibilities between these populations. We focussed on a Chironomus-specific TE, the minisatellite-like Cla-element, whose activi…
CandidaDB: a genome database for Candida albicans pathogenomics.
2004
CandidaDB is accessible at http://genolist.pasteur.fr/CandidaDB.; International audience; CandidaDB is a database dedicated to the genome of the most prevalent systemic fungal pathogen of humans, Candida albicans. CandidaDB is based on an annotation of the Stanford Genome Technology Center C.albicans genome sequence data by the European Galar Fungail Consortium. CandidaDB Release 2.0 (June 2004) contains information pertaining to Assembly 19 of the genome of C.albicans strain SC5314. The current release contains 6244 annotated entries corresponding to 130 tRNA genes and 5917 protein-coding genes. For these, it provides tentative functional assignments along with numerous pre-run analyses th…
Bioinformatic flowchart and database to investigate the origins and diversity of Clan AA peptidases
2009
Abstract Background Clan AA of aspartic peptidases relates the family of pepsin monomers evolutionarily with all dimeric peptidases encoded by eukaryotic LTR retroelements. Recent findings describing various pools of single-domain nonviral host peptidases, in prokaryotes and eukaryotes, indicate that the diversity of clan AA is larger than previously thought. The ensuing approach to investigate this enzyme group is by studying its phylogeny. However, clan AA is a difficult case to study due to the low similarity and different rates of evolution. This work is an ongoing attempt to investigate the different clan AA families to understand the cause of their diversity. Results In this paper, we…
Toward completion of the Earth’s proteome: an update a decade later
2017
Protein databases are steadily growing driven by the spread of new more efficient sequencing techniques. This growth is dominated by an increase in redundancy (homologous proteins with various degrees of sequence similarity) and by the incapability to process and curate sequence entries as fast as they are created. To understand these trends and aid bioinformatic resources that might be compromised by the increasing size of the protein sequence databases, we have created a less-redundant protein data set. In parallel, we analyzed the evolution of protein sequence databases in terms of size and redundancy. While the SwissProt database has decelerated its growth mostly because of a focus on i…
Using Deep Learning to Extrapolate Protein Expression Measurements
2020
Mass spectrometry (MS)-based quantitative proteomics experiments typically assay a subset of up to 60% of the ≈20 000 human protein coding genes. Computational methods for imputing the missing values using RNA expression data usually allow only for imputations of proteins measured in at least some of the samples. In silico methods for comprehensively estimating abundances across all proteins are still missing. Here, a novel method is proposed using deep learning to extrapolate the observed protein expression values in label-free MS experiments to all proteins, leveraging gene functional annotations and RNA measurements as key predictive attributes. This method is tested on four datasets, in…
Surfing transcriptomic landscapes. A step beyond the annotation of chromosome 16 proteome
2013
All participating laboratories are members of ProteoRed-ISCIII.-- et al.
General Statistical Framework for Quantitative Proteomics by Stable Isotope Labeling
2014
Pedro J. Navarro et al.
CiliaCarta: An integrated and validated compendium of ciliary genes
2019
The cilium is an essential organelle at the surface of mammalian cells whose dysfunction causes a wide range of genetic diseases collectively called ciliopathies. The current rate at which new ciliopathy genes are identified suggests that many ciliary components remain undiscovered. We generated and rigorously analyzed genomic, proteomic, transcriptomic and evolutionary data and systematically integrated these using Bayesian statistics into a predictive score for ciliary function. This resulted in 285 candidate ciliary genes. We generated independent experimental evidence of ciliary associations for 24 out of 36 analyzed candidate proteins using multiple cell and animal model systems (mouse…