Search results for "UniProt"
showing 6 items of 6 documents
Proteins as Functional Units of Biocalcification – An Overview
2016
High-throughput approaches such as genomics, transcriptomics and proteomics have led to the discovery of a larger set of biomineralization genes than previously foreseen. These gene lists are often difficult to decode in light of the current models of calcification. Here we overview the proteins available in UniProt (Universal Protein Resource), that were identified directly in metazoan calcium carbonate mineralized structures or known to have direct key-functions in calcification processes. Functional annotation of the protein datasets using Gene Ontology reveals that functions like carbohydrate binding, structural and catalytic activities (e.g. hydrolase) are commonly represented across t…
CRISPR sequences are sometimes erroneously translated and can contaminate public databases with spurious proteins containing spaced repeats
2020
© The Author(s) 2020.
Toward completion of the Earth’s proteome: an update a decade later
2017
Protein databases are steadily growing driven by the spread of new more efficient sequencing techniques. This growth is dominated by an increase in redundancy (homologous proteins with various degrees of sequence similarity) and by the incapability to process and curate sequence entries as fast as they are created. To understand these trends and aid bioinformatic resources that might be compromised by the increasing size of the protein sequence databases, we have created a less-redundant protein data set. In parallel, we analyzed the evolution of protein sequence databases in terms of size and redundancy. While the SwissProt database has decelerated its growth mostly because of a focus on i…
Using Deep Learning to Extrapolate Protein Expression Measurements
2020
Mass spectrometry (MS)-based quantitative proteomics experiments typically assay a subset of up to 60% of the ≈20 000 human protein coding genes. Computational methods for imputing the missing values using RNA expression data usually allow only for imputations of proteins measured in at least some of the samples. In silico methods for comprehensively estimating abundances across all proteins are still missing. Here, a novel method is proposed using deep learning to extrapolate the observed protein expression values in label-free MS experiments to all proteins, leveraging gene functional annotations and RNA measurements as key predictive attributes. This method is tested on four datasets, in…
Proteomic fingerprinting of apple fruit, juice, and cider via combinatorial peptide ligand libraries and MS analysis
2018
Combinatorial peptide ligand libraries coupled to MS was applied to extensively map the proteome of apple fruit, and to detect its presence in commercial apple juice and cider to evaluate their authenticity and genuineness. Using the Uniprot_Malus database, 96 proteins were detected in apples, among which 30 proteins were specifically captured via combinatorial peptide ligand libraries. Next, three proteins, previously recognized in fruits, were found in apple juice, which were involved in cellular metabolism of fruit maturation and in allergenic reactions. On the other hand, only one Malus allergen was identified in cider beads eluate, demonstrating that the industrial processes did not pr…
REP2: A Web Server to Detect Common Tandem Repeats in Protein Sequences
2020
Ensembles of tandem repeats (TRs) in protein sequences expand rapidly to form domains well suited for interactions with proteins. For this reason, they are relatively frequent. Some TRs have known structures and therefore it is advantageous to predict their presence in a protein sequence. However, since most TRs diverge quickly, their detection by classical sequence comparison algorithms is not very accurate. Previously, we developed a method and a web server that used curated profiles and thresholds for the detection of 11 common TRs. Here we present a new web server (REP2) that allows the analysis of TRs in both individual and aligned sequences. We provide currently precomputed analyses f…