6533b7d1fe1ef96bd125c48a
RESEARCH PRODUCT
Bioinformatic flowchart and database to investigate the origins and diversity of Clan AA peptidases
Andrés MoyaCarlos LlorensRicardo FutamiGabriel Renaudsubject
Protein familySequence analysisImmunologyProtein domainMolecular Sequence DataBiologycomputer.software_genreGeneral Biochemistry Genetics and Molecular BiologyProtein Structure SecondaryPhylogeneticsSequence Analysis ProteinSoftware DesignConsensus SequenceConsensus sequenceAspartic Acid EndopeptidasesClanAmino Acid SequenceDatabases ProteinPeptide sequencelcsh:QH301-705.5Ecology Evolution Behavior and SystematicsPhylogenyDatabaseAgricultural and Biological Sciences(all)Biochemistry Genetics and Molecular Biology(all)Applied MathematicsResearchComputational BiologyGenetic VariationGene AnnotationTemplates GeneticMarkov ChainsProtein Structure Tertiarylcsh:Biology (General)Modeling and SimulationGeneral Agricultural and Biological Sciencescomputerdescription
Abstract Background Clan AA of aspartic peptidases relates the family of pepsin monomers evolutionarily with all dimeric peptidases encoded by eukaryotic LTR retroelements. Recent findings describing various pools of single-domain nonviral host peptidases, in prokaryotes and eukaryotes, indicate that the diversity of clan AA is larger than previously thought. The ensuing approach to investigate this enzyme group is by studying its phylogeny. However, clan AA is a difficult case to study due to the low similarity and different rates of evolution. This work is an ongoing attempt to investigate the different clan AA families to understand the cause of their diversity. Results In this paper, we describe in-progress database and bioinformatic flowchart designed to characterize the clan AA protein domain based on all possible protein families through ancestral reconstructions, sequence logos, and hidden markov models (HMMs). The flowchart includes the characterization of a major consensus sequence based on 6 amino acid patterns with correspondence with Andreeva's model, the structural template describing the clan AA peptidase fold. The set of tools is work in progress we have organized in a database within the GyDB project, referred to as Clan AA Reference Database http://gydb.uv.es/gydb/phylogeny.php?tree=caard. Conclusion The pre-existing classification combined with the evolutionary history of LTR retroelements permits a consistent taxonomical collection of sequence logos and HMMs. This set is useful for gene annotation but also a reference to evaluate the diversity of, and the relationships among, the different families. Comparisons among HMMs suggest a common ancestor for all dimeric clan AA peptidases that is halfway between single-domain nonviral peptidases and those coded by Ty3/Gypsy LTR retroelements. Sequence logos reveal how all clan AA families follow similar protein domain architecture related to the peptidase fold. In particular, each family nucleates a particular consensus motif in the sequence position related to the flap. The different motifs constitute a network where an alanine-asparagine-like variable motif predominates, instead of the canonical flap of the HIV-1 peptidase and closer relatives. Reviewers This article was reviewed by Daniel H. Haft, Vladimir Kapitonov (nominated by Jerry Jurka), and Ben M. Dunn (nominated by Claus Wilke).
year | journal | country | edition | language |
---|---|---|---|---|
2009-01-01 | Biology Direct |