6533b832fe1ef96bd129ab4a

RESEARCH PRODUCT

false

subject

0301 basic medicinechemistry.chemical_classification030102 biochemistry & molecular biologybiologyProtein familyChemistryProtein domainSequence FeatureA domainGeneral MedicineComputational biologybiology.organism_classificationGeneral Biochemistry Genetics and Molecular BiologyAmino acid03 medical and health sciences030104 developmental biologyAmino acid compositionProteomeArchaea

description

Amino acid composition is a sequence feature that has been extensively used to characterize proteomes of many species and protein families. Yet the analysis of amino acid composition of protein domains and the linkers connecting them has received less attention. Here, we perform both a comprehensive full-proteome amino acid composition analysis and a similar analysis focusing on domains and linkers, to uncover domain- or linker-specific differential amino acid usage patterns. The amino acid composition in the 38 proteomes studied showcase the greater variability found in archaea and bacteria species compared to eukaryotes. When focusing on domains and linkers, we describe the preferential use of polar residues in linkers and hydrophobic residues in domains. To let any user perform this analysis on a given domain (or set of them), we developed a dedicated R script called RACCOON, which can be easily used and can provide interesting insights into the compositional differences between a domain and its surrounding linkers.