6533b833fe1ef96bd129c423

RESEARCH PRODUCT

Distinctive attributes for predicted secondary structures at terminal sequences of non-classically secreted proteins from proteobacteria

Peteris ZikmanisInara Kampenusa

subject

terminal sequencesMultiple discriminant analysisGeneral Immunology and MicrobiologybiologyQH301-705.5General Neurosciencesecondary structureComputational biologyLinear discriminant analysisbiology.organism_classificationBioinformaticsdiscriminant analysisGeneral Biochemistry Genetics and Molecular BiologyCross-validationSecretory proteinDiscriminantprotein secretionSecretionProteobacteriaBiology (General)General Agricultural and Biological SciencesProtein secondary structureproteobacteria

description

Abstract C- and N-terminal sequences (64 amino acid residues each) of 89 non-classically secreted type I, type III and type IV proteins (Swiss-Prot/TrEMBL) from proteobacteria were transformed into predicted secondary structures. Multivariate analysis of variance (MANOVA) confirmed the significance of location (C- or N-termini) and secretion type as essential factors in respect of quantitative representations of structured (a-helices, b-strands) and unstructured (coils) elements. The profiles of secondary structures were transcripted using unequal property values for helices, strands and coils and corresponding numerical vectors (independent variables) were subjected to multiple discriminant analysis with the types of secreted proteins as the dependent variables. The set of strong predictor variables (21 property values located at the region of 2–49 residues from the C-termini) was capable to classify all three types of non-classically secreted proteins with an accuracy of 93.3% for originally and 89.9% for cross-validated (leave-one-out procedure) grouped cases. The average error rate (0.137 ± 0.015) of k-fold (k = 3; 4; 6; 8; 10; 89) cross validation affirmed an acceptable prediction accuracy of defined discriminant functions with regard to the types of non-classically secreted proteins. The proposed prediction tool could be used to specify the secretome proteins from genomic sequences as well as to assess the compatibility between secretion pathways and secretion substrates of proteobacteria.

10.2478/s11535-008-0026-5https://doaj.org/article/133a146691ac42149d1fae66814b636c