0000000001212298
AUTHOR
Gaetano T. Montelione
Quality assessment of protein NMR structures.
Biomolecular NMR structures are now routinely used in biology, chemistry, and bioinformatics. Methods and metrics for assessing the accuracy and precision of protein NMR structures are beginning to be standardized across the biological NMR community. These include both knowledge-based assessment metrics, parameterized from the database of protein structures, and model versus data assessment metrics. On line servers are available that provide comprehensive protein structure quality assessment reports, and efforts are in progress by the world-wide Protein Data Bank (wwPDB) to develop a biomolecular NMR structure quality assessment pipeline as part of the structure deposition process. These qu…
High-resolution solution NMR structure of the Z domain of staphylococcal protein A
Staphylococcal protein A (SpA) is a cell-wall-bound pathogenicity factor from the bacterium Staphylococcus aureus. Because of their small size and immunoglobulin (IgG)-binding activities, domains of protein A are targets for protein engineering efforts and for the development of computational approaches for de novo protein folding. The NMR solution structure of an engineered IgG-binding domain of SpA, the Z domain (an analog of the B domain of SpA), has been determined by simulated annealing with restrained molecular dynamics on the basis of 671 conformational constraints. The Z domain contains three well-defined alpha-helices corresponding to polypeptide segments Lys7 to Leu17 (helix 1), G…
Simulated annealing with restrained molecular dynamics using a flexible restraint potential: Theory and evaluation with simulated NMR constraints
A new functional representation of NMR-derived distance constraints, the flexible restraint potential, has been implemented in the program CONGEN (Bruccoleri RE, Karplus M, 1987, Biopolymers 26:137-168) for molecular structure generation. In addition, flat-bottomed restraint potentials for representing dihedral angle and vicinal scalar coupling constraints have been introduced into CONGEN. An effective simulated annealing (SA) protocol that combines both weight annealing and temperature annealing is described. Calculations have been performed using ideal simulated NMR constraints, in order to evaluate the use of restrained molecular dynamics (MD) with these target functions as implemented i…
Homology modeling using simulated annealing of restrained molecular dynamics and conformational search calculations with CONGEN: application in predicting the three-dimensional structure of murine homeodomain Msx-1.
We have developed an automatic approach for homology modeling using restrained molecular dynamics and simulated annealing procedures, together with conformational search algorithms available in the molecular mechanics program CONGEN (Bruccoleri RE, Karplus M, 1987, Biopolymers 26:137-168). The accuracy of the method is validated by "predicting" structures of two homeodomain proteins with known three-dimensional structures, and then applied to predict the three-dimensional structure of the homeodomain of the murine Msx-1 transcription factor. Regions of the unknown protein structure that are highly homologous to the known template structure are constrained by "homology distance constraints,"…
Assessing model accuracy using the homology modeling automatically software
Homology modeling is a powerful technique that greatly increases the value of experimental structure determination by using the structural information of one protein to predict the structures of homologous proteins. We have previously described a method of homology modeling by satisfaction of spatial restraints (Li et al., Protein Sci 1997;6:956-970). The Homology Modeling Automatically (HOMA) web site,http://www-nmr.cabm.rutgers.edu/HOMA, is a new tool, using this method to predict 3D structure of a target protein based on the sequence alignment of the target protein to a template protein and the structure coordinates of the template. The user is presented with the resulting models, togeth…
The mechanism of binding staphylococcal protein A to immunoglobin G does not involve helix unwinding.
Structural changes in staphylococcal protein A (SpA) upon its binding to the constant region (Fc) of immunoglobulin G (IgG) have been studied by nuclear magnetic resonance and circular dichroism (CD) spectroscopy. The NMR solution structure of the engineered IgG-binding domain of SpA, the Z domain (an analogue of the B domain of SpA), has been determined by simulated annealing with molecular dynamics, using 599 distance and dihedral angle constraints. Domain Z contains three alpha-helices in the polypeptide segments Lys7 to His18 (helix 1), Glu25 to Asp36 (helix 2), and Ser41 to Ala54 (helix 3). The overall chain fold is an antiparallel three-helical bundle. This is in contrast to the previ…
A topology-constrained distance network algorithm for protein structure determination from NOESY data.
This article formulates the multidi- mensional nuclear Overhauser effect spectroscopy (NOESY) interpretation problem using graph theory and presents a novel, bottom-up, topology-con- strained distance network analysis algorithm for NOESY cross peak interpretation using assigned resonances. AutoStructure is a software suite that implements this topology-constrained distance net- work analysis algorithm and iteratively generates structures using the three-dimensional (3D) protein structure calculation programs XPLOR/CNS or DY- ANA. The minimum input for AutoStructure in- cludes the amino acid sequence, a list of resonance assignments, and lists of 2D, 3D, and/or 4D-NOESY cross peaks. AutoStru…
Analysis of the structural quality of the CASD-NMR 2013 entries
We performed a comprehensive structure validation of both automated and manually generated structures of the 10 targets of the CASD-NMR-2013 effort. We established that automated structure determination protocols are capable of reliably producing structures of comparable accuracy and quality to those generated by a skilled researcher, at least for small, single domain proteins such as the ten targets tested. The most robust results appear to be obtained when NOESY peak lists are used either as the primary input data or to augment chemical shift data without the need to manually filter such lists. A detailed analysis of the long-range NOE restraints generated by the different programs from t…
Simulated annealing with restrained molecular dynamics using CONGEN: Energy refinement of the NMR solution structures of epidermal and type-αtransforming growth factors
The new functionality of the program CONGEN (Bruccoleri RE, Karplus M, 1987, Biopolymers 26:137-168; Bassolino-Klimas D et al., 1996, Protein Sci 5:593-603) has been applied for energy refinement of two previously determined solution NMR structures, murine epidermal growth factor (mEGF) and human type-alpha transforming growth factor (hTGF alpha). A summary of considerations used in converting experimental NMR data into distance constraints for CONGEN is presented. A general protocol for simulated annealing with restrained molecular dynamics is applied to generate NMR solution structures using CONGEN together with real experimental NMR data. A total of 730 NMR-derived constraints for mEGF a…
Homology modeling of an RNP domain from a human RNA-binding protein: Homology-constrained energy optimization provides a criterion for distinguishing potential sequence alignments
We have recently described an automated approach for homology modeling using restrained molecular dynamics and simulated annealing procedures (Li et al, Protein Sci., 6:956-970,1997). We have employed this approach for constructing a homology model of the putative RNA-binding domain of the human RNA-binding protein with multiple splice sites (RBP-MS). The regions of RBP-MS which are homologous to the template protein snRNP U1A were constrained by "homology distance constraints," while the conformation of the non-homologous regions were defined only by a potential energy function. A full energy function without explicit solvent was employed to ensure that the calculated structures have good …
Total Correlation Spectroscopy (TOCSY) of Proteins Using Coaddition of Spectra Recorded with Several Mixing Times
Protein NMR Structures Refined with Rosetta Have Higher Accuracy Relative to Corresponding X-ray Crystal Structures
We have found that refinement of protein NMR structures using Rosetta with experimental NMR restraints yields more accurate protein NMR structures than those that have been deposited in the PDB using standard refinement protocols. Using 40 pairs of NMR and X-ray crystal structures determined by the Northeast Structural Genomics Consortium, for proteins ranging in size from 5-22 kDa, restrained Rosetta refined structures fit better to the raw experimental data, are in better agreement with their X-ray counterparts, and have better phasing power compared to conventionally determined NMR structures. For 37 proteins for which NMR ensembles were available and which had similar structures in solu…
A novel RNA-binding motif in influenza A virus non-structural protein 1.
The solution NMR structure of the RNA-binding domain from influenza virus non-structural protein 1 exhibits a novel dimeric six-helical protein fold. Distributions of basic residues and conserved salt bridges of dimeric NS1(1-73) suggest that the face containing antiparallel helices 2 and 2′ forms a novel arginine-rich nucleic acid binding motif.
Combined use of 13C chemical shift and 1H alpha-13C alpha heteronuclear NOE data in monitoring a protein NMR structure refinement.
A large portion of the 13C resonance assignments for murine epidermal growth factor (mEGF) at pH 3.1 and 28 degrees C has been determined at natural isotope abundance. Sequence-specific 13C assignments are reported for 100% of the assignable C alpha, 96% of the C beta, 86% of the aromatic and 70% of the remaining peripheral aliphatic resonances of mEGF. A good correlation was observed between experimental and back-calculated C alpha chemical shifts for regions of regular beta-sheet structure. These assignments also provide the basis for interpreting 1H alpha-13C alpha heteronuclear NOE (HNOE) values in mEGF at natural isotope abundance. Some of the backbone polypeptide segments with high in…
Evaluating protein structures determined by structural genomics consortia.
Structural genomics projects are providing large quantities of new 3D structural data for proteins. To monitor the quality of these data, we have developed the protein structure validation software suite (PSVS), for assessment of protein structures generated by NMR or X-ray crystallographic methods. PSVS is broadly applicable for structure quality assessment in structural biology projects. The software integrates under a single interface analyses from several widely-used structure quality evaluation tools, including PROCHECK (Laskowski et al., J Appl Crystallog 1993;26:283-291), MolProbity (Lovell et al., Proteins 2003;50:437-450), Verify3D (Luthy et al., Nature 1992;356:83-85), ProsaII (Si…
High Resolution Solution NMR Structure of the Z Domain of Staphylococcal Protein A. Analysis of Secondary Structure for Free Z Domain and Bounded to IgG Antibody
Staphylococcal protein A (SpA) is a cell-wall-bound pathogenicity factor from the bacterium Staphylcoccus aureus. It exhibits tight binding to many IgG, IgA and IgM molecules at site(s) different from antigen-combining site. Because of their small size and immunoglobulin (IgG)-binding activities, domains of protein A are important targets for protein engineering efforts and for the development of computational approaches for de novo protein folding.
The second round of Critical Assessment of Automated Structure Determination of Proteins by NMR: CASD-NMR-2013
The second round of the community-wide initiative Critical Assessment of automated Structure Determination of Proteins by NMR (CASD-NMR-2013) comprised ten blind target datasets, consisting of unprocessed spectral data, assigned chemical shift lists and unassigned NOESY peak and RDC lists, that were made available in both curated (i.e. manually refined) or un-curated (i.e. automatically generated) form. Ten structure calculation programs, using fully automated protocols only, generated a total of 164 three-dimensional structures (entries) for the ten targets, sometimes using both curated and un-curated lists to generate multiple entries for a single target. The accuracy of the entries could…
A community resource of experimental data for NMR / X-ray crystal structure pairs
We have developed an online NMR / X-ray Structure Pair Data Repository. The NIGMS Protein Structure Initiative (PSI) has provided many valuable reagents, 3D structures, and technologies for structural biology. The Northeast Structural Genomics Consortium was one of several PSI centers. NESG used both X-ray crystallography and NMR spectroscopy for protein structure determination. A key goal of the PSI was to provide experimental structures for at least one representative of each of hundreds of targeted protein domain families. In some cases, structures for identical (or nearly identical) constructs were determined by both NMR and X-ray crystallography. NMR spectroscopy and X-ray diffraction …
Protein structure prediction assisted with sparse NMR data in CASP13
CASP13 has investigated the impact of sparse NMR data on the accuracy of protein structure prediction. NOESY and 15 N-1 H residual dipolar coupling data, typical of that obtained for 15 N,13 C-enriched, perdeuterated proteins up to about 40 kDa, were simulated for 11 CASP13 targets ranging in size from 80 to 326 residues. For several targets, two prediction groups generated models that are more accurate than those produced using baseline methods. Real NMR data collected for a de novo designed protein were also provided to predictors, including one data set in which only backbone resonance assignments were available. Some NMR-assisted prediction groups also did very well with these data. CAS…
NMR Exchange Format: a unified and open standard for representation of NMR restraint data
SCOPUS: le.j