|
|
The potential applications are wide ranging and include protein design, all-atom model reconstruction, and crystallographic refinement. We have a prototype of Probik (Protein Backbone motion by Inverse Kinematics), a tool for flexible and intuitive manipulation of protein backbones segments using inverse kinematics. We have also developed a Monte Carlo algorithm with Probik to manipulate longer backbone segments. The inverse kinematics solver is based on the research of M. Raghavan and B. Roth and on an implementation of their algorithm with improved numerical properties by D. Manocha and J. Canny. It is an exact solver that allows us to examine all possible reconstructions for each potential change to a backbone bond angle; numerical inverse kinematics solvers generally calculate single solutions. It also gives us more information from which to analyze the properties of a backbone's possible motions. We have used Probik to estimate the derivatives of backbone motion with respect to individual bond angles. We have also demonstrated that the number of bond angles needed to manipulate short protein backbones can be reduced to four using principal component analysis. These two results appeared at WAFR 2004 in July 2004. We also investigated how Probik’s ability to manipulate longer backbone segments might be used to close loops or gaps in protein backbones, as will be reported at the end of summer 2004.
We are concluding work on protein C-alpha chain building using four-body statistical potentials. Tropsha et al. have developed a four-body potential based on the observed occurrence of four residue clusters in a large training set of known structures. This potential has been used to discriminate folded structures by Krishnamoorthy and Tropsha and to evaluate decoy structures built by a simpler two-body potential with the lattice chain growth algorithm by Gan and Tropsha. We have improved the computation of the four-body potential using incremental Delaunay triangulation to allow its use as the energy function for the chain growth algorithm and by using an of-lattice model. These results will be presented at the Second CGAL User Workshop in June 2004. This produces decoys with better average properties. We have developed a method to filter the decoys by calculating for each decoy the matrix of pair-wise distances between C-alpha atoms. Pande et al. have demonstrated that the average of these matrices over a set of decoys is usually closer to the native structure’s matrix than the majority of the individual decoy matrices. We use this fact to select a subset of the decoys from which we calculate a new averaged matrix. We finally reconstruct C-alpha atom coordinates based on the averaged matrix of pair-wise distances to produce final structures. For most proteins of up to 75 residues, we are able to produce structures from a set of 1000 decoys that are comparable to the structures Gan produced with millions of decoys. Completion of the work and submission of results scheduled for June 1, 2004.
Many biochemistry applications use simplified models of protein structures, such as the common C-alpha only model. Often the simple models must be turned into all-atom models for further analysis. This conversion can require significant human input. We have developed a new algorithm that robustly converts a C-alpha only representation of a protein backbone into an all atom model with little or no human input. Our algorithm extends research by R. Kolodny that rebuilds C-alpha chains from known fragments and demonstrates that libraries of small fragments can model protein structure well. We select an all-atom representation for each C-alpha fragment with a simple optimization search and resolve the discrepancies in overlapping fragments with the Probik manipulation tools. Work on this project in ongoing with an expected submission of results by Sept.1 2004.