| Alpha-complex potentials for proteins |
| (Guibas; Koehl, Zomorodian) |
| The traditional approach to knowledge-based prediction
of protein structures derives an energy potential on atomic interactions
as a function of statistical information about distances between the atoms.
Last year, as part of this grant, Carter et al. examined using four-body
likelihood potentials using a tetrahedralization of the atom centers.
Multiple researchers have observed that the distance potential does not contain information for long interactions. Based on this observation, we examine defining new energy potentials based on alpha-complexes (an alpha- complex is a subcomplex of the Delaunay complex, where the alpha parameter controls the degree to which the complex captures local geometry). Our approach naturally extends from two-body potentials (edges) to three-body (triangles) and four-body (tetrahedra) potentials, although the number of possible simplex types increases, and we will need larger protein databases to obtain good data. Preliminary experiments show that our two-body potential performs as well as the all-pairs potential, while using a tenth of the data. We plan to design better methods for identifying significant simplices for the potential computation.
The above figure shows the distance histogram for Alanine-Lysine Alpha carbons. The graph displays both the all-pair calculation, for distances less than 20 angstroms, and the one for pairs that occur as edges in the alpha-complex, with alpha equal to 10 angstroms. The database consists of 2,145 protein folds from SCOP. We only use the alpha carbons to compute the pairs. Note that the alpha complex pairs accurately capture low-distance interaction.
These figures show cRMS-potential of
mean force scatterplots for two databases. The first data-base consists
of all pairs of alpha-carbon atoms that are less than 20 angstroms in
distance. The second uses alpha-carbon pairs that occur as edges in the
alpha-complex of the protein, with alpha set to 10 Angstroms. The three
graphs show potential calculations using all pairs from the first database,
and all pairs or alpha-complex pairs from the second. Note that the alpha-complex
query on the first database is competitive with the all-pairs method,
although it uses only about a tenth of the pairs. The alpha-complex database
shows degradation in higher cRMS. |