Researchers at IBM’s Computational Biology Center have discovered a pattern in the hydrophobicity of soluble, globular proteins that they said could help validate computer simulations of protein structures.
In last week’s issue of the Proceedings of the National Academy of Sciences, IBM researcher David Silverman and his colleagues detailed the discovery of the pattern they found in the transition between hydrophobic amino acids on the inside of a protein to hydrophilic amino acids on the outside of a protein.
Using a mathematical construct called a second-order moment, Silverman first calculated the change in distribution of hydrophobicity for three different globular proteins. He then found that a ratio of two well-defined qualities was approximately the same for all three.
“I think at that point lightning struck,” said Silverman. “I had discovered what some mathematicians or physicists might call an invariant.”
The hydrophobic ratio, as Silverman termed it, turned out to be 0.75 with a standard deviation of 0.045 for 30 soluble, globular structures selected from the Protein Data Bank. Silverman also performed the calculation on 14 “decoys” to see if the method could discriminate bogus structures. He found that the decoys were so different from the native structures that hydrophobic ratios could not even be assigned to them.
The IBM team believes that such hydrophobicity profiling could serve as a relatively sensitive scoring function to judge the accuracy of protein structure prediction algorithms.
While there are a number of methods to grade structure prediction, such as those used in the annual CASP (critical assessment of techniques for protein structure prediction) competition at Lawrence Livermore National Laboratory, Silverman said that most of them require some knowledge of the native structure.
The hydrophobicity ratio, Silverman said, could be used to verify the accuracy of a predicted structure whose native structure is not known. The computation itself takes only minutes per protein.
“This has enormous relevance in the field of proteomics,” said Ajay Royyuru, who heads up structural biology research at IBM Research. “Predicting protein structures is a key post-genomic activity because there are a lot of sequences for which we do not know the structure. Improving our ability to predict or even judge our predictions is certainly a very key activity.”
The work is complementary to IBM’s Blue Gene protein-folding project. Royyuru said the findings could guide the project in designing better functions or assessing whether a particular trajectory of folding is progressing correctly. He said that the apparent uniformity in arrangement of the amino acids could help researchers understand how and why a particular sequence folds into its final shape, but cautioned that the research is still in its early stages and it’s too soon to predict its full implications.
The team is now exploring how the ratio behaves as a function of molecular dynamics and folding and how well it serves as a scoring function compared to other scoring functions. They also intend to perform a complete characterization for all the protein structures in the PDB.
Royyuru said that the team intends to make the PDB characterization information available “once we get to the point where we have it available for lots of proteins in whatever manner is most appropriate, but it’s too early for us to say how that will be.”
Silverman has calculated the hydrophobic ratio for 47 globular proteins so far and is currently applying the method to transmembrane and ribosomal proteins to see how well the pattern holds up in other types of proteins.
“We have to check whether it works for all structures,” Silverman said. “I say, don’t worry. It’s going to work for everything. I haven’t seen one exception.”