A new, three-dimensional map of protein folds might make it easier for structural genomics researchers to navigate the vast space of almost 20,000 known protein structures.
The map, published in last week’s Proceedings of the National Academy of Sciences, is the creation of Sung-Hou Kim, a professor at the University of California Berkeley. “This gives us a global view of what the protein structure universe looks like,” he said.
Kim and his colleagues chose about 500 of the most common protein folds — building blocks which make up about 80 percent of the known protein structures — and used a program called DALI to calculate similarities between pairs among them. They then arranged these blocks in three-dimensional space, such that more similar folds are close together and folds that are less similar show up on the map at a distance.
This is not the first crack at protein fold cartography, Kim said, but previous maps have been two-dimensional rather than three-dimensional projections. Kim’s group also used slightly different approaches than others to calculate the pairwise similarities, he added. The new map “not only emphasizes the important role of secondary structure classes and chain topologies in the partitioning of the fold space, but also reveals the size of protein domain as an important factor in setting the overall distribution of folds,” Kim and co-authors write in their article.
The map also points to evolutionary relationships between protein folds, “so that the evolutionarily early proteins are close to the origin, and the ones that developed later seem to be coming out further away from the origin,” Kim said.
The reason why the researchers restricted themselves to the 500 most common folds is that the calculations are computationally intense. In a future version of the map, they want to include another 500 folds, which represent the remaining 20 percent of known protein structures, but “we have to go to another supercomputer to do it because of the scale of the computation,” Kim said.
But even this expanded map might not cover the universe of protein folds completely, since novel folds are expected to be discovered over the coming years by structural genomics projects. In fact, the NIGMS Protein Structure Initiative, which funds nine structural genomics centers across the US (including the Berkeley Structural Genomics Center that Kim heads) aims to increase the number of different folds, rather than to determine structures of members from the same protein family. “It will be a way of filling in this map,” said John Norvell, director of the Protein Structure Initiative. “It’s almost as if you are looking at a map of the solar system, and you know where the planets are that you know about, but now, when a new planet shows up, you know where to put it in the solar system,” said Judd Berman, senior vice president for chemistry at Affinium Pharmaceuticals, which determines protein structures by x-ray crystallography and NMR to help prioritize drug targets.
Nevertheless, the map is unlikely to facilitate any immediate advances in the development of structure-based drugs. “It is setting up this inventory of protein structures that will be useful for both understanding the basic biology and eventually for more medical purposes,” said Norvell, but “I don’t think it has any direct applications to drug design.”
While other tools already exist to analyze similarities between closely related proteins, “this lets us put a relationship between proteins that we might not have previously understood were related,” said Molly Schmid, senior vice president for clinical programs at Affinium. “And that certainly could be important to us to evaluate which proteins would make good targets for therapeutic intervention.”
But even though the new map does not give researchers a picture of individual protein families that would be granular enough to reveal potential side-effect-producing drug interactions, its global view echoes the recent concept in drug development to consider more than one potential target at a time, according to Kim. “It’s a kind of systems view of the protein space,” he said.