By Aaron J. Sender
Speed is tough enough when the road is as straight as a linear genomic sequence. Even more so when the path is a protein structure that can theoretically buckle into an infinite number of contortions.
But that hasn’t stopped Gaetano Montelione of the Center for Advanced Biotechnology and Medicine and Rutgers University in New Jersey from trying to take 3-D protein structure determination into the fast lane.
He and his colleagues have developed software that, operating on a 40-processor Linux cluster, shortens the journey of nuclear magnetic resonance data to protein structure from months to hours. And if Montelione gets the 128-processor cluster he wants, then “it would be done in tens of minutes,” he says.
The team’s method of choice is NMR — often preferred when the target protein refuses to crystallize and limited to use on small proteins.
To the uninitiated, the lab’s two $1 million-a-piece 500 MHz and 600 MHz magnets look like huge steel vats more suited for brewing beer than collecting NMR spectra.
Once raw data are distilled, Montelione’s AutoAssign tool automatically links each resonance peak with the particular atom in the target protein that generated it by integrating an object-oriented database of amino acid structures with constraint propagation methods.
NMR resonance assignments are valuable even with a crystal structure available, says Montelione. “If I know the structure and I know the resonance assignments, I can screen small molecules and identify those that bind and where they bind,” he says. “You can screen hundreds to thousands of compounds per day with this method.”
Getting from assignments to structure is the compute-intensive part: The resonance assignments, combined with nuclear Overhauser effect and other data, are fed into the lab’s second program, AutoStructure. Parallel processors sift through endless possible protein shapes, finally generating 3-D structures based on calculated constraints.
The software, for which Montelione is negotiating a commercial licensing deal but keeping academic access open, is just one leg of his attempt to create a high-throughput protein structure factory.
Montelione heads a consortium that recently received $25 million from the NIH to automate protein structure determination. Together, the labs explore how crystallography and spectroscopy can complement each other. The goal: solve 180 structures annually by the fifth year.
But proteins resist speed. Experts put the number of distinct protein families at 10,000 to 20,000. Despite recent advances in x-ray crystallography and NMR spectroscopy, the complete silhouette of the human proteome is likely decades away.
With the NMR data-analysis roadblock removed, “now the bottleneck is data collection,” says Montelione. “To collect the data for a 3-D structure determination takes about five or six weeks,” he says. “That’s too slow.”
So Montelione is experimenting with Bruker NMR cryogenic probes to reduce thermal noise and is considering getting an 800 MHz spectrometer. He expects to cut the data collection time to three or four days.
“Although 3-D protein structure is a lot more work and it’s not high-throughput the way sequencing is, it has a lot more value, particularly if you can tie it in with function,” Montelione says.