Amidst a flurry of quarterly earnings reports and Lab Automation product releases (see table, p. 6), structural proteomics has graced the newswires in the past several weeks with its own set of announcements in the area of X-ray crystallography automation. Last week, a pair of California scientists published a paper describing a computer program that entirely automates X-ray diffraction image analysis, then Stanford unveiled its newest synchrotron light source for X-ray analysis, the SPEAR3. These events were sandwiched between an announcement from the NSF that it had provided another $30 million over the next five years to support the Protein Data Bank, which currently contains the 3D structures of 24,000 proteins; and an announcement this week that Fluidigm, a crystallization-on-a-chip company, had licensed patented submicroliter crystallization technology from San Diego-based Syrrx.
ProteoMonitor talked this week with some of the players behind last week’s X-ray crystallography technology developments — and some of the leaders of the large structural proteomics initiatives that will likely put the technology to use — about how these developments might help reach the goal of making X-ray crystallography a truly high-throughput proteomic technique, and what challenges remain.
Calling Santa’s Little Helpers
X-ray crystallography is a notoriously slow and expensive process — even the best high-throughput centers can currently hope to solve at most about 100 structures per year, and the Protein Structure Initiative, sponsored by the National Institute for General Medical Sciences, estimates the current average cost per structure to be about $240,000. There are several widely acknowledged bottlenecks: producing enough of the desired protein, producing viable crystals from that protein, arranging for time at one of a few powerful light sources such as the Advanced Photon Source at Argonne National Laboratory, and doing back-end analyses of diffraction data and final structure validation. None of these processes are as fast as other types of proteomics techniques, so automation is key.
James Holton, who now operates the Advanced Light Source at Lawrence Berkeley Laboratory, thinks he has found an automation solution to much of the back-end backlog in Elves. The software, which Holton and his University of California Berkeley colleague Thomas Alber described in a PNAS paper published online last week, “is the first expert system for protein crystallography that can do everything from image processing to model building all by itself,” Holton said. “It’s got enough of the rules right that it can actually succeed in the whole process without human intervention.” Holton said that the sample structure he described in the PNAS paper took him six months to solve going through all the data processing, while “Elves can do it in under an hour.” This speed increase, Holton said, was due entirely to the Elves automation. He acknowledged, however, that this rate was not exactly typical: although speed varied from structure to structure depending on parameters like unit cell volume, the average time period required to complete the process is about 3 to 4 days. To date, Holton’s record for fastest time is 19 minutes.
Elves works by combining existing data analysis programs such as Wedger and Scaler, which use various parameters to help build a model, and essentially automating the steps in between with the help of an English-language user interface. “It’s running the exact same programs and doing it in the exact same way: it’s just not making any mistakes,” he said.
Holton believes that when it comes to crystal structure determination, human mistakes are the biggest problem holding up the march to high-throughput capabilities. “I’ve found that the preponderance of delays in modern structure determination is because of human error — that’s where the bottleneck comes in,” he said. By eliminating that error, he said, speed of analysis would be limited only by the algorithms and computer speed.
Two companies have already bought into his idea: Plexxikon, which announced a deal in May to co-crystallize targets and inhibitors for Genentech (see PM 5-26-03), and Sugen, now part of Pfizer, have already licensed the software for an undisclosed fee, Holton said. Now that the PNAS paper has been published, Elves will be made freely available to non-commercial groups, while companies will still pay a licensing fee. Holton said he has been unofficially distributing the software for free for “the last couple of years,” but that soon it will be available for official web download once the setup “meets the requirements of Berkeley’s lawyers.”
Funding for Elves came from the TB Structural Genomics Consortium, one of the nine centers funded by the NIGMS PSI.
Shining a Brighter, Faster Light
Now that Holton has worked out the back-end, he has turned his attention to the data collection bottleneck — the point at which the light is shone on the protein crystal to get an X-ray diffraction pattern for analysis. “I think the bottleneck for [the] data collection part of things is that synchrotron visits are too long and too infrequent. You have to wait about a year to get time, and then you have 24 hours to do everthing,” he said. Holton’s proposed solution: a system he called “FedEx crystallography,” in which someone can send in a small 10 micron crystal sample for a “micro-shift,” get a result back, and then have time to think before proceeding to the lengthy and costly step of producing a larger crystal. “Once you’ve got these crystals, you need to know if they’re any good. And you need to know if they’re good sooner rather than later,” he said, noting that small sample crystals make it hard to obtain coveted synchrotron time. “You can burn close to $1 million turning a 10 micron crystal into 300. So if it turns out to be salt — well, everyone has a story.”
Holton pointed to Fluidigm’s technology as another potential solution to the problem of large crystal requirements. “I think technologies like these nano-drops that Fluidigm is doing where you can do experiments with less material are probably going to be important,” he said.
Until FedEx takes an interest in crystallography, or Fluidigm’s nano-vision comes to fruition, more time and faster processing can help ease the long synchrotron lines, says Keith Hodgson, director of the Stanford Synchrotron Radiation Laboratory. Hodgson thinks that his lab’s newest offering — the SPEAR3 synchrotron — will improve throughput significantly.
This system, an upgrade on the SPEAR2, “puts SPEAR in the class of being competitive with the brightest light sources in the world,” such as Berkeley’s ALS and Argonne’s APS, according to Hodgson. “The number of photons that we can deliver to a sample … goes up on beamlines from a factor of 5 to a factor of 200,” he said. “That means we can collect data faster, and the average throughput will go up.” Hodgson said that the first of the beamlines to be set up on the new machine would be for macromolecular crystallography — an activity that he said takes up 30 to 50 percent of beamline time in the US. The synchrotron is jointly funded by the NIH and US Department of Energy to the tune of $28 million for each system.
Faster structures are not the only advantage the new machine will bring to crystallographers, according to Hodgson. The stronger beams help solve “complex molecular assemblies at increasingly high resolution … of 3 angstroms or below,” he said. In addition, the beamlines are integrated with software and hardware control accessible from remote locations, as well as a lot of automation — the combination of which Hodgson believes “will herald a new area of accessibility of macromolecular crystallography to a broader range of scientists.” He said the automated process will ultimately take about 3.5 minutes per sample screened.
In addition to operating the synchrotron, Hodgson is also deeply involved with the Joint Center for Structural Genomics based at the Scripps Research Institute, another one of the nine pilot centers funded by the NIGMS PSI pilot phase. The stated long-range goal of the PSI is “to make the three-dimensional atomic-level structures of most proteins easily available from knowledge of their corresponding DNA sequences.” The goals of the pilot phase are somewhat less ambitious. “The idea of pilots is, of course, we don’t quite know how to put it all together, and there need to be different things tried,” said John Norvell of NIGMS, who directs the program.
The PSI pilot phase, which began in 1999, is funded with $65 million per year and will run through the end of this year. The production phase, set to begin in 2005, will likely boost annual funding to $75 million per year — although existing centers will have to reapply to be considered for continued participation, Norvell said. Regardless of the status of JCSG in the production phase, however, both Norvell and Hodgson predicted that the new synchrotron would play a significant role. “That’s part of the reason the NIH was willing to spend $28 million of its valuable funds,” Hodgson said. “They are looking to the future to see that the PSI will increasingly demand synchrotron resources, and the better they are, the more successful the science will be that comes from that initiative.”
Said Norvell, “Certainly synchrotrons like the upgrade at Stanford are crucial to this process — if you’re going to automate the process you have to have a devoted instrument.”
Hodgson said that the first SPEAR3 beamline will be up and running within a month, with 30 more beamlines going up over the next several months.
The Final Bottleneck: Getting the Crystal
As much as automation in the middle and back-ends of the X-ray structure determination process may help things, the ultimate bottleneck for many proteins is forming a crystal in the first place. X-ray crystallography by nature “is tedious and it’s kind of a black box,” Cheryl Arrowsmith, chief scientist of the Canadian branch of the Structural Genomics Consortium said (see pioneer, p. 7). Norvell, Hodgson, and Holton all named membrane proteins as particular challenges. Norvell also noted that the PSI production phase would include a focus on this obstacle, and that the NIH Roadmap initiative to form Protein Production Facilities (see PM 10-3-03) would focus on overcoming obstacles to production of these proteins. Such obstacles must be overcome, the scientists said, before X-ray crystallography can truly be a high-throughput proteomic technique. “Today, if we wanted to feed this pipeline with membrane-bound proteins, the trickle would be silence at the end of the pipeline,” said Hodgson. “We’re very well-poised to take advantage of the methodologies. … [but] this is an example of where the front-end has to develop new technologies to feed that pipeline.”
See next week’s issue for a sneak peak at the PSI’s upcoming production phase.