The National Institute for General Medical Studies’ Protein Structure Initiative is preparing to kick-start its production phase with the release of two new RFAs in the next month or two, and a proposed $75 million annual budget, PSI director John Norvell told ProteoMonitor.
This new push follows the PSI’s 5-year pilot phase, which has involved the funding of nine structural genomics centers across the US with $65 million in FY2003 and in FY2004, as well as about 25 to 30 smaller research grants funded through traditional NIGMS funding mechanisms.
The PSI’s stated goal is to “make the 3-dimensional atomic level structures of most proteins easily available from knowledge of their corresponding DNA sequences.” The program’s pilot phase officially expires at the end of this fiscal year, and “the current [nine] centers wouldn’t exist beyond that,” Norvell said, noting that the centers could reapply to participate in the production phase slated to begin in 2005. He emphasized, however, that the new applications will be judged in an open process, and that previous centers might wish to “reorganize” to fit into the layout of the next phase. The nine current pilot centers are headquartered at: Lawrence Berkeley National Laboratory, the University of Wisconsin, Scripps Research Institute, Argonne National Laboratory, Structural Genomix, Rutgers University, the University of Georgia, the University of Washington, and Los Alamos National Laboratory.
The first new RFA, which will lay out the funding details for the establishment of three to five large scale centers for high-throughput structure production — each of which will be able to ask for up to $12 million in total costs per year — is “awfully close to being complete — there [are] just a few more people to sign off on it,” Norvell said.
The mandate of each large-scale center will be to produce at least 200 non-redundant unique protein structures per year, and to further develop the relevant accompanying technologies and methodologies that the pilot phase pioneered. At least one other RFA will fund a total of three to six small-scale centers at a proposed budget of up to $4.5 million in total costs per center. Two types of small-scale centers are proposed: One type will focus on developing tactics to solve the structures of difficult-to-crack proteins such as membrane and higher eukaryotic proteins, and the other will focus on structures related to particular disease states. The latter cate-gory of small centers will be jointly funded by specific disease institutes at the NIH, Norvell added. “All these together would be part of this whole PSI network that would begin in 2005,” he said.
The emphasis of the PSI in both phases is to produce unique, non-redundant protein structures — that is, to divide the proteome into sequence-based families that have greater than 30 percent sequence identity, and then solve only one representative structure per family. “[E]ven if the success rate for any one is relatively low, the fact that you have several different targets from each family gives you several to choose from,” Norvell said. The idea is that once a representative structure is available, in silico comparative modeling can uncover the rest of the proteins in that family.
The pilot phase showed that solving representative structures could be done. In the next stage, Novell said, “the idea is to emphasize larger families which will then have a greater benefit to the scientific community because there will be more bang for your buck.” This greater yield would be accomplished through comparative modeling, which would be done in a “separate [NIGMS] effort.”
A driving force in this exercise is the desire to improve upon the current efficiency of the Protein Data Bank, the public repository of 24,000 3D structures into which the PSI’s structures are also entered. “[M]ost of the structures in the PDB have a higher degree of redundancy — only about 10 percent are really new as defined by their sequence identity,” Norvell said. The PDB received a $30 million boost last month from the NSF (see PM 2-6-04).
In particular, the PSI pilot phase also demonstrated the ability to “put together a pipeline and show you could construct it and scale it up,” said Norvell. “That’s taken a major advancement in robotic instruments, laboratory and data management systems, putting together staff, developing SOPs, and … really learning to make the whole pipeline work in an efficient way.” As examples of interesting pilot phase technologies, Norvell pointed to a solubility engineering technique developed at the Los Alamos center; a cell-free synthesis technique developed at the Wisconsin center and at a collaborating Japanese company; and an NMR data analysis technique developed at Rutgers.
This advancement in technique during the pilot phase “has developed a tremendous amount of technology that is going to solve a number of structures,” said Keith Hodgson, director of the Stanford Synchrotron Radiation Laboratory, and head of the structure determination core at the Scripps-based PSI center. Hodgson indicated his intention to be involved in later phases of the PSI, at least through the auspices of the SSRL.
But a number of challenges remain, according to Norvell. In particular, he said, there is the need to get everyone together in a more organized and focused way. “Management and administration from the central level is a big challenge — all the scientists running these [projects] have been trained to look at [the problem] in a different way,” he said.”We need a different level of organization.”
This organization will eventually extend beyond the PSI. As part of its $2.1 billion roadmap initiative, the NIH is planning to fund a set of protein production facilities for the expression and purification of membrane proteins (see PM 10-3-03). Norvell, who is involved in the development of this program, said that there will be interactions between the two programs.
In addition, the PSI is beginning to think more internationally:The initiative directors have talked in particular with the Wellcome Trust in the UK about developing policy statements, and an international meeting of scientists involved in structure initiatives will take place this November, Norvell said. “I think it’s not really at the point of collaboration at this point, but as these pilots around the world progress, there will probably be reasons to think about additional discussions about target selection and maybe even sharing of facilities,” he said.
In the meantime Norvell has his hands full preparing for the next phase of PSI, and working on some complementary projects. These projects include a pair of open databases designed as supplements to the PDB. One contains lists of the targets that the PSI centers are considering attacking, updated weekly; the other — nicknamed PEPC for protein expression, purification, and crystallization, and under construction now — will house all the information involved in these four steps that “is typically lost except for the one that works,” in a searchable format, Norvell said. In addition, the PSI will fund $50,000 mini-projects to conduct functional studies on proteins that have solved structures.