Scientists from the Institute of Systems Biology and the Swiss Federal Institute of Technology announced this week at the Human Proteome Organization's annual meeting in Sydney that they have completed an initial mass spectrometry-based map of the entire human proteome.
The map comprises more than 170,000 single reaction monitoring assays – one each for at least five proteotypic peptides for each of the 20,300 human genes currently annotated as protein-encoding – and represents the first "complete, comprehensive set of assays" covering the human proteome, Robert Moritz, director of proteomics at ISB and the project's co-leader along with ETH Zurich's Ruedi Aebersold, told ProteoMonitor.
The mapping project launched in October 2009 with funding from the National Human Genome Research Institute of the National Institutes of Health, which provided $2.7 million in direct funds under the American Recovery and Reinvestment Act. It also received €2.7 million ($4.1 million) from the European Research Council (PM 10/23/2009).
The aim of the work is to build a set of standardized SRM assays that proteomics researchers can use to more easily investigate proteins of interest. Building SRM assays from scratch is a time-consuming process in which researchers must identify the peptides on which to base assays for a given protein; determine the best transition to explore using mass spec; and develop and optimize methods for performing the separation and the assay.
The ISB/ETH SRMAtlas essentially takes care of these steps for proteomics researchers, allowing them to detect and quantify proteins important to their research without going through the process of developing mass spec assays for each protein.
"This is the first time that any researcher has the capability of choosing any set of proteins they are interested in and having complete, comprehensive [SRM assay] coverage of the proteins of interest," Moritz said.
He also noted that the database will provide researchers with a consistent, standardized resource for SRM assay development and that the proteomics community as represented at this year's HUPO meeting had embraced it as such.
"This is a clear, comprehensive database, and the HUPO conference has fully endorsed this as the pathway going forward," he said.
When the project launched last year, ISB researchers set as their goal developing assays for the four best proteotypic peptides for each of the known protein-coding genes in humans. A year later, they've managed to surpass that target, developing assays for at least five proteotypic peptides for each protein-coding gene and, in some cases, Moritz noted, developing additional assays to account for larger proteins and some post-translational modifications like N-glycosylation, phosphorylation, and acetylation.
"Some of the proteins have up to 30 peptide assays per protein," he said. "When we started looking at very large proteins we covered an extra peptide for every 10 kDa of molecular weight. We also extended the coverage of membrane proteins and secreted proteins to provide for glycosylation sites."
The researchers also developed assays targeting protein changes related to SNPs represented in the human population at frequencies greater than 30 percent.
The SRM assays were built at ISB using Agilent's 6460 triple-quadrupole LC-MS/MS platform and at ETH Zurich using QTRAP instruments from AB Sciex. The ISB researchers also used Agilent's 6530 quadrupole time-of-flight LC-MS platform to obtain full scan spectra data for the peptides. Thermo Fisher supplied peptides used in the work, and Origene provided the project with full-length purified human proteins (PM 05/14/2010).
Although the assays were built using Agilent and AB Sciex instruments, Moritz said they should be easily transferrable to other platforms.
"We've done a study on ion transportability, and they can be transported across a number of different platforms," he said. "The fragmentation pattern that was derived is quite comparable to many other manufacturers', as well. It uses standard quadrupole fragmentation."
Moritz also noted that because the data was collected in high resolution, it will be applicable to assay development on high-resolution instruments that may be released in the future.
"This future-proofs the data. It extends normal quadrupole technology, and as Agilent and others manufacture [new] high-resolution techniques, this data will be immediately applicable to that type of technology because we already have it in very high resolution," he said.
The researchers are currently doing the bioinformatics work necessary to place the assay data on the SRMAtlas website, and are preparing a paper based on the work. Once the paper has been published, the full datasets will be made available to the public, Moritz said, adding that he expected that would happen within the next four to six months.
Agilent is also working to incorporate the data into its suite of mass spec informatics products, he said.
The project will now embark on its second phase, still operating under the original $2.7 million ARRA grant. The aim of this phase, which will last an additional 12 months and also use Agilent and AB Sciex equipment and peptides and proteins provided by Thermo Fisher and Origene, is to increase the number of available assays, Moritz said.
"The total number of assays as defined by proteotypic peptides of the human proteome is on the order of 465,000. We currently have 170,000 of those. [The goal is] expanding that out to 250,000 to 300,000 and then also applying [the assays] directly to human tissue to define as many proteins as possible with them," he said.