This story originally ran on May 18 and has been updated.
A study published this week purports that mass spectrometry technology is robust and can achieve reproducible results, while at the same time suggesting that the skills of researchers using the machines may not match the technical abilities of the instruments.
In the study published May 17 in the online edition of Nature Methods, members of the Human Proteome Organization report that out of 27 research facilities and mass-spec vendors that participated in a test-sample study, just seven were able to identify all 20 proteins in the sample, and only one identified all 22 tryptic peptides with a mass of 1,250 Daltons.
But rather than a failure of the mass specs used by the participants, the study's authors found other factors for the dismal success rate. In a centralized analysis of the raw data deposited by the participants, the authors found that all 20 proteins and most of the 1,250 Da peptides had been detected, indicating that the technology was robust. Missed identifications, they said, were due to other factors such as false negatives, environmental contamination, database matching, and curation of protein identifications.
"What it says [about] the technology is, if you sum up everyone's data, the results are just stunningly overwhelming," said John Bergeron, a professor of anatomy and cell biology at McGill University and a co-author of the Nature Methods report. "You basically have complete coverage of these proteins."
Even though the success rate among the participating labs was low, he said that the mistakes were "trivial" and the steps necessary to get everyone to 100 percent success easily achievable.
According to his colleagues at HUPO, the value of the test-sample study is two-fold: first, it takes a major step in demonstrating the reproducibility of proteomics; and second, it highlights the need for education and training for the technology.
Said Gil Omenn, the "big news is that the findings were there in the spectra." Omenn is a vice president at HUPO and a professor of human genetics and public health at the University of Michigan Medical School where he is also director of the Center for Computational Medicine and Bioinformatics.
Fishing for Proteins
The report is the culmination of an effort first proposed almost three years ago by the Human Proteome Organization to develop a protein-standard mixture to benchmark instrument platforms and to show the research community that reproducibility could be achieved in proteomics [see PM 07/20/06]. Preliminary results from the sample-test study were first reported by ProteoMonitor in late 2007 [see PM 09/06/07]. The results published this week provide greater insight into the state of mass spec-based proteomics methods.
For the study, HUPO chose 20 proteins with molecular weights in the 32 kDa to 110 kDa range chosen from the open reading frame collection and the mammalian gene collection.
The proteins selected had a purity of at least 95 percent, unique tryptic peptide sequences, and at least one tryptic peptide of 1,250 Da plus-or-minus 5 Da. The proteins were expressed in E. coli and purified by ion exchange and reverse-phase chromatography or by preparative electrophoresis purification from inclusion bodies.
Dried samples containing 5 picomoles of each protein were shipped on dry ice along with detailed examples of LC-MS proteomics analyses to the participating labs.
Each of the labs were told to use the National Center for Biotechnology Information nonredundant human protein database of Nov. 27, 2006, with exact matches for all 20 test-sample proteins.
[ pagebreak ]
All participants were also asked to deposit their raw data, methodology, peak lists, peptide statistics, and protein identifications into Tranche. That information was then submitted to the Proteomics Identifications Database, or PRIDE.
The 27 participants — 41 research facilities and six mass-spec vendors were originally asked to participate — were free to use any mass spec platform and procedures of their choosing.
While seven labs identified all 20 proteins, seven other labs encountered naming errors, and six experienced naming errors, false positives and redundant identifications. A final group of seven labs experienced multiple problems including trypsinization, undersampling, incomplete matching of MS/MS spectra resulting from acrylamide alkylation, and database search errors.
In addition to protein identification, participants were also asked to identify 22 tryptic peptides with a mass of 1,250 Da, six of which had cysteine residues "whose mass increased as a consequence of reduction and alkylation as routinely used before protein trypsinization," the authors wrote.
While only one lab was able to identify the 22 peptides, three others reported detecting peptides that contained cysteines.
Other groups, though, incorrectly reported 1,250 Da peptides derived from contaminated proteins, and several groups reported incorrect peptides as a result of a single missed trypsin cleavage.
Even after intervention by the test-sample study's organizers, just one additional lab reported all 1,250 Da peptides in their dataset.
Next, the study's organizers analyzed the raw data from the participants. They used a uniform protocol of database search using X! Tandem and post-processing of the Trans Proteomic Pipeline "to assign probabilities to all identifications and global false discovery rates as well as to determine the total number of tandem MS spectra assigned, number of distinct peptides, and amino acid sequence coverage," they said in the Nature Methods paper.
What they found was that the majority of participants had generated enough raw data that they could have identified all 20 proteins and most of the 22 1,250 Da peptides.
For all 27 labs, the majority, 79 percent, of MS/MS spectra were assigned to the 20 recombinant human proteins, but 21 percent of the spectra were assigned to contaminants that included E. coli proteins, trypsin, keratins, and other proteins, the authors wrote.
In a few cases, Bergeron said, human albumin was reported, but upon analysis by the study's organizers, they discovered it was bovine serum albumin, "which was not in the [sent] sample… but because two of the tryptic peptides of bovine are sequence-identical to human and since they knew they had human proteins, they thought it was human albumin."
For a field in which reproducibility of experimental data has been a continual challenge, the sample-test study "shows a lot of progress in" indicating the technology is capable of achieving reproducible results across different labs, said the University of Michigan's Omenn.
In a commentary accompanying the study, Ruedi Aebersold, a co-founder of the Institute for Systems Biology and a professor of molecular systems biology at the Federal Technical University in Zurich and the University of Zurich, said that the study should "attract a high level of interest because it confirms and dispels some popular, largely unsubstantiated notions in the field of proteomics."
It clearly demonstrated that training and skill level was crucial for success, he said. It also showed that different mass spec platforms were able to identify all 20 proteins in the test sample. The experimentation also introduced a "considerable" number of contaminants, which, nonetheless, were identifiable. And by analyzing the results of the individual labs and combined datasets, the study "provided important insights that are not apparent from the individual datasets."
[ pagebreak ]
The study, "therefore, shows that with sufficient experimental care and technical training," all 27 participants could have reproducibly identified the proteins in the test sample. Though real-world proteomics analyses is a highly complex procedure and the issues of reproducibility and quality control should be separated conceptually and by experimental strategy, high-quality data, Aebersold said, are essential for the discovery of a proteome, and the Nature Methods paper "suggest that this can be accomplished with currently available methods."
But that can happen only when the researchers know how to use the methods and know what may and may not work. The individual results suggest that many still don't. Indeed, the participants included some of the top facilities in the world doing proteomics work. That only about one-fourth of them were able to successfully complete an academic exercise was, as Omenn said, "unspectacular."
But members of HUPO also said that the failure of most of the labs should not reflect poorly on the proteomics community.
According to Young-Ki Paik, president of HUPO and director of the Yonsei Proteome Research Center at Yonsei Univesity in Seoul, the fact that individual labs did so poorly was anticipated as other similar exercises, including studies conducted by the Association of Biomolecular Resource Facilities, have shown similar results [see PM 02/23/06].
"What is reassuring is the minimal education and training needed to effect 100% success for the 20 proteins," Paik told ProteoMonitor in an e-mail. "We may be able to set up a sort of SOP for MS-based test sample analysis in the future."
Pierre LeGrain, secretary of HUPO and director of life sciences at the French Commissariat à l’Energie Atomique, said in e-mail that there has always been skepticism about the reproducibility of proteomics results. "Biologists have to be convinced that large-scale biology using sophisticated technological processes requires specific protocols, standards, and practices," he said. "This is true even for 'good' labs!"
Added Bergeron, there were no fatal flaws among the 20 labs that didn't initially identify all 20 proteins in the test sample and "the amount of work that it took for these 20 labs to now get to 100 percent was really trivial."
"The reason I think they didn't get it [right on their own] is they had no way of knowing whether what they were doing was fail-safe or not," he said, adding the test-sample study now provides such a mechanism. "The community is at such a level of sophistication that it takes very little for them to recognize how to be pretty … fail-safe."
One of the biggest bottlenecks identified by the study was the quality of the informatics tools that labs used. "A major contributing factor to erroneous reporting resides at the level of database and search engines used and once corrected for, provided an almost perfect score for most participants," the authors wrote. Added Bergeron, while the community had strong suspicions about such problems, before this, "they could really just gossip about it."
Finally, he said that as HUPO tries to get funding for its proposed $1 billion, 10-year Human Proteome Project [see PM 05/01/08 and 08/21/08], the study supports HUPO's argument for such an ambitious undertaking.
"It demonstrates exactly what we've been proposing ¬– that a Human Proteome Project, in my opinion, should be a global initiative employing as many labs as possible and accessing as much data as possible," Bergeron said. "To me, it simply indicates exactly as we had speculated: the technology is fabulous, the folks around the world can be guided using that technology in a pretty easy way. … You can use the mass specs that are available to you, and you're going to generate high-quality data whose value is increased tremendously when it's added to the high-quality data that other folks are generating."