Spurred by an editorial in Nature calling for better standards in proteomics, the Human Proteome Organization prepared a sample test effort to identify common errors affecting the reproducibility of proteomics experiments. Of the 27 labs that participated in the study, just seven properly identified all 20 proteins given to them in a mix and only one lab could identify all 22 tryptic peptides in that sample with a mass of 1,250 Daltons.
The proteins in the HUPO designed samples given to the 27 labs contained no tags and represented a variety of molecular weights. Each protein also contained at least one peptide whose mass was 1,250 Daltons. After each participating lab analyzed the samples and determined what was in them, using a database specified by HUPO, they deposited their data into Tranche and Pride.
While some labs were successful, most encountered problems. Seven labs ran into serious problems identifying the proteins and, although three of them had to be sent fresh material, the others could work with the data they had. Thirteen labs needed minimal help from the organizers to glean the correct answers from their data. "What we found is the data that the mass spec generated were fabulous and there was more than enough data there to identify the 20 proteins," says McGill University's John Bergeron, the senior author of the June Nature Methods paper describing this study. "Then the folks had some difficulty in terms of ensuring that they could identify all of the 20 proteins the way we wanted them to."
The second part of the study was a bit trickier, Bergeron says. Lead author Alexander Bell, also from McGill, adds that only four groups had enough data to identify all 22 1,250-weight proteins — but only one of them reported it properly. "The other three groups didn't interpret the data well enough," he says.
However, when the researchers studied the centralized data, they could easily identify the peptides, despite the individual labs' difficulties. "We see very clear identification of the proteins' sequences very unambiguously," says McGill's Catherine Au, who adds that that helps make the case for pooling data.
From that, the researchers say that proteomics techniques are reproducible. "That value has been … demonstrated through the centralized analysis, where the reproducibility and the accuracy is now overwhelmingly demonstrated," Bergeron says.