NEW YORK (GenomeWeb) – The development of more streamlined, higher-throughput mass spec workflows has allowed proteomics researchers to run significantly larger and more complicated experiments than were previously feasible.
But as proteomic experiments grow in size and complexity, harmonizing data generated on different days, different instruments, and in different labs has become a more pressing challenge.
"We are beginning to think about experiments in terms of hundreds of samples, which is a little different than what we used to think about," said Michael MacCoss, professor of genome sciences at the University of Washington. "We used to think about doing experiments within the context of our batch, but now we would really like to have experiments that can be joined with other people's datasets."
This requires controlling for experimental variability across large numbers of mass spec runs. In a study published last week in Analytical Chemistry, MacCoss and his colleagues argued for calibration using external reference materials, which they suggested could help researchers better account for variability in sample prep and other steps upfront of LC-MS analysis, which are the largest contributors to variation across proteomic experiments and labs.
MacCoss noted that proteomics researchers have typically used internal standards — isotope-labeled peptides, specifically — for calibrating their experiments. This approach has several downsides, though. Because these standards are added into the sample post-digestion, they don't account for the variability of the sample prep process. Additionally, while they are suitable for targeted experiments, generating isotope-labeled peptides for the thousands of proteins analyzed in a typical shotgun proteomic experiment is prohibitively expensive.
Researchers have also explored using peptide standards with amino acid "wings" added to their ends, which allows them to undergo digestion, but, the authors noted, such "wings do not accurately capture the digestion conditions of the native protein sequence." Another option is using intact labeled protein standards, but, in addition to being quite expensive, these intact proteins are also not guaranteed to be a good proxy for the actual endogenous protein targets as they may not have the same conformation or post-translational modifications.
An external calibrator, on the other hand, consists of a reference material representative of the sample being analyzed and which can be run through all the processing and analysis steps as the actual sample.
For instance, the authors noted, "an experiment measuring analytes in human cell lysates would use a pool of human cell culture, or in plasma would use a pool of plasma… The reference material is prepared alongside experimental samples in each sample processing batch, capturing all the conditions that the experimental samples experience from protein extraction, to digestion kinetics, to instrument variation."
MacCoss noted that Andy Hoofnagle, his co-author and University of Washington colleague has for some time been promoting external calibration for targeted mass spec experiments. In the Analytical Chemistry study, the researchers suggest applying the approach to shotgun proteomics work, as well.
To demonstrated the usefulness of the approach, they used a yeast lysis as an external calibrator and reproduced an experiment in which yeast were either grown unperturbed or treated with 0.4M NaCl. In the first instance, they looked at whether calibration improved the correlation of DIA mass spec runs on a Thermo Fisher Scientific Q Exactive HF instrument done on identical samples that they noted were "prepared on different days by the same operator at the same site using the same instrument and acquisition method."
Given the similarity of preparation, the expectation was that addition of an external calibrator would do little to improve the correlation of the measures, which proved to be the case.
They followed this with a pair of targeted selected-reaction monitoring experiments done on the same day using identical equipment (a Thermo Fisher Altis triple quadrupole), again finding that external calibration did little to improve the correlation between the two runs.
Comparing the data generated on the Q Exactive to that from the Altis, however, they noted that while data from both instruments followed the same general trend, they were not well correlated due to the fact that the signal generated by the Q Exactive was higher than that generated by the triple quad. Calibrating using the yeast reference material significantly improved the correlation.
They also looked at data from samples prepared by different researchers and run using DIA mass on different instruments (a Sciex 5600 TripleTOF and the Q Exactive). For the uncalibrated experiments, the correlation coefficient was .63. For the calibrated experiments, it was .92, which was the same level of correlation as the initial DIA experiments run on samples prepared by the same researchers and run in the same lab using the same instrumentation.
"If you make measurements relative to something that's common, and you get to the point where everybody's making those measurements relative to the same thing, it ends up being a pretty powerful measurement," MacCoss said.
He noted, though, that consideration was still needed of what the optimal reference standard for different experiments might be.
"You want to have a reference material that has the peptides [of interest] within your range of measurement," he said. "When you're talking about thousands or tens of thousands of peptides, it's going to be hard to get an optimal reference material for all of those peptides."
"We want to do the greatest good for the greatest number of peptides," he added. "Sometimes that will mean making pools of samples and so on in order to be able to get the best chance of having a representative matrix for you to measure."
In terms of spreading adoption of the idea, MacCoss suggested that large research efforts like the National Cancer Institute's Clinical Proteomic Tumor Analysis Consortium (CPTAC) could help incentivize researchers.
"Say, for instance, the CPTAC program had a reference material for all their samples, and I was another investigator performing some sort of a tumor analysis," he said. "I might want to be able to compare my results relative to the CPTAC program and to measure my samples relative to the reference material that they've [established]."
"I think initially we should be thinking about coming up with reference materials that a lab uses for its own projects, and then [down the road] you could start thinking about these things in larger [terms]," like the CPTAC example, he said. "And eventually people who are performing similar analysis on similar types of samples will [be able] to access that common reference and make their data comparable to data from these other larger programs."