NEW YORK (GenomeWeb) – Researchers at the State University of New York at Buffalo have developed a shotgun mass spec workflow that offers highly reproducible label-free quantitation across large sample cohorts.
Described in a study published this month in the Journal of Proteome Research, the approach combines optimization of sample prep and liquid chromatography with high-resolution MS1 measurements to enable what Jun Qu, professor of pharmaceutical sciences at SUNY-Buffalo and senior author on the study, said are experiments quantifying more than 6,000 proteins across hundreds of samples with high throughput and reproducibility.
In the JPR study, Qu and his colleagues used the workflow, which they have termed IonStar, to analyze a set of 20 samples of combined human and Escherichia coli lysates, finding that they were able to quantify a total of 6,273 proteins across 6.5 orders of magnitude and with median intragroup coefficients of variation of between 6 percent and 9 percent. Of the total 6,273 quantified proteins, 6,234, or 99.4 percent, were quantified in all 20 samples.
While the total number of quantified proteins is less than in some other published proteomics experiments, the IonStar workflow stands out for its capacity to quantify large number of samples with extremely low levels of missing data and high quantitative quality, Qu suggested.
"You see people publish papers saying they have quantified 10,000 proteins in 15 patients, but when you look at the data you see that only 2,000 proteins are quantified in every one of the 15 patients, and around 4,000 of the proteins are only quantified in two or three patients," he said. "So, this type of data is very misleading."
What is needed for clinical and pharmaceutical research is "to reproducibly measure proteins in many biological replicates with high data quality," Qu said.
This has been a challenge for shotgun proteomics techniques due in large part to the stochastic nature of such methods. In a typically shotgun proteomics approach, the instrument performs an initial scan of precursor ions entering the instrument and selects a sampling of those ions for fragmentation and generation of MS/MS spectra. However, because instruments can't scan quickly enough to acquire all the precursors entering at a given moment, many ions — particularly low-abundance ions — are never selected for MS/MS fragmentation and so are not detected.
The fact that all fragment ions are not measured in every run makes high-quality quantification across samples challenging for DDA methods, especially in the case of low-abundance molecules, which are most likely to be missed in a particular analysis.
Data-independent acquisition (DIA) mass spec methods have arisen in part in response to this challenge. In DIA mass spec, the instrument selects broad m/z windows and fragments all precursors in that window, allowing the machine to collect MS/MS spectra on all ions in a sample. Because the method collects data on all ions in a sample, DIA, unlike DDA, offers consistent protein quantitation across runs, allowing for robust comparisons of, for instance, protein expression levels in different samples.
Another approach is to use precursor-level data, MS1, as opposed to MS/MS level data, for quantification. This avoids the stochastic sampling issue associated with MS/MS-based quantitation, and, Qu noted, the high resolution of current cutting-edge mass spec instruments like the Thermo Fisher Scientific Orbitrap Fusion Lumos on which he and his colleagues did their experiments allows for improved MS1 resolution, scan speed, and sensitivity, which enable deeper quantitative coverage and higher reproducibility across samples.
In the JPR study, the SUNY-Buffalo researchers set out to thoroughly optimize their MS1-based quantitative IonStar workflow for use on a high-resolution Orbitrap instrument, looking at steps ranging from sample preparation and trypsin digestion to LC gradient time, arriving at a set of conditions that Qu said were aimed not at measuring the largest number of proteins in a small set of samples but measuring the largest number of proteins possible with high reproducibility across large numbers of samples.
Compared to workflows optimized for lower-field Orbitrap instruments, the researchers found that their method benefitted from more extensive protein digestion to take advantage of the approach's improved depth of coverage, as well as a shorter LC gradient length running on a longer LC column, which they said was due to the high-res instrument's higher scan speed and sensitivity. They also found that a higher sample loading capacity of around 4 μg of peptides provided optimal sensitivity.
Also key are the workflow's data processing steps, Qu said. For instance, because the method uses MS1 chromatography peaks for quantification, it's essential to make sure these peaks are accurately aligned across different runs and samples. He cited data from a presentation he gave at this year's Pittcon demonstrating that his group's chromatogram alignment algorithm reduced retention time variations across runs by more than 97 percent.
Also important is the method's quantitative feature extraction, Qu said, noting that the IonStar approach employs a sensitive feature extraction method followed by post-feature-generation quality control. This, he said, contributes to more sensitive and reproducible generation of high-quality features compared to other MS1 methods.
In the same Pittcon presentation, Qu showed data from an experiment in which he and his colleagues profiled the proteomes of 100 rat brains, quantifying more than 7,000 proteins across these samples. There were no missing values in any of the 100 samples for 99.5 percent of these 7,000-plus proteins.
Qu and his colleagues have applied the method to a variety of experimental questions, including studying the effects of drug treatments on pancreatic cancer cell lines and on xenograft mouse models of various cancers.
In the JPR paper, they used the approach to look at the response of pancreatic cancer cell lines to combination treatment with the chemotherapeutic gemcitabine and the FGFR inhibitor BGJ398. An analysis of 39 samples identified more than 6,000 proteins, 99.5 percent of which were quantified in all samples. Applying a threshold of 1.4-fold change, the researchers identified 1,302 proteins that were altered after treatment, including a number of molecules associated with cell cycle, apoptosis, and cell migration and adhesion functions.
"Biological validation [of these findings] is ongoing," the authors wrote.