NEW YORK – A team led by researchers at the Federal Institute of Technology (ETH) Zurich and Masaryk University have used data-independent acquisition mass spec to classify patient breast cancer samples based on their proteomic profiles.
Described in a paper published this week in Cell Reports, the study used SWATH mass spec to analyze 96 breast cancer samples representing the five conventional breast cancer subtypes, finding that proteomic-based classification largely matched traditional subtyping, though heterogeneity at the protein level was also apparent within subtypes.
While the effort itself was not powered to provide actionable clinical information, it demonstrates that mass spec-based proteomics can be practically applied to clinical research, suggested Ruedi Aebersold, professor at ETH Zurich and senior author on the study.
"Essentially what we tried to show is that with a technique that is very fast, relatively cheap, and quite simple, we can recapitulate [conventional classification] information in a way that has really not been possible in proteomics before," he said, calling it a step towards the "commoditizing or democratization of proteomics."
Aebersold acknowledged that proteomics researchers have previously used mass spec to classify clinical tumor samples. Most notably, the National Cancer Institute's Clinical Proteomic Tumor Analysis Consortium (CPTAC) has published studies presenting proteomic-based subtypes for a number of cancer types.
However, Aebersold said that to date such work has largely been the purview of large, well-funded groups like CPTAC.
"The CPTAC papers are very nice papers, but each one of them was quite a gigantic effort from several groups," he said.
The SWATH workflow presented in the Cell Reports paper, on the other hand, can be done by a relatively small proteomics lab and at a pace of around 20 samples per day, he said.
For the study the researchers developed a spectral library based on samples from five classical breast cancer subtypes: luminal A, luminal B HER2-, luminal B HER2+, HER2 enriched, and triple-negative. They then ran the 96 samples using the SWATH DIA approach, collecting quantitative data in every sample for a set of 2,842 proteins.
The researchers found their proteomic profiles largely recapitulated the five traditional subtypes, with the highest intra-group correlations seen within the luminal A samples and the luminal B HER2- and luminal B HER2+ subtypes.
The proteomic profiles also identified a high correlation between some members of the luminal B HER2+ and HER2-enriched subtypes as well as higher levels of tumor heterogeneity at the protein level within the triple-negative subtype.
Aebersold noted that the study was not designed to generate clinical insights based on the proteomic data collected, but the finding does echo previous research identifying heterogeneity among triple-negative breast cancers at the proteomic and phosphoproteomic level. For instance, last year researchers with the I-SPY 2 TRIAL identified a set of triple-negative patients who exhibited significant levels of HER2 phosphorylation despite being ostensibly negative for the protein.
Aebersold and his colleagues also identified from their data a set of three key proteins — type II inositol 3,4-bisphosphate 4-phosphatase (INPP4B); cyclin-dependent kinase 1 (CDK1); and receptor tyrosine-protein kinase erbB-2 (ERBB2) — that was able to correctly assign 84 percent of the 96 tumor samples to their conventional subtype.
With its focus on throughput and workflow simplicity, the study fits within the larger trend in proteomics towards prioritizing analysis of large numbers of samples to obtain meaningful statistics about the clinical validation of protein biomarkers and proteomic measurements.
To this end, Aebersold said the work was intended to communicate to scientists outside the proteomics community that mass spec technology had reached a point where it was a practical tool for clinical research.
"The proteomics field has made enormous progress, but what we have not managed to do I think is to convey the message that this is now a technology that has reached a state where it is accessible," he said.
Aebersold cited his experience speaking to clinicians at a workshop sponsored by the Human Proteome Organization (HUPO).
"We had a discussion where we asked them what was keeping people like them — technologically progressive clinical scientists — from using proteomics," he said. "And virtually all of them said accessibility. They said they realized that it was powerful but that they simply couldn't do [proteomics], because if they went to their local lab and said they were coming with a cohort of 100 samples, [the lab] would say, 'Well, we can't process that, or it will take forever.'"
"What we would like to say [with this study] is that now, if you have a cohort of 100 or 200 or 300 samples and you have access to certain instruments, many of which are now very common, that you can do [proteomics] with similar effort as [required for] transcriptomic measurements, which of course many, many clinical groups do," he said.
The Cell Reports study is one of several efforts suggesting proteomics has reached that level of accessibility. At the recent American Society for Mass Spectrometry annual meeting, Thermo Fisher Scientific cited experiments on the company's new Orbitrap Exploris 480 in which researchers used DIA workflows to identify around 3,000 proteins in a five-minute experiment.
While the field's emphasis on throughput has in large part revolved around improvements in DIA workflows, Thermo Fisher also at the ASMS meeting introduced a new isobaric tagging approach that its developer, Harvard University professor Steven Gygi, said could allow researchers to reproducibly quantify samples at a depth of 8,000 to 10,000 proteins in around an hour per sample.