A recent investigation by the National Cancer Institute's Cancer Genome Atlas consortium into the genetic underpinnings of breast tumors also offers a glimpse of the benefits of including proteomic data in such analyses.
In a study published this week in Nature, TCGA researchers analyzed primary breast tumors from a total of 825 patients across five platforms, measuring DNA copy number, DNA methylation, exome sequencing, messenger RNA, and microRNA sequencing.
Additionally, they ran 403 of the samples on reverse-phase protein arrays, measuring levels of 171 cancer-related proteins and phosphoproteins, finding generally good concordance between the genomic, transcriptomic, and proteomic data, corresponding author Charles Perou, a genetics researcher at the University of North Carolina at Chapel Hill, told ProteoMonitor.
The DNA- and RNA-based profiles generated in the study clustered into groups that largely corresponded with four main classes of breast cancer, and, Perou noted, when the researchers "did unbiased class discovery using just the protein expression, it did largely map to [these] gene expression subtypes."
However, Perou added, the "proteomics, and particularly the phosphoproteomics, really informed us in some ways beyond what the gene expression could tell us."
Specifically, he said, the proteomic data suggested the existence of two distinct phosphoproteomic-based subtypes within the larger gene expression-based HER2 subtype – one exhibiting high HER2 and HER1 signaling activity and the other exhibiting lower levels of such activity.
The other example "where the protein data really provided a lot of interest and headscratching," Perou said, was the group's analysis of PI3 kinase signaling, in which they found a disconnect between the PI3K signaling data obtained via the RPPA analysis and their PI3K mutation data.
"When we did a pathway-based analysis of the PI3K signaling pathway we could see that what are believed to be protein and phosphoproteomic signatures of PI3K activation didn't correlate with PI3K mutations, but did correlate with the loss of negative regulators of that pathway, like loss of INPP4B or loss of PTEN," he said. "So there we're somewhat left with a disconnect between the mutation information and the phosphoproteomics."
This particular disagreement has been observed in previous studies, Perou noted, including in work by MD Anderson researcher Gordon Mills, who led the RPPA portion of the Nature study.
Given the discrepancy, Perou said, the "challenge now is to figure out which of these many different genetic events or protein signatures are going to be biomarkers" of responsiveness to drugs like PI3K or mTOR inhibitors.
This work is now ongoing, he said, noting that the researchers are currently reanalyzing the genetic data based upon protein and phosphoproteomic endpoints.
"For example, you can classify samples into high, medium, and low expression of phospho-Akt and ask where there are mutations or gene expression or DNA copy number changes that correlate with that distinction," Perou said. "And ideally, if that works, we could discover … some of the genetic drivers of those protein signatures."
"That's the richness of this multiplatform TCGA data set," he added. "There are so many ways to look at the data. We're just at the beginning."
George Mason University researcher Emanuel Petricoin, who invented the RPPA technique along with GMU colleague Lance Liotta, is involved in a number of tumor profiling efforts using the technology (PM 8/13/2010). Petricoin applauded the TCGA study for its work combining genomic and proteomic data, noting that observations like the disconnect between the PI3K mutation and signaling data demonstrate the value of such an approach.
He suggested, however, that by not using laser capture microdissection – of which Liotta, again, was an inventor – the researchers included stromal tissue in their analysis that likely masked lower strength proteomic signatures.
Such masking "is always an issue with all genomic or proteomic analysis that's done on grossly dissected or macrodissected tumor materials," Perou said. "You have a mix of cell types, and so you have to keep that in mind when interpreting the data, and if you have a low-level signal and lots of stroma, it will get swamped out."
He said, however, that he believed that "the stroma is affecting the tumor, and I sort of prefer to see the complete picture with the caveat that there's a mixture [of cell types] more than trying to just microdissect part of it, because then you're missing part of the picture."
Yiling Lu, director of the Reverse Phase Protein Array Shared Resource at MD Anderson, told ProteoMonitor that in their analyses the researchers also included antibodies to certain stromal cells in order to help control for this content. And, Perou noted, even without microdissection, the TCGA researchers were able to identify the two HER2 signatures.
Petricoin said, however, that the HER2 subtype was likely a particularly strong signal given the low amount of HER2 in stromal cells.
"Your stromal cells don't have a lot of HER2, and it's such a strong signal that … the amount of contamination there [doesn't matter]," he said. Petricoin noted, though, that in breast tumor profiling work his lab has done as part of the Biomarkers Consortium's I-SPY2 project (PM 3/26/2010), they have found much less concordance between proteomic and genetic breast cancer subtypes.
He attributed this divergence to the use of the microdissected samples, adding that his team is "seeing a lot of different subtypes emerge, a much more detailed signaling architecture."
Beyond issues of dissection technique, use of the TCGA samples also raises the question of their appropriateness for phosphoproteomic work. As has been noted in discussions of the NCI's Clinical Proteomic Tumor Analysis Consortium's use of TCGA tissue, given that the samples were not collected with phosphoproteomic research in mind, they might not have been frozen quickly enough to avoid post-resection changes in phosphorylation levels.
"That is a concern and of course a caveat to all of these studies and [RPPA] results," Perou said. He noted, however, that the group's HER2 data suggested that phosphoproteomic signatures might be more durable than some have worried.
"We very much saw this HER2 protein activation signature, which included phospho HER2, and it was so highly correlated with the underlying DNA copy number changes and gene expression changes that I would say for that particular phosphorylation event it was validated by all the other technologies," he said. "So I think we can say that there are some protein and phosphoprotein data that is good and reliable from the TCGA samples, but we're literally going to have to go through it almost one protein and phosphoprotein at a time to figure out which are trustworthy and which are not."
Efforts to assess the effect of post-excision delay on phosphoproteomic samples are ongoing, Perou noted. Indeed, at the Human Proteome Organization's 11th annual meeting this month, Broad Institute researcher Steve Carr reported on an effort by the CPTAC initiative to ascertain the effects of time to freezing on the phosphoproteome of the TCGA samples (PM 9/21/2012).
In that study, the CPTAC researchers found that certain phosphosites showed changes as early as one minute post-excision, but they found no significant changes in the total proteome content across time points and general stability across the phosphoproteome.
Investigations by MD Anderson researchers have found similar phosphoproteome stability, Lu said.
"Of course we'd love to have every sample flash-frozen within five minutes out of the patient," Perou said. "But in the clinical world that's just not practical. We have to learn to work with samples that have an ischemia time of 30 minutes, an hour, two hours."