Skip to main content
Premium Trial:

Request an Annual Quote

Researchers ID Pan-Cancer Molecular Subtypes Using Proteomic, Genomic Analyses of TCGA Samples


NEW YORK (GenomeWeb) – An international research team has completed a multi-omic pan-cancer analysis of data on 3,527 tumors spanning 12 different cancer types generated by the National Cancer Institute's Cancer Genome Atlas consortium.

The study, which was published this week in Cell, identified 11 major molecular subtypes into which the different cancers could be placed. And while the analysis found that tissue of origin remained the dominant factor for identification of tumor subtypes, roughly 10 percent of tumors were reclassified based on their molecular data.

Also notable given past multi-omic analyses of TCGA samples was the fact that the various levels of molecular data were in good agreement regarding the subtypes and tumor classification. Previous studies by TCGA researchers and by researchers from NCI's Clinical Proteomic Tumor Analysis Consortium have found that when looking within a single tumor type, different types of molecular data have divided the tumors into different subtypes.

For instance, last month a CPTAC team led by Vanderbilt University researcher Daniel Liebler published in Nature a mass spec analysis of 95 tumors previously analyzed at the genomic level by TCGA. That analysis identified several proteomic subtypes of the disease, including subtypes not apparent in the genomic data. It also demonstrated that – as previous studies have similarly suggested – mRNA levels are not reliable predictors of protein expression and that gene copy number variations are not, broadly speaking, predictive of protein expression, either.

Similarly, reverse phase protein array-based proteomic analyses of TCGA samples by MD Anderson researcher Gordon Mills – also an author on the Cell paper – have typically divided tumor populations into somewhat different sets of subtypes than have genomic analyses of the same samples. For instance, in a 2012 TCGA breast cancer study, proteomic data suggested the existence of two distinct phosphoproteomic-based subtypes within the larger gene expression-based HER2 subtype – one exhibiting high HER2 and HER1 signaling activity and the other exhibiting lower levels of such activity.

In this week's Cell study, on the other hand, the authors found that each level of molecular data they used – whole-exome DNA sequencing, DNA copy-number variation, DNA methylation, genome-wide mRNA levels, microRNA levels, and protein expression for 131 proteins as measured by RPPA – identified comparable subtypes, University of California, Santa Cruz researcher Joshua Stuart, senior author on the paper, told ProteoMonitor.

In large part, this reflected the fact that the molecular classifications to a significant extent recapitulated the tumors' tissues of origin," he said. "Ninety percent of the time, it's the tissue driving that categorization.

However, Stuart said, even in the 10 percent of tumors whose classification diverged from their tissue of origin, the various types of molecular data demonstrated good agreement.

"The platforms do agree, and they agree that these are divergent cases," he said. "The proteomic data say, [for instance], that a bladder cancer looks like a squamous [cell cancer], and the RNA says the same thing and the copy number changes say the same thing. So they agree on the divergence and also on which tumors match their tissue of origin."

In a sense, in the Cell paper, Stuart and his colleagues were working in the opposite direction from the previous multi-omic analyses of TCGA samples. While those past studies investigated individual tumor types with their goal of identifying molecularly defined subtypes within the cancer, the recent effort sought to identify subtypes across the 12 cancer types.

To an extent, the two approaches demonstrated consistency with each other, Stuart said, noting that, for instance, estrogen receptor positive and estrogen receptor negative breast cancers were grouped into two different classes in their pan-cancer analysis.

"That is kind of the first dominant way in which breast cancers differ, and you can see that in our map they are very different," he said. "In fact, they look just as different as [cancers from] two different tissues."

Bladder cancers were the most diverse of the cancers the group examined, splitting into seven pan-cancer subtypes, with the majority falling into one of three subtypes: a C1-LUAD enriched subtype, a C2-squamous-like subtype; and a C8-BLCA subtype. According to the authors, patients with the C1-LUAD enriched and C2-squamous-like subtypes "showed significantly worse overall survival" than the C8-BLCA subtype.

Some cancers -- acute myelogenous leukemia, for instance – tracked entirely with their tissue of origin, with the molecular data doing no reclassification of any samples. Stuart said, though, that he expected that as the group continues to add data from additional tumor types, it will see more reclassification based on the molecular data.

"Some of these tumors didn't divide up at all, but if we had a higher diversity of tumor types, maybe it would start sorting out [for instance] the AMLs because there would be another tumor type that some subset of the AMLs might group to better than just sticking with the [other] AMLs," he said. "So, as we add more diversity into the map, it will reclassify more."

The researchers are now planning to expand their analysis to 21 tumor types, an effort Stuart said that will begin this fall. They also hope to add metastases to the project, he noted.

For proteomic data they will continue to rely on Mills' RPPA analyses, but, Stuart said, the group is interested in potentially adding other proteomic data sources like mass spec, as well. He said that he has been in touch with some researchers from CPTAC about possibly expanding the analysis to include that group's data. CPTAC has taken on analysis of three tumor types – breast, colorectal, and ovarian – with the aim of profiling around 100 samples of each.

"We could consider a subproject where we look at these [CPTAC] tumor types and do an integrated analysis including the mass spec [data]," Stuart said. "That could be interesting."