NEW YORK (GenomeWeb) – In a study published today in Cancer Cell, researchers from South Korea, the Netherlands, and the US National Institutes of Health reported their new proteogenomic analysis of diffuse gastric cancers (GCs) in young populations, identifying four subtypes of the disease.
In previous studies of GCs, genomic and transcriptomic analyses have identified molecular signatures associated with phenotypes of the disease, such as patient subtypes and survival. The Cancer Genome Atlas (TCGA) identified four GC subtypes and associated molecular signatures: Epstein-Barr virus (EBV)-positive tumors with recurrent PIK3CA mutations, DNA hypermethylation, and amplification of JAK2, CD274, and PDCD1LG2; tumors with microsatellite instability with high mutation rates; genomically stable tumors enriched for diffuse histological variants and mutations in RHOA; and tumors with chromosomal instability showing aneuploidy and amplifications of genes encoding receptor tyrosine kinases, the researchers said.
But the TCGA approach, which was performed using reverse-phase protein arrays, is limited by antibody availability. For a more comprehensive and unbiased proteomic characterization, the Clinical Proteomic Tumor Analysis Consortium performed mass spectrometry-based proteomic and/or phosphoproteomic analyses of tumor tissues from patients with colon, breast, and ovarian cancers, which demonstrated that proteomic signatures can provide information that can further stratify patients.
As the impetus for this study, the researchers noted that the incidence of diffuse-type GCs (one of two major histological subtypes of GC) has increased in the US, and that increasing rates of GC in patients under 40 have also been reported in the US. GCs diagnosed in young patients — called early-onset GCs (EOGCs) — are strongly enriched for diffuse histology, are predominant in women, and are highly metastatic. But relatively few genome-wide studies have been performed on diffuse GCs, especially in young populations, compared with intestinal GCs.
The researchers collected paired tumor and adjacent normal tissues, as well as blood samples, from 80 patients with EOGCs under 45 years of age. These 80 tumors included 74 diffuse, three intestinal, two mixed type, and one inflammatory myoblastic tumor. For each patient, the team performed exome sequencing of the tumors and peripheral blood mononuclear cells, as well as mRNA sequencing of the paired tumor and adjacent normal tissues. The researchers also performed global proteome, phosphoproteome, and N-glycoproteome profiling of paired tumor and adjacent normal tissues using liquid chromatography-tandem MS analyses.
Using exome sequencing data, they identified 56,502 nonsynonymous single-nucleotide variants and 3,598 frameshift indels. Further, they found 11,938 genes were expressed in the tumor and adjacent normal samples, on average, in the mRNA data. The researchers then used these variants and expressed transcripts from each patient to build a sample-specific database and identified 156,135 peptides, 28,944 phosphopeptides, and 4,376 N-glycopeptides from the global proteomes, phosphoproteomes, and N-glycoproteomes, respectively. These peptides were mapped to 10,295 protein-coding genes, on average.
The researchers also compared previous genomic analyses of TCGA and Hong Kong (HK) GC cohorts — which mostly included late-onset GCs — to their own data and found that the EOGCs showed a different mutation landscape than the late-onset cancers (LOGCs).
Importantly, the investigators were able to stratify the EOGCs into four distinct subtypes. They identified two clusters using the mRNA data, four clusters using the global proteome data, three clusters using the phosphoproteome data, and three clusters using the N-glycoproteome data. They then performed an integrative clustering analysis using all four types of data and identified four subtypes.
In comparing their own GC subtypes to GC subgroups in TCGA data, the researchers found that subtype 2- and 4-like TCGA subgroups showed the best and worst survivals, respectively. MSI- and EBV-positive GCs were significantly enriched in the subtype 2-like subgroup, while genomically stable GCs were enriched in the subtype 4-like subgroup.
They also further defined subtype 1 as representing cell proliferation-related processes (cell cycle and DNA replication RNA processing, translation, and protein degradation). They defined subtype 2 as representing immune response-related processes (antigen presentation, BCR/TNF/Toll-like receptor signaling, TCR signaling, and phagosome). Subtype 3 uniquely represented metabolism-related processes (oxidative phosphorylation, fatty acid b-oxidation, and citrate cycle). And subtype 4 mainly represented invasion-related processes (actin cytoskeleton and MAPK, PI3K-AKT, WNT, RHOA, and cadherin signaling).
"Based on these data, subtypes 2 and 4 can be characterized as immunogenic and invasive tumors with possibly good and poor survival rates, respectively, similar to subtypes 2- and 4-like subgroups in TCGA cohort," the authors suggested. They noted that the tumors in subtype 2 show strong immune activity that may contribute to a good prognosis and that the tumors in subtype 4 show strong invasion potential that may contribute to a poor prognosis.
"These data suggest that proteogenomic analysis can provide potential associations of somatic kinase fusions with cellular signaling," the authors concluded. "These data suggest that, when a cohort includes both EOGC and LOGC, mRNA expression patterns can be analyzed using the whole cohort, but somatic mutations should be analyzed separately for young and old patients. Our proteogenomic analyses provide signaling pathways correlated with somatic mutations, oncogene and tumor suppressor candidates identified from mRNA-protein correlation, four GC subtypes, and mRNA/protein signatures defining the subtypes."