With proteomics researchers from around the globe gathered in Boston last week for the Human Proteome Organization's 11th annual meeting, scientists and officials from the National Cancer Institute's Clinical Proteomic Tumor Analysis Consortium offered an update on the progress of the initiative's second phase.
This update coincided with the launch of the CPTAC Data Portal, a resource hosting all the data being produced by the second phase of the project along with certain data from its first phase, CPTAC 1.
As Chris Kinsinger, program manager in NCI's Office of Cancer Clinical Proteomics Research, noted during CPTAC's HUPO presentation, the portal currently offers more than 800 raw mass spectrometry files comprising more than 500 gigabytes of data, including data generated by several newer mass spec instruments including Thermo Fisher Scientific's Q Exactive and AB Sciex's TripleTOF 5600.
Launched in August of last year, CPTAC 2 – a five-year, $75 million to $120 million project – aims to combine protein biomarker discovery and verification studies in tumor tissue samples with genomic characterizations of those same samples done by the NCI-funded Cancer Genome Atlas (PM 8/26/2011).
The work builds on the initial five-year, $104 million CPTC initiative launched in 2006, which worked to build a foundation of technologies and standards to advance the application of proteomics to cancer research. That project established five multidisciplinary, multi-institution research centers and developed collaborations with more than 60 public and private institutions around the world.
The second phase of the program established research centers at eight institutions including Washington University in St. Louis, the University of North Carolina, Boise State University, Pacific Northwest National Laboratory, the Broad Institute, Fred Hutchinson Cancer Research Center, Johns Hopkins University, and Vanderbilt University.
The groups have begun analysis of three tumor types – breast, colorectal, and ovarian, with the aim of profiling around 100 samples of each, Kinsinger said. He added that the project is currently focused on the discovery stage, profiling the tumor proteomes as well as a range of post-translational modifications.
The tumor samples' quality, particularly with regard to analysis of post-translational modifications – and phosphorylation, specifically – have proved a key area of concern for the project. As Kinsinger noted, although the TCGA samples were snap frozen within 60 minutes of resection, this delay in freezing could still potentially affect the PTM content of the proteins.
In a separate HUPO presentation, CPTAC 2 participant and Broad Institute researcher Steve Carr reported on an effort to ascertain the effects of time to freezing on the phosphoproteome of the project's samples.
In that study, CPTAC 2 researchers took time point samples of tumors at 0, 5, 30, and 60 minutes post-excision and profiled their total proteome contents along with their phosphoproteomes and glycoproteomes. While they found no significant changes in the total proteome content across time points and general stability across the phosphoproteome, they did find that certain phosphosites showed changes as early as one minute post-excision.
During his presentation, Kinsinger also noted the relatively small amount of TCGA tumor samples available to the CPTAC 2 researchers. In the past, this issue has raised concerns among several outside researchers who have suggested it would mean that samples for the discovery and verification phases of the project would likely have to come from two different sets of patients (PM 8/27/2010).
These potential disadvantages, however, are balanced by the opportunity the project offers to integrate tumor proteomic and genomic data on a large scale.
"I think the novel part of CPTAC is to do the proteogenomic integration and finally incorporate all of that data into networks and hypotheses and to test these hypotheses," Kinsinger said. "Looking at the proteogenomic integration, we want to in the first step … align the protein information we have with other [genomic] information that's available on the [University of California, Santa Cruz] Genome Browser."
He added that this portion of the project's informatics workflow is currently under development, with much of that effort being driven by Boise State researcher Morgan Giddings.
Kinsinger also highlighted some data from the TCGA's analysis of its ovarian cancer samples that, he suggested, demonstrated the rationale for the CPTAC 2 project and protein biomarker discovery efforts in general.
"With ovarian cancer, what we know is that 96 percent of the serous ovarian cancer tumors that TCGA analyzed have a mutation in p53, which is not surprising," he said. "But then after that the next most prevalent mutations were BRCA 1 and 2, which have mutation rates of 11 and 12 percent. And then [all other mutation rates] were less than that."
This, Kinsinger said, means that while ovarian cancer "is driven by the p53 mutation, there aren't other mutations that seem to really be driving that cancer – it's just very specific … So we have to have some other way of interpreting the data. I think proteins could have something to contribute."
In addition to Kinsinger, several CPTAC 2 participants presented at the session, with Johns Hopkins researchers Hui Zhang and Heng Zhu reporting on efforts to analyze post-translational modifications and Vanderbilt's Rob Slebos providing an overview of the mass spec workflows his team is using for its discovery efforts in colorectal cancer.
The Vanderbilt discovery pipeline, which was developed as part of the CPTAC 1 initiative, consists, Slebos said, of digestion followed by basic reverse-phase HPLC, which separates the peptides into 15 fractions that are each analyzed on a 90-minute gradient on a Thermo Fisher Scientific Orbitrap Velos instrument.
Slebos added that the researchers have used the RNA-seq data provided by the TCGA analysis of the tumor samples to generate personalized search databases to each tumor for making protein IDs, a process, he said, that can "expand protein identification levels by five to 10 percent."
The Vanderbilt researchers are also building multiple-reaction monitoring assays for sets of proteins of particular interest – a panel of metabolic enzymes, for instance, as well as panels of tyrosine kinases.
They are currently working their way through 95 colorectal tumor samples provided by TCGA, with the ultimate goal of analyzing around 150 total samples, Slebos said, noting that he expected they would be able to complete this work in around a year.