Skip to main content
Premium Trial:

Request an Annual Quote

NCI's CPTAC Project, Entering 16th Year, Has Left Clear Mark on Cancer Proteomics World


NEW YORK – Now entering its fourth stage and 16th year, the National Cancer Institute's Clinical Proteomic Tumor Analysis Consortium (CPTAC) project is one of the longest running and best funded initiatives in the proteomics field.

According to researchers inside and outside the program, it has produced a number of technologies and methods and a wealth of experimental data, helping to bolster proteomics' credibility as a tool for research into cancer and life science more generally.

"It's a tour de force, the analysis that they have done," said Pedro Cutillas, professor of cell signaling and proteomics at the Cancer Research UK Barts Centre. Cutillas is not a CPTAC participant but has used the project's data in his work. "They have produced very valuable datasets for the cancer research community," he said.

The project began with the 2006 launch of the NCI's Clinical Proteomic Technologies for Cancer (CPTC) initiative, a five-year, $104 million effort focused primarily on developing and evaluating proteomic tools and workflows. That was followed by the second stage of the initiative, another five-year, roughly $100 million endeavor that aimed to take the tools and methods developed and validated during the CPTC project and apply them to the analysis of clinical cancer samples.

Notably, the CPTAC researchers were to generate proteomic and phosphoproteomic profiles of tumor samples that had already undergone genomic analysis via the NCI's Cancer Genome Atlas (TCGA) program, making it one of the first large-scale efforts to take a proteogenomic approach to cancer research.

The third edition of CPTAC continued and deepened this proteogenomic approach, adding new emphasis on translational work as researchers aimed to use genomics and proteomics to better understand patient drug response and the development of resistance. The fourth stage of the project, which launched this year and is slated to run for an additional five years at funding levels of around $11 million per year, will continue in this vein, with researchers focused on specific cancers including melanoma, multiple myeloma, acute myeloid leukemia, and non-small cell lung cancer and collaborating with NCI-sponsored clinical trials to apply proteogenomics to questions of drug response and resistance.

While the various peer-reviewed papers and datasets detailing proteogenomic profiles of different tumor types are perhaps the highest profile products of the CPTAC project, the efforts, beginning with the CPTC initiative, to standardize and validate tools and workflows for these analyses have provided an essential foundation for the group's work and helped advance cancer proteomics and proteomics more generally.

Richard Smith, director of proteomics research at the Pacific Northwest National Laboratory (PNNL) and a CPTAC researcher, highlighted the "systematic way" the program went about "validating the mass spectrometry-based platforms for doing proteomic measurements" as a key contribution.

When CPTC launched in 2006, proteomics was still in its relative infancy, with many mass spec workflows facing a variety of challenges including issues around reproducibility and throughput. These remain sticking points for the field even today, but significant advances have been made, and the CPTC and CPTAC projects had notable impacts, Smith said.

"They put together a consortium that worked together to really effectively develop a common approach to refine the details, increase the throughput, and that did things on a scale and with a level of care that really hadn't been done at that point," he said.

The project "started with the premise that in order for proteomics to have clinical utility, it had to have defined protocols that were used by all the labs doing proteomics if they wanted to publish at a certain level of credibility," said Karin Rodland, Smith's colleague at PNNL and a professor emeritus at Oregon Health and Science University. Rodland did not participate in the CPTC portion of the initiative but has been involved in subsequent stages of the effort.

Such work was needed to give mass spec-based proteomics the broader acceptance it lacked at the time, Rodland noted.

"Mass spectrometry in the early 2000s had developed a reputation for being irreproducible because of poor experimental design in the very first plasma proteomic studies," she recalled. "It had given the field a bad name. There was a tendency to poo-poo mass spectrometry. And that was what [the CPTC] was all about. Erasing that bad name."

The effort has also served to spark the development of new tools, noted Alexey Nesvizhskii, professor of bioinformatics at the University of Michigan.

"It stimulated new technology developments and new computational developments," he said, noting that many of the tools developed within CPTAC are now widely used by investigators outside the initiative.

Nesvizhskii's lab has produced a number of new proteomics software tools, including MSFragger and IonQuant, through its work within the initiative.

Using the tools and workflows developed and validated through the initiative, CPTAC researchers have generated some of the largest and most comprehensive cancer proteomic and proteogenomic datasets to date, providing both new biological insights as well as data that outside scientists can use in their research. The NCI's Proteomic Data Commons, which features the CPTAC data along with data from associated efforts like the International Cancer Proteogenome Consortium (ICPC) and Applied Proteogenomics Organizational Learning and Outcomes (APOLLO) project, currently features proteomic data on roughly 2,400 cancer samples spanning more than 20 cancer types.

"The network has generated an impressive number of cancer-type profiling studies," said Connie Jimenez, head of the OncoProteomics Laboratory at the VU University Medical Center Amsterdam. Jimenez is not a CPTAC participant but said that she uses the tools and data developed by the project in her work.

"These are important landscape studies that are exploratory and hypothesis generating," she said. "They were large scale so that a representative sample per tumor type could be analyzed, and there was [a] big focus on the quality of the samples and quality control. They've generated deep insights into cancer biology and have also yielded a lot of biomarkers for follow-up research and also novel drug targets."

Jimenez noted that while there are a number of cancer proteomics research groups and initiatives around the world, few have the resources, financial and otherwise, for projects as ambitious as CPTAC.

"It's the scale [of the CPTAC data] and the number of different data types," Jimenez said, which include genomic, transcriptomic, microRNA, and methylation data, as well as, in more recent studies, information on the tumor microenvironment and the immune system.

"They are very comprehensive, and when you put the information from all these different layers together, including the proteome and often the phosphoproteome, you can really get a mechanistic understanding and hypotheses of the cancer biology or how the cancer rewires and is able to grow and manipulate its environment," she said.

This data has also helped push proteomics closer to the mainstream of cancer research, Jimenez said.

"Still, the cancer field is a genomics field, and I think the CPTAC network has helped to get more focus on the proteins, the molecules in the cell that do most of the work and provide most of the drug targets and biomarkers," she said. "They have had, in recent years, a series of high-impact papers in Cancer Cell, Cell, Nature. … These articles get attention from oncologists, from people in the cancer omics field."

"I definitely see [proteomics] moving more into the mainstream," Rodland said. "The early adopters in clinical medicine are now aware of proteomics and phosphoproteomics and what they can contribute, and I am seeing more openness to including proteomics measurements in the early stages of clinical trials. We are getting positive responses from the people we are reaching out to."

"I think people are realizing that this type of data is important," Cutillas said. He noted, though, that the proteomics techniques employed by the CPTAC teams remain out of reach for many.

"It's not as widespread as, say, transcriptomic or genomic analysis because of the cost and availability of instruments and labs with expertise," he said.

Smith, likewise, noted that while mass spec proteomics capabilities are beginning to reach the broader community, it remains difficult for a typical lab "to do these studies of hundreds of samples with the depth of proteomic analysis where you are measuring 10,000 proteins and maybe 50,000 phosphorylation sites and doing it in a quantitative fashion."

Rodland said that in the newly launched fourth edition of the initiative, she is aiming to push the effort's proteomics insights closer to actual clinical applications. In the previous stage of the project, she and her colleagues identified the protein Aurora kinase B, for example, as key to early resistance to gilteritinib in FLT3-mutated acute myeloid leukemia (AML).

They are now continuing this work with AML clinical trial samples, investigating drug treatments and resistance in the disease.

"We are looking at clinical trials of combination drug therapies, trying to understand the cross-talk between combination therapies, how to identify synthetic lethality, what drugs to use in combination, and whether you do it simultaneously or sequentially," Rodland said. "We are trying to have a molecular understanding of combination drug therapy."