Skip to main content
Premium Trial:

Request an Annual Quote

Swedish Team Develops Genome-Wide Database Assessing Impact of Individual Genes on Cancer Outcomes


NEW YORK (GenomeWeb) – A team led by researchers at Sweden's Royal Institute of Technology (KTH) and Uppsala University have developed an interactive database containing information on the impact of individual proteins on clinical outcomes in 17 cancer types.

Described this week in a paper published in Science, the resource offers cancer researchers a tool for navigating the massive amounts of molecular data generated by efforts like the National Cancer Institute's Cancer Genome Atlas (TCGA) initiative and could enable the identification of markers predictive of disease progression and outcome as well as possible drug targets, said Mathias Uhlén, professor of microbiology at KTH and first author on the study.

Uhlén and his colleagues developed the database, which they are calling the Human Pathology Atlas, as part of the larger Human Protein Atlas (HPA) project, which Uhlén leads out of KTH.

Using transcriptomic data from TCGA and antibody-based protein measurements from the HPA, the researchers looked at the prognostic role of each protein-coding gene in 7,932 tumor samples spanning 17 different tumor types. According to the authors, they generated 100 million Kaplan-Meier survival plots describing the prognostic value of these genes based on transcript levels, more than 900,000 of which are available at the pathology atlas resource.

The effort, Uhlén noted, differs from many other initiatives in that, rather than looking at links between mutations or gene variants and cancer outcome the focus is on links between protein expression, as represented by transcript levels, and outcomes.

"This is completely separate from what most cancer researchers are asking when they ask if a mutation is good or bad," he said. "We are looking at the downstream effect of mutations not the mutations themselves."

The project was enabled by the comprehensive transcriptomic profiling done on TCGA samples as well as work published last year by Uhlén and collaborators demonstrating that transcript levels could serve as effective proxies for protein expression.

One of the main takeaways of the work, Uhlén suggested, is the breadth of genes and proteins that are prognostic for cancer. The researchers found that some 10,000 of the roughly 20,000 protein-coding genes in human were prognostic in at least one of the cancer types. Many of these genes are involved in functions like mitosis and regulation of apoptosis, he said, noting that "this makes sense because we know that a lot of cancers are widely proliferating cells."

Among the down-regulated prognostic genes are many tissue-specific molecules, which Uhlén noted also fits into existing understandings of cancer and in particular the observation by pathologists that the more de-differentiated a cancer is the more aggressive it is likely to be.

More than presenting any firm conclusions, though, the study presents researchers with a resource they can use to explore potential biomarkers and drug targets more deeply. The fact that 10,000 genes are prognostic for one of the 17 cancers is interesting but not particularly useful from a clinical perspective, Uhlén noted.

"Oncologists and pathologists are not very impressed by prognostic genes because they are only statistical," he said. "You can have a bad gene and survive very well, and you can have a good gene and not survive."

What is needed going forward is more detailed study of specific cancers and questions to identify sets of molecules that can consistently predict outcomes in large group of patients. More investigation into the effects of different therapies is also need, Uhlén said, noting that he and his colleagues did not have treatment information for the patient samples they used to build the database.

"So, now it would be very nice to move to better stratified or better quality patient material where we also know how they were treated," he said. "We are only scratching the surface right now."

The pathology atlas is a "knowledge base" that takes the vast amounts of data present in the TCGA and HPA datasets and presents it in a form more accessible to the average cancer researcher, Uhlén suggested. "Pathologists understand Kaplan-Meier plots and so on. So, we show which are [potentially] prognostic genes and now it is up to different groups and research consortia to go into different cancers and validate those."

He added that he also believes the resource will prove useful for discovery of new drug targets, and noted work he and his colleagues did in the paper to identify potential targets in liver cancer.

For this work they combined the TCGA expression data with a model of cell metabolism developed in Uhlén's lab.

"We have a metabolic model [comprising] 4,000 enzymes, and so we can look at the expression [of each enzyme] and whether it is present or not in a [tumor sample]," he said. "And we can put that into the model, and we can see if they can produce the energy required, because cancers need a lot of energy because they are growing very fast."

Through this analysis they found that many of the enzymes present are redundant. "If you lack an enzyme you will still do fine because the cell will [use] alternative pathways," he said.

However, Uhlén said, a small number of enzymes are essential, meaning they can't be routed around. More interestingly from a therapeutic perspective, a number appear to be redundant in healthy tissue but essential in tumor tissue.

"And those could, of course, be fantastic targets for cancer therapy," he said. "This is one reason we are saying this could be a part of personalized medicine, because we can use next-generation sequencing of these patients, put that data into the model, and then we could say these patients should be treated with this inhibitor and not this one and so on."

Uhlen and his colleagues are now following up their work with more in-depth studies of lung and colorectal cancer, some of which they published in the Science paper, confirming the prognostic potential of several genes at both the mRNA and protein levels.

"We're planning also to do a deeper analysis for other cancers, as well," he said. "But we also hope that this new database will inspire other groups around the world to do that kind of work."