Skip to main content
Premium Trial:

Request an Annual Quote

What's in The Cancer Genome Atlas?


With cancer researchers collecting more and more information on mutations, epigenetics, gene expression, and clinical features associated with different cancers, where does the information go? The data collected by large international groups like The Cancer Genome Atlas and the International Cancer Genome Consortium gets put into databases that are accessible to the public and searchable in various ways, says The Scientist's Carina Storrs. She has tips for researchers looking to use the information in these databases to help in their work.

Both groups' data portals allow for searches on mutation and expression data on specific genes, so that might be a good place to start, Storrs says. Or, there's also the possibility of searching through information on all known genes or pathways implicated in a certain tumor type. "Both ICGC and TCGA data are publicly available, but note that they have already been processed: sequences have been confirmed by various techniques, and patient-identifying information, such as the presence of germline SNPs, has been removed," Storrs says. Researchers can also search for gene drivers in various tumor types, frequency of certain mutations, or the functional impact of a given mutation.

Various universities and research centers, like Memorial Sloan-Kettering Cancer Center, also have their own cancer genomics databases, which can also be used as a starting point for research, Storr adds.

The Scan

Study Points to Tuberculosis Protection by Gaucher Disease Mutation

A mutation linked to Gaucher disease in the Ashkenazi Jewish population appears to boost Mycobacterium tuberculosis resistance in a zebrafish model of the lysosomal storage condition, a new PNAS study finds.

SpliceVault Portal Provides Look at RNA Splicing Changes Linked to Genetic Variants

The portal, described in Nature Genetics, houses variant-related messenger RNA splicing insights drawn from RNA sequencing data in nearly 335,700 samples — a set known as the 300K-RNA resource.

Automated Sequencing Pipeline Appears to Allow Rapid SARS-CoV-2 Lineage Detection in Nevada Study

Researchers in the Journal of Molecular Diagnostics describe and assess a Clear Labs Dx automated workflow, sequencing, and bioinformatic analysis method for quickly identifying SARS-CoV-2 lineages.

UK Team Presents Genetic, Epigenetic Sequencing Method

Using enzymatic DNA preparation steps, researchers in Nature Biotechnology develop a strategy for sequencing DNA, along with 5-methylcytosine and 5-hydroxymethylcytosine, on existing sequencers.