DENVER (GenomeWeb News) – As researchers get ready to wrap up the pilot phase of the Cancer Genome Atlas later this year, they're gearing up for the second phase of the project, which will likely involve sequencing roughly 20 to 25 tumor types in the next five years. Second-generation sequencing methods are playing an increasing role in this effort, attendees heard at the American Association for Cancer Research meeting here in Denver yesterday.
Several National Cancer Institute representatives, TCGA members, and those already using data from the TCGA project came together to discuss the past, present, and future of the project at the AACR meeting.
TCGA, officially launched in late 2005, is a research collaboration aimed at characterizing the genomes of several tumor types. By working together in an integrated, high-throughput research effort, TCGA can tackle these genomes quickly and cost efficiently, Daniela Gerhard, director of the NCI's Office of Cancer Genomics, said during her presentation.
Those involved said researchers expect to finish the pilot phase of TCGA this fall. Among the goals of that pilot: identifying high quality biospecimens, characterizing 500 glioblastoma multiforme and ovarian cancer cases and 200 lung cancer samples, and ensuring that the data has sufficient power to detect changes present in about three to five percent of samples.
TCGA published a paper describing preliminary results for 206 GBM samples in Nature last September. At the session yesterday, Peter Park, a researcher affiliated with Harvard University, the Brigham and Women's Hospital, and the Children's Hospital, said the team has now completed 277 GBM samples and counting. When they evaluated 700 more genes in dozens of samples, Parker said, the team found many additional genes present at relatively low frequency.
Members of the consortium are reportedly in the process of migrating from Sanger sequencing to high-throughput, second generation sequencing platforms. The collaboration has been modifying its pipeline for second-generation sequencing over the past few months.
They are also making progress characterizing ovarian cancer genomes, sequencing 600 genes in 23 serous ovarian cancer samples. In so doing, TCGA members have so far identified 91 mutations in 56 different genes in ovarian cancer.
Eventually, they plan to scale this up to nearly 6,000 genes, also looking at microRNA sequences and miRNA target sequences. The first ovarian cancer data should be out at the end of May, with much more data likely released this summer, researchers said.
Those involved in the session explained that TCGA has yielded data providing insights into DNA methylation and mutation, pathway views, cancer sub-type classifications, and more. Now, it's up to the research community to come up with creative ways to use transcriptome, copy number, methylation, sequence, miRNA expression, and clinical data, Park and others said.
Gerhard noted that the team has learned many lessons in the past few years, including insights into tissue collection and the effect of tissue quality on molecular data. For instance, she said, it's been difficult to procure cancer samples that have adequate clinical annotation and information about patient outcomes.
A standard operating procedure for next-generation tissue processing has been established — a move that should benefit not only future TCGA efforts but also research by other members of the community, Gerhard said.
Data from TCGA is being made freely available online. To protect patient privacy, the data will be available in two tiers: a public tier and a protected tier that can only be accessed by bona fide scientists who have submitted an application form.
"We encourage anyone and everyone to use it for their own research," Gerhard said.
While future funding for the project has not yet been clearly defined, Gerhard said the TCGA's pilot phase — which had a budget of $100 million — has more than achieved the proof-of-concept goals set out for it. Down the road, Gerhard called for an improved analytical pipeline for TCGA — a view echoed by other speakers.
Speaking on the future of TCGA, NCI Deputy Director Anna Barker said the project requires an analytical pipeline that can ensure close to real-time interpretation of results while allowing data integration and sharing. That will likely require increased investment in bioinformatics tools, she added.
Results from TCGA so far indicate that it is possible to distinguish between cancer-related signals and noise, discover new cancer genes, classify cancer sub-types, find clinically relevant data, integrate research teams, and do high-throughput research, Barker said. Now, she emphasized, the pressure is on for researchers to work together and meet specific goals and milestones in the effort to understand and ultimately treat cancer.