Geospiza said this week that the Ontario Institute for Cancer Research in Toronto will use its FinchLab software to manage data from a number of next-generation sequencing projects, including data for the recently launched International Cancer Genome Consortium.
Founded at the end of 2005, the OICR has rapidly accumulated a sizable fleet of second-generation sequencers. As of March, the center had five Illumina Genome Analyzers and five Applied Biosystems SOLiD systems. At the time, an OICR official told BioInform sister publication In Sequence that the center plans to increase its next-gen sequencing capacity another three- to fourfold over the next two to three years.
In April, the OICR took on an even larger role in the next-gen sequencing community as the headquarters for the ICGC, which plans to resequence around 25,000 samples of around 50 different tumor types over the next 10 years. The OICR bioinformatics team, led by Lincoln Stein, will manage the ICGC Data Coordination Center [BioInform 05-16-08].
Geospiza CEO Todd Smith said that the company responded to a request for proposals for data-management systems that OICR issued in March. An OICR spokesperson told BioInform in an e-mail that the institute chose Geospiza because it “best met our scientific and strategic needs.”
“This is really a validation of what we have been working toward with this software product,” Smith told BioInform. “It’s very important that a significant group is an early adopter” of the software, he said.
Francis Ouellette, associate director of informatics and biocomputing at OICR, told BioInform in an e-mail that he and his OICR colleagues are “taking a systematic approach” to understanding the DNA sequence variations responsible for cancer, “and plan to integrate [FinchLab] with other scientific activities at the OICR … like the ICGC.”
Cruising the Exome
Ouellette said that OICR is generating diverse data for ICGC as well as for its internal research projects, which include human exome genome sequencing, expression analysis, and epigenomics studies on human cancers with an initial emphasis on pancreatic cancer.
Data-management needs at OICR include tumor, sample, and analysis tracking; planning sequencing runs; quality control; and preparing reports. One example where FinchLab will be put to work will be in “tracking Applied Biosystems and Illumina reads from the cancer genomics sequencing project,” said Ouellette.
Second-generation sequencing “turns the knob to 11, if you will, but people are going to ask the same kinds of questions, just in a new way with new data.”
“We are using FinchLab to help us analyze and process cancer genomics data from our next-generation sequencing platform and we will use FinchLab to complement our in-house developed bioinformatics tools and analysis pipeline,” he added.
OICR has just begun integrating its in-house data-analysis system with FinchLab, he said.
Smith explained that the institute may have contemplated continuing with its home-grown system but ultimately opted for a commercial solution. “The people there certainly have the capabilities to do that, but they wanted to see if they could avoid that step this time,” he said. “They know how hard it is.”
Many research institutes have developed in-house informatics infrastructures for Sanger sequencers and other technologies that produce far less data than next-gen sequencing, explained Smith. “The dialogue we have been having is, ‘This is really different and the system we have isn’t going to meet that need,’” he said.
“Our view has been for a very long time that a lot of these programmers in these institutions should be more deeply focused on the science problems rather than the infrastructure,” said Smith.
Smith views Geospiza’s role in the second-generation sequencing era as not only helping with data handling but “converting it into information, which has always been the focus and direction of the company,” he said.
He added that since Geospiza was founded in 1997, “our core talents and technologies have been directed toward genetic analysis.”
When the Mailroom Sequences
Second-generation sequencing technologies create opportunities for scientists to explore genomes in “very new ways,” he said, noting that this market is no longer limited to large labs.
Referring to a recent ad campaign from Illumina in which a researcher claims, “I’ve turned my mailroom into a genome center,” Smith said, “I ask the question, ‘Well, that’s great, [but] where are you going to put your data center?’ And that’s what we can help them with,” he said.
Smith cited the company’s long experience with managing Sanger sequencing data as an advantage in the rapidly evolving market for next-gen data-management tools. “We have a real understanding of the different kinds of experiments people do,” he said.
Second-generation sequencing “turns the knob to 11, if you will, but people are going to ask the same kinds of questions, just in a new way with new data,” said Smith.
While Geospiza’s focus in the past has been core laboratories, the company’s potential customer base in the second generation era includes biotechnology and pharmaceutical companies, too. After all, “most people who have the technology in place are struggling with it,” he said.
Providing OICR with technology is also “validation of our experience in science in working with these data,” said Smith.