New sequencing technologies will play an important role at the three Genomic Sequencing Centers for Infectious Diseases that in April were awarded five-year contracts totaling $106 million from the National Institute of Allergy and Infectious Diseases.
The three centers are at the Broad Institute and the J. Craig Venter Institute, which each won a $43 million contract, and the University of Maryland School of Medicine's Institute for Genome Sciences, which received $20 million.
Though the three contracts are new awards, they continue a previous initiative, under which the NIAID awarded five-year contracts to two Microbial Genome Sequencing Centers in 2003 — the Broad Institute and The Institute for Genome Research, which merged with JCVI in 2006.
Like the MSCs, the three new GSCIDs aim to provide the infectious disease research community with rapid and cost-effective high-quality sequencing services for pathogenic microorganisms — including viruses, bacteria, fungi, and protozoa — as well as invertebrate vectors. In addition, the three centers will conduct genotyping studies of microbes and their human hosts in order to study the variation in host response, an expansion of the original initiative.
The centers fit in with NIAID's overall genomics program, which aims to provide a variety of services to the scientific community, including functional genomics, proteomics, structural genomics, bioinformatics, and reagents.
According to Maria Giovanni, assistant director for microbial genomics and advanced technology at NIAID, the three centers provide more than just sequencing services, as they also offer expertise in pathogen biology. The ultimate goal is to use the sequencing data to develop new diagnostics, vaccines, and drugs, she said.
Sequencing projects to be conducted by the three centers are selected from white paper proposals that can be submitted by members of the scientific community, the sequencing centers, or both. "It's a very flexible program because we want to respond to what the scientific community needs," Giovanni told In Sequence last week.
For example, under the last contract, JCVI generated an influenza virus sequencing pipeline and sequenced almost 4,000 complete human and avian influenza viruses, which it made publicly available through GenBank. This year, with the pipeline already in place, the institute also sequenced isolates from the H1N1 virus, an example of a "response to a need in the scientific community and to a public health need," she said.
The Broad Institute, for one of many projects under its last contract, set up a pipeline to sequence Dengue viruses, of which only very few complete genome sequences were available at the time. To date, she said, it has completed about 1,500 Dengue virus genomes.
New sequencing technologies, with their ability to generate large amounts of sequence data quickly and cheaply, have already changed the way the NIAID-funded sequencing centers conduct projects, and will likely continue to do so in the future.
Changes start with the "white paper" proposals. "Sometimes, the community members don't understand the full capabilities of the technologies, so we help them understand what's possible," said Bruce Birren, co-director of the genome sequencing and analysis program at the Broad Institute and director of the institute's GSCID. "Usually, that involves raising their sights and being more ambitious than they otherwise would have been."
Birren's center uses all of the new sequencing platforms available at the Broad Institute, which include 454's Genome Sequencer FLX, Illumina's Genome Analyzer, Applied Biosystems' SOLiD, and Helicos' Genetic Analysis system, "and we have relationships with all of the instrument manufacturers who have sequencing platforms that aren't quite yet ready for market," he said.
[ pagebreak ]
"We no longer have one sequencing method that's perfect for all applications," according to Birren. "Some of these give you very large amounts of very cheap data, with very short reads; others give you longer reads at higher cost. Part of the challenge, and the fun, is matching the technology to the problem."
Because the goals of a project determine which sequencing strategy is most appropriate, proposals also need to be more specific now in stating these goals, according to Claire Fraser-Liggett, director of the Institute for Genome Sciences at the University of Maryland.
White papers "are going to have to address more than just 'What's the organism of interest and how many isolates do you have on hand?'" she told In Sequence last week. "There is going to have to be some considerable thought put into what the real goals of the project will be, and based on those goals, what level of sequence coverage … will be required in order to answer specific questions."
Fraser-Liggett said that her center expects to use primarily 454 and Illumina sequencing technology "over the next couple of years," and currently has two of each in place.
Meantime, at JCVI's sequencing center, new sequencing technologies from 454, Illumina, and ABI have already replaced much of ABI 3730xl Sanger sequencing, and are continuing to do so, according to Bill Nierman, director of infectious diseases at JCVI.
When the awards were made in April, he said, many of the sequencing strategies that went into the original proposal had already "been dramatically revised to accommodate next-gen sequencing, and to reduce the input of Sanger sequencing with 3730xls."
Sanger sequencing still plays a role in some resequencing and influenza projects, he said, but "we are looking to quickly migrate those projects to 454 platforms." Lately, he said, his center has been thinking about conducting de novo sequencing by 454, and sequencing additional strains on one of the short-read platforms, either the Illumina or the SOLiD.
One of the opportunities that new sequencing technologies have afforded is to study entire populations instead of single organisms, according to Birren. "The right scale of projects for bacterial sequencing has changed, in just a few years, from an organism to a couple of organisms to now hundreds of isolates," he said.
"Originally, it was a big deal to sequence a genome," he said. "But now, the fact that the sequencing is so rapid and so cheap [means] we can start to characterize populations."
This is important not only to understand how microbial populations infect humans and are recognized by the immune system, but also for designing effective vaccines against them, according to Birren.
For example, in a study of drug-resistant tuberculosis, in order to understand what mutations are involved, and how the organism can afford to carry them, "we need to be looking at hundreds of examples to have the statistical power to make inferences, and these new technologies make it possible to sequence, literally, hundreds of different strains of drug-resistant TB," he said.
The introduction of new sequencing technologies has also meant that the bottleneck of projects has shifted from sequencing to the front end — acquiring and preparing samples — and back end of the process, or data analysis.
A larger percentage of the budget, for example, is now devoted to bioinformatics, according to Fraser-Liggett, than there was under the previous contract at TIGR, where she was president and director until 2007. "It's less expensive to generate considerably more data, but we need more effort downstream to do the same level of analysis," she said.
Nierman added that the informatics infrastructure needed to support the instruments is "a big part" of the new technologies.
"What we have to make sure is that we are building the databases and the tools that go along with this data," said Giovanni, adding that NIAID has "a large investment in bioinformatics and databases."
[ pagebreak ]
Along with the increased data output, the cost of sequencing has come down considerably with the advent of the new technologies. The original solicitation for the centers, posted in early 2008, states a production cost of $1 or less per Q20 kilobase for whole-genome shotgun data, including overhead and equipment costs, and finished genome sequence costs of less than $20 per kilobase above that.
Those numbers "refer primarily to Sanger sequencing, which makes up a very modest component of all of the work that we are doing now," according to Birren. Costs "have fallen by a factor of severalfold just by moving from Sanger to 454, and another 10-fold going to Illumina, and other new technologies are likely to further reduce that," he said.
Cutting costs further remains a priority, "and we ask [the centers] for numbers every couple of months," said Giovanni.
It is also in the interest of the centers, "because every time we cut the cost of sequencing in half, we double the number of organisms we can sequence," Birren said.
According to Fraser-Liggett, "We have always put our major emphasis on [cutting costs], as have the other centers, and that was a requirement the first time around as well." Cost savings can come not only from new technologies, she said, but also from improving existing ones, for example by optimizing protocols or by increasing the level of automation.
"That is why cost continued to decrease even when sequencing platforms and technology did not necessarily change very rapidly," she said. "There was always room for improvement in the overall process."
Fraser-Liggett pointed out that the cost of a finished genome sequence depends on the definition of "finished", and that many projects today don't require the same level of completion as past projects did.
In addition to reducing costs, the new sequencing technologies have also sped up projects, for better or for worse.
"With smaller genomes, you can essentially complete an entire project in a day on these very high-throughput platforms," she said. "We don't have the luxury we did previously with the slower capillary sequence technology, being able to generate three-fold coverage of data, and look at it, and then make decisions."
Besides moving over to second-generation sequencing platforms, the three centers that won the recent NIAID awards are keeping an eye on emerging sequencing technologies. Speaking for the JCVI, Nierman said that "I think within the five years of this contract, we are going to be doing projects that will look significantly different … than we are going to be seeing as we start."
Added Fraser-Liggett, "It would not surprise me at all [if] four years from now, what we are doing, and how we are going about these studies, may be very different than what our plan would be as we get started now."
She cautioned, however, that "you can almost get distracted, in part, by trying to go after all the new technologies, to the detriment of getting the work done."
But the new contracts — unlike the previous ones — also let the centers devote part of their budget to technology R&D, which is "going to facilitate the implementation of any new technologies into all the sequencing pipelines," she said.
For instance, "if a new technology looks to be very promising," centers can now do side-by-side comparisons with existing technology as part of ongoing projects.
But R&D activities are not restricted to testing and implementing new sequencing platforms.
For example, researchers may explore hybrid sequencing strategies to decode pathogens that have proven problematic in the past because of the large amounts of repetitive DNA they contain, Fraser-Liggett said.
[ pagebreak ]
According to Nierman, "a lot of technology development is going to go into genotyping applications" — such as developing new array-based approaches for targeted genotyping.
Also, as sequencing gets cheaper, "the genotyping and many other array-based applications will migrate to sequencing platforms," he said.
Another area of R&D will pursue methods to sequence small amounts of material, said Birren. For example, in order to study how the host immune response shapes the evolution of a pathogen, samples need to be analyzed directly from a patient. "Some of the technologies we need would allow us to sequence smaller and smaller amounts of material," he said.
Overall, the new technologies could eventually help the three centers fulfill their mission to improve infectious disease research. "The centers clearly have a mastery of technology, based on a long expertise and experience applying that technology, but the key to the centers is coupling that technology to important problems in infectious disease, through our connections to the infectious disease community," Birren said.
"The white paper process is really excellent at bringing together infectious disease doctors and epidemiologists with persistent problems that their own disciplines haven't been able to solve yet," he added, "and putting them in touch with the power of genomic technology that really, we think, is going to leapfrog this."