New sequencing technologies, in particular 454 Life Sciences’, have already shown their value in analyzing bacteria and viruses in metagenomic samples.
But in the future, the new platforms may play an even bigger part in the growing field of metagenomics. According to a recent report by the National Research Council of the US National Academies of Sciences, new sequencing technologies will be especially helpful for obtaining sequence information from microbes present in low abundance, since their oversampling cost is much lower than that of Sanger sequencing.
“It was just way too prohibitively expensive to attack [metagenomics] by Sanger sequencing, unless you were Craig Venter,” said Bill Farmerie, whose core facility at the University of Florida, Gainesville, is considering using its 454 sequencer in a metagenomics project. “But this [technology] makes it available.”
According to the NAS report, entitled “The New Science of Metagenomics: Revealing the Secrets of Our Microbial Planet,” one run on 454’s GS FLX generates the same number of base pairs as about 100 runs on an ABI 3730xl instrument “for approximately 20 percent of the cost.”
Specifically, the report, which lists technical features of 454’s GS FLX, Illumina’s 1G Genome Analyzer, and Applied Biosystems’ upcoming SOLiD instrument, cites a 2006 metagenomics study of samples from an iron mine in Minnesota that used 454’s technology.
The study, published in BMC Genomics by a group led by scientists from San Diego State University, “suggested that the 454 sequencing data are remarkably similar to those generated from the same sample with Sanger sequencing, at least in terms of 16S rRNA sequences.
“Additional studies to validate the utility of the short reads clearly are warranted, but the initial data support the role of new sequencing technologies in future metagenomics studies because they will allow for deeper sampling of environmental samples than is currently possible,” the report states.
454’s website lists seven published metagenomic studies on its website as of this week that range from the mouse gut to soil and ocean samples.
Other users are now also testing Illumina’s Genetic Analyzer for metagenomics projects.
“I think Illumina is going to be a great platform for that as well,” Elaine Mardis, co-director of the Genome Sequencing Center at Washington University, told In Sequence last week.
Her center is currently testing ways to generate 50 base-pair reads on the Illumina instrument. “We are looking in our experiment now to see how well we can pick out matches within a metagenomic sample [using Illumina’s 50 base pair tags],” she said. “Fifty base pairs should be plenty to get lots of information about microbes that are in there.”
But the shorter reads associated with next-gen sequencers may prevent researchers studying a metagenomic sample from asking questions other than what species are present.
“It was just way too prohibitively expensive to attack [metagenomics] by Sanger sequencing, unless you were Craig Venter.”
For example, the NAS report doubts that researchers will be able to assemble genomes from metagenomics data. The new technologies “are still vexed with issues such as shorter read lengths than those that have become routine with Sanger sequencing,” the report states. “The limitations have obvious consequences for assembly, particularly for metagenomics applications in which assembly is already complex and difficult.”
One way to improve assemblies from metagenomic data, the report suggests, is to compare the data to reference genomes, and the new technologies are already helping in creating more of these references.
For example, last year scientists from Washington University, the Institute for Genome Research, and Stanford University launched the Human Gut Microbiome Initiative, a project that aims to generate “deep draft” genome sequences of 100 cultured bacterial reference species found in the human gut. The group plans to finish 15 of them using a combination of 454 sequencing and Sanger sequencing.
“A cost-effective strategy involves producing the bulk of the coverage by shotgun reads on a 454 Life Sciences pyrosequencer,” coupled with paired-end reads from a conventional ABI 3730xl capillary instrument, according to a project outline.
The project, funded by the National Human Genome Research Institute, is still underway and will probably be completed next year, Mardis said. According to the project outline, the initiative will cost approximately $2.8 million.
But even for other analyses of metagenomic data, such as comparing proteins encoded by different microbes in these samples, short read lengths might not be good enough.
At a conference in San Diego last month, Bob Strausberg, deputy director of the J. Craig Venter Institute, said that only Sanger sequencing currently provides reads long enough to analyze and compare proteins encoded by DNA in metagenomic samples. He cited a study of proteorhodopsin-like genes in a Sargasso Sea sample that revealed how they have adapted to different wavelengths available, concluding that “none of this could be done without Sanger reads.”