Skip to main content
Premium Trial:

Request an Annual Quote

JGI Installs Second PacBio, Phases out 454 as it Preps to Generate Nearly 50 Trillion Bases in 2012


By Andrea Anderson

The US Department of Energy's
Joint Genome Institute is tweaking its sequencing technology lineup as it gears up to produce more than 47 trillion bases of DNA sequence in the 2012 fiscal year, with more than half that sequencing output slated to go toward its largest user program, the Community Sequencing Program.

JGI is currently running eight Illumina HiSeqs, two Illumina MiSeqs, and a single PacBio RS machine, according to JGI spokesman David Gilbert. The institute just phased out two Roche 454 instruments, along with five Illumina GAIIx sequencers, and has purchased a second RS system that was delivered this week and should be up and running next month, Gilbert said.

As it has expanded its fleet of Illumina sequencers and built up its capability to generate long reads using the PacBio RS, JGI has also been decreasing its reliance on Roche 454 instruments, JGI Director Eddy Rubin told In Sequence. "We've been able to do many of the applications cheaper with Illumina," he explained, "and we're hoping that many of the things we did with Roche [454 sequencers] we can do with PacBio."

Longer reads are still an advantage for many projects, including sequencing efforts involving genomes or metagenomes that have not been characterized previously. But Rubin said JGI is planning to use PacBio platforms for many of the long-read applications where Roche 454 sequencing was used in the past.

"One of our big focuses is on de novo sequencing," Rubin said. "For our de novo sequencing, where we have not seen the genome before — like metagenomic environments, fungi, plants — having long reads is a real advantage, so PacBio does contribute to that."

Although it is still relatively expensive to generate sequence data using PacBio systems, he noted that the read lengths and nature of the errors generated with the RS instrument are appealing.

"Its throughput is expensive and a disadvantage, but its long reads are great," Rubin explained. "It has errors, but the errors are random — it seems to cover both high- and low-GC [regions of the genome] very well."

Nevertheless, much of the sequencing done at JGI will be done on Illumina machines, which Rubin called the "big workhorse for much of what we're doing."

41 New Proposals

JGI last week announced that it had chosen 41 research projects out of 152 applicants to participate in its 2012 Community Sequencing Program.

The Walnut Creek, Calif., genomics facility provides sequencing services and analytical assistance for projects selected for the CSP, Rubin explained. For their part, participating researchers must be able to show that they have the financial wherewithal to bring other aspects of the projects to fruition, including funding for collecting the samples and dealing with the data provided by JGI.

Eligible projects are those related to nearly every aspect of the biosphere except biomedical science, Rubin told IS, including studies of previously uncharacterized genomes, plant-microbe interactions, and microbial communities.

The 41 proposals that were approved for 2012 run the gamut from genome sequencing studies aimed at fleshing out branches of the tree of life to metagenomic efforts related to bioremediation and/or biofuel production.

In additional to DNA sequencing, some CSP projects will also include RNA sequencing aimed at assessing gene expression or metatranscriptomes.

Some of the 2012 CSP projects will involve whole-genome sequencing of specific organisms — including a study headed by researchers at Oregon State University and the University of California, Riverside that is focused on sequencing around 1,000 fungal genomes in an effort to establish two or more reference genomes for each of the 577 fungal families.

Nevertheless, many of the 2012 CSP projects are geared toward understanding the genomics of communities and interactions between organisms in specific environments rather than individual genomes, Rubin explained.

As part of one of the larger CSP projects, for instance, a team led by University of North Carolina microbiology researcher Jeff Dangl is looking at interactions between plants such as Arabidopsis, maize, or the biofuel crop plant Miscanthus and the microbial communities in the rhizophere around the roots of these plants.

"Somewhere in the future we want to improve plant growth by recruiting certain microbial communities," Rubin said. "Genomics in the past has very much been about looking at a microbe or a tree. This is about looking at their interactions."

Overall, roughly 55 percent of the sequence generated at JGI in fiscal year 2012 is expected to go toward CSP projects. The remaining sequencing capacity at JGI will be devoted to the DOE's Bioenergy Research Centers, the JGI Director's Science Program, the Low Dose Radiation Research Program, and the International Cooperative Biodiversity Groups program.

Specifically, around 30 percent of the estimated 47.1 trillion bases to be generated at JGI next year will go to the Bioenergy Research Centers, centers funded by the DOE that are interested in converting biomass into biofuels. The JGI Director's Science Program has been allotted 10 percent and the Low Dose Radiation Research and ICBG programs will make up the remaining five percent of the sequencing output.

A 30-Fold Increase

Prior to 2009, JGI relied primarily on Sanger sequencing for the CSP and other projects (IS 7/11/2008). Since then, though, the institute has increasingly turned to next-generation sequencing platforms.

By adopting high-throughput sequencing technologies over the past few years, JGI has accomplished a tremendous increase in sequencing output: Between 2009 and 2011, the institute has increased the amount of sequence generated by roughly 30-fold with a budget that's stayed level at around $70 million, Gilbert noted.

For the CSP in particular, sequence output in 2012 is expected to be four to five times higher than it was in 2011.

"All of the metagenomic projects are getting bigger and bigger," Rubin said.

"With Sanger [sequencing], metagenomic projects were tens of millions of base pairs," he added. "Now we have metagenomic projects that are terabases."

To deal with the massive amounts of sequence data it is generating, JGI has started using the Berkeley Supercomputing facility, known as the National Energy Research Scientific Computing Center, or NERSC, for storing and analyzing sequence data. To that end, Rubin said JGI is in the process of converting genomic algorithms into a format that is compatible with the high-performance computing hardware at NERSC.

Along with DNA and RNA sequencing capabilities, he noted that the center has also been developing other sequencing-related capabilities, including technology for amplifying and sequencing DNA from single microbial cells that can't be grown in culture — technology that allows for projects that involve metagenomic sequencing of a given microbial community in parallel with single-cell sequencing of microbes from the same environment.

"The JGI primarily used to just offer sequence analysis and progressively we're offering additional things: we're offering single-cell genomics, we have a small DNA synthesis program," Rubin said. "There are various other capabilities that we are progressively offering to start becoming a next-generation genome center."

Have topics you'd like to see covered in In Sequence? Contact the editor at anderson [at] genomeweb [.] com.