NEW YORK (GenomeWeb News) – In a paper appearing online in Nature today, researchers from the Department of Energy's Joint Genome Institute and elsewhere reported on the findings from the first 56 genomes to be sequenced and analyzed through the Genomic Encyclopedia of Bacteria and Archaea, or GEBA, project.
By sequencing and analyzing 53 bacterial and three archaeal genomes, the researchers not only identified new protein and gene families, but also improved their understanding of microbial function and phylogeny. And with even larger scale phylogeny based microbial sequencing efforts on the horizon, members of the GEBA team expect to gain an even more complete picture of individual microbes and microbial communities as a whole.
"Our results strongly support the need for systematic 'phylogenomic' efforts to compile a phylogeny-driven 'Genomic Encyclopedia of Bacteria and Archaea' in order to derive maximum knowledge from existing microbial genome data as well as from genome sequences to come," senior author Jonathan Eisen, a researcher at the University of California at Davis and head of JGI's phylogenomics program, and his co-authors wrote.
Although almost 1,000 bacterial and archaeal genomes have been sequenced so far, the researchers explained, the selection of these genomes has been biased toward organisms of interest for their physiology and/or role in human, plant, or animal disease.
In contrast, the GEBA project, which was launched in 2007, aims to catalog and characterize microbes based on their phylogeny or place in the tree of life.
"[W]hile the currently available sequenced genomes cover a wide range of biological and functional diversity, they have not covered a wide enough range of phylogenetic diversity," Eisen said in a statement. "What distinguishes GEBA is that it is less about the individual genomes and more about building a more balanced catalog of the diversity of genomes present on the planet which in turn should facilitate searches for novel functions and our understanding of the complex processes of the biosphere."
For the pilot phase of GEBA, the team set out to sequence 100 bacterial and archaeal genomes. They have already selected 159 "high priority" microbial isolates based on phylogenetic relationships inferred from small subunit ribosomal RNA information. Data from 92 of the genomes has been released to the Integrated Microbial Genomes database and GenBank so far.
The current study describes the findings from the first 56 (53 bacterial and three archaeal) genomes, which the researchers decoded using shotgun sequencing with Sanger, Roche 454, and/or Illumina technology. Detailed information about the sequencing has been published in the several issues of the journal Standards in Genomics Sciences.
Based on their results so far, the researchers concluded that rRNA sequences are useful for helping pick out microbes from different phylogenetic positions in the tree of life, though they cautioned that there are limits to the resolution of this method.
By analyzing the genomes, the team has already found 1,768 protein families that don't share significant sequence similarity to any known proteins. They also identified previously undetected members of known protein families.
For instance, several organisms sequenced contained a new type of cellulase enzymes that seem to efficiently break down plant material in acidic environments. Investigators at JGI are currently characterizing these proteins in the hopes of applying the information to future biofuel technology.
"The information from this first set of organisms has provided a rich source of novel enzymes and detailed biochemical pathways that can help scientists optimize processes of critical importance to areas of the DOE mission, such as biofuels production, bioremediation, and how carbon is captured and cycled in the environment," co-author Edward Rubin, director of JGI, said in a statement.
After the pilot phase of the project wraps up, Eisen and his co-workers hope to expand GEBA to include hundreds or thousands of genomes, eventually sequencing both cultured and uncultured microbes.
"The known phylogenetic diversity of bacteria and archaea is immense with hundreds of major lineages and probably millions if not hundreds of millions of species," Eisen said. "This encyclopedia project is starting at the top — with the major phylogenetic groups — 100 genomes from across the tree. But we have barely scratched the surface of characterizing the diversity on the planet."