As research tools become more refined, scientists are applying single-cell genomics approaches to fill in what is known about microbial diversity. Speaking at the eighth annual DOE-JGI User Meeting last week, the Broad Institute's Paul Blainey and Tanja Woyke from the US Department of Energy Joint Genome Institute showcased how both their groups are using single-cell genomics to bolster knowledge of candidate phyla.
The vast majority of bacteria cannot be cultured, and there are a number of bacteria that have only been glimpsed as their sequences are picked up by metagenomic studies. And so there are a number of candidate bacterial phyla, like OP9, NC10, ZB3, and others, about which little is known.
With new approaches, though, "we are really starting to fill in these branches," Woyke said.
The Broad's Blainey, who is also an assistant professor at MIT, outlined his lab's approach to single-cell genomics, which has two broad parts: isolating the cells and amplifying DNA from those cells.
To isolate cells, his lab takes a microfluidic-based approach using a 48-reactor device with a number of microfabricated valves that they developed. "With a laser trap, we can grab hold of bugs [flowing] into the chamber for lysis and whole-genome amplification," he said.
For that amplification, his lab turns to multiple-displacement sequencing, though Blainey noted the approach is very sensitive to contamination. Common sources of contamination, he said, include the sample — such as in the case of more than one cell being isolated — the lab environment, and even within the amplification reagents themselves, which he likened to "a prize in a cereal box."
To combat contamination, Blainey said his lab reduces the volume of the reaction, making the ratio of the cell to the contaminants more favorable. He added that this is not a complete solution but it is an improvement. Additionally, the lab typically measures contaminants using tools like digital PCR.
Still, he said, single-cell data is "not quite like what one gets from isolates." It gives uneven amplification, he said, though that can be normalized. Chimeras also appear due to the amplification method. If there is a reference genome, those chimeric structures can be excluded, though that isn't an option for de novo genomes.
Instead, using an iterative 'Jackknife' assembly, in which reads from one cell are mapped to a co-assembly of reads from other cells, Blainey and his team could then identify chimeric structures in the first cell. From this approach they saw "pretty dramatic improvements," he said.
Additionally, combining single-cell data with metagenomic data can help improve the single-cell dataset.
Blainey and his team have applied this approach to parts of the phylogentic tree that are little understood. In an example he gave, they focused on a candidate phyla called OP9. They gathered samples from two locations in the Great Basin: Little Hot Creek, California, and Great Boiling Spring, Nevada.
Using the Roche 454 platform, they sequenced a number of OP9 cells and co-assembled about 15 genomes. This, Blainey said, gave them a high coverage but fragmented view of the genome.
By combining metagenomic and single-cell data, Blainey was able to bin the OP9 genome from Great Boiling Spring. "So we were off to the races combining this with a single-cell dataset," he said.
Both OP9 genomes are nearly complete, he said, and within them, the researchers are getting a look at the biology of the phylum. For instance, he noted that they contain markers indicative of both gram-negative and gram-positive bacteria. In addition they are likely polysaccharide-degrading anaerobic heterotrophs.
JGI's Woyke, who heads the Microbial Genomics Program there, has also been taking a single-cell approach to flesh out what's known about these candidate phyla. She and her colleagues took samples from nine different sites across the world, including from marine, freshwater, thermal, and sediment sites, where there was likely underexplained diversity, she said.
From those sites, they isolated 9,600 single cells, which they then filtered down to 201 cells for which they obtained draft genomes. The 9,600 cells were filtered both to identify ones that met certain quality standards and to home in on cells from the candidate phyla of interest. This process, she added, took about a year as they worked to figure out the best way to assemble the genomes.
The assemblies ranged in size, with the largest contig 400 kb in length. By taking a co-assembly approach, they were able to improve assemblies and generate more complete ones as well. Completeness of the genomes also varied, she said, from about 35 percent of the genome to nearly all of it.
With these sequences, she said, they were able to resolve a number of inter- and intra-phyla relationships, as well as provide more substantive genetic data for candidate bacteria phyla, such as for SAP406, OP3, OP8, and others.
She said that combining metagenomic data and single-cell genomic data also helps improve sequence quality. "Now we can more accurately bin them to phyla," she said, whereas before, some bacteria could only be binned at the kingdom level.
Additionally, Woyke and her colleagues were able to uncover possible novel functions as indicated by the presence of sigma factors, which are needed for the initiation of translation, in archaea when they had only previously been seen in bacteria and TATA boxes in archaea.