By Monica Heger
This article was originally published April 2.
The Single Cell Genomics Center at Bigelow Laboratory for Ocean Science is establishing itself as the first automated, high-throughput single-cell genomics facility in the US and plans to add single-cell sequencing to its list of services this year.
The nonprofit, based in East Boothbay, Maine, was established in 2009 to offer single-cell services such as cell sorting, cell lysis, and whole-genome amplification.
Director Ramanus Stepanauskas told In Sequence that the center's expertise is in the pre- and post-sequencing steps of single-cell genomics, and it has so far partnered with the Department of Energy's Joint Genome Institute for the sequencing portion of its own research projects.
When the center launches its single-cell sequencing service, it will contract out the sequencing portion to JGI and likely other service providers as well.
Currently, the institute offers pre whole-genome sequencing services for single cells — cell sorting, lysing, and whole-genome amplification, as well as sequencing the 16S ribosomal RNA, which identifies what species a cell came from. Then customers can analyze the genomes further using their own technology, whether it be PCR, sequencing, or something else.
Prices for these services vary, but for one sample, which typically yields about 100 single amplified genomes and the 16S rRNA sequence, the cost is around $5,000, Stepanauskas said.
He said that there has been an increasing interest in next-gen sequencing of single genomes, so the institute decided to include that in its offerings. However, he said, Bigelow's expertise is still on the front and back ends.
"Our focus is on those parts of the workflow that are really specific to single cells," Stepanauskas told In Sequence. "The actual sequencing is a standard process and at the moment we are not planning to re-invent the wheel."
Rather, he said, Bigelow's expertise is in techniques like cell sorting, lysis, and genome amplification, as well as the assembly and analysis of single-cell genomes, which poses problems for standard assembly algorithms due to the highly uneven coverage inherent to whole-genome amplification.
Stepanauskas said that the key to Bigelow's operations is its ability to scale up — moving from processing just a few cells in test tubes to tens of thousands of cells in multiple 384-well plates.
A 2008 National Science Foundation grant enabled the center to purchase robotic liquid handling systems and analytical tools to establish a high-throughput pipeline, he said.
The institute started working primarily with marine microbes but has since worked with customers worldwide on around 40 different projects studying single-cell genomes from microbes found miles below the surface of the Earth, symbionts of insects, the gut microbiota of vertebrates, and, increasingly, human cells, said Stepanauskas.
Currently, most of its customers are academic, but Stepanauskas said commercial firms have shown increased interest, including biopharmas and biofuel developers.
"It took industry a few years to sense the field and get confidence that single-cell genomics is a reliable and useful tool," he said.
Bigelow is perhaps the only institution dedicated to single-cell genomic services. Stepanauskas said that it has streamlined and automated a lot of the sample prep process, which helps to reduce contamination and increase genome recovery.
While Bigelow uses the standard multiple displacement amplification strategy to amplify genomes from single cells, Stepanauskas said that all the reagents are first run through a decontamination process.
"We've found that there are no commercially available MDA reagents that are free from DNA contaminants," he said.
Additionally, the Bigelow researchers monitor the MDA reactions themselves to determine which cells are likely to have the highest genome recovery. Stepanauskas could not disclose details about this process because he plans to publish it, but said that essentially, monitoring the kinetics of each MDA reaction can help identify which reactions were the most efficient and will result in the greatest recovery.
Only those cells are selected for whole-genome sequencing. This approach allows the researchers to recover between 95 percent and 97 percent of most genomes, he said.
Other groups using MDA have reported a wide range of results for genome recovery, with some reporting around 90 percent recovery, but more frequently around 70 percent.
Furthermore, being able to process thousands of cells at once helps the Bigelow researchers choose the most interesting ones to study with whole-genome sequencing, Stepanauskas said.
For example, he said the team worked on a project with samples from the deep ocean, about 800 meters below the surface, in which it was looking for uncultivated bacteria with carbon-fixing genes that work in complete darkness.
The original sample contained thousands of cells. Then, after sorting and amplifying the genomes of all those cells, they used PCR as a first-pass screen to identify those bacterial cells with a gene encoding a key enzyme in carbon fixation that is found in all green terrestrial plants. Whole-genome sequencing allowed them to confirm the finding and study those particular bacterial species in greater detail.
Bigelow is also focused on the analysis and assembly of single-cell genomes. Standard assembly algorithms cannot be used on sequence data from single cells because "the products from whole-genome amplification of individual cells don't cover the genome evenly, and that confuses assembly algorithms," Stepanauskas said.
Following whole-genome amplification, some regions of the genome are over-amplified compared to others, so some reads may be covered only a few times, while others may be covered hundreds of times, and a standard assembler will not know how to deal with that.
Researchers at Bigelow have tried a couple of different protocols, including one developed by Pavel Pevzner's team at the University of California, San Diego, which does not use coverage depth to make assembly decisions (IS 3/13/2012), and one developed by researchers at JGI.
JGI's assembly algorithm uses k-mer frequency to computationally remove sequence reads that are over-represented in the data set, said Stepanauskas. Then, that normalized data set is run through a slightly modified standard assembler.
Going forward, Stepanauskas said one of the main bottlenecks of single-cell sequencing is in the cell lysis process. Scientists have made a number of improvements in techniques like assembly and genome recovery from whole-genome amplification, but cell lysis is a step that is often not considered, yet still challenging.
Microbial cells are very diverse, said Stepanauskas, and some are contained in cell walls that are very easy to lyse, while other cell walls are very difficult to open without damaging the DNA.
Stepanauskas said that the technique Bigelow uses is effective on marine microorganisms and microbiota in digestive systems, but when examining samples from tougher environments like soils and alkaline hydrothermal vents, "the success rate of lysis is much lower and requires improvements."
Despite these challenges, Stepanauskas said single-cell sequencing still offers considerable advantages over metagenomic sequencing, and he thinks it will be an increasingly valuable technique.
Metagenomics is a "great tool to discover genes," he said, but it is "ineffective in figuring out how the genes fit together, and which bacteria the genes came from."
Additionally, he said, there is enormous diversity in microbes, and rarely are two cells identical. Genes move around through recombination and bacteria are constantly evolving and developing mutations.
"That variability is going to always challenge metagenomic assembly," he said. "Even if you obtain a genome from metagenomic assembly, while it may be useful in many ways, you have to keep in mind that there's probably not a single cell that's exactly like that."
Have topics you'd like to see covered by In Sequence? Contact the editor at mheger [at] genomeweb [.] com.