Researchers at the Karolinska Institute in Sweden have developed a single-cell transcriptome sequencing method that uses molecular tags and microfluidics to enable quantitative gene expression measurements in a single cell.
The method differs from other single-cell approaches, such as the Smart-seq2 method, in that it focuses on enabling an accurate counting of the number of transcripts per cell to answer questions related to cell type and cell identity, rather than methods that enable full-length sequencing of transcripts to look for SNPs or splice site variations, senior author Sten Linnarsson, an assistant professor at the Karolinska Institute, told In Sequence.
The technique, which was published last month in Nature Methods, builds on previous work the researchers published in 2011 in Genome Research. The new method makes use of Fluidigm's C1 system for sample prep and unique molecular identifiers (UMIs) to directly count the number of transcripts.
According to Linnarsson, there are two main challenges in single-cell transcriptome analysis. The first is capture efficiency — how many RNA molecules are converted to cDNA. The other challenge is amplification bias.
The use of UMIs "allows you to eliminate the amplification bias," Linnarsson said. Short random sequences are attached to each individual cDNA molecule. For instance, if a specific gene in a single cell yields 10 cDNA copies, once those molecules are amplified, it is impossible to tell that the cell started with 10 molecules, he explained. "But, if you attach a short random sequence to all those molecules, they'll be different." So, if amplification results in say 10 million molecules, there will still only be 10 different tags. "Instead of counting reads, you count the number of different tags, so you can see the 10 original molecules in the data."
While the use of UMIs enabled a reduction of amplification bias, the use of microfluidics and optimized reagents in the sample prep steps helped to increase capture efficiency, Linnarsson said. Previous protocols of single-cell RNA-seq have been able to convert only between 5 percent and 10 percent of the RNA molecules in a cell to cDNA, but Linnarsson's team was able to achieve capture efficiency of 48 percent.
"It's hard to say exactly how [the microfluidics] improves capture efficiency," Linnarsson said. He attributes the improvements to a number of features. First, the entire procedure is done in a closed chamber. Second, the reaction volume is very small, around 200 times smaller than if the procedure is done in an Eppendorf tube, he said, which helps prevent side reactions.
Other researchers working on single-cell sequencing techniques have observed the same phenomenon — that reducing the volume in which reactions take place, results in less bias.
Aside from the microfluidic system, the researchers attributed the improvement in capture efficiency to the use of optimized reagents, most notably a template-switching oligo design.
In the Nature Methods study, the team tested the protocol on 41 mouse embryonic stem cells. Cells were captured in a microfluidic chamber of the C1 system and cDNA synthesis was performed using a template-switching oligo carrying a 5-base UMI.
In order to measure the efficiency of cDNA synthesis, the researchers added a known number of control RNA molecules to each well. Counting the resulting number of cDNA molecules, the researchers found efficiency was around 48 percent.
After cDNA synthesis, the molecules were amplified and adapters were added through a tagmentation reaction. They were then sequenced on the Illumina HiSeq 2000. Using the C1 system enabled cDNA molecules to be synthesized and amplified from 96 single cells simultaneously, and then the cDNA molecules from all 96 cells could be sequenced together on one run.
Sequencing depth averaged around 10 reads per cDNA molecule and total mRNA counts were just over 200,000 molecules per cell.
The researchers estimated that cost per cell is around $22, including reagents, the microfluidic chip, and sequencing costs. Linnarsson said they were able to keep costs low because the researchers made their own reagents, and sequencing made up the majority of the cost.
Kun Zhang, an associate professor of bioengineering at the University of California San Diego, who has also been developing single-cell sequencing methods but was not involved in the Karolinska study, told IS that the study's authors were able to reduce noise to a very low level, "to a point where the distribution almost matched the theoretical distribution, and that is pretty amazing."
Additionally, he said the recovery of nearly half of a cell's total RNA molecules is the "most sensitive method I've seen so far." Typically, methods recover mostly in the range of 10 percent of mRNA molecules, Zhang said. He attributed this improvement to the template-switching oligo, which was RNA-based rather than the more commonly used DNA-based oligo. Additionally, he said, "implementing the protocol in Fluidigm's device may have contributed [to the improved capture efficiency] because the reaction volume is smaller."
He said the method could be useful for studying stem cells, cancer cells, or any cell "where you want to obtain an accurate measure of gene expression." The only limitation of the method is that because it only sequences the ends of the transcript, and not the full length of the transcript, it will not pick up isoforms. "That's both an advantage and disadvantage," he said. "The disadvantage is that you can't look at isoforms, but the advantage is, that since you don't look at isoforms, you get one count per molecule, so the counting is more accurate."
Another group from the Karolinska Institute has developed a single-cell RNA-seq protocol that analyzes full-length transcripts, known as Smart-seq2. Linnarsson said that the two approaches have complementary benefits. While his counting method will give a very sensitive measurement of gene expression from a single cell, the Smart-seq method is better suited for analyzing SNPs or splice site variations, he said.
Linnarsson said his group is interested in using the method as a "tool to investigate complex tissue like the nervous system to address the questions of cell type and cell identity." His team is now aiming to sequence 10,000 single cells from the nervous system. Longer term, he is interested in scaling up the method to look at 100,000 or 1 million cells. To do this, he said he would need to further miniaturize the sample prep process.