This article was originally published Aug. 18.
NEW YORK (GenomeWeb) – Researchers at the University of California, San Francisco have developed a single-cell low-coverage transcriptome sequencing method that can classify neurons into subtypes and identify biomarkers.
The technique, which the UCSF team published in Nature Biotechnology earlier this month, combines Fluidigm's C1 Single-Cell Prep System with Illumina's SMARTer Ultra Low RNA and Nextera XT kits to prepare sequencing libraries.
The long-term goal of the research is to understand cellular diversity in the neocortex, co-lead author Alex Pollen, a postdoctoral scholar in UCSF's department of regeneration medicine, told In Sequence. "There is an astonishing diversity of neurons in the brain," he said, even in very early stages of development when cells appear to be the same. Single-cell sequencing is ideal for picking apart the diversity, but can quickly become prohibitively expensive, he said. So the team wanted to establish the lower limit of coverage necessary to still distinguish closely related cells.
The use of the C1 system was important for making the single-cell method automated and higher throughput, Pollen said, as it enabled automated cellular capture on a microfluidic chip of up to 96 cells in a run. Not only does that automate and standardize the procedure, he said, but also, the reactions that would normally be performed in a tube, including the lysing, reverse transcription, and initial amplification, are now performed in nanoliter volumes directly on the chip. "It's thought that these small volumes maximize the effective concentration of reactions, which may maximize the sampling accuracy, which is very important when you're starting with such a small quantity of nucleic acids," he said.
In addition, Tomasz Nowakowski, co-lead author and postdoctoral scholar in UCSF's department of regeneration medicine, told IS, the C1 system helps increase the throughput. Previously, techniques for single-cell transcriptome analysis have involved using microcapillary tubes, which is not very efficient, he said. But single-cell sequencing using the C1 system for sample prep, enables hundreds of cells to be analyzed within a reasonable time frame, he said. The technique "really revolutionizes the way we can analyze cells in the developing brain," he said.
The researchers tested their strategy on 301 single cells from 11 populations using the C1 system. They first sequenced the transcriptomes of each cell to high coverage on the Illumina HiSeq, generating approximately 89 million reads per cell, which they used as a reference. Next, they pooled dozens of cells and sequenced the transcriptomes to lower coverage on the MiSeq, generating approximately 2.7 million reads per cell.
Across the 301 cells, which came from a range of different sources, they found that the lower coverage RNA-seq had good correlation with the higher coverage — on average about 0.91. However, for low abundance transcripts, correlation dropped to just 0.25.
The team then compared transcriptomes from four different cell types that would be expected to show major differences in gene expression — pluripotent, skin, blood, and neural. They evaluated the low-coverage transcriptomes from those cells using principal component analysis to identify the genes that explain variation between the cells. The analysis was able to separate the cells into their correct source populations and also identify genes that reflect the biological properties of the cells. By comparing low to high coverage, the researchers found that the analysis and genes identified were similar. About 78 percent of the top 500 genes identified were the same between the two groups.
Next they looked deeper at neural cells to see whether low-coverage transcriptome sequencing could distinguish between closely related cells in a heterogeneous population.
Specifically, they looked at cells from the germinal zone of a human cortex at gestational week 16 to capture radial glia and newly generated cortical neurons. They also looked at cells further along in development as well as pluripotent cells.
Low coverage and high coverage transcriptomes matched, and when they down-sampled the low-coverage sequence data further, they found that they could group cells accurately with 5,000 to 50,000 reads per cell. The team was able to identify high expression levels of genes that correlated with cell type. For instance, the neural progenitor cells had high levels of expression in genes associated with proliferation, while the cells furthest along in development had high expression of genes associated with neuronal maturation.
The single-cell transcriptome sequencing could also be used to further subdivide the broad groups of cells into known subtypes and potentially new subtypes, the researchers reported.
For instance, some newborn neural cells expressed markers of inhibitory interneurons GAD1 and DLX genes, as well as previously unreported markers such as PDZRN3, whereas the rest of the cells from that group expressed the proneural genes NEUROD1 and NEUROD6. In addition, some newborn neural cells expressed UNC5D, a gene that is required for the earliest phases of migration, and other genes such as ROBO2 and NTM, whose roles in newborn cortical neurons are unknown.
"In some ways, this is a conceptual shift from trying to understand the molecular identity of a sample, like with bulk tissue RNA-seq, to using single cell surveys to try to understand the molecular identity of distinct populations of cells," Pollen said. "Going forward, we want to apply this towards examining the cell diversity during specification of diverse lineages in the developing brain."
The team determined that approximately 10,000 reads per cell could sufficiently sort cells into broad categories from a heterogeneous population, but that around 50,000 reads per cell could help further distinguish similar cells and help define the genes responsible for variation among cells of the same type.
"Subtle distinctions could be determined at 50,000 reads per cell," Pollen said. "This read depth also enabled the discovery of numerous biomarkers."
Pollen said that the team's next step is to use barcoding to pool and sequence many more cells at once. "Eventually, we want to do a large-scale survey that covers many cells." The study anticipates the utility of future cell capture and barcoding strategies that retain cell of origin information, Pollen said. "If you can capture and tag those cells, we'll learn something new, even with low sequencing depths."
Nowakowski added that although many of the cell populations the team is studying have been described previously, very little is known about specific genes and their contribution to normal development and function.
"We're excited to follow up in future studies on some of those biomarkers and trying to understand what kind of signaling pathways may be going on in some of these populations, and at the molecular level what the differences might be between them and what that means for brain development," he said.