SAN FRANCISCO (GenomeWeb) – A research team from Stanford University has developed an in situ RNA sequencing technology that can be used to study gene expression while maintaining 3D spatial information at single-cell resolution.
The technology, dubbed STARmap for spatially resolved transcript amplification readout mapping, makes use of hydrogel-tissue chemistry, targeted signal amplification, sequencing by ligation, and an error correction method, and the researchers demonstrated that it could be used to map more than 1,000 genes simultaneously in the mouse brain.
The Stanford researchers described the technology in a paper published in June in Science, and have filed a patent application on the method, but their commercial plans are unclear as they did not respond to multiple requests for comment.
Jehyuk Lee, an assistant professor at Cold Spring Harbor Laboratory, who was not affiliated with the study but who previously developed the in situ RNA sequencing technology known as FISSEQ, said that the Stanford researchers were able to "overcome multiple technical challenges." The paper is "the first that shows ... that you can define cell types and activities and cell states with single-cell resolution in the brain," he added.
The study includes several "different types of innovations," Lee said. In his opinion, the two key innovations were the group's error correction-mediated, barcode-based sequencing-by-ligation strategy, which they dubbed SEDAL, as well as the creation of a library of cDNA probes hybridized with cellular RNAs, a strategy termed SNAIL for specific amplification of nucleic acids via intramolecular ligation.
SNAIL uses a pair of primer and padlock probes that, when simultaneously hybridized to the same RNA molecule, allows the padlock probe to be circularized and rolling circle-amplified. That generates a DNA nanoball that has multiple copies of the cDNA probes. The researchers first designed a barcode library consisting of 1,024 five-base barcodes that they built into each padlock probe to serve as a gene identifier.
They then designed a sequencing-by-ligation approach to sequence just the five-base barcodes. For this, they designed their own approach that included an error correction mechanism, SEDAL. The SEDAL strategy uses two different short degenerate probes: reading probes to decode bases and fluorescence probes to convert decoded sequence information into fluorescence signals.
Lee said that the team's SEDAL technique was a "clever work around limitations" posed by sequencing by ligation or synthesis. Previously developed commercial sequencing-by-ligation techniques, such as the former SOLiD technology and Complete Genomics' technology, either had high error rates or high background fluorescence. But, the advantage of sequencing by ligation is that it can be performed at room temperature, while sequencing-by-synthesis technologies such as Illumina's require heat, which would have made it challenging to do directly on a microscope as the team did in this study, Lee said.
The team first tested their strategy on a panel of highly expressed genes in the mouse brain and found that the error rate was around 1.8 percent. Next, they looked at expression in a set of 160 genes in the mouse brain using five-base barcoded SNAIL probes over six rounds of SEDAL sequencing.
They noted that the gene expression pattern between replicates was consistent, and that for known neurons, the spatial distribution was as expected. The 160 genes included 112 genes that serve as markers of cell types as well as 48 activity-regulated genes.
The researchers used the 112 cell-type markers to classify genes. They pooled more than 3,000 cells from the replicates and found three clusters of major cell types — excitatory neurons, inhibitory neurons, and non-neuronal cells.
Next, they evaluated differential gene expression across the different experimental conditions by evaluating the 48 activity-regulated genes. To do this, they flash froze brain tissue from mice that had been kept in the dark or four days and mice that had been given one hour of light afterward. As expected, they found differences in gene expression in the primary visual cortex from mice that were exposed to light, compared to those that were not.
In order to test the scalability of the technique the researchers increased the gene list from 160 to 1,020 genes. The researchers analyzed the mouse neocortex, finding that although there was an initially higher color-misassignment that resulted in around 40 percent of the amplicons being filtered out through the error-correction method, they were still able to cluster the single cells into 16 cell subtypes, which included the 13 originally identified with the 160-gene panel, plus three additional cell subtypes.
The team concluded that although it would be possible to "encode and decode more than one million codes," the 1,020-gene set was near the upper limit for optical resolution. In order to analyze a larger gene set or whole transcriptome, it would be necessary to enhance optical resolution, potentially by using "super-resolution microscopy," the authors wrote.
Finally, the researchers applied their technique to larger 3D tissue blocks, analyzing more than 30,000 cells across six layers of the mouse brain. For this experiment, the team focused on 23 cell-type marker genes and five activity-regulated genes. They were able to identify 11 cell types that corresponded to cell types identified in their previous experiments and were able get spatial information for those cells within the 3D tissue block.
Compared to other forms of in situ techniques, like single-molecule FISH, Lee said that this method should be significantly cheaper, because it does not require the high-resolution imaging that is necessary for smFISH.
NanoString Technologies is also developing a technology that will analyze gene expression and quantify proteins while preserving spatial information. Called Digital Spatial Profiling, the firm is developing the technology to be read out using either its nCounter system or next-generation sequencing.
There are also differences from previously developed in situ RNA sequencing techniques, like FISSEQ, the technique that Lee originally developed and that is being commercialized by startup ReadCoor. FISSEQ, similar to the Stanford technique, also generates DNA nanoballs, but in that technique, up to around 35 bases are sequenced, as opposed to just the five bases that the Stanford team sequences.
But because of the error correction method that the Stanford team developed, "their sensitivity is higher than previous in situ sequencing techniques, which don't have the sensitivity to get to single-cell resolution."
Lee said he thinks one major application of the technique will be to validate cell-type signatures and cell positions from RNA-seq experiments, including single-cell RNA-seq experiments. "This is going to be a really key technology," he said.
Lee anticipated that there would be a number of other publications coming out in the next several months describing different in situ RNA sequencing technologies. "But, this is the first that shows that for the brain, you can define cell types, activities, and cell states with single-cell resolution," he said. "When you do single-cell RNA-seq, you don't know where the different cells are in the brain, and whether they have a slightly different function depending on where they are, but with this you can compare similar cell types and understand whether there are cell-state or activity differences," he said. That ability would be key for large-scale studies like the Allen Brain Atlas, a 10-year project to characterize cell types and understand neural connections in the mouse brain, he added.
Going forward, Lee said he would like to see the researchers demonstrate that the technique can work on other tissue types. Tissues like the pancreas or kidney may be more technically challenging because they are denser than the brain. The rolling circle amplification that the Stanford team used tends to work best in brain tissue, Lee said, because other tissues are more densely packed with proteins. "When you fix them, they become very tightly crosslinked," he said. The Stanford team's technique may address the problem of protein crowding, he said, in that it includes some clearing of those proteins that are not bound to the probes, but that would have to be demonstrated in a different tissue type, Lee said.
"Once that's solved, it could be applicable to many biological questions," he said.