NEW YORK – Researchers from Washington University in St. Louis and their collaborators have developed a high-throughput reporter gene assay that can detect cell type-specific gene regulation at the single-cell level in live tissues.
Described in a Nature Genetics study published last month, the new method, named single-cell massively parallel reporter assay (scMPRA), combines single-cell RNA sequencing with a massively parallel reporter assay (MPRA), enabling researchers to measure the activity of cis-regulatory sequences (CRSs) across multiple cell types in parallel.
About 98 percent of the human genome does not code for proteins, and most genetic variants associated with diseases are mapped to the noncoding region of the genome, according to Barak Cohen, a professor of genetics at Washington University in St. Louis and the corresponding author of the study.
“The leading hypothesis at the moment is that the majority of the disease-causing heritable variation in the noncoding [region] is exerting its effects through altering cis-regulatory elements, [which include] enhancers, promoters, silencers, and insulators,” said Cohen. Additionally, scientists believe that these pathogenic genetic variants have cell type-specific effects, he said.
To interrogate cis-regulatory sequences and their associated variants en masse, researchers have been using MPRAs. By incorporating unique DNA barcodes in different candidate reporter genes, these assays allow scientists to measure the activity of target CRSs at a large scale.
“MPRA is just a way to do lots and lots of reporter gene assays in parallel,” Cohen said. “It leverages the fact that any assay that you can turn into a sequencing assay immediately becomes high throughput and scalable.”
However, because MPRA is generally performed in monocultures or as a bulk assay, Cohen pointed out, its readout usually only reflects the average CRS activity across all the cell types, omitting CRS information at the single-cell level.
To address this limitation, Cohen and his collaborators developed scMPRA, which combines MPRA with single-cell transcriptomes, allowing researchers to simultaneously measure the activity of reporter genes in single cells and the identities of each cell.
According to Cohen, the core of scMPRA is a two-level barcoding strategy that can measure the copy number of all reporter genes present in a single cell using mRNA readouts.
For this, a specific barcode is deployed to report the identity of the CRS while a second random barcode is introduced to serve as a proxy for the copy number of the reporter genes in the cell. Importantly, since the pool of the random barcode is large enough, the possibility of two of the same barcodes ending up in the same cell is “vanishingly small,” the authors noted.
For their study, the researchers tested scMPRA on a library of promoter variants in intact retinas. Specifically, the team created a library of sequence variants in the Gnb3 promoter and introduced the library into live mouse retina tissue.
Additionally, to help distinguish whether a cell harbors a silent plasmid or contains no plasmid after transfection, the researchers introduced a cassette on the Gnb3 promoter library, which allows the detection of the presence of plasmids carrying silent promoter variants. The results showed that scMPRA delivered reproducible results for each promoter variant in rod cells, bipolar cells, glial cells, and interneurons.
“Since the beginning of massively parallel reporter assays, people have been waiting for a single-cell option,” said Max Staller, a researcher at the University of California, Berkeley, who previously worked in Cohen’s lab as a postdoctoral researcher but was not involved in this study. “What is even cooler is that [scMPRA] works in tissues; that's what makes it so transformative,” he added.
Calling the dual barcoding strategy used in scMPRA “a very clever solution,” Staller said the new method helped solve multiple bottlenecks for the field. For a start, he said, the method augmented MPRA to the single-cell level, allowing researchers to now explore cell type-specific CRSs and their variants.
In addition, “when you transfect plasmids, you frequently get dozens or hundreds of plasmids per cell, and that makes it very difficult to measure the activity of each regulatory element,” Staller pointed out. The barcoding scheme used in scMPRA allows researchers to effectively determine how many plasmids are in each cell after transfection, he said.
Moreover, he applauded the method’s ability to distinguish between cells carrying silent plasmids versus those without a plasmid.
As such, Staller, whose lab studies how regulatory DNA changes during cell differentiation, thinks this technology will offer researchers a tool to better understand how cis-regulatory elements get turned on and off as cells differentiate in the tissues of an animal. “We're super excited about this technology,” he said. “I can't wait to use it in the lab. I think it's going to be really transformative for developmental biology.”
Despite its promises, one current limitation for scMPRA is throughput, Cohen pointed out. While bulk MPRAs can typically process upward of tens of thousands of reporter genes at once, he said, scMPRA’s current throughput is “much less,” on the scale of hundreds. As such, “getting the throughput up is the major challenge going forward,” he said, adding that as the throughput of single-cell sequencing technologies increases, the method’s capability will also likely catch up.
While the researchers primarily carried out scMPRA using 10x Genomics’ single-cell sequencing platform in this study, Cohen said the method is platform-agnostic and can work with other scRNA-seq technology.
As for the cost, Cohen said scMPRA is largely comparable with standard single-cell sequencing experiments. “If you can afford a 10x experiment, you can afford this,” he said.
Moving forward, his team is hoping to continue increasing the throughput of scMPRA, Cohen said. In addition, as a typical genome-wide association study signal encompasses hundreds to thousands of genetic variants, there is a need to assess how collections of genetic variants affect gene expression.
Although the collaborators in this study have filed a patent application pertaining to scMPRA, Cohen said his team currently does not have any plans to commercialize the technology. However, “we would be happy to license it to anybody who wants to license it,” he said.
Ultimately, Cohen said the hope is to generate enough cell type-specific data with scMPRA so that scientists can develop a machine learning algorithm to help recognize and study CRSs without the need for reporter assays.
“The pipe dream is that, with scMPRA, you could produce enough training data that you would be able to train a neural network to recognize enhancers,” he said.