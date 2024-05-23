This story has been updated to with more complete information about NIH funding.

NEW YORK – Researchers from Carnegie Mellon University and the University of Washington have developed a single-cell assay that adds information on the 3D structure of the genome to gene expression data from the same cell.

Led by Carnegie Mellon's Jian Ma and Zhijun Duan of UW and the Fred Hutchinson Cancer Center, the team contrived GAGE-seq, or genome architecture and gene expression by sequencing, a coassay that combines the chemistries for chromatin conformation capture (Hi-C) and split-pool single-cell RNA-sequencing. "It allows us to directly link genome topology to gene expression," Duan said in an email. A description of GAGE-seq appeared last week in Nature Genetics.

"There has been a lot of debate in the literature — some explicit, some implicit — about how functionally important 3D genome organization is," said William Noble, a researcher at UW who is collaborating with Ma and Duan in the 4D Nucleome Consortium, but who was not involved in the paper. Some think that the association between topological domains and gene expression is "overplayed," he said, "but there's also a lot of evidence that, in some cases, local chromatin architecture really impacts how genes get turned on and off."

Nobody has yet been able to address these kinds of questions, Noble said, "because integrating [Hi-C] with single-cell gene expression data is really challenging."

Early data from the new method do not appear to be able to settle the debate, though. "Indeed, GAGE-seq told us that although structure-based and transcriptome-based cell types are in general highly congruent with each other, sometimes there are discrepancies between the two, due to the fact that the dynamics of the two is not always on the same page," Duan said.

The method offers the ability to profile thousands of single cells, using gene expression data as a pivot point to use the Hi-C data type more easily. "The single-cell Hi-C space has lagged behind in some sense because interoperability of data is challenging," said Andrew Adey, a single-cell research expert at Oregon Health and Science University. "That's where these coassays are valuable, particularly if they root their analysis in RNA." He noted that the GAGE-seq authors used their RNA data to further integrate their results with spatial transcriptomics data. "That gives you additional context to your Hi-C component and is a powerful way to bridge your data and improve the biological interpretation," he said.

The GAGE-seq project was supported in part by the 4D Nucleome Program, a National Institutes of Health initiative to look at genome structure and dynamics in time and space. Ma is the principal investigator on a five-year, approximately $10 million grant issued in 2020 from the NIH Common Fund.

In the paper, Duan disclosed that he is the inventor on a provisional patent application covering the GAGE-seq protocol. Whether GAGE-seq might be commercialized is unclear, though. "It's up to the University of Washington," Duan said. "Personally, I don't have such an interest [or] plan right now."

Noble noted that the assay appears to make it easier to do single-cell Hi-C in a high-throughput manner. "If you also get gene expression data, that makes the appeal of generating single-cell Hi-C data much higher," he added.

The method combines Hi-C, which crosslinks chromatin and ligates DNA to produce pairwise proximity points between genomic regions, and the split-pool barcoding approach to single-cell RNA sequencing. The chemistry uses biotinylated primers to generate cDNAs from the transcripts in the cell and a special second chromatin fragmentation step to introduce attachment points for the first DNA barcode. All DNA fragments receive two rounds of barcoding before the chromatin crosslinking is reversed. Next, the biotinylated cDNA is pulled out and turned into a scRNA-seq library, which is sequenced separately from the scHi-C library.

In proof-of-concept experiments, the authors generated both data types for 683 human and 586 mouse cells. Human cells passing "stringent" QC showed an average of 181,240 chromatin contacts and 24,784 unique molecular identifiers (UMIs) from 2,699 genes per cell. For mouse cells, chromatin contacts averaged 206,113 per cell with 16,596 UMIs from 2,256 genes per cell.

In studies of the mouse cortex, the researchers were able to identify 28 known cell types, including excitatory and inhibitory neuron subtypes. "Although 3D genome features are known to encode cell identity, scHi-C often identified fewer cell types in complex tissues than scRNA-seq," they noted.

The team then used the single-cell gene expression data as a "bridge" to analyze 3D genome variation alongside other data types. First, they integrated it with spatial transcriptomics data obtained with MERFISH (multiplexed error-robust fluorescence in situ hybridization), a method being commercialized by Vizgen. According to Duan, this had never been done before.

The researchers also linked cis-regulatory elements to target genes by integrating their data with single-cell chromatin accessibility datasets and reconstructed developmental trajectories based on 3D genome features during bone marrow differentiation "which revealed a complex link between gene expression and the 3D genome," they wrote.

"Simultaneously measuring multiple molecular properties of a cell will always lead to better understanding of its cell state," Duan said. Potential research applications include probing the role of the 3D genome in tumorigenesis and adding another data type to large-scale genomic perturbation studies.

"In terms of our understanding of what modulates 3D genome structure, I think we understand snippets," Adey said. Perturbing the genome might reveal changes in how chromatin folds and the mechanisms by which it does so. "We need methods like this to show those patterns," he said. The authors "lay out the protocol well. Time will tell how easy it is to use in others' hands."