Skip to main content
Premium Trial:

Request an Annual Quote

Integration of Single-Cell RNA-seq, Genomic Data Reveals Cell Types Involved in Genetic Diseases

NEW YORK — With a new framework combining different omic data types, researchers have begun to home in on the cell types where variants uncovered through genome-wide association studies exert their effects.

Researchers from the Broad Institute and elsewhere developed the method, dubbed sc-linker, that incorporates single-cell RNA sequencing, epigenomic SNP-to-gene map, and GWAS summary statistic data to identify in which cells genetic loci uncovered through GWASs affect disease.

"Unlike previous studies, we analyze gene programs that represent different functional facets of cells, including discrete cell types, processes activated specifically in a cell type in disease, and processes activated across cells irrespective of cell-type definitions," senior author Aviv Regev, formerly a researcher at the Broad and now at Genentech, and colleagues wrote in their paper.

As they reported on Thursday in Nature Genetics, they additionally applied their approach to different diseases, including immune-related and psychiatric conditions.

The sc-linker framework relies on two initial data input types: single-cell RNA sequencing data from both healthy tissues and those affected by disease and GWAS summary statistics. From the single-cell RNA sequencing data, the researchers identified gene programs specific to certain cell types, cellular processes, and disease development. Using a tissue-specific enhancer-gene linking strategy, those gene programs were then tied to SNPs to generate SNP annotations. At the same time, using the GWAS summary statistics, they associated SNPs with particular human traits.

They then combined these two data streams to uncover which gene programs were enriched in which diseases and what mechanisms might be at play.

After benchmarking the approach on five blood cell traits, the researchers applied sc-linker to analyze a number of disease types.

For instance, they analyzed 11 autoimmune diseases in six immune cell-type programs that they had identified in four datasets — two of PBMCs, one cord blood, and one bone marrow set — and 10 immune cellular process programs. This identified cell type-disease enrichments that were in line with known biology, such as the association of T cells with eczema and enrichments for multiple sclerosis across all immune cell types.

Other links had less previous support, such as an enrichment of B cells in ulcerative colitis and T cells in celiac disease. That T cell enrichment was largely driven by the ETS1 gene, which is associated with T cell development and interleukin-2 signaling, and CD28, which is needed for T cell activation. This finding indicated that aberrant T-cell maintenance and activation may affect inflammation in celiac disease, the researchers noted.

They similarly examined psychiatric diseases to uncover enrichments of γ-aminobutyric acid (GABA)-ergic neurons in major depressive disorder. GABA-ergic neurons are involved in the regulation of stress, and this enrichment was driven by TCF4 and PCLO, which are involved in neuronal differentiation and synaptic vesicle trafficking, respectively. The researchers noted that as the GABA-ergic pathway is independent of the serotonin pathway that many treatments for depression target, it could represent a source of additional therapeutic targets.

They further noted an increasing number of rare M cells, which play a role in gut microbiome homeostasis, in the development of ulcerative colitis.

According to the researchers, these findings could form the starting point for further analyses.

Additionally, sc-linker could be updated with additional data types as they become available. "In the long term, with the increasing success of phenome-wide association studies and the integration of multimodal single-cell resolution epigenomics, this framework will continue to be useful in identifying biological mechanisms driving a broad range of diseases," they added.