NEW YORK – Induced pluripotent stem cells (iPSCs) can provide a window into regulatory variants related to traits and diseases that may be more difficult to uncover with differentiated tissues, according to new research published this week by investigators at the European Molecular Biology Laboratory (EMBL), the German Cancer Research Center, and elsewhere.
Such stem cells "are undifferentiated and therefore reflect the ancestral state of all cells," senior author Oliver Stegle, an EMBL group leader and division head at the German Cancer Research Center, and his colleagues wrote in their paper, noting that stem cells "could be particularly relevant when searching for the cause of diseases that occur early in development."
For a paper appearing in Nature Genetics on Thursday, Stegle and his team brought together array-based genotyping, whole-genome sequencing, and RNA sequence data from 1,367 iPSC lines characterized in five prior studies, representing nearly 900 healthy donors and a few dozen individuals with rare diseases. With these data, they were able to map expression quantitative trait loci (QTLs), uncovering variants that influence expression, splicing, or other features found in nearby or more distant genes.
With this "integrated iPSC QTL" (i2QTL) collection, the team catalogued hundreds of previously unappreciated expression QTLs. An analysis centered on the subset of samples with available genome sequence data also highlighted rare variants with pronounced expression effects, including those with ties to conditions such as monogenic diabetes, Bardet-Biedl syndrome, hereditary cerebellar ataxia, and global developmental delay.
"We were surprised to find such a large number of disease-associated genetic variants that are already visible in the expression pattern at the earliest time point of cell differentiation, represented by the iPSCs," co-first author Marc Jan Bonder, a researcher affiliated with EMBL and the German Cancer Research Center, said in a statement.
In iPSC lines developed from study participants with rare genetic conditions, the team demonstrated that transcriptomic and genomic profiles can help to narrow in on causal genes that have previously been implicated in the diseases in question.
On the common variant side, meanwhile, the investigators saw thousands of iPSC eQTL sites that overlapped with loci reported in genome-wide association studies of conditions ranging from heart disease or blood lipid levels to liver disease. They noted that the iPSC transcriptome and genome dataset is expected to inform future studies on still other traits and conditions.
"Overall, the genetic maps and colocalization catalogs generated in this study form a valuable reference dataset, further aiding in the interpretation of risk variants in a unique cell type relevant for development, cellular differentiation, cancer, and rare disease research," the authors wrote. "We expect that the genetic maps presented here, in combination with the constantly growing GWAS and rare-disease resources, will reveal missing molecular underpinnings of complex and rare genetic diseases and traits manifesting during development."
In a related Nature Genetics study, members of the same team relied on multiplexed pooling analyses, single-cell RNA sequencing, and other strategies to track down common variant eQTLs contributing to development across several time points in more than 200 human iPSC lines generated for the Human Induced Pluripotent Stem Cell Initiative (HipSci) that were directed to differentiate into neurons, including dopaminergic ones.
That analysis led to nearly 1,300 eQTLs at sites linked to neurological traits or conditions, including a significant proportion of sites beyond those included in the current version of the Genotype-Tissue Expression collection. The team also tracked down several cell types involved in neuronal differentiation and detected molecular features found in iPSC lines with poor neuronal differentiation.
"Based on molecular markers that predict differentiation bias, we estimate that 13 [percent] of iPSC lines in the HipSci resource produce very few neuronal cell types under the conditions tested," the authors reported, suggesting that "a priori selection is enabled by gene expression profiling data from the pluripotent state that is easily obtainable and often already available."