NEW YORK (GenomeWeb) – Researchers from McGill University in Montreal have developed a targeted sequencing technique that focuses on functional methylomes and have created an adipose tissue-specific panel that they plan to use in population-scale studies to identify epigenetic differences in metabolic diseases like obesity.
Reporting in Nature Communications last week, the approach, which they call MCC-seq for methyl-C capture sequencing, involves first creating a whole-genome sequencing library, bisulfite converting and amplifying it, and then doing a target capture. For the target capture, they used Roche's NimbleGen SeqCap Epi product.
Using SeqCap Epi they designed targeted panels specific for adipose tissue, validating that its performance was comparable to other approaches such as whole-genome bisulfite sequencing, Agilent's SureSelect Human Methyl-seq kit, and an Illumina array, and demonstrating it in a small cohort of obese individuals.
Elin Grundberg, the senior author of the study and an assistant professor in human genetics at McGill University, told GenomeWeb that her lab had previously been working with Illumina arrays to study epigenetic modifications but wanted something more comprehensive. The group has also used whole-genome bisulfite sequencing, but that was not cost-effective to do on hundreds or thousands of individuals, she said.
The researchers designed two panels with the goal of capturing the putative functional and disease-linked methylome in adipose tissue.
The lab is now using the larger more comprehensive panel to look for epigenetic biomarkers in obese individuals that predict specific metabolic complications, such as heart disease or diabetes. In addition, Grundberg said that they plan to develop epigenetic panels for different diseases and tissue. The next one, she said, will be an autoimmune panel for blood.
In a first version of the adipose tissue panel, the group targeted 87 mb of sequence data, including nearly 2.5 million CpGs and around 1.3 million SNPs. After determining that the V1 panel performed well, the team decided to scale up, designing a more comprehensive V2 panel that targeted 156 mb of sequence.
The V2 panel includes over 4.4 million CpGs and over 2.8 million SNPs. The panel covers CpGs in low and unmethylated regions that had been identified from whole-genome bisulfite sequencing data of adipose tissue, CpGs within human adipocyte regulatory elements (H3K4me1 and H3K4me3), all CpGs in Illumina's 450K array, over 28,000 regions covering metabolic disease-associated GWAS loci, and more than 250,000 SNPs from Illumina's HumanCore Bead Chip.
To assess the performance of the larger panel, they ran a six-plex capture and sequenced the samples on one lane of the Illumina HiSeq using 100 base paired end reads. On average, 62 percent of the reads mapped to target regions with 15x mean coverage. Around 65 percent of the target regions were covered at a sequence depth of 5x or greater.
Comparing the MCC-seq protocol to whole-genome bisulfite sequencing and an Illumina array for overlapping CpGs showed high correlation. They also compared the method against Agilent's targeted protocol, SureSelect Human Methyl-seq. While the Agilent method worked well, Grundberg said that for her lab's purposes it was not the right approach because it was not strand specific, so it could not correlate epigenetic variants with genotypes.
Next, the team sought to validate the panels in disease cohorts, testing the V1 panel on adipose tissue from 72 obese individuals undergoing bariatric surgery.
The team multiplexed samples in groups of four and sequenced to an average depth of 25x. They detected over 2 million CpGs in at least one individual. They narrowed down the list by restricting further analysis to CpGs with at least 5x coverage, leaving them with just over 1.7 million CpGs.
To validate MCC-seq's ability to simultaneously call genotypes, the group used an Illumina array to genotype 24 adipose tissue samples and compared that to the MCC-seq panel, obtaining 99 percent concordance.
Finally, the group wanted to demonstrate that the panel could be used for epigenome-wide association studies. To do this, they focused on a specific trait — plasma triglyceride levels — from the 72 individuals.
Triglyceride levels were highly variable across the cohort. Looking at the MCC-seq data, they identified over 3,000 CpGs that were associated with triglyceride levels and evaluated them based on their overlap with histone marks in human adipocytes and regions of low or no methylation. The CpGs associated with triglycerides were most likely to map to enhancer histone marks or regions of low methylation, supporting the "mounting evidence that disease-trait-associated epigenetic variants localize, to a large extent, to distal regulatory regions," the authors wrote.
They then used a variety of methods to focus in on the CpGs most strongly correlated with triglycerides — comparing to array data from a separate cohort study, to RNA-seq data from nearby genes, to transposase-accessible chromatin sequencing to further map the triglyceride-associated CpGs linked to hypomethylated regions, and to the National Human Genome Research Institute's database of GWAS results.
Notably, they identified several genes linked to 19 of the CpGs that were previously cited for a metabolic disease traits, including CD36, RPTOR, and ABCG5/ABCG8.
The most significant CpG of the triglyceride-associated loci was located within the CD36 gene, so the team focused on that gene for further follow up. CD36 is known to play a role in lipid metabolism and has previously been linked to metabolic disease susceptibility. The CpG maps to an area of low methylation that is unique to adipose tissue.
Looking at CD36 expression in healthy and obese individuals, the researchers found that there was significantly higher expression in adipocytes than blood cells and found that methylation was negatively correlated with CD36 expression.
"This is a first step toward understanding epigenetic differences in metabolic disorders," Grundberg said. Although individuals with metabolic disorders often have clear, well-defined traits, like obesity, "within that obese population there is huge variation" on things like triglyceride levels and cholesterol, which are important for diseases like diabetes and cardiovascular disorders, she said.
Moving forward, Grundberg said that her lab plans to use the more comprehensive V2 panel to study a larger cohort of the same population — between 400 and 500 obese individuals that have undergone bariatric surgery. "It's a well-phenotyped population that is homogenous in traits, but have distinct complications of metabolic syndromes," she said. The goal is to identify epigenetic biomarkers that can predict some of those distinct complications, like diabetes, she said.