Using an approach called reduced representation bisulfite sequencing to simultaneously gauge genome-wide DNA methylation and sequence patterns in three generations of family members, researchers have found evidence that genotype has a more widespread influence on DNA methylation patterns than previously appreciated.
The researchers, from the HudsonAlpha Institute for Biotechnology and Duke University, reported in PLoS Genetics earlier this month that data for six family members and two unrelated individuals suggests genetic influences account for the majority of the differential methylation detected between homologous chromosomes, while parent-of-origin effects or imprinting was found at around 8 percent of the sites tested.
"Overall, our results demonstrate that the influence of genotype on patterns of DNA methylation is widespread in the genome and greatly exceeds the influence of imprinting on genome-wide methylation patterns," senior author Richard Myers, president and director of HudsonAlpha, and his co-authors wrote.
The results suggest that something about specific DNA sequences — regardless of which parent they are inherited from — seem to help set up DNA methylation patterns and the degree of methylation, Myers told In Sequence. "While it's not shocking that that might happen," he said, "what's striking is how much it happens. It's the vast majority of the allelic influence on methylation patterns."
Cytosine methylation — particularly at CpG sites where cytosine and guanine nucleotides neighbor one another — is an epigenetic mark that can contribute to normal development and aging processes. But shifts in methylation have also been linked to a range of diseases including cancer, autoimmune disease, and psychiatric conditions.
While different alleles often have the same methylation pattern, the researchers noted, there are instances in which parts of paired, homologous chromosomes show distinct methylation profiles. This differential methylation is sometimes attributed to imprinting — a situation in which epigenetic patterns are contingent on whether a given stretch is inherited from an individual's mother or father. But a handful of past studies have hinted that DNA sequences can affect methylation too.
To explore this in more detail, the team turned to reduced representation bisulfite sequencing, or RRBS, a method that involves sequencing genomic DNA that's been fragmented with a restriction enzyme targeting CpG-rich sites in the genome. The resulting DNA fragments are then treated with bisulfite to convert unmethylated cytosine residues to uracil so that methylated and unmethylated cytosines are distinguishable from one another by sequencing.
The approach is more cost-effective and less analytically taxing than whole-genome bisulfite sequencing, since it leaves researchers with a much smaller, but very CpG-rich, collection of genome sequences.
Alex Meissner pioneered the RRBS strategy a few years ago when he was a post-doctoral researcher in Rudolf Jaenisch's lab at the Whitehead Institute for Biomedical Research and Massachusetts Institute of Technology, Myers noted.
For the current study, Myers and his colleagues came up with computational methods that helped them to use the method to simultaneously detect CpG methylation and genetic patterns in the same samples.
"We wanted to look at as many CpGs methylated in the genome as we possibly could and this was one of the best techniques for doing that," Myers explained. "We didn't really redevelop [the RRBS method], but we had to basically figure out how to analyze the data, which is a big part of it."
Though they used the Illumina Genome Analyzer for their experiments, Myers said RRBS strategy is compatible with any sequencing platform.
Despite its utility, Myers called the approach a "stop gap" measure as researchers wait for sequencing costs to decline to the point where whole-genome bisulfite sequencing is feasible for many samples.
With roughly 29 million CpGs in the human genome and about half turning up in non-repetitive parts of the genome that could be assessed by next-generation sequencing methods, Myers noted, the goal is to eventually find as many functionally important sites with dynamic methylation as possible.
The team is already doing some whole-genome methylation sequencing studies, he said, though that method carries considerable computational challenges of its own, largely owing to the bisulfite conversion used to help gauge methylation.
For the current study, researchers digested one microgram of genomic DNA with an enzyme called MspI, filled in the ends of these DNA fragments, and added adenosine to the 3'-ends before adding on Illumina adaptors and purifying the fragments. They then treated the bits of DNA with bisulfite, and amplified and sequenced the fragments.
Using the RRBS method and paired-end sequencing with the Illumina GA, the researchers sequenced 40 to 120 base pair stretches of bisulfite converted DNA from non-immortalized, peripheral white blood cells isolated from fresh blood samples taken from six members of a three-generation family and two unrelated individuals.
"Analysis of a family allows for the determination of a SNP's parental origin along with inheritance patterns of DNA methylation levels and therefore permits the direct examination of genetic and epigenetic mechanisms of differential methylation," the researchers explained.
[ pagebreak ]
"By analyzing DNA methylation in a family, the impact of alleles versus the impact of a chromosome's parental origin on the inheritance of methylation can be clearly resolved."
In the process, the team generated 10 or more sequencing reads for roughly one million CpGs per sample. Once they tossed out sites where they didn't have at least 10 reads in all of the samples, they were left with information for about 950,000 CpG sites.
"To be able to be quantitative, we wanted to have ten reads for each so we could basically distinguish ten percent differences [in methylation]," Myers explained.
Though it is possible to cover marginally more CpG sites by purifying more DNA fragments and doing deeper sequencing of the samples, he said, that level of additional resolution did not seem to be cost-effective at the moment, since researchers would have to sequence about twice as deeply to get methylation data on 10 percent more CpG sites.
Along with the percentage of methylated reads for each CpG, the team also looked for heterozygous SNPs corresponding to sequence fragments. Together, analyses of DNA sequence variant inheritance and methylation patterns in the family members made it possible to distinguish parent-of-origin influences on methylation from sites where the differential methylation hinged on genetic patterns.
The researchers found evidence of imprinting at around 8 percent of the differentially methylated loci, including some imprinted loci that hadn't turned up in past studies, Myers noted.
Overall, though, the group's data suggested that variable methylation between homologous chromosomes is far more apt to stem from genetic patterns, with at least 92 percent of this differential methylation relating to haplotype.
The findings extended beyond the family members as well: 40 of the heterozygous SNPs that had been linked to differential methylation in the family members were also found in at least one of the unrelated individuals, with 30 showing the same sorts of methylation effects in these individuals as well.
From the patterns identified in family members and unrelated individuals, the researchers estimate that as much as 80 percent of methylation variability is a consequence of genetic patterns.
In general, these effects were less common in CpG islands, turning up more often in intergenic sequences and parts of the genome that are less well conserved. That led the study authors to speculate that there is "evolutionary constraint on DNA methylation levels, as genetic variants that affect DNA methylation tend to lie in regions under less selective pressure."
Moreover, the researchers' RNA sequencing experiments in lymphoblastoid cell lines suggest that it can influence the expression of genes: more than one-fifth — 22 percent — of genes associated with genotype-dependent DNA methylation events also showed allele-specific gene expression.
"Our results show that genome sequence will need to be considered when assessing how genes are regulated, especially by DNA methylation, in the context of disease," first author Jason Gertz, a postdoctoral researcher in Myers's HudsonAlpha lab, said in a statement. "This is particularly important for cancer research where DNA methylation is being considered as a biomarker."
If DNA sequence determines how methylation patterns are set up early in development, Myers added, it's conceivable that the same alleles that are influencing methylation in the blood cells tested are acting in a similar manner in other tissues.
"If any of these end up being disease-associated, then that might mean that we can detect them more easily," he said. "We may not have to go after the specific tissue to find disease-specific variants."
Nevertheless, he cautioned, that idea is still just speculation, since more studies are needed to look at the relationship between genotype and differential methylation in other tissues and cell types.
Related studies are underway, Myers said, including methylation studies by members of the ENCODE consortium on a range of cell lines and tissue types. Members of his lab are also looking at methylation patterns in post-mortem brain samples from individuals with major psychiatric conditions and healthy controls as part of the Pritzker consortium.
Such studies are benefiting from a decline in sequencing costs, Myers said, adding that his team has continued to improve and streamline the RRBS method to make it even more cost-effective as they gear up to look at many disease and control samples with the Illumina HiSeq 2000, which provides more sequence data and allows more samples per lane.
Myers estimated that the cost of doing relatively deep RRBS is now a third of what it was when the team started the PLoS Genetics study and is somewhere in the neighborhood of $500 to $1,000 per sample — a price that is partly contingent on an institute's operating costs and overall sequencing output.
"A lot of the [decrease in cost] is because sequencing costs have gotten better, but a lot of it is because we've gotten streamlined and we're doing, literally, thousands of samples this way," Myers said, "so that makes it cheaper for us."
Have topics you'd like to see covered in In Sequence? Contact the editor at anderson [at] genomeweb [.] com.