NEW YORK (GenomeWeb) – A UK research team has come up with a new method for applying bisulfite sequencing-based methylation profiling to individual cells.
As they reported in Nature Methods this weekend, the researchers are currently able to assess cytosine methylation levels at up to nearly half of the genome's methylation-prone CpG sites (where cytosine and guanine bases neighbor one another) in mouse cells using the single-cell bisulfite sequencing, or scBS-seq, approach.
In its proof-of-principle experiments, the team applied scBS-seq to dozens of single mouse oocytes or mouse embryonic stem cells that had been grown in growth serum with or without kinase inhibitors.
When they integrated cytosine methylation information gleaned from 12 individual mouse oocyte cells using scBS-seq, for example, the researchers generated a genome-wide cytosine methylation map that closely resembled that available from studies of the DNA methylome in this cell type using bulk samples.
"When we merged the single [oocyte] cell datasets, we got a profile from a dozen oocytes that very, very nicely recapitulates what you would get if you sequenced 100 to 150 [oocytes in bulk]," Babraham Institute epigenetics researcher Gavin Kelsey, co-senior author on the study, told In Sequence.
In the past, researchers have applied reduced representation bisulfite sequencing (RRBS) for assessing cytosine methylation profiles in individual cells. That enzymatic digestion-based single cell bisulfite sequencing method, described by Fuchou Tang's group at Peking University, is effective for profiling methylation patterns in CG-rich portions of the genome, Kelsey explained.
In contrast, his team's newly developed approach is aimed at applying bisulfite sequencing across the genome of individual cells in a manner that's more or less unbiased.
Generally speaking, Kelsey noted that the scBS-seq method draws from another bisulfite sequencing method known as post-bisulfite adaptor tagging, or PBAT — a method presented in Nucleic Acids Research by Japanese researchers in 2012.
Rather than lopping sequencing adaptors onto DNA and then treating it with bisulfite to convert unmethylated cytosine bases to thymine, the PBAT method involves bisulfite treatment prior to adaptor ligation to reduce DNA loss associated with the harsh chemical treatment.
"The appeal of that method, on which our single-cell method is based, is that it does all of the molecular biology after you do the bisulfite treatment," Kelsey said.
"One of the problems of bisulfite treatment is that it also tends to degrade DNA," he explained. "So if you spend a lot of time fragmenting DNA, ligating on adaptors, and things like that, a lot of those reactions are basically wasted."
In the scBS-seq context, Kelsey and his colleagues simultaneously fragmented DNA while doing the bisulfite conversion of unmethylated cytosines. But rather than following that step with sequencing adaptor ligation, they performed primer extension to produce complementary DNA strands that included Illumina sequencing adaptors.
In an effort to catch and amplify as many fragments as possible, the team performed this tagging step five times before adding on a second sequencing adaptor and amplifying tagged DNA molecules by PCR done with indexed primers and feeding the DNA into standard Illumina library prep protocols.
"What we did to make it work for single cells was to do a number of cycles of primer extension. We essentially do a pre-amplification step before we make a regular Illumina library," Kelsey said.
On the analytical side, meanwhile, the method involves a relatively routine mapping of bisulfite sequence read data. But because the multiple rounds of primer extension done during sample preparation obscure some directional information in libraries, Kelsey said, the team typically spends more time mapping each sequence than it would when dealing with reads produced from a conventional bisulfite sequencing library.
In their current study, the researchers began by applying this approach to a dozen oocytes at the metaphase II stage of the cell cycle that had been collected from mice after ovulation. They also tested 22 mouse embryonic stem cells with scBS-seq, including 20 grown in typical growth medium and 12 grown in serum containing kinase inhibitors.
The methylation profiles detected for those individual cells were subsequently compared with those found in seven negative control scBS-seq libraries and pooled cells from the same sample types.
The researchers sequenced each sample to a relatively low depth with Illumina's HiSeq, mapping around 3.9 million reads per sample, on average.
The sequence data provided a peek at methylation levels at between 1.8 million to 7.7 million CpG sites per sample (3.7 million, on average), representing roughly 18 percent of CpGs in an average individual cell tested by scBS-seq.
It's anticipated that additional CpGs can be profiled by generating deeper sequence coverage — something the study's authors demonstrated through deeper sequencing with slightly longer reads on two of the mouse oocyte cell libraries.
Results from experiments performed in the current study hint that it's possible to boost the number of CpG sites assessed across the genome by using 150 base pair paired-end reads rather than paired-end reads spanning 100 base pairs apiece.
Still, Kelsey noted that there can be diminishing returns related to declines in sequence quality at the ends of the reads, depending on the size of the DNA fragments going into the library.
"If one was to [sequence] paired ends and you're overlapping, then obviously you're sequencing the same bit of DNA twice," he said. "So there's an element of diminishing returns for certain read lengths."
In the two mouse oocyte libraries sequenced most extensively, the team saw methylation profiles at almost 50 percent of the CpG sites in the genome, though Kelsey cautioned that that may be approaching the current saturation levels for a given single-cell library.
"I don't think these libraries will allow us to get much more information from a [single cell]," he said. "But no doubt there will be further improvements in the method that will allow one, if one wanted to, to cover all CpGs in the library."
In contrast, bisulfite sequencing on bulk cell samples typically offer a look at somewhere on the order of 70 percent of CpG sites, since not all reads can be mapped back to the genome after bisulfite treatment.
Even in samples where fewer than half of CpG sites are directly interrogated, though, the scBS-seq appears to provide valuable information on methylation patterns in individual cell genomes and the variability from one cell to the next.
Moreover, Kelsey explained, it's possible to infer additional methylation patterns in several regions based on methylation marks profiled directly. "For some regions, you can infer methylation patterns if you've got sufficient isolated CpGs, or CpG islands," he said, "because there's a tremendous degree of concordance across the CpG islands."
Using the mouse oocyte data, the team was able to assess the technical variability in their method, since that cell type is known for its homogeneity and for well demarcated regions of high- or low- cytosine methylation. Generally speaking, experiments in those cells pointed to good reproducibility and relatively little noise in the scBS-seq experiments.
Along with their value for studying technical variation, Kelsey and his colleagues are keen to begin interpreting the methylation patterns in mouse oocytes as an entry point to more detailed studies of methylation in oocytes from other animals as well.
"My group is very interested in understanding the methylation profile of oocytes," he explained. "One of the motivations for the single-cell approach is that we can move beyond studies in the mouse into other species for which oocytes are going to be much more difficult to get in numbers."
In contrast, the mouse embryonic stem cells considered in the study are far more prone to heterogeneity in their transcriptional and cytosine methylation profiles.
In that cell type, Kelsey explained, researchers hope to use single-cell methylation profiling methods such as scBS-seq to untangle the "immense degree of variation" in methylation profiles from one cell to the next so that they can get a better view of the relationships between particular methylation patterns in an embryonic stem cell and that cell's eventual fate.
Results from the current study hint that cell-to-cell methylation variability is enhanced at sites annotated as enhancers in mouse embryonic stem cells, though Kelsey emphasized that "we cannot explain all the sites of methylation variation between cells as mapping to enhancer elements."
"There's quite a large amount of cell-to-cell variation that appears not to map to any standard genome annotation," he said. "It's going to be quite interesting to explore what those regions are and what functional importance they might have."
He and his colleagues are especially interested in continuing to apply scBS-seq to characterize methylation in oocytes from other organisms to develop a better understanding of methylation programming and reprogramming in pre-implantation embryos across multiple mammalian species.
More generally, though, the study's authors argue that the scBS-seq method should be applicable to other studies of embryonic development as well as clinical tests aimed at designing stem cell-centered therapies or more effectively understanding and treating cancer or fertility problems.
While the group is quite pleased with the way scBS-seq is performing technically, Kelsey said there are ongoing efforts to streamline the method before rolling it out for more widespread applications.
In particular, he noted that the sample preparation side of the approach remains quite time-consuming due to the multiple rounds of primer extension reaction entailed in the pre-amplification step.
"Doing 12 libraries over a couple of days is fine," Kelsey said, "but if we wanted to really exploit the potential of single-cell methods, we'd like to be able to do 100 libraries in an experiment. We need to work on the method to get to that level of throughput."