Two teams of researchers have developed new methods that use high-throughput sequencing to profile DNA methylation that could be useful for analyzing methylation sites across large genomes.
One group, led by scientists at the University of California, San Diego, and Virginia Commonwealth University, devised a targeted methylation-profiling method that uses padlock probes to capture tens of thousands of bisulfite-converted short genomic targets containing CpG sites for sequencing.
The second team, led by George Church's group at Harvard Medical School, has come up with a similar method, also based on padlock probes, to selectively enrich CpG-containing bisulfite-converted DNA regions for sequencing.
That group has also developed a complementary method, called methyl-sensitive cut counting, that uses a methlylation-sensitive enzyme to interrogate 1.4 million methylation sites across the human genome.
The San Diego/Virginia group designed a set of approximately 30,000 padlock probes in order to assess methylation of approximately 66,000 CpG sites within 2,020 CpG islands, or 2.1 megabases in total, and used them to compare methylation in three human fibroblast lines and eight human pluripotent stem-cell lines. They used Illumina's Genome Analyzer to sequence the captured DNA.
The researchers found that although the padlock capture was very specific, the capture efficiency between different probes was quite uneven. Using a couple of normalization approaches, they were able to reduce this representation bias somewhat, but wrote that "we think there is room for further improvement, especially in better understanding the annealing thermodynamics of padlock probes, characterizing potential sequence-dependent bias of DNA polymerase or ligase, and post-capture normalization of biased libraries."
According to Kun Zhang, a professor in the department of bioengineering at UCSD and the senior author of the paper, reducing this bias "is quite important since it is probably the biggest limitation of padlock probes compared to other array-based or solution-based capture methods."
In the future, they could possibly expand their approach to a larger number of CpG sites, he and his colleagues wrote. "As we have not encountered an upper limit, it seems possible that all CpG islands in the human genome (about 20 megabases in total size) or other genomic targets of similar size can be captured and sequenced in single-tube reactions," according to the paper.
Zhang said that the padlock method is significantly cheaper than whole-genome bisulfite sequencing, and it is more flexible than reduced-representation bisulfite sequencing regarding the selection of targets. Unlike affinity-capture methods, it quantifies the methylation sites absolutely, so studies across labs can be compared easily, he added.
"One limitation is that you have to spend the money and effort to make a set of padlock probes," he said. On the other hand, once the probes are made, the method can be scaled up to large numbers of samples without incurring any further costs.
Zhang said the approach will benefit projects that aim to accurately quantify absolute methylation levels across large numbers of targets. "Any study that requires such information will find this method useful," he said.
[ pagebreak ]
The method "will be very popular" to study methylation in cancer, stem cells, neuronal plasticity, memory and learning, and aging, according to Yuan Gao, a professor in the Center for the Study of Biological Complexity and in the department of Computer Science at Virginia Commonwealth University, who is the other senior author of the paper. "We have ongoing collaborations on several of these already," he added.
The Church group, on the other hand, designed 10,000 shorter padlock probes to profile about 7,000 CpG sites in the Encyclopedia of DNA Elements pilot project region, and used them to study methylation in this area in B-lymphocytes, fibroblasts, and induced pluripotent stem cells from members of the Personal Genome Project. Like the other team, they also used Illumina's Genome Analyzer to sequence the captured DNA.
Both padlock probe methods are similar in principle, but with the Church group's method, the captured DNA can be sequenced directly without library construction. However, their probes are shorter, capturing fewer CpG sites than the probes used by Zhang's team.
"The superior specificity of padlock probes is a key to its success when the genomic DNA is bisulfite-converted," making it less complex, Jin Billy Li, a postdoc in the Church lab and one of the lead authors of the study, said in an e-mail. The method "is particularly useful to profile cytosine methylation at selected CpG loci in numerous genes."
Counting the Cuts
Li's team's second method, methyl-sensitive cut counting, or MSCC, on the other hand, is a genome-wide approach. It uses the methylation-sensitive restriction enzyme HpaII, which cuts unmethylated CCGG sequences and generates a sequencing library of tag fragments from the cut sites.
Though the human genome contains 2.3 million HpaII sites, only about half of these are "sufficiently unique for use in profiling," and cover about 1.4 million CpG sites, according to the paper. The researchers profiled these sites in a single B-lymphocyte cell line from a PGP individual.
"The protocol is extremely simple, just like many other next-generation sequencing library-construction [protocols]," Li said. In addition, MSCC "is free of probe design and synthesis and bisulfite treatment."
He pointed out that both methods developed by his group "are relatively unbiased," which he said is important to identify differentially methylated regions or sites.
Both groups said they plan to commercialize their methods. Zhang said he and his colleagues are "in discussion with several companies" about commercializing their padlock probe method, and Li said that Harvard recently filed patents on both technologies described in their paper.
For their padlock-probe-based approaches, both groups used custom-made oligonucleotide libraries from programmable microarrays in their studies that Agilent Technologies provided to them as part of an early-access program it has maintained over the last two years.
But it is not yet clear whether Agilent plans to turn this application for its oligo libraries into a new product. "Our intent is to see the creative uses [researchers] devise for this tool," Fred Ernani, senior product manager for emerging genomics applications at Agilent, told In Sequence by e-mail. "While we don't always commercialize the technology described in the resulting papers, we are continually evaluating their potential to become products."
So far, Agilent has developed one commercial product that uses its custom oligo libraries: its SureSelect genomic target enrichment kit that launched earlier this year and is based on a method developed by the Broad Institute (see In Sequence 2/24/2009).