Researchers from the University of Chicago and Emory University have come up with a bisulfite treatment-related sequencing method that shows promise for detecting a DNA modification known as 5-carboxycytosine, or 5caC, across the genome with base-level resolution.
The team outlined the basis for its scheme — dubbed chemical modification-assisted bisulfite sequencing, or CAB-seq — in a study published earlier this month in the Journal of the American Chemical Society. The method hinges on the availability of a chemical catalyst that slightly alters 5caC's chemistry.
By labeling the modified base in that manner, authors of the study explained, it becomes possible to not only enrich for it but also to change its response to bisulfite treatment. Then, by doing bisulfite conversion on the same sample before and after a 5caC-labeling treatment, the position of 5caC bases can be determined.
"We offered two methods. For the first method we used chemistry approaches to selectively label [5caC] with biotin," senior author Chuan He, a chemistry and biophysical dynamics researcher with the University of Chicago, told In Sequence. "That allows us to enrich, carboxylC containing DNA for sequencing."
In its existing form, the CAB-seq method is best suited to looking at only a limited number of loci at once in mammalian genome, he noted, since the relative scarcity of stable 5caC intermediates in those organisms necessitates deep sequencing to detect the cytosine marks.
Because it can be applied in conjunction with antibody- or biotin label-based enrichment, though, he predicted that the CAB-seq approach could be used to look at much larger swaths of sequence when sufficient quantities of sample are available.
The team is continuing to work on ways of improving the approach with an eye to eventually understanding 5caC profiles across the genome.
"We are in the process of developing more effective methods for enrichment and readout of [5caC], building on the method we just published," He said. "Hopefully, we can get a whole-genome map of [5caC]."
Cytosine methylation has garnered a good deal of attention as an epigenetic mark on DNA. For the most part, methylation studies to date have focused on 5-methylcytosine.
But researchers are increasingly interested in understanding related modifications to the cytosine base as well, particularly in light of studies suggesting that the 5-hydroxymethylcytosine base formed during cytosine demethylation might also serve as a source of epigenetic information — at least in some biological contexts.
Consequently, several groups have taken a crack at coming up with sequencing approaches for routinely profiling 5hmC marks.
For instance, researchers in Hagan Bayley's University of Oxford lab characterized current patterns associated with 5hmC bases as they pass through the pore of alpha-hemolysin, a protein being pursued in the nanopore sequencing sphere (IS 10/12/2010).
Another UK-based group reporting in Science last spring outlined an approach called oxidative bisulfite sequencing, or oxBS-seq, for identifying — and distinguishing between — methylcytosine and hydroxymethylcytosine bases (IS 5/1/2012).
Members of He's University of Chicago lab have been developing sequencing methods for 5hmC detection, too. In 2011, He and his colleagues published results related to a single-molecule hydroxymethylation sequencing strategy that sprung from a collaboration with Pacific Biosciences researchers (IS 12/6/2011).
Since then, He's group has developed a so-called Tet-assisted bisulfite sequencing, or TAB-seq, approach for getting base-level information on 5hmC locations in the genome, described in Cell last spring (IS 5/22/2012).
Nevertheless, he noted, 5hmC is not the final intermediate formed during cytosine demethylation. Instead, that base gets converted to 5-formylcytosine (5fC) and 5caC, which are recognized and lopped off by DNA base repair systems to form fully demethylated cytosine.
"If you just sequence [5hmC] … you don't know which [5hmC] is a stable epigenetic mark and which are going toward demethylation," He explained.
"The best way [to distinguish between those options] is to sequence [5fC] and [5caC]," he added. "Once you get to the [5fC] and [5caC] stage, you're committed to demethylation."
In Cell earlier this year, He and his colleagues described a method for detecting the 5fC intermediate. For their latest study, they set their sights on the other pre-demethylation intermediate, 5caC.
First, the team had to track down a chemical with selectivity for carboxycytosine — a search that led them to a chemical catalyst called 1-ethyl-3-[3-dimethylaminopropyl]-carbodiimide hydrochloride, which labels 5caC's carboxyl group with an amine group.
By then using amines tagged with biotin, He and his co-authors noted in their new paper, it becomes possible to enrich for 5caC with the help of biotin-binding streptavidin beads.
The new labeling scheme helps those interested in seeing 5caC by sequencing, too, He said.
That's because 5caC normally acts like unmodified cytosine during bisulfite treatment, getting converted to a thymine residue. After amine labeling, though, 5caC bases become more resilient under bisulfite treatment conditions. Rather than appearing as cytosine in bisulfite sequence data, the labeled 5caC moieties resist conversion and resemble methylated cytosine bases.
Theoretically, that means that the position of each carboxylcytosine can be detected across the genome, He said, by comparing data from a standard bisulfite sequencing experiment with sequences generated following a 5caC-labeling step and bisulfite treatment.
In practice, though, there are still hurdles to overcome before CAB-seq can be applied for whole-genome sequencing. That's particularly true for sequences in which 5caC is rare, since the sequencing depth needed to pick up the modified bases may be prohibitively expensive.
"In principal, if someone has [sufficient] funding they can do this genome-wide," He said. "Right now, the major problem is that the [5caC] levels are quite low, so you need a lot of sequencing depth to see [5caC]."
In mammalian samples, for instance, studies so far suggest 5caC and its fellow demethylation intermediate 5fC tend to be quickly demethylated to cytosine. Based on such patterns, He speculated that 5caC's primary function in mammalian systems is its role as a stepping-stone to full demethylation.
Still, researchers haven't dismissed the possibility that the modification could have a biological or regulatory role in its own right, particularly in non-mammalian organisms with high levels of 5caC, which have been identified in unpublished studies.
"There's always a possibility that cells generate 5fC and then 5caC to add an additional layer of complexity for regulation," He explained. "So it's important to have an accurate way to detect both of them."
By continuing to develop and improve methods for profiling each of the demethylation intermediates across the genome, the team hopes that it will eventually be possible to more fully explore that possibility.
As it stands at the moment, the CAB-seq approach appears to be best suited for looking at loci-specific 5caC patterns in mammalian samples, He said, or for doing more extensive 5caC sequencing in organisms with relatively high 5caC levels.
He noted that it should also be possible to track 5caC genome-wide using the current approach if there is enough starting material on hand to do antibody-based enrichment or enrichment for labeled 5caC with streptavidin beads.
Nevertheless, He and his colleagues are looking at ways of tweaking and improving on their existing CAB-seq protocol, including potential changes to the chemistry used to label 5caC — additional development steps aimed at bolstering the method's sensitivity and specificity for 5caC.
"The labeling chemistry, though specific for [5caC], still has some background," He said. "So if we can exclusively label [5caC], which we're trying right now, that … could be directly applied to our methods."
He and his co-authors have filed for patents related to the 5caC methods described in the new study as well as the 5fC sequencing methods developed previously.
A University of Chicago spinout company called Wisegene — which currently markets reagents and kits related to the He team's 5hmC sequencing method, TAB-seq — may ultimately take steps towards commercializing products related to the group's newer 5caC and 5fC profiling approaches. In the case of the CAB-seq approach outlined in the current study, He noted that the group "probably will improve it a little bit before we commercialize."