Pacific Biosciences' customers are using the company's recently launched analysis method for base modifications to study methylation and other epigenetic events in microbes and pathogens such as Escherichia coli and Salmonella.
The PacBio system has a unique feature that allows for epigenetic analysis without performing a separate experiment, such as methyl-seq or bisulfite sequencing. Instead, base modifications can be detected by analyzing the kinetics of the system. As the polymerase incorporates nucleotides, there is a detectable pause if a base is modified, such as with a methyl or a hydroxymethyl group, signaling the presence of a modification.
The company published a proof-of-principle paper describing its base modification sequencing strategy in Nature Methods two years ago (IS 5/11/2010).
Recently, the company released software that flags base-modification events in the sequence data. Users of the PacBio RS with C2 chemistry can download the software for free. Alternatively, users can also analyze the sequence data for these pauses using their own bioinformatics.
The current version of PacBio's algorithm does not distinguish between the different types of modifications, but that information will be incorporated into future versions of the algorithm, Jonas Korlach, scientific fellow and co-founder at PacBio, told In Sequence.
Each event produces a slightly different "pausing pattern" in the polymerase, which are then further analyzed to determine which type of modification it is. Currently, PacBio provides a motif identification tool for bacterial methylomes. Bacterial methylomes can be characterized by determining methyltransferase specificities from the methylation patterns throughtout the genome. Additionally, the company currently provides recommendations for which base modifications typically display certain secondary peaks, which can help in identifying the precise modification, Korlach said..
The company has published a technical note on its website describing how researchers can use the software to analyze 6-methyladenine and 4-methylcytosine in bacterial genomes.
Additionally, the long reads of the PacBio will "ultimately allow phasing of epigenetics," said Korlach. This could help better understand things like chromosome inactivation and imprinting as well as help create a more comprehensive epigenetic profile to better analyze development and disease progression.
PacBio customers such as the Department of Energy's Joint Genome Institute and the US Department of Agriculture are currently using the method to evaluate base modifications in microbes that are important for biofuels or bioremediation and in pathogens that cause foodborne disease or infect livestock.
The microbial space in particular is one that "our customers are excited about," said Korlach. "It's well known from the literature that methylation plays an important role in the basic life function of microbial growth," he said. But there is also evidence that it impacts pathogenicity, "inducing changes in microbial diseases that have direct effects on how infectious the strain is," for instance.
"Methylation profiles may relate to the differences in strain fitness … and virulence in humans," Robert Mandrell, research leader of the Produce Safety and Microbiology Research group at the USDA's Agricultural Research Service, told In Sequence.
Mandrell's group does not own a Pacific Biosciences machine, but in a proof-of-principle collaboration with PacBio, the company sequenced six E. coli genomes from isolates of the O145 strain representing two different outbreaks — a 2007 outbreak in Belgium associated with ice cream, and another in Arizona in 2010 associated with romaine lettuce.
Originally, the group simply wanted to take advantage of PacBio's long reads to help with assembly. And, indeed, incorporating the long reads of the PacBio helped the team reduce the number of contigs in one of the isolates to eight from 353 using a combination of Illumina and 454 sequencing.
Examining the genomes for base modifications, the group identified an unusual methyltransferase in the strain that caused the outbreak in Arizona. A PacBio team has also identified the same methyltransferase in the E. coli strain responsible for the outbreak in Germany last summer.
While the methyltransferase was "not novel, it was unusual to see it in E. coli," Mandrell said.
The strain responsible for the outbreak in Germany last summer, O104:H4, was identified as an enteroaggregative E. coli strain that acquired a phage genome with the capability of producing a Shiga toxin. The PacBio team found the methyltransferase in the phage and Mandrell's USDA team found the same methyltransferase in the isolate from the Arizona outbreak.
Because the Shiga toxin-producing phage from the O104 German E. coli strain played such a large role in that strain's pathogenicity, Mandrell said this indicates that the methyltransferase could have a role in that. However, he said that further functional studies would need to be done to demonstrate this. Additionally, he said, the strain from the 2007 Belgian outbreak did not have the methyltransferase and it, too, caused illness.
There is already some evidence that methyltransferase can affect pathogenicity. For instance, it's been demonstrated that "mucosyl pathogens have methyltransferases that can alter expression profiles of genes involved in virulence," said Mandrell.
He said his next step is to use PCR to screen a collection of more than 6,000 Shiga toxin-producing E. coli isolates from the agricultural region of Salinas Valley in California to see if those also have the same methyltransferase.
Additionally, he said, the group would like to continue to do sequencing on the PacBio to evaluate base modifications in different outbreak strains. To do this, they will work with another USDA ARS center in Nebraska that has a PacBio machine.
The DOE's JGI, which currently owns two PacBio machines, is also using the company's recently launched software to analyze base modifications.
Rex Malmstrom, who heads the micro-scale applications group at JGI and has been working on epigenetic sequencing, said that the system is especially useful for detecting base modifications other than 5-methyl cytosines. Modifications like N6-methyladenines, which are very common in bacteria, are not detectable by chemical conversion methods such as bisulfite sequencing, he said (IS 6/19/2012). The institute is now using the PacBio to measure N6-methyladenines in most of the bacterial genomes it is sequencing.
New England Biolabs is also collaborating with PacBio to evaluate N6-methyladenine and 4-methylcytosine in restriction enzymes and their associated DNA methyltransferases, chief scientific officer Rich Roberts, told IS.
Roberts said that PacBio is doing the sequencing and base modification analysis, first testing the protocol in bacterial genomes with only two to three restriction modification systems. By sequencing the whole genome and looking for the methylation sites, the researchers are then able to "find a motif that is the recognition sequence for whatever methylase is in there," Roberts said.
The company is studying methylation in a specific class of restriction enzymes in which it was difficult to determine recognition sequences with traditional methods. This is because these enzymes cut randomly, so the only way to determine the recognition sequence was to look at methylation sites directly. "That was quite difficult to do," said Roberts, and "would often take several months of experiments to get a clear recognition sequence."
It was "so labor intensive that we didn't want to do it," he added.
Roberts said that his group is now working on protocols to evaluate 5-methyl cytosine, which doesn't elicit as strong a signal on the PacBio as N6-methyladenine or 4-methyl cytosine, but those protocols are still being developed. Additionally, while the methylation work has so far been done in collaboration with PacBio, he said New England Biolabs recently applied for a grant to purchase its own machine.