SAN FRANCISCO (GenomeWeb) – Single-molecule sequencing to identify methylation patterns can pick apart the various bacterial strains present in a metagenomic sample, according to researchers from the Icahn School of Medicine at Mount Sinai.
The researchers took advantage of a feature of Pacific Biosciences' sequencing technology that enables the direct detection of methylation to develop a metagenomic sequencing technique that uses strain-specific methylation signatures to determine the composition of a mixed sample and to match plasmids to their host genomes. The team described the method in Nature Biotechnology this week.
The researchers have filed a patent on the method and are exploring potential commercial opportunities, according to Gang Fang, corresponding author and assistant professor of genetics and genomic science at Mount. Sinai. He said that it could have applications in outbreak settings and diagnosing pathogens from microbiomes. Initially, it would be especially useful for low- to medium-complexity samples, such as the infant microbiome, he added.
For the study, the researchers first developed a method to calculate methylation motifs and assign scores to those motifs, as well as a motif-filtering approach that would assign contigs with evidence of methylation to a motif-specific bin.
Essentially, standard PacBio sequencing is done on a sample and contigs are assembled, Fang explained. For each assembled contig, the researchers assign a methylation signature. Then, filtering gets rid of the vast majority of non-methylated contigs. After that, the team has a matrix with the rows representing a metagenomic contig and columns representing a methylation motif. "Each entry tells you how much the motif is methylated on a particular contig," he said, and then "contigs are clustered based on their similarity of methylation motifs."
The process is effective because bacterial species have diverse methylation patterns. Unlike human genomes, where methylation primarily occurs on CpG islands, bacterial strains have, on average, three different methylation motifs, Fang said, making it unlikely that any two strains will have the same combination of methylation patterns.
To validate their method, the researchers created a synthetic metagenomic mixture of PacBio sequence reads from eight separately sequenced bacterial species. Motif filtering identified 16 motifs from the metagenomic contigs, based on the methylation scores. Fourteen of those motifs, or 88 percent, were exact matches to the true methylated motifs, while the remaining two were closely related.
Next, they analyzed methylation profiles of contigs assembled from PacBio sequencing of a fecal microbiome sample from an adult mouse. Their method identified 38 methylated motifs and nine distinct contig bins. Seven bins consisted of contigs from the order Bacteroidales and were all very similar in sequence composition.
Fang said that the methylation method for metagenomic sequencing has an advantage over so-called sequence composition-based approaches because even strains that have highly similar genome sequences tend to have different methylation patterns.
One of the most important advantages, though, is that the method can link plasmids and mobile elements to specific strains, he said. This is important because plasmids can contain antibiotic resistance genes or virulence factors, and it would be helpful to know which strain in a sample they belong to. Metagenomic sequencing methods that rely on coverage levels to link plasmids to their hosts are often not effective because plasmids can replicate independently of its host. However, plasmids do contain the same methylation patterns as their host.
The researchers demonstrated that when they transformed Escherichia coli and Helicobacter pylori with a plasmid from an E. coli strain, they could pick apart the strains and match the plasmid to its host based on the methylation profile.
They further demonstrated this on simulated metagenomic samples of between 20 and 200 different strains, as well as on a mouse gut microbiome, in which they identified 19 mobile genetic elements and conclusively matched eight to their corresponding host.
Stephan Schuster, research director at the Singapore Center on Environmental Life Sciences Engineering at Nanyang Technological University, who was not affiliated with the study, said that the ability to link plasmids to their host genome was a particular advantage of the method. "It's a nicely done and very interesting study," he said.
He speculated that the method's applications could include studying microbial samples on biofilms, and that it could have medical applications, like analyzing dental plaque, or looking for the presence of pathogens on a catheter or other type of device that would be implanted.
For such applications, he said, there would be a limited number of microbes that would have to be teased apart, and methylation signatures could be a good way of doing that since it would also be able to tie in the plasmids. However, he questioned whether the resolution would be high enough for really complex environmental samples.
Fang concurred that initially, the best applications would be for low- and medium-complexity samples. Another issue, he said, is that the per-base cost of sequencing with PacBio is still higher than with shorter read sequencing systems. It would likely cost about twice as much to use the PacBio methylation approach on a sample versus shotgun sequencing on an Illumina instrument, he estimated.
Nonetheless, he said he anticipates this method to be complementary to other methods. For instance, it could be particularly helpful in parsing a sample that contains species from the genus Bacteroidales, which he said is important for disease in the gut microbiome. The sequences of various strains are all very similar to each other but have different methylation patterns, he explained. It could also be helpful when being able to link a plasmid to its host — in order to understand antibiotic resistance, virulence, or function — is critical, he said.