SAN FRANCISCO (GenomeWeb) – Dovetail Genomics is working to modify its in vitro proximity ligation genome assembly method and bioinformatics pipeline for metagenomics applications under a new two-year, $938,000 grant from the National Institutes of Health.
The Santa Cruz, California-based firm aims to build off the assembly protocol it calls Chicago, modifying it so that it is able to pick out specific bacterial strains from metagenomic samples. In particular, its goal is to be able to work with fecal samples, identifying the various strains in the gut microbiome.
Ramesh Ramakrishnan, Dovetail's senior vice president of R&D, said that the firm would eventually look to commercialize the product, although he said that it is still too early to say when it would do so or what a potential product would look like.
Dovetail Genomics first described its Chicago assembly method at the Plant and Animal Genomes conference in 2015 and published a study in Genome Research last year describing it in more detail.
In Chicago, the first step is what's called chromatin reconstitution, Ramakrishnan said. Although bacteria do not have chromatin, he said, the method can still work for metagenomics samples, since the DNA will still bind to proteins. Traditionally, chromatin reconstitution has been done in the context of human DNA using histones, but a similar mechanism will work for other organisms' genomes, he said.
After binding proteins to the DNA, they are fixed with formaldehyde and cut with a restriction enzyme, giving sticky ends. Biotinylated nucleotides are added to those ends and the free blunt ends are ligated. Finally, the crosslinks are reversed and the binding proteins removed. The resulting fragments are then digested with exonuclease to remove the biotinylated nucleotides, then sequenced.
The method is similar to the Hi-C method and Dovetail also offers Hi-C assembly services, but the main difference is in the resolution, Ramakrishnan said. Hi-C links DNA fragments that are "hundreds of kilobases to megabases" away from each other, "so proximity is somewhat limited," he said. But, with the Chicago method the firm has demonstrated it can link DNA pieces that are between 10 kilobases and 100 kilobases away from each other.
That increased resolution has advantages for metagenomic samples, he said. For instance, one potential application is to look for causes of diarrheal disease by doing metagenomic sequencing of fecal samples. The two most common causes of diarrheal disease are Escherichia coli and Shigella. Both often also have plasmids associated with them that may contain virulence factors or antibiotic resistance genes, but because the plasmids are separate from the genome, "you don't always know which plasmids are associated with which genomes." The Chicago method can help in linking the plasmids with the correct genomes to identify the pathogenic strain that's causing diarrheal disease.
The method can work on DNA inputs down to 500 nanograms and sometimes less, Ramakrishnan. It doesn't require the bacterial sample to be cultured, which is important, since many organisms do not grow in culture and culturing lead to biases — enabling some species to vastly outgrow others.
Going forward, the main development challenge will be ensuring that the method can work on complex samples, Ramakrishnan said, and will be able to distinguish individual bacterial strains from a mixed sample.
The major key will be the bioinformatics pipeline, Ramakrishnan said. While adapting the Chicago library prep steps for a metagenomic sample should be pretty straightforward, the bulk of the firm's work would be on the bioinformatics side. "The major focus of the company will be further developing the algorithm and refining it, making it more expansive so we can deconvolute complex mixtures," he said.
A number of other groups have been developing metagenomic sequencing strategies, and in at least one case, such approaches are being used for clinical purposes. For instance, Charles Chiu, director of the Viral Diagnostics and Discovery Center at the University of California, San Francisco, has launched a clinical metagenomic test for meningitis and encephalitis out of UCSF's Clinical Microbiology Laboratory . Chiu's team uses a shotgun sequencing strategy and has developed a custom informatics pipeline, SURPI+, to identify pathogens from the sequence data.
Researchers from Johns Hopkins have also demonstrated that metagenomic sequencing of brain or spinal cord biopsies can identify pathogenic microbes in individuals with suspected infection-induced neurologic disorders.
And, a group from the Cincinnati Children's Hospital has tested a metagenomic sequencing protocol on fecal samples to identify patients harboring multidrug resistant bacteria.
Ramakrishnan said that the major difference between the various metagenomic sequencing protocols is in the bioinformatics. Each group uses a separate bioinformatics algorithm. In addition, the other approaches do not include Dovetail's sample prep step for proximity ligation to de novo assemble the metagenome.
Ramakrishnan said that although the firm will most likely commercialize the product for research, it is too early to say whether it intends to pursue the clinical market for infectious disease testing, for instance, but said that it was possible. In addition, he said, while the initial NIH grant is to develop the technology for fecal samples, the method could ultimately have applications for a whole range of complex metagenomic samples.