This article has been modified to clarify details of the method.
NEW YORK (GenomeWeb) – Researchers from the Genome Institute of Singapore have developed a method to improve the accuracy of Oxford Nanopore Technologies' MinIon. The researchers are now working to improve the method and want to use it for applications such as 16S sequencing from complex samples, RNA sequencing, and sequencing complex genomic regions.
The researchers described the method recently in a study published in Giga Science. Niranjan Nagarajan, associate director of computational and systems biology at GIS, told GenomeWeb that his group wanted to find a way to improve the accuracy of the MinIon. "Nanopore sequencing is an incredibly exciting technology," he said, but, "its Achilles heel is the accuracy of the raw reads that come out of the sequencer and the presence of systematic errors."
The method borrows from the concept used in Pacific Biosciences' circular consensus sequencing: by sequencing the same molecule over and over, a consensus sequence can be computed, which improves accuracy.
Although nanopore sequencing requires linear DNA, the GIS team's approach relies on circularization in the initial steps.
The method, which they call intramolecular-ligated nanopore consensus sequencing, involves first circularizing a template DNA molecule. Then the researchers use rolling circle amplification to amplify that molecule, which generates multiple repeating units of the template DNA.
"Sequencing this piece of DNA reads the same original sequence multiple times, allowing us to merge the information computationally to a single accurate read of the original piece of DNA," Nagarajan said.
As a proof of concept, the researchers first applied it to 16S RNA sequencing, which is used to identify bacteria from a mixed sample. Since 16S sequences can be similar, long, accurate reads are important. The group first used synthetic sequences to validate the method, and compared the INC-Seq method to 2D reads using a standard MinIon sequencing method.
Only 72 percent of the 2D reads could be mapped to the correct reference, and 88 percent could be mapped to a sequence belonging to the same species as the reference. However, 92 percent of INC-Seq reads, which were generated with six segments, could be mapped to the correct reference and 99 percent could be mapped to a sequence belonging to the same species as the reference.
Next, the researchers tested the method on a bacterial community consisting of S. cristatus, S. oralis, and P. micra. They found that 16S RNA sequencing using INC-Seq improved accuracy of the reads to 97 percent compared to 84 percent with standard 2D reads. Rates of mismatch errors were reduced tenfold to 0.7 percent from 7.5 percent. In addition, the method was able to figure out the correct abundances of each species.
The team then tested INC-Seq on a more complex community, consisting of 10 microbial species with varying levels of abundance. Again 16S sequencing using INC-Seq was able to recapitulate the relative abundances of each species, although it struggled with the most abundant species, suggesting that it may have been affected by amplification bias.
Nagarajan said that his group is continuing to work on improving the method. For instance, he said, his team is working on a protocol that involves using dumbbell adaptors to enable the same DNA molecule to be sequenced multiple times. Essentially, a hairpin loop is attached at either end of a DNA molecule to enable it to be read over and over. Although this approach would still require an RCA step, it should have less length biases than other types of circularization approaches, Nagarajan said. The current INC-Seq method works best when the DNA templates are all approximately the same size, but the hairpin approach wouldn't have those same length biases.
RCA causes other issues as well. For instance, sometimes DNA gets stuck together, and sometimes polymerase jumps from one template to another, which leads to a hybrid read with sequences from two different templates, Nagarajan said. But, he added, the team is working on reducing those biases.
The dumbbell adaptor would also reduce the turnaround time from essentially a full day to just a couple of hours, he said.
Nagarajan is also testing different applications, such as RNA sequencing and assembly analysis. He said he is especially interested in using it for RNA sequencing to study complex genes, isoforms, and large gene families. In addition, he said he would be looking to use it to "disambiguate particularly complex regions of the human genome."
In addition, if the group is able to successfully develop the dumbbell adapator method, which will improve turnaround time, he said it would be a good method for rapid 16S sequencing in complex samples to "confidently identify organisms down to the species or even strain level." It could be used to identify specific crop pathogens in field settings, for instance, he said.
Nagarajan said that the methods are available free for anyone to test out themselves. In addition, he said that he is talking with Oxford Nanopore about the potential of developing an improved version that would be available to users. "The protocol is quite straightforward and we know of many labs that are already using it," he said.