BALTIMORE – Researchers from Aalborg University in Denmark have generated near-finished microbial genomes from pure bacterial cultures or metagenome using only nanopore long-read sequencing.
Their approach, described in a paper published in Nature Methods earlier this month, demonstrated that nanopore sequencing, using Oxford Nanopore Technologies' new R10.4 flow cell and Q20+ chemistry, can now produce bacterial genomes without the need for short-read or reference polishing.
"For us in the lab, this is really an important step," said Mads Albertsen, a microbiologist at Aalborg University and the lead investigator of the study. "We can [now] use a single, readily available technique to do more-or-less perfect bacterial genome assemblies."
According to Albertsen, being able to assemble close-to-perfect bacterial genomes with a single sequencing technology has been a long-time goal for his lab. His team started out using Illumina short-read sequencing, but he said it turned out to be "very difficult" to assemble repetitive and similar regions of the bacterial genomes with short reads.
In 2014, his lab became an early customer of Oxford Nanopore, before the company commercially rolled out its first sequencer, MinIon. The team first used nanopore long-read data as scaffolding for Illumina sequencing, Albertsen said, but with the improvement of nanopore sequencing, the group started using the long reads to assemble bacterial genomes about five years ago.
However, due to the high consensus error rate of nanopore sequencing, the team always had to employ Illumina short-read data to remove the errors, especially for insertions and deletions derived from homopolymer regions in the genome.
While it is possible to achieve near-perfect bacterial genome assemblies using only Pacific Biosciences HiFi sequencing, Albertsen said because the PacBio platform is big, expensive, and often requires a lot of DNA input, the technology "is not something that has reached small labs." Oxford Nanopore’s platforms, on the other hand, which he said are less expensive, "really democratize who can actually do really nice bacterial genome assemblies."
For their study, Albertsen's team tested Oxford Nanopore's R10.4 flow cell and Q20+ chemistry, which are still under early access, on ZymoBiomics mock bacterial community samples that consist of seven bacterial species and one fungus using a PromethIon platform. Additionally, the team sequenced activated sludge samples from an anaerobic digester using the new flow cell and chemistry on a GridIon device.
The results showed that the R10.4 chemistry improved nanopore sequencing's ability to call homopolymers. Specifically, the researchers reported that at a consensus sequence level, the new chemistry was able to correctly resolve nearly all homopolymers that were less than 11 bp long.
"If you actually look in reference genomes of bacteria, the vast majority of homopolymers are below nine or 10 [bp]," Albertsen said, adding that even though nanopore sequencing still cannot handle homopolymers that are larger than 50 bp very well, these larger homopolymers rarely exist in bacteria. That said, he pointed out that for eukaryotes and other organisms, the concerns about homopolymers still persist for nanopore sequencing, even with the new chemistry.
"It's really fascinating how far nanopore [sequencing] has come in a short time," said Fritz Sedlazeck, a researcher at Baylor College of Medicine who is experienced with the technology.
According to Sedlazeck, who is also an early-access customer of Oxford Nanopore, the new R10.4 chemistry, which his lab recently tested on the PromethIon, also performed "very well" for him, generating sequencing data with a median error rate of about 1.2 percent.
In addition, Sedlazeck said the improved consensus read accuracy of the new chemistry, as demonstrated in this study, "will help a lot" with genome assembly.
"The raw read accuracy is one thing, but the consensus error is really the important thing in my opinion," he said, adding that the improved consensus read accuracy, as a result of reduced systematic errors, will enable researchers to identify true variants.
While the Danish researchers defined "near-finished genomes" as genomes that do not require short-read or reference polishing, Sedlazeck said the term is still "not very quantitative" and needs to be further fleshed out. "I like that they are thinking about this [term] … the downside is, they didn't postulate any metrics," he added.
Sedlazeck said he believes it is important for the field to develop standards and include quantifying aspects when evaluating the quality of genome assemblies. One possible solution, he said, is to gauge the quality of the assembled genomes using a core set of essential genes as benchmarks.
Echoing Sedlazeck's point, Albertsen said that it was "really difficult" to settle on the meaning of "near-finished genomes" during their study, as there are many possible ways to define the term. "There's no good cutoff for what you mean by near-finished," he added. "It really is case specific, and it depends on how many homopolymers there are in the specific bacterial genome."
Despite the promises for nanopore sequencing, Albertsen said he does not think it will replace PacBio HiFi sequencing anytime soon. "If I had all the money in the world and all the time, I would do PacBio [sequencing]," he said. "The nice thing about PacBio HiFi reads is that the raw read quality of the individual reads is so great that you can actually do a lot of things with the raw reads." Still, he said that for small labs, nanopore sequencing is "so much easier and cheaper to do."
Moving forward, Albertsen said he is hoping to leverage nanopore sequencing to assemble genomes for as many bacterial species as possible, further completing the tree of life.
"What we are pushing for now is trying to develop methods and get funding to make sure we can make a tree of life," he said. "I think now the technology is actually starting to be there and starting to be realistic."