This article has been updated with comments from researchers.
NEW YORK (GenomeWeb) – Researchers at the Norwich Medial School of the University of East Anglia and Public Health England have used Oxford Nanopore MinIon sequence data, in conjunction with short-read Illumina sequence data, to identify the position and structure of an antibiotic resistance island in the genome of two Salmonella Typhi strains.
The study, which was published in Nature Biotechnology today and also assessed the technical performance of the MinIon, is "the first publication that shows the utility of the technology to solve a complex biological problem," according to Justin O'Grady, a lecturer in medical microbiology at the Norwich Medical School and the senior author of the study.
"This is an interesting study that shows the power of long reads to resolve tricky parts of the genome that are troublesome to assemble with short read technologies," said Nick Loman, a researcher at the University of Birmingham who was not involved in the project. His group has also used the MinIon to analyze Salmonella outbreak strains, but has not published its data yet.
O'Grady is a participant in Oxford Nanopore's early access program and has had the MinIon in his research lab since June. He decided to use the instrument to study two strains of a recently emerged multi-drug resistant haplotype of Salmonella enterica Typhi, H58.
His colleagues had already determined that the two H58 strains were resistant to antibiotics, but they did not carry the plasmid where resistance usually resides, suggesting that a resistance island had integrated into their genomes. "They knew it was there, they could find the genes, but they could not position it in the chromosome or find its structure," O'Grady said.
Also, de novo assemblies of Illumina sequence data for the two strains resulted in highly fragmented genomes with 86 and 143 contigs, respectively, from which they were unable to pinpoint the insertion site of the resistance island.
Using MinIon data, which they generated with the earlier R6 chemistry, they were able to scaffold a number of Illumina contigs, and "we got sufficient reads across the island to tell us its position and structure," he said. To achieve this, they mapped the MinIon reads to the Illumina contigs using the LAST alignment tool, which can work with the many mismatches and gaps of the Oxford Nanopore data.
They repeated the experiment with the more recent R7 chemistry, data they included in the publication, and obtained "very similar results, with slightly better identity against the reference."
For one strain, for example, they used 40 MinIon reads to link contigs across the insertion sequences of the island. The Illumina data had not covered that sequence well because of its high GC content, leading to a break in the assembly, they determined. For the other strain, using the MinIon reads they discovered that one antibiotic resistance gene appears to be present twice, once on a plasmid and once inserted into the bacterial genome.
In addition to resolving the resistance island, the researchers created a hybrid genome assembly from MinIon and Illumina data for one of the strains, using the SPAdes software. This reduced the number of contigs to 34, from 86 in the original Illumina-only assembly, and increased the N50 length to 319 kilobases, from 154 kilobases. The hybrid assembly failed, however, to resolve the structure of the island completely due to the low sequence coverage with Illumina data.
Knowing the position and structure of the resistance island will allow scientists to design PCR assays so they can quickly identify the strains and track their epidemiology around the world, O'Grady said.
While it would have been possible to characterize the island using PCR-based assays alone, he said, that would have taken much more time and effort than using the MinIon.
While pursuing the resistance island, the researchers also took a good look at the performance of the MinIon in their hands, and how it compares to other platforms.
In one 18-hour run using the R7 chemistry, for example, they generated 93.4 megabases of data, a total of 16,401 sequencing reads, including template, complement, and two-direction or 2D reads, with a median read length of about 5.4 kilobases and read lengths up to 66.7 kilobases.
The median accuracy of the reads, based on Phred scores from basecalling, was 68.4 percent, ranging from 49.9 percent accuracy for template reads to 60.2 percent for complement reads and 84.2 percent for 2D reads.
They also assessed the accuracy of the MinIon reads based on how they mapped to the Illumina data, and found a median accuracy of 64.3 percent for the template reads, 61.6 percent for the complement reads, and 71.5 percent for the 2D reads.
Many of the errors appeared to be indels, in particular deletions, which often involved stretches of only As and Ts or only Gs and Cs. An analysis of the substitution errors showed that the MinIon has trouble distinguishing between Gs and Cs, especially in homopolymers, the authors wrote.
According to O'Grady, the accuracy has improved with the latest R7.3 chemistry and updated basecalling software and "averages around the mid-80 [percent]," which he said comes close to PacBio's accuracy of about 85 percent but is still much lower than Illumina's 98 to 99 percent accuracy.
His lab did not try to maximize read length for the MinIon – he said other users have obtained reads as long as 150 kilobases – but his median read length of 5.4 kilobases is "comparable" to that of PacBio, which promises average read lengths of 10 to 15 kilobases with its latest P6-C4 chemistry.
The yield of the MinIon also appears to begin to approach PacBio's. Using 48-hour runs, O'Grady's lab has produced up to 150 megabases of data, according to the paper, but other users have reported yields approaching 600 megabases per run. Using PacBio's new chemistry, throughput for the RSII is expected to be between 500 megabases and 1 gigabase per SMRT cell, according to the company.
Oxford Nanopore has had issues with the quality of the flow cells, O'Grady said, but given improvements in yield and consistency he thinks MinIon data alone could be used for de novo assemblies, without the use of short-read data, similar to how PacBio data has been used lately.
But also, because of its small size, potential low cost, and ease of use, the MinIon could have applications "far beyond the genome center," he said, including routine infectious disease diagnostics, possibly at the point of care. "I would like to see that type of technology reach clinical microbiology laboratories within five to 10 years," he said.
O'Grady said he is aware of software being designed that can analyze MinIon data remotely and report diagnostically relevant information on pathogens back to users, which "makes the technology very accessible to everybody and takes away the need for a bioinformatician in the lab."
Oxford Nanopore has not said when it plans to complete the MinIon early access program and make the platform commercially available, and it has not set a price yet. The company did not respond to a request for comment for this article. However, the cost per flow cell is expected to be on the order of $1,000, "which is a lot for a single sample, but it depends on the sample you are trying to diagnose," O'Grady said. "If you are looking at sepsis, or respiratory tract infections, you might be able to justify that kind of cost."
His lab is currently looking to use the MinIon for sepsis diagnosis, identifying pathogens directly from blood DNA without cell cultures.
O'Grady acknowledged that other next-generation sequencing platforms, such as Illumina's, are already being used to analyze clinical samples. However, "this particular device is very accessible, and as a non-expert, I was able to take it, run it in the lab, and get results in a week or two," he said.