Skip to main content
Premium Trial:

Request an Annual Quote

MinIon Early-Access Users Evaluate Platform's Performance, Explore Various Applications

Premium

NEW YORK (GenomeWeb) – Less than a year after Oxford Nanopore Technologies started shipping its first MinIon sequencers to early-access customers, several users presented results at the Advances in Genome Biology and Technology meeting in Marco Island, Fla. last week.

Sara Goodwin, manager of technology development at Cold Spring Harbor Laboratory, delved into the technical performance of the MinIon, while her colleague James Gurtowski focused on data analysis for de novo genome assembly.

Researchers at The Genome Analysis Centre (TGAC) in Norwich, UK explored the MinIon for surveying microbes from air samples, while other groups looked into analyzing methylated DNA bases, genotyping tropical disease pathogens, and de novo genome assembly.

Cold Spring Harbor Laboratory entered the MinIon Access Program last year and has run more than 40 MinIon flow cells so far. Earlier this year, Michael Schatz presented the de novo assembly of a yeast strain from error-corrected nanopore reads at a conference, a project the group also published on a preprint server. In addition, CSHL researchers have used the MinIon to sequence amplified cDNA from a human cell line, Goodwin reported.

Over the course of the early-access program, the technology has undergone multiple updates, including three different flow cells – versions R6.0, R7.0, and R7.3 – and a number of sample preparation protocols.

Variability in performance between individual flow cells has been considerable, she said. While at least one flow cell of each of the three types performed well, there were also "some not so good ones." Performance still varies between individual R7.3 flow cells, but is more consistent than with the earlier versions.

One strong indicator of flow cell performance is the number of usable pores. Each chip carries 512 nanopores but the number of available pores "can decline precipitously," Goodwin said, likely as a result of shipping. Having at least 350 to 400 active pores indicates good performance, they found.

On the sample prep side, the researchers have tested various DNA-shearing protocols, using Covaris g-Tube technology, all of which resulted in some very long reads on the order of 100 kilobases. Unsheared yeast DNA yielded smaller average read lengths, possibly because the DNA breaks during sample prep. Typically, the researchers now shear the DNA into 10-kilobase fragments, which works well for their de novo assemblies.

Early library prep protocols required the motor protein that feeds the DNA through the nanopore to be loaded onto the DNA by the user, but it now comes preloaded onto the hairpin adapters and carries a His-tag for affinity purification, which has increased yield.

The longest read the CSHL team has generated so far is 190 kilobases in size, but Goodwin cautioned that very long reads are not always useful because they often do not align well. It is not clear, though, whether that is due to the quality of the reads or because the researchers have not found the best alignment method to process them yet.

Overall, 2D reads, which use information from both the forward and reverse strand, have been "much more useful" than single-dimensional reads, she said, and the longest 2D read her team has produced so far is 57 kilobases. The average fraction of 2D reads out of all reads has increased from 19 percent for R6.0 flow cells to 30 percent for R7.3 flow cells, and for individual runs it has exceeded 50 percent.

With regard to biases in the data, the researchers noticed that pentamers with low GC-content seemed to be underrepresented, in particular poly-As and poly-Ts. This has somewhat improved with newer versions of the flow cells and basecallers, Goodwin said.

Regarding the error profile of the reads, the CSHL scientists found that using the BLAST aligner mismatch errors dominated, with 13 percent mismatched bases for average 2D reads from R7.3 flow cells, followed by 8 percent deletion and 3 percent deletion errors. This is different from what other groups have found using different aligners – those groups reported that indel errors dominate, Goodwin said.

Finding the right aligner for Oxford Nanopore data is "not a solved issue" and there is a lot of development going on in this area right now, she said.

The percentage of mappable reads has increased steadily over time. While only 8 percent of reads from R6 flow cells aligned, an average of 38 percent of 1D reads from R7.3 flow cells do so, and up to 80 percent of 2D reads from individual R7.3 flow cells aligned.

As reported earlier, the researchers generated nanopore data at more than 120-fold coverage for their yeast de novo assembly over 30 MinIon runs. About 90 percent of the data came from just four good flow cells, Goodwin said, and each produced on the order of 500 megabases of data.

The final assembly, from nanopore reads that were error-corrected with short Illumina reads using CSHL's Nanocorr approach, consists of 95 contigs with an N50 contig size of 585 kilobases, she reported.

Certain genomic elements, such as rRNA, gene cassettes, transposons, LTR retrotransposons, and telomers were picked up at higher frequency in the nanopore data than in Illumina MiSeq data, she said.

The researchers also tested the MinIon on human cDNAs, with the ultimate goal of sequencing full-length isoforms to determine splice sites, particularly in single cells. For this, they amplified cDNA using a long-range polymerase and created libraries similar to those from genomic DNA. No shearing was necessary because few cDNAs were longer than 10 kilobases, and the average read length was 1 kilobase.

Going forward, the CSHL team plans to study the effect of base methylation on the nanopore data.

Microbial surveillance

Richard Leggett and his collaborators at TGAC have tested the MinIon for the detection of microbial species in metagenomic samples. Eventually, the researchers would like to use the device for real-time surveillance of environmental samples, such as air and water, he said.

The TGAC group has been an early-access user of the MinIon since last April. As expected for an alpha- or beta-testing program, there have been frequent updates to the technology, Leggett said, but "the general trend is upwards" and performance has been improving. "I think there is some possibility for the future," he said.

To analyze the nanopore data after alignment, the researchers have developed a software tool called NanoOK that produces a report with information on the data quality and error profiles.

To test the MinIon, they sequenced the Human Microbiome Project mock community, a mix of DNA from 20 diverse bacteria. For the analysis, they used a program called Kontaminant, developed at TGAC, that looks for kmers that match to kmer libraries from reference sequences and provides rapid feedback on the sample composition. Using Kontaminant they were able to detect the 20 species.

They also sequenced samples from an ongoing air sample analysis project on the MinIon and compared the results with Illumina data. In one air sample from a greenhouse, they identified 1,400 species from 500,000 Illumina reads and 140 species from 13,000 MinIon reads. Ninety-nine of the species overlapped between the two platforms.

To enable real-time data analysis, the researchers have explored running Kontaminant on a low-cost Raspberry Pi computer and were able to analyze nanopore data on this.

The missing piece for real-time surveillance is rapid library preparation, for example, on a microfluidic device, Leggett said.

A handful of additional MinIon early-access users also presented posters at AGBT. A group led by Yutaka Suzuki at the University of Tokyo, for example, used the MinIon for genotyping tropical disease pathogens, including Dengue virus and Plasmodium falciparum.

Another group, at the University of Maryland, developed a new algorithm for the de novo assembly of long, error-prone reads from Pacific Biosciences and Oxford Nanopore. Using MinIon data, they assembled the E. coli genome into a single contig.

Finally, Winston Timp from Johns Hopkins University has developed a computational approach for identifying DNA methylation patterns in single cells using MinIon data.