NEW YORK (GenomeWeb) – Five research labs on two continents have taken the Oxford Nanopore MinIon sequencer through its paces, assessing how reproducible the device's performance is across laboratories and developing standard protocols and reference data.
The labs are part of the MinIon Analysis and Reference Consortium (MARC), a group formed by a number of participants in Oxford Nanopore's MinIon early-access program (MAP) that plans to conduct a series of projects around the technology.
Results from the first phase were published today as an online preprint in F1000Research. The consortium made all raw and aligned nanopore reads available through the European Nucleotide Archive, and the European Bioinformatics Institute coordinated the data distribution and analysis. The journal also started a nanopore analysis channel this week, and published an editorial by a group of MAP participants.
When Oxford Nanopore released the second version of its flow cells, "it became clear that groups were having different degrees of success with the MinIon," the study authors wrote, the reasons for which were hard to figure out from single sequencing runs. The goal of their study was to assess the yield, accuracy, and reproducibility of the MinIon by performing replicate experiments at several sites, and to figure out what technical factors determine high performance.
For the study, the five laboratories — based in the UK, the US, Canada, and the Netherlands — each generated MinIon sequence data from the same substrain of Escherichia coli K-12 in duplicate, using the same protocol to culture the bacteria, extract their DNA, prepare sequencing libraries, and sequence them. Because Oxford Nanopore released an updated sequencing kit and protocol while the study was ongoing, the labs also generated a second dataset using those updates, for a total of 20 flow cell runs for the entire study. All experiments were performed between the end of March and the end of April of this year.
Overall, the researchers found "considerable variability" in the quality of flow cells, but flow cells that had a large number of active pores when they arrived at the lab "produced data of comparable yield, quality, and accuracy."
Using R7.3 flow cells and SQK-MAP005 chemistry, a typical experiment resulted in about 20,000 two-dimensional reads, with a median length of 6.5 kilobases, generating about 115 megabases of data. When they used an 8-kilobase shearing protocol, almost 5 percent of 2D reads were at least 10,000 bases long, some of them more than 50,000 bases. The total error for individual 2D base calls was 12 percent, consisting of 3 percent miscalls, 4 percent insertions, and 5 percent deletions. A single run yielded enough 2D bases to cover the E. coli genome 25-fold.
When the researchers only considered so-called 'pass' 2D reads, the experimental yield decreased to 75 megabases per flow cell, from about 12,000 reads with a read length distribution peak of 6,700 bases.
One reason the quantity and quality of data varied between runs is that many steps in the standard protocol are sensitive to the quality of materials and reagents being used. Also, researchers accidentally deviated from the protocol, and unexpected computer failures happened during runs.
"A large component of variability in MinIon data quality was contingent on lab-specific behavior," the authors wrote, but wet-lab method variations and computer failures "had minimal effects on data quality," with the exception of the amount of fuel mix added.
Forced restarts of the scripts that control the MinIon "were found to be the largest source of variation among the 20 runs," they wrote, resulting in "extreme variation" in the length of the sequencing run and yield.
"The performance of the MinIon device itself was consistent," they found, and no experiment failed due to problems with the device. They also did not observe any GC-bias, although they noted this may be hard to detect with an E. coli genome.
Most important for data yield was the initial number of active pores in a flow cell, they wrote.
Overall, the results suggest a variety of ways to improve the performance of the MinIon, they wrote, including clearer protocol steps, methods for longer library molecules, and improved run scripts.
Data generated in the study "are intended as a snapshot of the state of the MinIon technology in April 2015," they concluded, and the release of the data is intended to inspire others to develop new software and conduct additional analyses.
MARC intends to perform more experiments using the MinIon Mk1 device and reagents that Oxford Nanopore recently released in order to see whether this leads to differences in data yield, accuracy, and error profile.
Going forward, the consortium plans to start a second phase of experiments in order to identify protocol changes "that improve the performance and extend potential applications of the platform."