NEW YORK (GenomeWeb) – In a study published on the pre-peer review website arXiv last week, researchers from the University of Washington demonstrated nanopore sequence data from the Mycobacterium smegmatis porin A (MspA) pore.
The paper, authored by Jens Gundlach, a professor of physics at the University of Washington, and colleagues, builds on a series of studies the team has published on nanopore sequencing using the MspA pore. The group first showed that MspA's size and shape were desirable characteristics for nanopore sequencing in 2010, and in 2012 the group combined the MspA pore with a method for controlling DNA translocation through the pore.
Gundlach told In Sequence that he has submitted the paper for publication in a peer reviewed journal and according to that journal's rules on publication, he could not comment on the paper until it had gone through the peer review process and was published.
In the arXiv paper, the UW team sequenced a bacteriophage genome known as phi X 174, generating 92 aligned reads comprising 118 kb of sequence data, or 21.9x average coverage of the phi X 174 genome. Around 60 percent of the reads were longer than 1,000 bases and the longest reads were around 4,500 bases.
Last year, Illumina licensed worldwide exclusive rights to nanopore sequencing technology being developed by the University of Washington team and a group from the University of Alabama, Birmingham.
Illumina declined to comment on the publication and what it means for potential commercial development.
Oxford Nanopore, which is commercializing the MinIon nanopore sequencer, also declined to comment. The company first released data from its system in February at the Advances in Genome Biology and Technology meeting on Marco Island, Fla. And earlier this month, researchers from the University of Birmingham in the UK publicly released data — a single 8.5 kb read of the Pseudomonas aeruginosa genome — that they generated with the MinIon. The company has not yet published a peer reviewed study on the details of its nanopore sequencing technology.
Although the arXiv study has not yet been peer reviewed, Stuart Lindsay, director of the Center for Single Molecule Biophysics at Arizona State University who is also working on nanopore sequencing technology, said that it was a "very impressive advance in nanopore sequencing," and that it is clear that the MspA pore provides a higher resolution than alpha hemolysin pore, which had been the primary biological nanopore being studied before Gundlach published on the MspA pore.
Mark Akeson, chair and professor of biomolecular engineering at the University of California, Santa Cruz and a consultant to Oxford Nanopore, added that the paper is a "very nice set of experiments and important confirmation" of the work being done at Oxford Nanopore. Akeson's UCSC group developed the ratcheting system that makes use of the phi29 DNA polymerase to control translocation through the nanopore.
Lindsay added, however, that the technology is "nowhere near de novo sequencing" due to the errors that the researchers report. He said that, at least initially, it would have application in hybrid assembly and species identification, but not de novo sequencing.
The authors themselves acknowledge this limitation, writing that the "amplitude of ion current levels alone does not provide enough information for direct de novo sequencing, i.e. conversion of ion currents to accurate sequences in the absence of a reference for alignment and comparison." However, improved algorithms that take advantage of the information "contained in the variance, the duration, and the voltage dependence of each current level" may enable this application.
In the UW team's proof-of-concept paper, the researchers demonstrated the sequencing of the phi X genome using a single MspA pore, established in a lipid bilayer. They used a shotgun-ligation approach to establish the phi X libraries, which included a restriction enzyme digest to linearize the DNA, and SPRI bead purification.
The linearized DNA is ratcheted through the nanopore with the processive enzyme, phi29 DNA polymerase. Phi29 is attached to the lipid bilayer, and it "unzips" the double stranded DNA molecule, feeding the 5' end through the MspA pore. Once it reaches the end of the strand, the DNA molecule moves in the other direction, and the polymerase synthesizes the complementary strand.
The UW researchers said that four nucleotides determine the current signal, and so developed a base-reading algorithm based on associating each of the 256 possible 4-mers with an ion current.
First, the team set about validating their approach for measuring current from the 256 4-mers and created a quadromer map with current level estimates for each of the 256 quadromers.
Next, they compared their quadromer map with sequencing of the phi X 174 genome. They demonstrated that the predicted current levels matched the observed current levels with a correlation coefficient of around .99, and that errors tend to result from "shifts in the positioning of the DNA within the pore's constriction due either to DNA secondary structure within the vestibule or DNA interactions with the pore vestibule or constriction."
Finally, they tested the ability to sequence the whole phi X 174 genome. They created a library by ligating adapters to the full-length phi X genome. Sequencing generated 106 long — defined as greater than 200-bp — ion current recordings. When they aligned to the quadromer maps, 92 of those reads aligned with high confidence to the genome. Of those 92 reads, around 60 percent were longer than 1,000 bp; 20 percent were longer than 2,000 bp; and 10 percent were longer than 3,000 bp.
Read lengths were much shorter than the distribution of the library, suggesting that DNA polymerase "dissociation from the strand is the primary cause of termination," the authors wrote.
The 92 reads comprised 118 kb of sequence and a mean 21.9x coverage of the phi X 174 reference genome.
The team next used the reads to help with hybrid assembly by aligning 11,000 100-base single-end Illumina MiSeq reads to a single 3,800-base nanopore read.
The UW team also demonstrated that the nanopore reads could be used for species identification. By taking a sub-region of the nanopore reads and aligning them to a 156 mb database that contained over 5,000 viral genomes, they found that the nanopore reads aligned to the phi X 174 genome with high confidence, "implying that nanopore read quality is sufficient for unambiguous species identification."
Finally, they assessed the ability to detect SNPs. They first introduced 1,044 SNPs to the reference genome and quadromer maps associated with each of those SNPs. They then aligned 33 nanopore reads to the modified reference, successfully calling 808, or 77.4 percent, of the SNPs.
Looking ahead, the authors identified several areas where they could make improvements. First, they noted that the phi29 DNA polymerase may not be the most ideal polymerase, writing that, "much of the variance in current levels associated with the nanopore sequencing system described here results from the erratic and stochastic motion of the phi29 polymerase as it feeds the DNA through the pore." Switching to a different enzyme would likely improve performance, they said.
In addition, they said that there is room to improve the variant calling algorithms for nanopore sequencing data, as well as to improve on the device itself through advances in "nanopore parallelization, channel setup, and microfluidics."
If the UW system is commercialized it will almost certainly compete with Oxford Nanopore's MinIon. In the paper, the researchers said that the experimental device they used cost less than $20,000, but Oxford Nanopore already expects to market its MinIon at $1,000.
According to Akeson, the roadmap that the UW team laid out in the paper is "very close to the MinIon."