
NEW YORK (GenomeWeb) – Researchers at the University of Toronto have used the Oxford Nanopore MinIon to sequence three clinically relevant human genes – CYP2DC, HLA-A, and HLA-B – and determine their variants and haplotypes without the need for statistical phasing.
The results of the study, which was published online this week ahead of peer review in F1000Research, an online publishing platform, "demonstrate that nanopore sequencing is an emerging standalone technology with potential utility in a clinical environment to aid in medical decision-making," according to the authors, who participate in Oxford Nanopore's early-access program for the MinIon.
"Ultimately the error rate needs to be reduced before nanopore-seq can be used in the clinic," Ron Ammar, a postdoc at the Donnelly Centre at the University of Toronto and the lead author of the paper, told GenomeWeb. To his knowledge, this is the first published nanopore study that uses human DNA, he said.
For their study, the researchers PCR-amplified the three genes from HapMap sample NA12878, which has been extensively characterized using other sequencing platforms. After generating the amplicons, which are about 4 to 5 kilobases long, they used a total of 1.5 micrograms of PCR product from the three genes to construct MinIon sequencing libraries.
Using the R7.3 chemistry in a single 24-hour MinIon run, they produced about 19,700 1D template reads, 9,600 1D complement reads, and 7,500 2D reads, which results from a template read and its complement. Mean read lengths were 2.7 kilobases for 1D template and complement reads, and 3.5 kilobases for 2D consensus reads.
Next, they mapped the reads to the human genome reference assembly using the BLASR aligner, which was originally developed for Pacific Biosciences data. They were able to align about 19 percent of the 1D template reads, 28 percent of the 1D complement reads, and 63 percent of the 2D consensus reads. The mean mapping accuracy for all read types was on the order of 72 to 74 percent.
For the final analysis, they performed two alignments, using only 2D reads where possible because of their better accuracy. For one, they only selected the single best alignment for each long read. This, they wrote, was important for analyzing the CYP2D6 gene, which has two closely related pseudogenes nearby, and allowed them to verify that their PCR product indeed derived from CYP2D6.
For the second, they allowed multiple alignments to the reference for each read, enabling them to "gather all reads for a single gene in a single pileup to any of the eight HLA-A/B loci to generate a consensus sequence."
Several variant and haplotype callers did not work for the analysis because of the "long reads, high error rates and continuously evolving error profile of the MinIon basecalls at this early stage," the authors wrote, but they were able to obtain variants and haplotype information from the coverage of aligned reads, and haplotype proportions from clinical marker positions across all aligned reads. The mean sequence coverage was 1,240-fold for CYP2D6, 790-fold for HLA-A, and 1,420-fold for HLA-B.
The researchers validated their results using genotyping data from Complete Genomics and data they generated using Sequenom MassArray and Taqman qPCR, and using haplotypes from statistical phasing and from the HapMap project.
With the high coverage they achieved, they were able to call many positions with a consensus of 70 to 90 percent. Though the MinIon basecalls had a high error rate, "the majority of errors appear to be randomly distributed across the length of the reads, which is why increasing coverage can yield a consensus that matches variant calls from existing sequencing and genotyping platforms such as Illumina, Complete Genomics, and Sequenom," they wrote.
Overall, they had enough long mappable reads to phase all variants in the loci they studied, without the need for parental haplotypes or statistical phasing. However, not all nanopore-determined haplotypes matched the statistically-phased haplotypes from the other platforms, which the researchers attributed to an early PCR error or sample contamination.
Overall, "while nanopore sequencing with the MinIon is demonstrably error-prone in its current stage of development, we assert that this technology holds promise for clinical applications because accurate consensus sequences can be built with sufficient coverage given the high number of reads generated," they concluded.
In terms of clinical applications, "it will be very hard, in my opinion, to do haplotyping at this point, unless they increase the accuracy more," said Nezih Cereb, CEO of HLA typing firm Histogenetics, who was not involved in the study. "It's a promising technology, but with 70 percent concordance, you cannot do much."
Histogenetics uses several sequencing platforms for HLA typing, including Illumina's MiSeq and Pacific Biosciences' PacBio RS II. The company is also an early-access user of the MinIon.
In the future, Ammar and his colleagues plan to use the MinIon for RNA-seq and to sequence repeat regions on the order of kilobases in the human genome.