This story has been updated to indicate that three groups have sequenced a human genome on the MinIon.
NEW YORK (GenomeWeb) – Two research groups have independently sequenced whole human genomes on Oxford Nanopore Technologies' MinIon device, indicating that the technology is progressing from being able to scan small bacterial or viral genomes to handling large eukaryotic genomes. Although both groups, who presented their results at a user group meeting in New York City last week, acknowledged that the cost of sequencing a human genome on the MinIon is prohibitive for routine use and basecalling is not yet accurate enough for clinical purposes, they said that it is a good first step and that the technology shows promise, particularly with the anticipated release of the higher throughput PromethIon.
A third group, a consortium of eight different laboratories, also sequenced a human genome on the MinIon and released initial data at the meeting last week.
"It's an exciting first step to be able to sequence human genomes on the MinIon, which doesn't require fluorescence or any imaging equipment," Michael Simpson, head of the genomic medicine research group at King's College London, told GenomeWeb. "I'm hopeful this will scale with the PromethIon."
Simpson presented work from a team of researchers associated with the University of Oxford's Wellcome Trust Centre for Human Genetics and genome analytics firm Genomics. They used the MinIon to sequence the genome of the well-characterized human sample, NA12878, which was also the first reference sample for the National Institute of Standards and Technology's Genome in a Bottle Consortium.
To do the sequencing, Simpson's group used Oxford Nanopore's latest chemistry release, known as R9.4, as well as a 1D sequencing protocol with size selection and PCR. They ran eight MinIon devices in parallel, using a total of 47 flow cells , three of which utilized a long-read protocol. The sequencing took about four weeks, Simpson said, and consumables costs ran around £20,000 ($25,400).
From the standard libraries, the researchers achieved mean read lengths of 6,587 base pairs, and from the long-read libraries, 7,684 base pairs. The genome was sequenced to an average coverage of about 40X and more than 99 percent of the genome was covered by at least one read. Overall accuracy was 99.93 percent, and at 35X coverage, there was a .30 false negative rate and a .32 false positive rate for basecalling.
"SNP calling is not yet at the quality of Illumina," Simpson said, particularly for heterozygous SNPs, but "from our simulations, basecalling just needs to improve by a small amount to achieve usable SNP calls directly from the sequence data."
The goal of sequencing a human genome was to act as a proof of principle and to highlight the advantages of long nanopore reads. To that end, the researchers looked at a couple of clinically relevant but challenging-to-analyze genes. For example, short reads often do not map uniquely to the cancer predisposition gene PMS2, but the nanopore reads were long enough to avoid such mapping issues. They found similar results when they looked at another tricky region, the beta defensin cluster on chromosome 8. Again, short reads often do not uniquely align, but the longer nanopore reads did not have those issues and provided more uniform coverage across the entire region. The researchers also demonstrated that nanopore sequencing was able to identify large structural variants, including a heterozyogous 2.5-kilobase deletion and a heterozygous 7-kilobase deletion.
"The longer the reads, the better," Simpsons said, adding that the group was also able to phase entire chromosomes.
For a second test case, the researchers sequenced the genome of an individual with an immune disorder associated with cytopenia and cerebral ataxia. The individual had previously been sequenced as part of a trio whole-genome sequencing project. The previous sequencing, done with short-read sequencing technology, had identified two de novo missense variants located 2.3 kilobases apart in the SAMD9L gene. However, it was unclear whether the two variants were on the same haplotype.
For this genome, the group ran 35 flow cells, getting similar read lengths, throughput, and error rates as for the previous genome. The sequencing generated eight reads that spanned both de novo variant sites, seven of which confirmed that the variants are on the same haplotype.
Simpson said that the group has a number of next steps. First, they will generate more data on the genomes to see whether increased sequencing depth helps improve variant calling. That will also enable them to look "at a more detailed level at whether the errors we see at base calling are random or whether there is some structure to those," he said.
Independently, researchers from UMC Utrecht's Center for Molecular Medicine demonstrated that whole-genome sequencing with the MinIon could identify disease-causing structural variants. Although such an application is not yet ready for diagnostic purposes, Wigard Kloosterman, who led the work, told GenomeWeb, the group would continue to consider it as the technology improved.
Kloosterman's team began using the MinIon for whole-genome sequencing in April of this year, using the R7 chemistry. By September, the group had switched to the R9 chemistry and completed around 90 runs, reaching just below 10X coverage with the R7 and R9 chemistries. They then upgraded to the R9.4 chemistry and by mid November had run 122 flow cells and generated a 16X mean coverage genome.
The genome they chose to sequence was that of an individual with chromothripsis, a phenomenon involved in congenital diseases as well as cancer that is characterized by complex genome rearrangements. The individual had also been sequenced using Illumina technology, which had identified 40 breakpoints involved in structural variants.
Using nanopore sequencing may be preferable for this, however, due to the longer reads. "To figure out what went wrong, we need to be able to build the long-range structure of the rearrangements, which is difficult to do from short-read data," Kloosterman said.
To analyze the nanopore data, Kloosterman's group developed a structural variant caller, NanoSV, which was able to detect all 40 breakpoints that had been identified by Illumina sequencing. The researchers also wanted to see whether the MinIon could uncover additional structural variants. They looked for structural variants genome wide, using three different callers on the nanopore data, including NanoSV, and they compared the nanopore data with the Illumina data. In general, when structural variant calls overlapped between the Illumina data and nanopore data, the calls were accurate, Kloosterman said. But accuracy dropped considerably if structural variants were only called by two different nanopore callers. The NanoSV caller, in particular, had about a 30 percent to 40 percent validation rate for calls that were not also called by Illumina sequencing.
The group plans to continue to refine NanoSV by tweaking its parameters."We want to fine tune the caller on the algorithmic side to catch more signatures and find more support for the structural variants that we can pick up," Klooseterman said, adding that it is a matter of balancing sensitivity and specificity. In addition, he said, the group wants to optimize the tool for speed and ease of use. The researchers do not plan to commercialize NanoSV, which Kloosterman said is freely available on GitHub.
Another important application the group demonstrated is the ability of the nanopore sequence data to determine heritability. Kloosterman said that because of the long reads, the researchers were able to figure out whether the structural variant breakpoints originated on maternal or paternal chromosomes. For the patient they sequenced, for example, all 40 breakpoints were inherited paternally. "That gives new biological insights into how the disease originated," Kloosterman said.
The group has since begun sequencing the genome of a second patient, aiming for 15X coverage, which Kloosterman hopes will be possible with between 15 and 20 sequencing runs using the R9.4 chemistry.
Longer term, Kloosterman is interested in applying the technology to cancer genomes. If the team gains access to the higher throughput PromethIon, he would also like to apply it to population sequencing projects. As part of the Genome of the Netherlands project, "we've found many structural variants, and it would be interesting to apply long-read technology to that project," he said.
Because the current error rate is just below 10 percent, Kloosterman said, structural variant detection is a more appealing application than analyzing SNVs. Nonetheless, researchers have now demonstrated that the MinIon can generate "robust sequencing of human genomes," he said, which is a "paradigm shift."