NEW YORK (GenomeWeb) – Nanopore sequencing can detect structural variants in a mixture of PCR amplicons from cancer cell lines, according to new research by a Johns Hopkins University team.
In a proof-of-principle study published yesterday as a preprint in BioRxiv, the scientists, led by Jim Eshleman, a professor of pathology and oncology at JHU School of Medicine, and Winston Timp, an assistant professor of biomedical engineering at Hopkins, showed that they could detect a number of well-characterized structural variants — including large deletions, inversions, and translocations — that affect two tumor suppressor genes in pancreatic cancer cell lines, using amplicon sequence data from Oxford Nanopore's MinIon.
The paper represents the first published study that evaluates the ability of the MinIon nanopore technology to detect structural variants, and the researchers plan to present their results during a poster session at the American Society of Human Genetics annual meeting in Baltimore this week.
"This work serves as proof of principle, showing the ability of nanopore sequencing to correctly and reliably detect SV with only hundreds, instead of millions of reads," the authors wrote. It also demonstrates that the MinIon can pick out well-characterized SVs from mixtures of PCR amplicons that are diluted 1:100 in wild-type sequence.
"What is demonstrated is that nanopore sequencing is good enough quality to identify [structural variants] when the template constituents are known," Eshleman told GenomeWeb.
For clinical applications, throughput will need to improve, he said, because this would require genome-wide detection of SVs or targeted sequencing of regions likely to contain an SV, rather than amplicon sequencing. "We are hopeful that this will be accomplished with advances in the current MinIon platform as well as with the larger instruments that Oxford is producing," he said.
In terms of structural variant detection, Oxford Nanopore is still "at a larval stage" compared to other technologies that offer long-range genome information, such as Pacific Biosciences, BioNano Genomics, Illumina synthetic long reads, and OpGen optical mapping, Timp told GenomeWeb. "We are still understanding the kinds of data and the best ways to analyze it," he said, adding that "we think that the MinIon has more potential for diagnostic and screening use, due to its low logistical overhead."
For their study, the Hopkins investigators PCR-amplified 10 structural variants that inactivate either the CDKN2A/p16 or the SMAD4/DPC4 tumor suppressor gene, using genomic DNA from pancreatic ductal adenocarcinoma cell lines.
These structural variants, which included interstitial deletions, translocations, inversions, and a combination of a translocation and an inversion, had been previously detected by either SNP arrays or whole-genome sequencing and confirmed by PCR and Sanger sequencing across the junctions. The size of the amplicons was just under 600 base pairs, and the researchers also included a wild type control and a replicate.
They generated libraries for all 12 amplicons and sequenced them in a single MinIon run, which produced about 2.5 megabases of data from approximately 4,000 2D reads with an average read length of 640 base pairs. All amplicons mapped to the expected regions of the human genome, and their representation did not depend on the complexity of the SV, though one variant in particular had a low number of correctly aligned reads.
Next, to test the sensitivity of the assay, the researchers diluted six amplicons 1:100 in a background of wild-type p16 and sequenced the mix on the MinIon, generating 2.6 megabases of data from more than 4,000 2D reads with an average read length of 650 base pairs. All six amplicons aligned to the expected regions of the human genome, and one of structural variants was detected from a mere 378 reads.
Finally, the team assessed how accurately they could detect the breakpoints of the structural variants, using an SV detection tool called Lumpy. While they were able to detect the correct breakpoints in some of the samples, those "lacked precision," they noted. However, they said this can likely be solved bioinformatically by optimizing alignment parameters or tailoring breakpoint detection "to the idiosyncrasies of nanopore sequencing data."
Timp said his group is already testing recently released bioinformatics tools, such Nanopolish and PoreSeq, as well as unreleased tools, which he expects "will greatly improve the detection of breakpoints."
Overall, MinIon nanopore sequencing has advantages over short-read sequencing technologies in terms of its long reads that enable sequencing through repetitive regions, its speed that generates results in minutes rather than hours, its low capital cost of $1,000 per device, and its small size, the researchers wrote.
Its main limitations are the currently high mismatch and indel error rate and the limited yield per run, ranging from megabases to gigabases. But both continue to improve, they wrote, for example through better bioinformatic tools and new hardware expected from Oxford Nanopore later this year and next year.
"Given the speed, small footprint, and low capital cost, nanopore sequencing could become the ideal tool for the low-level detection of cancer-associated SVs needed for molecular relapse, early detection, or therapeutic monitoring," the authors wrote.
Timp said he and his colleagues are currently following up on their study by attempting to capture common SV hotspots in solution, which will allow them to detect breakpoints with unknown alternative ends. This work, he said, is ongoing but still requires validation.
In terms of alternative platforms for structural variant detection, Timp said that Pacific Biosciences' recent announcement of the Sequel platform is "very exciting" because of the reduced instrument price and dramatic increase in yield. He also noted that PacBio already produces polished data with established bioinformatic pipelines.
"But Oxford Nanopore is a different animal — the platform cost is effectively zero, as is the footprint and operation," he said, adding that this will allow sequencing to spread from research laboratories to "the typical hospital/clinical setting."