Australian researchers have developed a targeted, multiplexed sequencing strategy that relies on automated primer design software and carefully selected PCR conditions to minimize sequence bias across amplicons of interest.
"We designed the methods to facilitate our work in cancer predisposition gene 'discovery' screening," University of Melbourne genetic epidemiology researcher Daniel Park told In Sequence in an email message. "However, Hi-Plex can be applied to a broad range of research and diagnostic applications requiring the analysis of a moderate number of genomic regions."
In addition to software for designing the primers themselves, Hi-Plex incorporates a heel clamp primer approach, modest amplicon sizes, specific PCR cycling conditions, and high-fidelity enzyme selection to perform high-throughput sequencing on multiple targets of interest in parallel, including amplification of some tricky-to-tackle sequences.
Though the approach spans the primer design, library preparation, and even some sequence analysis steps in the targeted sequencing process, the sequencing step itself is performed using existing sequencing instruments such as Illumina or Life Technologies' Ion Torrent platforms.
The Hi-Plex method was born from the University of Melbourne team's ongoing efforts to track down new cancer susceptibility genes and the alterations in them that increase individuals' risk of developing cancer — work that it believes may ultimately lead to better predictive tests for these pathogenic mutations.
"A major bottleneck for the validation of candidate genes is the need to sequence their complete coding regions in 'germline' DNA (e.g. from a blood sample) from thousands of cancer-affected and unaffected people," Park said. "We needed a much faster, cheaper, and less labor-intensive approach to take over from the traditional approaches of [high-resolution melting analysis] or Sanger sequencing."
There are several existing strategies for doing targeted, multiplex sequencing, including a variety of commercial kits, such as Ion AmpliSeq from Life Tech, Agilent's Haloplex, or Illumina's TruSeq Amplicon library preparation method.
In an effort to come up with a somewhat cheaper and simpler strategy of their own, Park and his Australian colleagues came up with several targeted sequencing tweaks spanning the primer design to sequence analysis steps.
During the PCR process, for example, their Hi-Plex protocol relies on the use of primers with so-called 5-prime heel clamps, which contain both gene-specific sequence as well as universal adapters suited to the sequencing technology being used.
Such primers "serve to enhance gene-specific primer binding after initial seeding and also to allow universal adapter primers to 'drive' the majority of the PCR," Park said. "In this way we minimize bias that would otherwise result from differing gene-specific primer efficiencies."
Hi-Plex also hinges on automated primer design software that produces uniform and relatively small amplicons — currently on the order of 100 bases of sequence between the primers.
Though the method could theoretically be tailored to achieve a range of insert sizes, that consistent amplicon length appears to help when tackling tricky to read sequences while trying to maintain amplicon processability, Park said.
He noted that the approach can also provide an advantage when dealing with fragmented DNA found in some formalin-fixed, paraffin-embedded tumor samples or dried blood spot cards.
The fairly narrow range of insert sizes offers an edge when selecting amplicons to take forward for sequencing analysis, according to Park, since these PCR products can be easily nabbed for subsequent steps in the library preparation protocol.
"[T]he tight size distribution of the product library or libraries allows us to perform stringent size selection in a single gel lane using a simple agarose gel electrophoresis," he explained. "This contributes to the ability to use primers that would otherwise not be deemed of sufficient quality due to off-target priming effects."
Another trick the team employs is the use of permissive PCR cycling conditions. By spanning a gradient of annealing and extension temperatures during each cycle, Park noted that the Hi-Plex protocol makes it possible to tack on primers with a variety of annealing temperatures and do sequence extension across a range of sequences — from those with a high guanine and cytosine, or GC, content to low-GC regions.
Coupled with a highly processive, high-fidelity polymerase for minimizing PCR errors or drop off, that set of cycling conditions is designed to produce accurate amplicons even in the face of somewhat poor primer quality.
"Our demonstration in the recent papers included gene-specific primers of apparent low quality," Park said, "which highlights the utility of Hi-Plex in contexts where optimal primer design is not possible."
In their proof-of-principle paper in Biotechniques, for example, the researchers performed 60-plex amplicon sequencing on the breast cancer predisposition genes PALB2 and XRCC2 using DNA from lymphoblastoid cell lines or formalin-fixed, paraffin-embedded breast cancer samples from an Australian cancer study.
In that analysis, some 87 percent of reads generated by Ion Torrent PGM sequencing of Hi-Plex-prepared amplicons could be mapped to targeted portions of PALB2 and XRCC2 in the human reference genome.
So far the team has tackled as many as 96 samples at a time using Hi-Plex, Park said, noting that it should be feasible to stretch that out to look at far more samples simultaneously using a dual-index method for multiplexing.
At the moment, the researchers have taken a crack at sequencing the Hi-Plex-prepared amplicons using either 2x150 bases paired-end reads generated using Illumina TruSeq chemistry or 200 base Ion Torrent reads.
In their platform comparison study in Analytical Biochemistry, for instance, the University of Melbourne researchers found that Hi-Plex performance was similar when used in conjunction with Illumina or Ion Torrent sequencing.
Hi-Plex is expected to be compatible with any of the existing high-throughput sequencing instruments, Park noted. Even so, he noted that platforms that are amenable to paired-end sequencing offer a bit of an edge when using one of the variant-calling approaches designed with Hi-Plex in mind.
By designing amplicons in such a way that paired-end reads spanning a certain sequence overlap with one another, he said, it's possible to do very stringent variant calling by only registering the variants that are supported by both reads from a given pair.
"[W]e wanted to be able to make maximum use of paired-end sequencing to allow very accurate 'calling' of variants," Park said. "This feature is generally useful, but will be particularly valuable in the context of accurately detecting rare somatic mutations in cancer diagnostics, for example."
In research slated for publication in the not-too-distant future, Park and his colleagues reportedly applied that variant calling approach to a more thorough, high-throughput analysis of the PALB2 and XRCC2 genes.
Upon publication of that study, Park said the group plans to introduce "an automated variant caller that uses the primer co-ordinate information for 'clipping' and completely-overlapping read-pairs comparison for accuracy."
Though Hi-Seq does not currently achieve the levels of multiplexing that are possible with commercial kits, its developers say the method offers advantages on the labor side, since the protocol is relatively short and hands off.
In particular, Hi-Plex sample prep can be done for a fairly modest price using readily available equipment in the lab, Park said, since the "only significant outlay is for primers, which will cover many thousands of reactions."
In their unpublished targeted sequencing study of PALB2 and XRCC2, Park noted that the researchers managed to screen for cancer-related mutations in coding portions of the genes at a cost of around $20 per sample using Hi-Plex-based targeted, multiplex sequencing.
That price tag included library preparation as well as sequencing reagents, labor, and analytical costs, though the group is keen to find ways of reining in the cost per specimen even further going forward.
The researchers are also aiming to develop an even more sophisticated primer design method, while tweaking the Hi-Plex protocol in ways that make it possible to multiplex and sequence many more amplicons simultaneously.
"In future experiments, we will test Hi-Plex for considerably higher parallelization," Park and co-authors wrote in Biotechniques, "with the aim of achieving robust thousands-plex single-tube multiplexing."
The current iteration of the Hi-Plex primer software, known as Hiplex-primer, is publicly available through github.