Skip to main content
Premium Trial:

Request an Annual Quote

NCI Develops High-Throughput, Low-Cost Sequencing Assay for HPV Research


NEW YORK (GenomeWeb) – Researchers at the National Cancer Institute have developed a new sequencing assay that detects 51 different types of papillomavirus that can infect human epithelial tissues.

The novel open-access test employs two stages of PCR amplification and enables a single technician to process a 768-sample batch in three days, running more tests for less cost than commercially available systems.

As described last week in the Journal of Clinical Microbiology, the assay was validated against a standard commercial test from Roche called the Linear Array HPV Genotyping Test. Specific and sensitive results were enabled by using a PCR reagent from Integrated DNA Technologies called RNase H2-dependent primers. The workflow was developed for both the Thermo Fisher Scientific Ion and Illumina sequencing platforms, and it runs at a cost of between approximately $2 and $10 per sample.

"Our motivation to develop this assay was to generate a high-throughput, low-cost assay that can be applied to really large epidemiological and surveillance studies, and that is what we can achieve with this assay," said co-author Nicolas Wentzensen, deputy branch chief and a senior investigator at NCI.

Development of the assay was funded through NCI's intramural research program, and it is now available for use by other labs.

There are more than 200 types of HPV, most of which cause things like warts and genital lesions. About 14 types of of the virus are considered high-risk for causing cervical, anal, and throat cancers, but another 35 or so are also present in skin, mucosa, and squamous epithelia yet are considered low risk.

According Wentzensen, although the low-risk HPV types do not have any clinical significance, researchers often test for them to study etiology, viral biology, population prevalence, and sexual transmission. For these purposes, there are a few commercially available tests, including the Roche test.

Sarah Wagner, a co-author on the JCM paper and a developer at Leidos Biomedical Research — which is the operator of the Frederick National Laboratory that is sponsored by NCI — said in an email that the priorities for the new assay were that it would be "low cost, high throughput, and broad spectrum," and yet also have a sensitivity and specificity comparable to the Linear Array.

The workflow Wagner worked out consists of three steps, the first of which is direct amplification of all 51 types of HPV.

While some assays use primers for conserved regions in order to amplify many subtypes with the same primers, the disadvantage of this approach is that the amplification may not be optimal for all subtypes. "Our assay uses specific primers for each genotype," Wentzensen explained, so that, in total, there are 127 primers, with some types having a few more primers to account for variation in primer binding sites in some genotypes.

After direct amplification of each type at its optimal conditions, the second step is essentially "reducing the individual sequences and making them more similar," Wentzensen said.

Finally, the third step prepares target DNA for NGS, including adding barcodes. Massively parallel sequencing of multiple types at the same time and bioinformatic readout can be automatically turned into an Excel sheet containing the typing results, Wentzensen said. "It's a highly automated workflow, particularly on the bioinformatics side," he added.

Wagner noted that the combination of type-specific and consensus priming strategies is unique to TypeSeq, as is the broad spectrum of detectable types and the deliberate equalization of type copy number to boost sensitivity.

There are only a few commercially available genotyping assays for HPV that have similar broad type coverage. However, according to Wentzensen, the problem with these is that they are "kind of laborious, they're not really suited for high throughput, and they're also relatively expensive."

Compared to the Linear Array, which is an industry standard test which detects 37 HPV subtypes and has been widely evaluated in clinical and epidemiological studies, the NCI test was shown in the study to have high concordance, Wentzensen said.

"We really feel like we can use it in situations where we would otherwise use Linear Array, but where we have such a large number of specimens that it would take too much time and be too costly," he said.

There were some discrepant results between the two tests, but that was expected, Wentzensen said, and depending on the application of the assay these may not be so meaningful. The group saw high overall agreement of over 93 percent, with most subtypes in the 98 percent and 99 percent range, he said. But this estimate is skewed by the large number of samples that are negative for more rare subtypes.

Percentage concordance goes down for some targets when the group looked only at positive agreement, "but still we are in a range where we are very happy with the performance, and the lowest percentages are found for types that have very low numbers [in the population], and were also low-risk types that we don't know very well and are less relevant," Wentzensen said.

For example, of the 863 clinical samples in the study, there were only 15 women with a positive HPV-34 target; seven of these were positive for both assays, and eight were positive for TypeSeq but not Linear Array. The net result was a 46 percent positive agreement, "but the numbers are so low that even a few discrepants make a big impact," Wentzensen explained.

On the other hand, there was very good agreement for the carcinogenic types, he said. For example, the group had 288 samples positive for HPV-16 on both tests, and some discrepants, with a positive agreement of 91 percent.

Though the assay was developed for both the Thermo Fisher Scientific and Illumina systems, the choice of sequencing instrument had an effect on cost. According to the study, the cost to run the TypeSeq assay on MiSeq was $9.50 for 96 samples per flowcell. Running the TypeSeq workflow on the Ion S5 had a cost of between $2 and $6 per sample, depending on the Ion chip type and scale of multiplexing, with the total cost of the assay plus NGS for the standard Ion batch size of 768 samples plus controls being under $6 per sample, excluding DNA extraction, labor and equipment.

The average total cost per sample, considering the different costs for each instrument but also the different throughputs, was around $3.50 per sample, and the researchers noted in the study that any DNA extraction method that produces high purity DNA free of PCR inhibitors is compatible with the assay.

The hands-on processing time was approximately 2.5 minutes per sample for manual processing, and under 2 minutes per sample for automated processing.

The genotyping analysis workflow, meanwhile, is completely automated, and "due to the highly parallelized processing was typically completed within one hour, with no user intervention or judgment required for calling," the authors wrote in the JCM study.

IDT reagent key to workflow

To achieve a high-throughput workflow, Wagner said the team knew it needed to utilize NGS. For their purposes, the more widely used consensus primer approach would not have achieved a high enough multiple-type sensitivity without requiring large numbers of reads per sample, which would increase the cost, she said.

"Very few broad-spectrum HPV typing assays use type-specific primers, likely due to the scale of multiplexing needed and the degree of homology between types, which makes the primer design particularly challenging," Wagner noted. She decided to use a combination of type-specific and consensus priming for the assay, with the type-specific PCR intended to "increase and normalize the copy number of all types present within the sample in a non-competitive system to reduce the chance of drop outs," and the consensus priming strategy then incorporating sequencing adapters and barcodes.

Wagner noted that an initial version of the assay took six months to develop and used standard primers for the type-specific multiplex PCR. This version, when it was benchmarked against the Linear Array, had adequate performance but the team wished to improve the sensitivity.

"It was clear that the sensitivity was affected in the type-specific PCR when late-amplifiers — in this case the viral isolates present at low viral loads — were not able to catch up to the early-amplifiers due to primers being consumed in artefacts and dimers," she said.

Wagner worked for six more months testing a range of buffer compositions, additives, and priming systems to reduce dimers, and finally hit on developing a version of the assay that employed IDT's RNase H2-dependent primers. These were "by far the most effective of everything tested," she said.

Specifically, the artefact and dimer formation were so low in the workflow using the IDT reagent that the group was "able to equalize each type's copy number within and between samples at the first PCR step, regardless of genomic copy number," Wagner said, adding, "This gave the system robust multiple-type sensitivity even with low numbers of sequencing reads."

The group's bioinformatician, David Roberson, developed a custom automated analysis plugin "with the ability to demultiplex dual-barcoded libraries on the Ion S5 platform, since dual barcoding is not currently supported by Thermo Fisher Scientific," Wagner said. That allowed for a very high level of sequencing throughput of approximately 800 samples per Ion 540 chip, and therefore a low per-sample sequencing cost, she added.

Wagner said that she hadn't been aware of the RNase H2-dependent primers during the initial assay development, but came across them when searching for oligo modifications that could reduce dimers.

"Because of their excellent specificity and sensitivity, I'm continuing to use the RNase H2-dependent primers in the development of several new multiplex assays which wouldn't be possible with standard primers," she said.

Although IDT historically markets primers for genotyping, there are many other PCR applications that could benefit from their use.

Allen Nguyen, IDT's director of vertical market development, said in an email that the company has been "selling 'do-it-yourself' rh primer and RNAse H2 components for several years, and a small number of researchers have utilized the potential of the reagents to improve PCR specificity." 

At the same time, the company has been developing and just launched a "fully optimized, complete solution for targeted amplicon sequencing using our rhAmp PCR technology, called rhAmpSeq Amplicon Sequencing," he said.

Nguyen noted that the company has also seen a lot of interest in the rhAmpSeq system for various targeted sequencing applications, ranging from clinical research, agricultural biotech, targeted genotyping by sequencing (GBS), and in the CRISPR market to efficiently confirm gene editing sites in parallel.

IDT has had over a dozen beta testers in various research areas, Nguyen said. For example, researchers at Cornell University's College of Agriculture and Life Sciences used the rhAmpSeq system for studying markers by targeted GBS, using an amplicon panel targeting nearly 2,000 markers across various grape species. And, researchers presented a poster at the 2019 Advances in Genome Biology and Technology meeting describing using the technology for CRISPR off-target effects evaluation.

The firm's new rhAmpSeq system uses the same underlying chemistry as used by Wagner, but IDT has significantly optimized the rh primer design and amplification mixes to further reduce primer dimers and non-specific amplification, and to maximize sequencing performance, Nguyen said.

For its TypeSeq HPV test, the NCI group now plans to coordinate with any labs interested in using the test. Wentzensen said that since the assay was published, a few groups have already contacted the team. "We've gotten a lot of interest already, and we're already doing some technology transfer with the [US Centers for Disease Control and Prevention]," he said, adding there is also interest from groups using the assay as a reference for other genotyping efforts.