NEW YORK (GenomeWeb) – Most next-generation sequencing labs that test for germline variants get concordant results, according to a recently completed global external quality assessment (EQA) study, though many labs have trouble detecting low-frequency somatic variants consistently.
NGS labs continued to use a variety of DNA target selection approaches and bioinformatics tools for base- and variant calling, according to the study, though some methods and approaches gained in share. Overall, labs largely used sequencing platforms from Illumina and Thermo Fisher Scientific, with Illumina dominating the germline testing area.
The aim of the 2016 pilot EQA scheme, conducted by the European Molecular Genetics Quality Network (EMQN) in collaboration with UK NEQAS for Molecular Genetics, was to assess the performance of labs involved in NGS testing for germline and somatic variants. Simon Patton, director of EMQN, presented results from the study, which wrapped up in March, at the Association for Molecular Pathology Global 2017 meeting in Berlin last month.
This was EMQN's fourth NGS pilot EQA scheme, and a record number of laboratories participated: 227 labs from 36 countries registered for germline testing and 76 labs from 23 countries for somatic testing. The previous year, a total of 236 labs took part in the scheme.
For both flavors of the scheme, the greatest number of participants came from Germany, the UK, and France, though labs from Switzerland, Spain, Denmark, and Australia also had good representation. The study was open to any NGS laboratory, not just diagnostic or clinical labs, though EMQN has traditionally served laboratories involved in patient testing.
Labs participating in germline testing were sent a human genomic DNA sample that EMQN had previously extensively validated with the help of four sequencing laboratories, both on Illumina and Ion Torrent platforms. In the future, upfront validation may no longer be necessary, Patton said, because consensus calls could be derived from the data submitted by the laboratories, provided a sufficient number of labs participate and there is no significant bias in the technologies they use.
Labs were asked to analyze the sample in their usual way, sequencing as little as one gene or as much as the entire exome or genome. A total of 212 labs submitted up to three sets of results, among them 30 submitted whole-genome sequencing analyses.
For the somatic NGS scheme, labs received a formalin-fixed paraffin-embedded cell line reference sample with engineered mutations that were present at allele frequencies of about 5 percent, as well as a matched FFPE sample without those mutations. The samples came from Horizon Discovery and contained a total of six somatic variants, four in the KRAS gene and two in the NRAS gene, the identity of which were not known to participants. Fifty-eight labs submitted up to three results for the somatic scheme.
Patton said that for the 2016 scheme, EMQN and its partner Euformatics had invested considerably into automating the data analysis process, which previously required a lot of manual intervention, and developed a standardized report format. That, he said, made the analysis more robust and allowed them to open the assessment to a larger number of participants. They are considering licensing the technology out to other providers of quality assessment schemes, he added.
For the germline scheme, about 80 percent of laboratories used an Illumina sequencing platform, most commonly the MiSeq and the NextSeq 500, and about 20 percent a Thermo Fisher Scientific Ion Torrent platform, mostly the PGM. A single laboratory ran the Roche 454 GS Junior, which Roche discontinued several years ago.
The platform mix was more evenly split for the somatic scheme, where 55 percent of labs used Illumina sequencers, again mostly the MiSeq, and 45 percent used Ion Torrent machines, in most cases the PGM.
Overall, data quality — the percentage of data above Q30, for example — on average appeared to be higher for Illumina platforms than for Ion Torrent platforms, Patton said, though he cautioned that fewer labs with Ion Torrent machines participated.
Regarding target capture solutions, the overwhelming majority of labs used commercial capture kits, including off-the-shelf and custom kits. Of the kits used by participants, about 30 percent came from Illumina, 25 percent from Thermo Fisher Scientific, 18 percent from Agilent Technologies, and 10 percent from Multiplicom, which is now part of Agilent. Other kits came from Roche NimbleGen, Qiagen, Fluidigm, Kapa Biosystems, and New England Biolabs.
Similarly, labs used a large variety of bioinformatics solutions. Overall, 70 percent performed bioinformatics in house — about half using commercial software and half pipelines built from public tools — and 13 percent outsourced their bioinformatics (the remainder did not reveal their strategy).
With regard to specific tools, for basecalling, about 80 percent of participants used a type of Burrows-Wheeler aligner (BWA). This number has grown since the first EQA for NGS in 2014, when 65 percent of labs used a BWA, Patton said. For variant calling, 61 percent of labs used GATK compared to 50 percent in 2014. Also popular for this task were VarScan (11 percent) and Samtools (8 percent).
Patton pointed out that none of these aligners and variant callers were originally written for clinical applications — they all came out of research laboratories. "They have never been through the same level of scrutiny and software validation that you would expect from a diagnostic software tool," he said, yet many commercial platforms have incorporated versions of these tools. "We all need to understand that there is an inherent risk of mistakes being made because the tools were never really developed with the stringency required for a diagnostic application."
When it came to consistency of the results between laboratories, participants in the germline scheme fared pretty well: among labs who analyzed the same gene, 89 percent obtained at least 80 percent concordant variant calls, which is 15 percent more than in the 2015 EQA NGS scheme. "The majority of labs consistently get the same variants when they test the same gene, which can only be seen as a positive for anything in diagnostics," Patton said.
Results looked less rosy for participants in the somatic scheme. For each of the six variants, only half to 58 percent of labs detected them, counting only those labs that actually tested for KRAS and NRAS mutations. According to EMQN's report, the reason might have been the low allele frequency of 5 percent, which might have been below their cut off.
EMQN was also able to benchmark laboratories who used the same sequencing platform and the same capture kit, and rank their performance in terms of various quality metrics. Doing that, they found "quite a lot of variation" between labs, Patton said, although the number of labs included in each group was quite small.
One challenge for the 2016 scheme, which also applies to other NGS proficiency testing studies, is that data formats are still not standardized, Patton said. For example, VCF (Variant Call Format) provides a standard for how variants are reported, yet there is still too much flexibility because the same information can be encoded in several different ways. "Some people add commas where others add a full stop," Patton said. "All of that seems trivial, but it actually makes comparing data very difficult."
For the 2016 EQA NGS scheme, EMQN and Euformatics therefore triaged the data submitted by labs upfront, running a number of checks before they accepted it, and if the data did not meet their standards, labs were asked to reformat it and provided it in the correct file type, which quite a few labs needed to do. "What I would like to see is more harmonization of the data standards so you can just take a VCF file and it will work in any platform," Patton said.
Meanwhile, EMQN and its partners are developing and running new external quality assessment schemes for techniques that are gaining importance in diagnostics, including cancer liquid biopsy testing and noninvasive prenatal testing.
A pilot EQA scheme for NIPT, for example, in which 35 labs participated, is just wrapping up, and the results will be presented at the International Society for Prenatal Diagnosis annual meeting in San Diego in July. Results from a liquid biopsy cancer testing scheme that EMQN and its partners conducted in collaboration with the International Quality Network for Pathology (IQN Path), in which 36 labs participated, are currently being analyzed and will be presented at an IQN Path workshop in Florence in June. Both schemes attracted a lot of interest from users, Patton said.