Skip to main content
Premium Trial:

Request an Annual Quote

Head-to-Head Comparisons of Commercial SARS-CoV-2 MDx Tests Point to Few Differences in Accuracy


NEW YORK – In more than a dozen recently published comparisons of commercial molecular diagnostic SARS-CoV-2 tests, researchers have generally found them to have similar performance. Since many tests are proving to be highly accurate, factors such as location of testing, time to results, and workflow may ultimately be more pivotal for labs needing to bring on testing. 

Conducting head-to-head evaluations of technologies is one of the best ways to know how well tests work. In a crisis, though, these evaluations must sometimes take a back seat to patient care. Almost four months into the COVID-19 outbreak, test comparisons are now popping up at a rapid pace.

Alex Greninger, a clinical virologist at the University of Washington, for example, co-authored a comparison of SARS-CoV-2 assays from Cepheid, DiaSorin, Hologic, and Roche.

Published last month in The Journal of Clinical Microbiology, the evaluation was part of the lab's own assessments of the tests on systems it already had in house.

Although the tests with emergency authorization have instructions for use that contain limits of detection, Greninger said direct comparisons are critical. Viral loads vary as much as a billion-fold between patient samples, for example, he said in an email, adding, "It's hard to compare tests on paper because there's no good standards for SARS-CoV-2."

Greninger and his colleagues evaluated 169 nasopharyngeal swab samples from patients. All the assays — Cepheid Xpert Xpress, Hologic Panther Fusion, DiaSorin Simplexa, and Roche Cobas 6800 — were 100 percent specific using the UW team's own US Centers for Disease Control and Prevention-based lab developed test as the gold standard. In terms of analytical performance, the lab's LDT and the Cepheid test were the most sensitive, with the Hologic, DiaSorin, and Roche tests only failing to detect positive specimens near the limit of detection of the lab's own assay.

Now, the UW lab is using all of the tests it evaluated, he said, and "other than the LDT, all the platforms are sample-to-answer and relatively quick."

Separately, individual evaluations of the DiaSorin and Hologic technologies have also been published recently, showing high sensitivity and specificity compared to lab-developed testing. Specifically, the DiaSorin test showed almost perfect agreement with a modified version of a World Health Organization protocol, while the Hologic test performed similarly to Stanford's EUA LDT. In the latter study, the authors concluded that considerations such as reagent availability, turnaround time, labor requirements, cost, and instrument throughput should be used to guide assay choice.

Similar to the UW lab, Daniel Rhoads, director of microbiology at University Hospitals in Cleveland, and his team evaluated EUA assays on testing platforms the health system already had in use.

"Like many large healthcare systems, our laboratories are using many different RNA assays —currently five different platforms," Rhoads said in an email. "This approach is largely due to supply chain challenges, which have prevented us from consistently obtaining adequate reagents to rely on fewer platforms," he added.

In a study published last month in JCM, the Cleveland team compared Abbott ID Now, Diasorin Simplexa, and the CDC's EUA assay. Obtaining reagents, and making the time to perform the evaluation, were the most challenging parts, Rhoads said, noting that it was still critical to do the comparisons.

Rhoads, who was also a co-author of a new guidance paper from the American Society of Microbiology on SARS-CoV-2 assay validation, noted that the detected virus in primary specimens can span up to eight log units. For example, the Diasorin test yielded cycle threshold results between 9 and 35 Ct, while the CDC assay had Ct values between 11 and 40 Ct.

In addition to reagent supply chain challenges, specimen collection devices, transport media, and personal protective equipment are also constrained, Rhoads said.

"Because of the scarcity of these supplies, labs, hospitals, and companies are commonly using approaches that are typically uncommon [such as] testing saliva for viral RNA, using self-collected specimens, and transporting swabs in saline," Rhoads said. "As a diagnostic laboratory community we are resourceful, but limitations of available physical resources continue to stress the system." 

Other groups have also recently published their comparisons, carried out under similar duress.

A team based at Northwell Health in New York recently published a pair of evaluations, for example. In the first, published last month in JCM, the lab compared a modified version of the CDC assay to tests from Diasorin, GenMark, and Hologic.

Using 104 nasophyrengeal swab specimens from patients, the team found the modified CDC, DiaSorin, and Hologic tests had perfect positive percent agreement, while the GenMark test had a PPA of 96 percent. The negative percent agreement was 100 percent for the GenMark and DiaSorin tests, 98 percent for the CDC-based test, and 96 percent for the Hologic test, with retesting of discordant samples improving these values.

The group also did a workflow analysis and concluded that the Hologic test had approximately 2 hours of hands-on time, while the DiaSorin and GenMark tests were in the 15-minute range. The Hologic test also had the longest run time of the four, at approximately 4.5 hours.

The study concluded that the DiaSorin and Hologic assays outperformed both the modified CDC and GenMark assays when it came to overall LoD and clinical correlation, but that the Hologic platform may be more appropriate for higher-volume testing.

A second study by the Northwell team, also published last month in JCM, compared decentralized tests from GenMark, Abbott, and Cepheid to the Roche Cobas system.   

In analytical validation studies of 108 patient samples, the lab found the limit of detection for the Xpert Xpress to be 100 copies per ml, while LoDs for the ePlex and ID Now were 1,000 and 20,000 copies per ml, respectively. All three assays showed 100 percent negative percent agreement. The lab also performed a workflow analysis, revealing that the ID Now had the fastest time-to-result per specimen, at about 17 minutes, compared to approximately 46 minutes for the Xpert Xpress and 90 minutes for the ePlex.

However, the tests were run using samples in viral transport media, which has since been removed in the protocol for the Abbott ID Now test, as previously reported.

A team in Canada also recently compared a test from Cepheid with two tests from Roche. In a study published last month in the Journal of Clinical Virology, they found high concordance among nasopharyngeal samples with high viral loads between the Cepheid Xpert Xpress, Roche Tib MolBio Lightmix, and Roche Cobas 6800 tests. However, in samples with high cycle thresholds, which correlates with low viral loads, the Cepheid test was more discordant. The authors noted that this could be improved with analysis of endpoint values, essentially involving assessment of the total amount of fluorescence detected at the end of the final nucleic acid amplification cycle in the Cepheid test.

In JCV last week, a team at Columbia University's Irving Medical Center compared Abbott ID Now, Cepheid Xpert Xpress, and Roche Cobas tests. In 113 remnant patient samples, the percent agreement with the Roche test was 74 percent and 99 percent for the Abbot and Cepheid tests, respectively. Specifically, with medium and high viral loads, both tests showed 100 percent positive agreement, but for samples with cycle thresholds above 30 that indicate low viral loads the ID Now had positive agreement of 34 percent, while the PPA for the Cepheid test was 97 percent. However, these assessments were also done using sample collected in transport media, which is no longer indicated for the ID Now test.

Other studies have also evaluated the Cepheid test. For example, a multi-center study in the Netherlands published in JCV this month showed it had complete agreement with an in-house LDT, while a multi-center international study published in JCM showed 99 percent positive agreement compared to standard-of-care tests on 483 remnant upper and lower respiratory specimens.

On the other hand, the Abbott rapid test has had more variable results. In a study in JCM comparing nasopharyngeal swabs collected in transport media tested on the Abbott RealTime SARS-CoV-2 assay to nasal swabs tested on the ID Now, researchers found an overall agreement between the tests of 75 percent, and suggested negative results on the ID Now were likely related to a higher limit of detection and sampling errors.

A pre-print in MedRxiv compared the two Abbott tests with a modified CDC assay. It found that although the ID Now was faster, using dry nasal swab samples it missed 13 percent of patients whose nasophyrengeal samples were positive with the m2000, and a medical record review deemed the discrepant results to be true positives.

Preliminary reports from other groups have also found the ID Now to be less accurate than expected, although the issues with sample collection may account for the discrepancy. Abbott has reportedly agreed with the FDA to conduct additional studies for its ID Now test using more patient samples.

The Roche tests, meanwhile, were among the first to launch and have been treated as reference tests in many studies, according to an expert at the Foundation for Innovative New Diagnostics.

In one evaluation, authors at the University of Ljubljana in Slovenia described conducting validation studies and switching from an LDT to the Roche test within 48 hours after the tests showed 98 percent agreement in nasopharyngeal samples from 502 patients.

A comparison in JCM of the Roche Cobas 6800 to the Cepheid Xpert Xpress by a team at the University of Chicago reported 99 percent agreement between the tests.

Also, a comparison of the Roche Cobas 6800 assay and the Hologic Panther Fusion by researchers at Weill Cornell Medical Center published earlier this month in Virology showed the two systems had comparable clinical performance. 

A comparison of the Roche Cobas 6800 assay to a SARS-CoV-2 LDT for the NeuMoDx 96 system showed comparable analytical and clinical performance. A team of authors based in Germany noted in a JCV study last month that the NeuMoDx 80-minute turnaround time and random-access capabilities made the system suited for automating medium-throughput SARS-CoV-2 testing, or as a supplemental instrument to run stat testing in labs that use high-throughput systems.

System-agnostic kit-based testing solutions are also being evaluated. These have the advantage of enabling labs to run COVID testing using basic instruments that most typically have, and a few evaluations of test kits have been published recently.

In one, published earlier this month in JCV, researchers in the Netherlands compared seven kits — from Altona Diagnostics, BGI, CerTest Biotec, KH Medical, PrimerDesign, R-Biopharm AG, and Seegene.

They found some variation in the limits of detection and detection rates for clinical samples, but concluded that all the kits could be useful for routine diagnosis of COVID-19 by experienced labs.

The Altona test was also part of another JCV study, in which researchers at Johns Hopkins University's molecular virology lab compared the Altona RealStar test to an assay from GenMark and a CDC-based LDT, showing comparable analytical performance for the three tests.

Researchers in Canada have also surveyed testing done by public health labs in different provinces in a study published last week in JCV.

Labs were using a slew of kits and test systems for SARS-CoV-2 molecular detection — from Thermo Fisher Scientific, Roche, Altona, New England Biolabs, Seegene, r-BioPharm, Quidel, SolGent, and PerkinElmer. Overall, the analytical sensitivities were equivalent between LDTs and most commercially available methods, the researchers said, although there was some variability in the limits of detection.

Finally, one study evaluated a panel-based test. A group in France compared the Qiagen QIAStat panel to the WHO test and found 97 percent positive agreement, with 100 percent sensitivity and 93 percent specificity on 69 primary clinical samples.

The Clinical and Public Health Microbiology Committee of the American Society for Microbiology recently provided some guidance for labs validating commercial assays. These experts noted that, outside of highly specialized academic and commercial labs, clinical microbiology labs tend to be unfamiliar with EUA classification, so assay verification can be daunting.

"Further compounding anxiety for laboratories are major issues with supply chain that are dramatically affecting the availability of test reagents and requiring laboratories to implement multiple commercial EUA tests," the authors wrote.

The guidance highlighted that there can be confusion about how to verify commercial EUA tests, since these are neither fully FDA cleared nor laboratory-developed tests. Verification should include a number of steps, the authors said, including assessment of accuracy and precision that pays particular attention to the authorized location for use noted by FDA when it deems the setting of use for a test in its authorization letter and paperwork.

Should the overall comparable accuracy of molecular SARS-CoV-2 testing hold up in future studies — and testing continue to be necessary for the foreseeable future — labs will likely be able to decide which test to use based on other factors.