Skip to main content
Premium Trial:

Request an Annual Quote

UCSF Group Attributes Discovery of Novel Human Viruses to Contaminated Sample Prep Products


A recently published research paper has shined a spotlight on the potential for manufactured sample prep products to harbor contaminating DNA sequences that can confound the results of highly sensitive molecular detection techniques such as next-generation sequencing, especially when used for new pathogen discovery and detection.

A team led by researchers from the University of California, San Francisco, used unbiased, ultra-deep sequencing of multiple clinical sample types to discover a novel, hybrid DNA virus whose sequence was nearly identical to that of a new virus discovered by a separate research group earlier this year in seronegative hepatitis patients.

However, upon further sleuthing, the UCSF team determined that its virus — and likely the one discovered by the previous research group — was not a human infectious agent, but instead the result of DNA contamination in silica spin column products manufactured by Qiagen and used by both groups for nucleic acid purification.

"Especially for those working in the pathogen discovery field, this is a wake-up call, and is something that needs to be taken into account," Charles Chiu, director of the UCSF-Abbott Viral Diagnostics and Discovery Center, and principal investigator on the new study, told PCR Insider this week. "Any laboratory that does unbiased deep sequencing, at least for pathogen detection and discovery, has to be aware of the potential for contamination."

According to a company spokesperson, Qiagen "is committed to the highest standards of quality and takes the issue very seriously," and as such has begun a "comprehensive evaluation process" to attempt to reproduce the findings of both groups.

The company has also begun offering users of the spin column in question free replacement nucleic acid purification kits that it believes is more conducive to deep sequencing applications, and the spokesperson noted that the findings are "not linked to Qiagen platforms used for clinical diagnostics and do not have an impact on reporting clinical results."

In their paper, published in September in the Journal of Virology, Chiu and colleagues described how they used Illumina HiSeq sequencing to discover and de novo assemble a novel, highly divergent DNA virus at the interface between the parvovirus and circovirus families.

The putatively new virus, which they dubbed parvovirus-like hybrid virus, or PHV, was of great interest to the researchers because both parvoviruses and circoviruses are known to broadly infect insects, vertebrate animals, and humans, and certain specific viruses have been linked to hepatitis.

Furthermore, in May, scientists from the National Institutes of Health and the Institute of Infectious Disease at the Third Military Medical University of China published a paper in PNAS describing how they used unbiased deep sequencing from Solexa (now part of Illumina) to discover a virus that they called NIH-CQV in the blood samples of 92 people from China who had hepatitis that was not caused by any of the five known hepatitis viruses.

Chiu and colleagues determined that the sequence of their virus was almost identical to that of the Chinese/NIH team's virus — a finding that, coupled with some other oddities, made them question their original results.

After Chiu and colleagues sequenced PHV, they noticed that "we were getting some results that were suspicious for potential contamination," said Chiu, who is also assistant professor of laboratory medicine and medicine/infectious diseases and associate director of the Clinical Microbiology Laboratory at UCSF's School of Medicine.

"For instance, we were identifying sequences from the virus in every sample that we looked at, [and] we had independently identified the virus from two different sample types — in one laboratory … in serum samples from a patient with seronegative hepatitis, and in another laboratory … in a patient from Nigeria with diarrhea," Chiu said.

To actually prove this, though, the group had to systematically trace every step of its experiments until it reached the Qiagen spin columns — specifically, the QIAamp Viral RNA Mini and QIAamp UltraSens Virus kits, which are used in many molecular virology studies. Interestingly, these exact kits were not used by the NIH/Chinese researchers, but they used the QIAamp MinElute Virus kit, which contains the same spin column technology.

"The experiment that proved it for us was really when we directly eluted water without even going through the extraction procedure to convert sample to DNA — just passed water directly through the Qiagen column — and could actually recover sequences from the virus," Chiu said. "In fact, we could recover the whole genome of the virus."

A path to contamination

The UCSF team even mapped out a potential hypothesis for how the contaminating DNA sequences got into the spin columns. Specifically, they believe that the silica used in the spin columns is sourced from oceanic diatoms — a type of algae — which originally harbored the viral sequences.

"My gut feeling is that it is the source," Chiu said. "I contacted both of the original metagenomic researchers who had deposited sequences from this virus into environmental databases. [They] made clear to me that they do not use these column-based kits — they use an entirely independent extraction method, yet they were still able to obtain these sequences."

Furthermore, Chiu and colleagues then did a "metagenomic survey" of publicly available environmental databases and were able to find their specific sequences "remarkably, only from waters off the Pacific Ocean," Chiu said.

"Given the fact that the spin columns contain silica components ultimately derived from these diatoms, it's certainly a reasonable hypothesis that this virus comes from the ocean," he added. "We haven't formally proven it, but we are currently looking at some oceanic samples that they have kept from these environmental studies to see if we can recover sequences from this virus. That would help to establish its origin."

PCR Insider contacted both Qiagen and the co-corresponding authors from the Chinese/NIH team about the findings. A Qiagen spokesperson emailed a statement regarding the company's position. The co-authors from the PNAS paper did not respond in time for this publication.

Qiagen has "immediately started a comprehensive evaluation process, including the production of the membrane by external vendors, and will reproduce the findings of the study in-house," the spokesperson said. "To this end, Qiagen has reached out to the authors of the two studies … for additional information about the experimental details such as primers/probe sequences and PCR conditions."

The spokesperson also noted that the firm has already received this information from Chiu's group, but has not yet received it from the Chinese/NIH team. He also noted that some of the findings of the PNAS paper do not jibe with the idea of contaminated nucleic acid purification products. For instance, he said, hepatitis patients in which the virus was putatively identified showed a prominent antibody immune response while the control subjects did not. Further, the mean viral load of infected patients was much higher than one would expect from amplifying up contaminating DNA.

"Qiagen will continue to further investigate the potential contamination of our products and the production chain and give an update on the results of our investigation on our website as soon as possible," the spokesperson said.

Chiu confirmed with PCR Insider that his group has been working with Qiagen to replicate the experiment, and that Qiagen has introduced it to another nucleic acid purification kit — the QIAamp UCP Pathogen kit — that is more conducive to deep sequencing experiments.

The products used in the UCSF and Chinese/NIH studies "are designed for use in PCR and other applications detecting specific targets but are not designed to be DNA-free," the Qiagen spokesperson said. "In standard PCR applications the described contamination will not impair the results. For deep sequencing applications, however … a product from the QIAamp UCP product line would be the appropriate sample preparation kit. It has a 32 times higher purity level compared to standard methods" — a claim that has been confirmed in peer-reviewed publications.

Chiu said that early results from his lab show that the QIAamp UCP kit does not harbor detectable virus sequences, but that the investigation is ongoing.

Burden of proof

Chiu also echoed the Qiagen spokesperson's statements about the contaminated purification kits being adequate for most molecular detection methods besides unbiased deep sequencing. For instance, he noted, although real-time PCR is a highly sensitive detection method, it is also designed to target a very specific sequence, and thus likely would not be affected by contaminating DNA of a completely different sequence.

"The difference with … unbiased deep sequencing is that … we not only have to worry about potentially amplified product in the laboratory from the positive controls, but because we amplify everything, we have to be aware of the potential of contamination coming from the environment, laboratory reagents, and even sample to sample," Chiu said. "With unbiased deep sequencing, I think contamination is an even greater concern, given that we're not targeting any one pathogen or small number of pathogens. We're really targeting the full spectrum of pathogens at once."

Chiu also said that "now that we have the technology, with the capacity to interrogate clinical samples by next-generation sequencing within 24 to 48 hours, it completely behooves us to rule out contamination before reporting any novel, potentially infectious agent."

Some of this onus does fall on manufacturers, he said, and the problem does extend beyond spin columns for nucleic acid purification. For instance, reaction tubes are not guaranteed to be DNA-free, just DNAse- and RNAse-free. And contamination can even lurk in reagents or water.

"This is why, in my laboratory, we follow strict rules to prevent contamination, such as unidirectional work flow and strict contamination-free hoods," Chiu said. "But even with these rules, and the use of ultra-clean reagents, we still often get contamination. For example, for a long time we were getting Burkholderia sequences in our next-generation sequencing data sets. It turns out that this is a medically relevant bacterial pathogen. When we tracked the origin of these sequences, we actually found that they were contaminating our samples from the air, through a very tiny leak in one of our HEPA filter laminar flow hoods that was drawing air from the external environment."

The potential for contamination is especially important to consider in studies trying to identify or detect new or potentially pathogenic organisms because of the possible public health implications. The work of Chiu et al. calls to mind a controversy that began in 2006 when scientists first discovered a new gammaretrovirus called xenotropic MLV-related virus, or XMRV, and reported the presence of XMRV sequences in samples from patients with chronic fatigue syndrome or prostate cancer.

A few years later, two separate papers — one published in Science in 2009 and another published in PNAS in 2010, and both since retracted — fanned the flames of the controversy by reporting the use of molecular detection techniques such as qPCR to detect either XMRV or polytropic murine leukemia virus-related gene sequences in the blood of patients that had been diagnosed with CFS.

Subsequently, however, several other research groups published studies that used molecular and other testing techniques to debunk an association between CMV and the viruses, and to suggest that either laboratory contamination, widespread contamination of commercial qPCR and sample prep reagents with MLV-related sequences, or both likely explained away the erroneous link (PCR Insider 6/9/2011 and 9/20/2012). And, another study published in late 2012 did indeed provide evidence that commercial PCR reagents and even human DNA preparations from several life science vendors contained MLV or mouse DNA, suggesting that the extent of the contamination was much larger than previously thought (PCR Insider 1/5/2012).

Despite the fact that the XMRV-CFS link was eventually debunked, damage had been done. For instance, in 2010 nonprofit blood banking association AABB put out recommendations discouraging CFS patients from donating blood for fear that they may spread the purported infectious agent.

"We are fortunate to have technologies that have a really broad detection spectrum, but unfortunately also have high sensitivity for detecting contaminants," Chiu said. "This can confound the way we interpret the results, and in some cases —with XMRV, for instance — it had a lasting impact of wasted time, effort, and resources on studying a virus that turns out not to be a human pathogen. We're happy that we were able to trace this contamination rather rapidly. From here on out, the burden will be on researchers doing pathogen discovery to prove that their discovery is not actually tainted by contamination."