NEW YORK (GenomeWeb) – Researchers from Washington University in St. Louis have developed a capture method for isolating and sequencing viral genomes from mixed samples, and demonstrated they could capture full viral genomes that were present at levels too small to be identified from metagenomic sequencing.
The technique, described recently in Genome Research, is similar to another method for isolating and sequencing viral genomes from samples that was published the same week by a separate group from Columbia University.
Kristine Wylie, co-lead author of the WUSTL study and assistant professor of pediatrics at the university, told GenomeWeb that the work came out of previous projects the group was involved in characterizing the viruses from metagenomic data, including the Human Microbiome Project and a study of unexplained fever in children.
"In both cases, we saw that [metagenomic] sequencing was a really useful tool to look at the virome in an unbiased way," Wylie said, "but we saw that it just wasn't as sensitive as we wanted it to be and we weren't getting the genome coverage we wanted for analysis."
As the Columbia University team also noted, Wylie said that viral genomes typically make up a tiny fraction of the genomic material in a metagenomic sample, making it difficult to obtain comprehensive information them.
As a result, the WUSTL team decided to turn to capture methods to enrich for viral genomes. Specifically, theydesigned an assay to capture all viruses that were known to infect vertebrates. To do this, they used NimbleGen's services to create around 2 million custom designed olignonucleotide probes that span around 200 mb of sequences. "We designed the targets we wanted for the virome panel, and then [NimbleGen] synthesized the oligos," co-lead author Todd Wylie, who is also the director of microbial genomics computing at WUSTL, told GenomeWeb.
The researchers' assay, which they dubbed ViroCap, includes targets from 34 viral families comprising 190 annotated viral genera and 337 species. All DNA and RNA viruses from vertebrate hosts, with the exception of human endogenous retroviruses, are included.
The team compared the assay to metagenomic sequencing in two sets of human samples. The first set consisted of clinical samples that had previously been tested and found positive for viruses. Metagenomic sequencing identified 10 viruses in the 14 samples. The ViroCap panel identified those same 10 viruses plus an additional four viruses and "resulted in dramatic improvements in all sequence coverage metrics," the authors wrote. The median viral genome coverage increased to 83 percent with the panel, from 2 percent with metagenomic sequencing.
The second set of samples included eight patient samples from a research study of children with unexplained fever. The children had been previously tested with a number of different PCR assays and found to be positive for one or more viruses.
Metagenomic sequencing identified 11 viruses, while the ViroCap assay detected those 11 plus an additional seven viruses. Similar to the first experiment, the ViroCap assay yielded improvements in all metrics, including viral genome coverage, which increased to a median of 76 percent from 2 percent.
The captured genomes from both experiments included both DNA and RNA viruses ranging in size from 5 kb to 161 kb. Eight viruses were covered completely, while 12 had more than 90 percent coverage.
Next, the researchers tested whether the assay could detect viruses even if they diverged significantly from the reference genome. To do this, they tested ViroCap on anelloviruses, which are single-stranded DNA viruses that have a common genome structure but could have between 30 percent to 50 percent sequence diversity among separate species. They ran the assay on samples positive for anellovirus that had been previously characterized.
The capture sequencing protocol was able to generate anellovirus contigs longer than 1 kb. The contigs ranged from 58 percent to 98 percent similar to the reference genomes.
Kristine Wylie said there were two keys to detecting divergent viral genomes. The first was a function of the assay design, which accounted for all known variation. In addition, the researchers tiled capture probes across the viral genomes to include highly conserved regions, which likely also enabled detection of genomes that shared little sequence homology to known viruses.
The second key is just a function of hybridization technology itself that allows for inexact matches, she said. "If we have a probe that's 100 bp, not every nucleotide needs to match," she said. "It allows us to be a little less stringent than PCR."
Todd Wylie added that bioinformatics techniques also enabled heterogeneous viruses to be captured with just a 200-mb panel.
The team also assessed how specific the panel was for viruses, as opposed to capturing and enriching for human or bacterial genomic information. They found that on average, about 5 percent to 6 percent of reads aligned to non-viral genomes. They attributed some of the off-target reads to the fact that the libraries had been handled by humans during the incubation, dilution, and amplification steps.
The ViroCap assay is publicly available and the researchers plan to update it periodically with new viral sequences. The team does not plan to commercialize it.
In the future, the researchers plan to use the assay to continue supporting their clinical research studies, including a project to characterize the viromes of children in both developing and developed countries to see how they compare. "We're just starting to get a view of what a normal virome looks like," Kristine Wylie said. She added that one of the group's relatively surprising findings is that asymptomatic kids are "loaded with viruses."
Another potential avenue for the research would be to design more syndrome-specific panels. "We've cast a wide net with our panel," Todd Wylie said, but because the design of the panel is open access, researchers could "take a subset of that and make a much cheaper panel" — one that focuses on respiratory viruses, for instance.