This article was originally published Oct. 10.
NEW YORK (GenomeWeb) – A team from Michigan State University and the Argonne National Laboratory has developed parallel metagenomic sequencing strategies designed to characterize wastewater-borne viruses.
As they reported in the Journal of Virological Methods earlier this month, the researchers did Illumina sequencing on randomly amplified viral particles that had been concentrated and purified directly from wastewater samples or from intermediate cell cultures.
The direct sequencing step aims to get a complete picture of the DNA and RNA viruses present in a sample, the study's first author Tiong Gim Aw, a fisheries and wildlife researcher with Michigan State University, explained. On the other hand, the cell culture approach is intended to identify viruses from the same sample that are prone to infecting human cells.
"We wanted to try, with the cell culture, to see if we could sequence potentially infectious viruses from wastewater," Aw told In Sequence.
"When we do direct sequencing, we get a lower percentage of human viruses," he explained. "But we obtain sequences from … viruses present in high abundance in wastewater — especially bacteriophages, which are viruses that infect bacteria."
The methods hinge not only on random amplification of viral sequences on the sample preparation side, but also on bioinformatics methods used to combine the short reads coming off the sequencing instrument into longer stretches of sequence that can be more easily analyzed.
For the current study, which focused on untreated wastewater samples, the team did de novo assembly with the Velvet algorithm to get longer viral sequences that could be annotated and identified through comparisons to viral sequence databases, Aw said, noting that "Velvet provided good output in terms of the number of contigs and also the number of reads that could be assembled into contigs."
He emphasized that the metagenomic sequence data alone does not indicate the infectiousness of viruses that appear to be present in the mix. For that, follow-up experiments, including cell culture, remain important.
Nevertheless, the methods provide a peek at viruses that may be present in water or wastewater sample, which should help in evaluating everything from water quality to the effectiveness of various water treatment methods.
"The amazing thing about metagenomics is that we get to look into the world of viruses and see things that we've never seen before in different environments," Aw said.
In the past, he and his co-authors noted that microbial analyses of water and wastewater focused largely on the presence or absence of problematic bacterial species such as Escherichia coli.
In the wake of various water contamination and/or fecal pollution events, they explained, there has been increasing interest in getting a broad look at not only bacteria present in such samples, but also at representation by other microbes, including protozoa and viruses.
For the current study, Aw and his co-authors attempted to assess wastewater viromes using Illumina short reads, reasoning that the coverage depth provided by the high-throughput instrument would help in interrogating a wider collection of viral sequences.
Aw noted that some past metagenomic sequencing studies have centered on Roche 454 reads, which are longer but become pricey when used to look at environmental samples sequenced to high coverage.
"We used Illumina, which has a higher output," he said. "We hoped that by using that, we'd have enough sequencing depth to find some of the human viruses in wastewater."
The researchers used Illumina's GAIIx instrument to assess viral libraries from two wastewater samples collected at a treatment plant in East Lansing, Michigan.
Prior to sequencing, the viral particles from this wastewater were concentrated, purified, and exposed to a membrane filtration step designed to remove DNA or RNA from bacteria or other microbes.
For the direct sequencing arm of the study, the viral particles were then concentrated further, enzymatically treated to remove free nucleic acids, and split into pools aimed at extracting and randomly amplifying either DNA viruses or RNA viruses. After adding appropriate adaptors and barcodes, the pooled samples were assessed by Illumina metagenomic and metatranscriptomic sequencing, respectively.
In parallel, the team established a viral cell culture using the human epithelial lung carcinoma cell line A549. Viral particles obtained from the cell cultures showing signs of infection were similarly concentrated, prepared, and sequenced to identify DNA or RNA viruses from the samples that were capable of setting up shop in the cell line, which is believed to be susceptible to several enteric viruses.
After quality control, read trimming, and de novo assembly of the 18 million reads generated from non-cultured samples, the investigators used tBLASTx and Metagenome Analyzer (MEGAN) methods to match the sequences up with known viruses in NCBI's GenBank viral reference sequence database.
The analysis indicated that identifiable viral sequences in those samples corresponded to numerous bacteriophages, as well as viruses that can infect humans and other animals. Still, a large swath of the viral sequences did not match known viruses in GenBank.
For the sample that had been cultured, meanwhile, the team generated just over 13 million reads, most of which appeared to stem from human viruses. Perhaps unexpectedly, given the nature of the samples tested, Aw noted, these potential pathogens tended to include species known for causing respiratory rather than gastrointestinal conditions.
The sequencing approaches are expected to become increasingly informative as more viral sequences from wastewater and other environmental samples are unraveled and added to existing sequence databases. In the meantime, Aw noted that there may be benefits to coming up with analytical methods to deal with unknown viruses, such as approaches that are less focused on annotation and matching to known viral sequences.
As a follow-up to their proof-of-principle study of untreated wastewater, the researchers are using similar sequencing and cell culture approaches to assess samples collected at different stages of wastewater treatment in the hopes of understanding how effective these process are at removing viruses, particularly those that might be infectious to humans.
Aw said that sequencing-based virome profiling strategies could likely be applied to samples collected at other environmental sites, though the approaches needed to isolate, extract, and concentrate viral particles from alternative sample types such as soil may differ.
In the case of wastewater samples, advanced concentration techniques are not usually needed, according to Aw, since viruses tend to be highly abundant. On the other hand, additional concentration steps are needed when dealing with samples that have more modest viral contents.
The researchers plan to continue tweaking their method further in the future. For instance, Aw said, it would be advantageous to find ways of sequencing lower and lower concentrations of genetic material from viruses to reduce reliance on amplification, which can bias the representation of viral sequences detected in a given sample.
The general methods described in the paper are expected to be compatible with any sequencing technology that provides sufficient throughput and sequence depth, though additional research is needed to determine the optimal depth of coverage or sequencing reads required to identify all of the viruses present in a given sample.