COLD SPRING HARBOR, NY (GenomeWeb) – Using the metagenomics analysis tool Taxonomer, researchers have been able to quickly tease out what pathogens are present in clinical and other samples, according to the University of Utah's Aurélie Kapusta.
In addition, the Utah researchers, along with Arup Laboratories and bioinformatics startup IDbyDNA, have developed a Taxonomer-based clinical sequencing test to help diagnose respiratory disease, Kapusta said at the Biology of Genomes meeting held here this week.
Researchers from the university, Arup, and IDbyDNA developed Taxonomer to quickly detect and classify pathogens from within samples, and have used the tool in both basic research and clinical settings.
In particular, they have used Taxonomer to identify the Zika virus within a patient sample from Utah as well as confirm and rule out the presence of the Ebola virus in samples from West Africa.
"In principle, Taxonomer can identify any life form in any sample," Kapusta said.
Taxonomer is a K-mer-based tool that uses both nucleic acid- and protein-level information to group organisms into taxonomic classes. Users can upload FastA or FastQ files or enter a SRA run identifier to the Taxonomer site and those reads are then compared to a database containing human, bacterial, fungal, and viral sequences to assign an organism of origin to them. And it does this quickly, Kapusta said. She noted that, in once instance, it was able to transfer and analyze 186,000 reads on four CPUs in three minutes for a study of hemorrhagic fever.
She added that they have a new Taxonomer server that can reach a speed of 2 million reads per minute with 16 CPUs.
The tool can be applied to both basic and clinical science, she noted. For instance, Taxonomer has been used to tease out cases of contamination within next-generation sequencing data as well as distinguish similar isoforms from within RNA-seq samples from cone snail venom.
At the same time, in the clinic, Taxonomer was used to determine that a Utah man who had fallen ill and died was actually infected with the Zika virus and that the virus was similar to strains in Mexico, where the man had recently visited, Kapusta said.
Similarly, she and her colleagues applied it to samples from patients in West Africa who exhibited symptoms of hemorrhagic fever, the cause of which was uncertain. For one, Kapusta reported that it confirmed the presence of the Ebola virus, while it determined another patient actually was infected with the Lassa virus and that another patient had a Chlamydophila psittaci infection, both of which can cause similar symptoms.
It likewise was able to detect the Japanese encephalitis virus within a sample from a 16-year-old boy with encephalitis for whom previous tests came back negative, she added.
Kapusta and colleagues have also developed a Taxonomer-powered tool called Explify, a commercial sequencing test, for use in the clinic to determine the causes of severe respiratory disease. It's particularly being targeted to patients who have been hospitalized with disease that other tests cannot identify. Pneumonia of unknown cause is a major issue, she added, and in a large portion of cases, the causative pathogen isn't found.
With Explify, which she said would be available soon, doctors could order the test and send samples for sequencing and analysis to Arup and IDbyDNA. A report would be back to the physician within 48 hours.
Kapusta said that the overarching goal is to enable "Google-like exploration" of sequencing datasets. To do this, they are testing whether they can conduct these searches against the entirety of the RefSeq database, which includes reference sequences for some 66,000 organisms.