NEW YORK (GenomeWeb) – Bioinformatics start-up Hyrax Biosciences has launched its first product, a computational tool that provides low-cost HIV drug resistance testing to clinical researchers.
The Exatype HIV drug resistance testing platform is a web-based system that offers tools for analyzing raw HIV sequences from one or many patients and for generating clinical reports. It features algorithms that combine the actual sequence with protein coding or other functional information to identify resistance mutations and distinguish them from sequencing errors. The software runs on Amazon Web Services so it scales easily to accommodate increased demand. Each sequencing run is analyzed on its own cloud instance so there are no wait times and customers' datasets are kept separate from each other.
Hyrax opened its doors in October and is based at the South African National Bioinformatics Institute at the University of Western Cape. Exatype began as a doctoral project at the institution in the laboratory of Hyrax CEO and Co-founder Simon Travers. The project was funded by the South African Department of Science with support from the Technology Innovation Agency and the South African Medical Research Council's Strategic Health Innovation Partnerships unit.
Travers joined the university in 2010. At the time, there was growing interest in using next-generation sequencing to study HIV and he found himself fielding repeated requests for help from researchers working in the space, especially with data analysis. Specifically, researchers were interested in NGS as an alternative to Sanger sequencing for drug resistance testing. Part of the reason for that are the costs associated Sanger-based testing, which can be prohibitive for researchers and clinicians in low-resource countries which tend to have high infectious disease burdens.
For example, depending on the country it can cost some $300 to $500 to do a single resistance test, which is not scalable for projects that involved hundreds of test samples. "NGS gives you the benefit of economies of scale because you don't need all of the sequences from a sequencing run to be from the same patient so you can pull sequences from multiple patients unto a single plate," Travers told GenomeWeb. NGS is also more sensitive than Sanger sequencing, enabling researchers to identify resistance in viral populations at prevalence levels of less than 20 percent.
In 2010, Travers said, there were few resources for analyzing HIV drug resistance data. "I put a PhD student on the project [who developed] an easy-to-use pipeline ... and it sort of snowballed from there into having a fully working pipeline." The doctoral project wrapped in 2014. Travers and colleagues have published two articles that describe some of the underlying algorithms. One of these, published in Nucleic Acids Research in 2014, describes a mapping tool that is designed to map sequence reads to short lengths of coding DNA. Specifically, it searches for open reading frames that minimize the number of stop codons and uses hidden Markov models to map each query sequence in the codon space to the coding regions in the reference. In this way, the software is able to distinguish errors associated with the sequencing process from true mutations, according to the NAR paper.
In the context of HIV resistance testing, sequence mapping is, on the one hand, relatively easy because input reads are mapped to a single gene sequence that is responsible for drug resistance, Travers explained. But the difficulty with HIV, compared to other organisms, is that it's not a homogenous population. "Every single sequence read ... is potentially unique so we need to be able to identify what is actually sequencing error and what is a true mutation in every single sequence read," he said.
This is especially important for homopolymer regions of the genome, which have repetitive sequences that can confuse some sequencing instruments. However these regions are important for HIV drug resistance because some important mutations occur in this area. Since the company's algorithm maps reads in the codon space as opposed to treating each nucleotide independently, "we are quite easily able to identify sequencing errors, particularly indels, from true SNPs," Travers said. Following SNP calling, variants pass through a drug resistance testing algorithm and, lastly, the system generates a clinical report that provides information about which drugs the mutations identified in the sample are resistant to.
After the doctoral project wrapped, the researchers spent several months rewriting the pipeline to make it more user-friendly. They also implemented it on AWS to take advantage of the low-cost compute resources for analysis that cloud infrastructure provides as well as save on costs associated with purchasing and maintaining resources in house. This way, the company has access to virtually unlimited compute so it isn't limited in the number of analysis projects it can run at a given time. Customers also save on compute costs since they only have to pay for the resources they use in the cloud.
The size of input data files also does not matter with Exatype since the system can simply provision additional compute resources to meet the increased demand. HIV drug resistance tests focus on a single gene but customers might analyze several samples that they've sequenced in a single run — file sizes are typically about a gigabyte. "The largest we've analyzed is 100 patients that have been sequenced at the same time but theoretically you can go three times as much as that," Travers told GenomeWeb. Running the software on the cloud also ensures that customers have access to the most up-to-date versions of the software at all times.
Exatype accepts files from Roche 454, Thermo Fisher Ion Torrent, and Illumina sequencing platforms. "The way we set up the [application programming interface], we can actually have the [sequencing] machine push the data through to us if need be," Travers said. "We haven't done that with a machine but we've set up the infrastructure to be able to do that and we are in discussions with some companies about possibly being able to do that on their systems."
The company currently has a tiered pricing structure under which it charges a per-sample price for analysis. Travers declined to discuss specific price points because these vary from one user to the next. He did say that the company's pricing structure means that it can lower its prices for customers in resource-limited settings and charge more for clients in wealthier contexts, he said. Hyrax is also mulling a licensing option for bulk users.
Although it is initially focused on HIV drug resistance testing, Hyrax also plans to develop solutions for TB and antibiotic resistance testing. "We really feel that being a South African company, we are able to provide this type of diagnostics to people who need it most in resource-limited settings," Travers said.