NEW YORK (GenomeWeb) – A collaborative project recently launched by researchers in Canada and the UK is taking aim at a common problem: incorporating genomics into clinical care and making the results of testing easily accessible and interpretable by primary care providers.
The so-called Sharing Mycobacterial Analytic Capacity (SMAC) project is focused on tuberculosis patients in the UK and Canada and includes researchers from the BC Centre for Disease Control (BCCDC), which is the public health arm for British Columbia's Provincial Health Services Authority; Oxford University; and Public Health England (PHE).
The project is led by Jennifer Gardy, a senior scientist in BCCDC's Communicable Disease Prevention and Control Services and an assistant professor at the University of British Columbia; and Derrick Crook, a professor at the University of Oxford and director of PHE's National Infection Service. Their teams will share knowledge, data, and materials around TB genomics to speed up the clinical validation and implementation of genomics-based TB diagnostics first in the UK and then in BC. As part of their efforts, the researchers will clinically validate a pipeline for analyzing TB genomes as well as develop a template for reporting the results of genomics testing that will help clinicians use genomics to better diagnose, treat, and track TB cases.
"Genome BC is really trying to get more and more into funding not just basic and translational research but really getting into … that point of clinical implementation," BCCDC's Gardy told GenomeWeb in an interview last week. "Recognizing that the UK was a little further ahead there, there was a really great opportunity to learn from them and to build on this existing relationship."
SMAC is one of the fruits of a memorandum of understanding signed last year by Genome British Columbia and Genomics England. The partners agreed to work on improving diagnostic capabilities and outcomes for patients with cancer, rare diseases, and infectious diseases by sharing information and jointly developing tools. The first phase of the MoU focused on building expert working groups, suggesting candidate projects, and harmonizing data-sharing tools and resources. The second phase of the project, which kicked off this year, centered on launching pilot projects in each of the three priority disease areas.
One of the diseases selected for study is tuberculosis. According to estimates from the World Health Organization, in 2011, there were approximately 8.7 million new cases of tuberculosis identified and 1.4 million tuberculosis-related deaths worldwide. More recent numbers from 2014 published by the WHO show that those numbers have gone up. There were an estimated 9.6 million new TB cases in 2014 and 1.5 million deaths.
Gardy's research group in Canada and Crook's lab in the UK have worked on TB genomics for several years and have collaborated on projects in the past. In at least one previous study published earlier this year, they showed that genomic testing is faster and often more accurate than current techniques used for diagnosing and characterizing TB infections.
That study, published in Lancet Respiratory Medicine, compared whole-genome sequencing of TB isolates with routine laboratory diagnostic workflows in eight participating laboratories over an eight-month period. They compared both kinds of testing in terms of diagnostic accuracy, processing times, and cost. Their findings showed that genomic testing can reduce the time to diagnose and characterize TB infections from an average of 31 days down to an average of 9 days. They also found that it costs 7 percent less per annum to run WGS-based diagnostics than it costs to run current diagnostic workflows.
Signing the MoU extended the existing relationship between the two groups and imposed a more formal structure on the partnership, Gardy told GenomeWeb. "You want these data-sharing activities to benefit both sides and also benefit the larger community, so you have to stop and think 'What's the smartest project? What's going to have the most impact on the UK side and the Canadian side?' What's that one golden story that you want to tell?'"
The partners have chosen to validate a bioinformatics pipeline put together by researchers from PHE and Genomics England for clinical use, she said. The so-called COMPASS-TB pipeline, which was used in the aforementioned Lancet Respiratory Medicine study, uses the Kraken metagenomics software from Johns Hopkins University to identify the Mycobacterium tuberculosis genome and then assembles the sequences using the H37rv reference genome. There are also tools for calling SNPs and for predicting resistance gene mutations and a clustering algorithm for searching the newly sequenced isolate against previously sequenced ones to determine whether there is a larger outbreak at play.
"We've done a lot of pipeline testing and we've done a lot of work on what comes out of that pipeline and goes into the report that describes the genomic data," she told GenomeWeb. Moreover, efforts to fine tune and validate the pipeline in the UK "saves us a lot of development work as we go forward with implementing TB genomics in our own jurisdiction," she added.
Meanwhile, members of Gardy's research group are using their expertise in information visualization techniques to develop a two-page laboratory report that will be used to communicate the results of TB genomics tests to clinicians. The new report will make it easy for doctors to locate important information that they need as well as understand interpretation results, the partners said. It will include lists of antibiotics that will likely work on the infection as well as whether or not the tested patient is part of a disease outbreak among other details.
"A lot of groups have used various InfoVis techniques to redesign health reports and electronic health records in the past, but these have been almost entirely reports for things like blood tests or other tests where you have a numerical value for each test result and that value has to be interpreted along a scale of bad-normal-good," Gardy told GenomeWeb. "The data we are producing with genomic analysis of TB isolates … are all data types that nobody's yet tackled from a report design perspective. This is the first time it's been tried with this sort of data."
To develop the report template, the researchers are using the design study methodology that is "essentially a formalization of the user-centric design idea," she explained. Basically, at every stage of the design process developers ensure that they account for the end users' needs, desires, and restrictions, she said. "This project is all about taking a complex data type and making it clinically useful for the community, so the design study approach, in which users are constantly and iteratively feeding back into the report design process, is really the best way to produce a meaningful output."
In addition to sharing data with their cross-border collaborators, the researchers will also contribute the genomes that they have sequenced to large shared public repositories like the National Center for Biotechnology Information's Sequence Read Archive (SRA). For example, the BC researchers have just sequenced 1,400 TB genomes, which they plan to eventually make available in a public database, Gardy said. In the past, they have contributed their data to the SRA or the European Bioinformatics Institute's European Nucleotide Archive and they will likely use these outlets for the new datasets as well, she told GenomeWeb.
In addition to sequence data, the researchers will share metadata around the samples via these repositories or through other sites such as Figshare, Gardy said. The metadata shared along with the samples includes the date the sample was sequenced, where it was collected, as well as details of the lab that performed the sequencing. The partners also plan to share the template of the report that they develop through the SMAC initiative via Github, so that others can use it to report their own lab results, she said. They also plan to share the bioinformatics pipeline following the clinical validation process.
"The more genomes we sequence and make available to each other and to the scientific community, [and] the more we annotate those with useful, meaningful metadata about what drugs [an] isolate is resistant to [and] the treatment outcome for each patient, the more we build an amazing genomic resource that the community can mine for future studies," Gardy said.
Such studies could focus on discovering new TB drug targets, identifying antibiotic resistance mutations, developing new molecular diagnostics, or the data could be used to design software programs that, for example, can predict drug resistance. "The fact that we are getting this data out there in the open is a huge benefit not just to the parties involved with the project but to the whole community," she said.