Skip to main content
Premium Trial:

Request an Annual Quote

Next-Gen Sequencing Helps Uncover Cause of TB Outbreak, Piece Together Transmission Patterns


By Monica Heger

This story was originally published Feb. 25.

Whole-genome sequencing
could prove useful in a public health setting for monitoring epidemics, according to researchers from the British Columbia Center for Disease Control.

Combining whole-genome sequencing with social network analysis, the researchers were able to determine that a tuberculosis outbreak in British Columbia several years ago was likely not instigated from genetic changes to the pathogen, but was instead likely due to increased usage of crack cocaine in the community.

Reporting last week in the New England Journal of Medicine, the team sequenced the whole genomes of tuberculosis isolates from 32 patients, as well as four historical isolates from the same region that were sampled before the outbreak. They then combined the sequence information with a detailed social network analysis.

Sequencing the whole genomes of the different isolates allowed the researchers to determine that the isolates originated from two separate lineages. Patrick Tang, senior author of the paper and a medical microbiologist at the BCCDC, said that analyzing the isolates with conventional genotyping methods would not have revealed that there were in fact two separate strains.

Using conventional methods, "we can recognize an outbreak, but that's it. We don't have the resolution to analyze individual relationships within the outbreak," Tang told In Sequence.

Tang and his team analyzed pathogen samples taken from patients during a tuberculosis outbreak that occurred in a medium-sized community in British Columbia from 2006 to 2008. During the time period, 41 cases of tuberculosis were diagnosed in the community, a 10-fold increase over the normal annual incidence in that region.

The team sequenced samples isolated from 32 of those patients, generating between 9.3 million and 21.3 million reads per sample with 50-base paired end reads on the Illumina GA. The sequencing was conducted at the Michael Smith Genome Sciences Center in Vancouver.

Before doing the sequencing, the researchers hypothesized that the outbreak could be due either to mutations causing a more virulent form of the pathogen or a social or environmental cause. Sequencing the strains and identifying the two separate lineages helped the researchers rule out a genetic cause. "It would be very unusual for a mutation to cause both lineages to become more virulent," Tang said.

Stephen Bentley, a senior scientist in pathogen genomics at the Wellcome Trust Sanger Institute, agreed that the sequencing illustrated that the outbreak was not caused by a more virulent pathogen. "From an epidemiologic view, that's an important distinction," he said. "It's not something the pathogen is doing; it's something the human is doing."

The detailed social network analysis confirmed the sequencing results, suggesting that an increased use in crack cocaine led to infection susceptibility in the first cases, and also provided a point of contact from which the disease was spread.

Tang said he thinks whole-genome sequencing will be a routine tool for monitoring epidemics and public health in the future.

"As [sequencing] methods get cheaper and methods for analyzing the data become easier and more standardized, this will definitely replace the current genotyping methods, which are very low resolution," he said.

Initially, he said, sequencing would be useful for studying outbreaks that take a relatively long time to develop, such as tuberculosis, which takes months to years. Once an outbreak is recognized, sequencing could be used to figure out patterns of transmission, which would help identify the source of the outbreak as well as where resources should be concentrated in order to control it.

[ pagebreak ]

However, he noted that the sequencing would still need to be combined with epidemiologic data, such as a social network analysis. "That was the key," he said. "We used the best methods in both typing the organism and epidemiological analysis."

Tang said that sequencing is still not fast enough to address outbreaks that occur in days or weeks, and that improvements in data analysis will also be necessary in order to apply the approach in the public health setting. Nevertheless, he predicted that in the next one to five years next-gen sequencing will be a "routine method for analyzing infectious outbreaks."

Bentley, whose team at the Sanger Institute sequenced 240 isolates of the Streptococcus pneumoniae pathogen to study how it has built up resistance to drugs (IS 2/8/2011), agreed with Tang that next-gen sequencing would be implemented in public health settings within the next five years.

"It's something we're working on ourselves very enthusiastically," he said. While both his recent work and the current NEJM study were done on the Illumina GA, a more practical implementation in the public health setting would likely be on a third-generation platform, he said.

Those machines will have "faster turnaround and lower cost, which are important for a clinical setting," he said.

Specifically, Bentley said that Illumina's MiSeq instrument and Life Technologies' Ion Torrent PGM would be good possibilities. Additionally, he said that the nanopore sequencing platform being developed by Oxford Nanopore would also be a possibility, although the company does not yet have a machine available.

Pacific Biosciences recently demonstrated that its single-molecule system can be used to rapidly characterize pathogens from a disease outbreak by sequencing the bacterial strain from last year's Haitian cholera epidemic (IS 12/14/2010).

Although the Sanger Institute has a PacBio RS, Bentley said it would likely not use that machine for pathogen sequencing in a public health setting.

While it is a good machine for obtaining "long-range sequence information," it is "a rather large machine, and not really designed for that sort of application." However, future generations may be more suitable, he said.

While it may be a few years before sequencing is commonly used in the public health setting, Bentley said his team is preparing for that scenario by building up databases of pathogen genomes. The goal is to enable a comparison of a patient's sequenced pathogen to those in the database, which "will be useful in determining how to treat the patient, and in spotting the epidemic and determining how to manage it," Bentley said. "And, you'll be able to predict things like level of virulence and response to antibiotics."

Tang's team at the BCCDC is also moving forward with pathogen sequencing, looking at a second tuberculosis outbreak and methicillin-resistant Staphylococcus aureus.

Currently, he said, the BCCDC is doing all its sequencing on the Illumina GA and HiSeq 2000 platforms at the Michael Smith Genome Sciences Center. The institute may consider purchasing a machine of its own in the future, but he said he could not predict if that would happen, or what machine it would be likely to purchase.

Have topics you'd like to see covered by In Sequence? Contact the editor at mheger [at] genomeweb [.] com.