NEW YORK (GenomeWeb News) — By analyzing the genomes of more than 3,600 strains belonging to a pathogenic group of Streptococcus bacteria, a team of researchers has unraveled the genetic events leading to its emergence as the cause of an epidemic.
After sequencing the strains, Houston Methodist Research Institute's James Musser and colleagues compared them to figure out the order in which the strains acquired virulence factors and toxin variants. Through that effort, they traced the emergence of this epidemic to a multistep process that led to the formation of a single organism in the early 1980s with increased virulence that then spread, as they reported in an early edition Proceedings of the National Academy of Sciences paper to be published this week.
"[T]he molecular evolutionary events transpiring in just one bacterial cell ultimately produced many millions of human infections worldwide," Musser and his colleagues wrote.
Group A Streptococcus, a Gram-positive bacterial pathogen, causes some 600 million infections globally each year and an estimated 10,000 to 15,000 severe infections in the United States. Mild illness due to GAS includes strep throat and skin infections like impetigo, but severe infections consist of streptococcal toxic shock syndrome and necrotizing fasciitis. According to the US Centers for Disease Control and Prevention, STSS and necrotizing fasciitis each make up between 6 percent and 7 percent of invasive cases.
Musser and his colleagues noted that a wave of serotype M1 GAS arose in many countries, including many severe cases striking public figures like the Muppeteer Jim Henson, in the late 1980s and early 1990s. But how this strain emerged has been unclear.
"[W]e know relatively little about the evolutionary genetic events and epidemiological forces that underpin temporal variation in bacterial disease frequency and severity," the researchers said. "Lack of precise understanding of these topics significantly hobbles our ability to understand and predict bacterial strain emergence and epidemics."
To uncover how this strain emerged, Musser and his colleagues turned to 3,615 serotype M1 organisms collected from a variety of locations in Europe and North America between the 1920s and 2013. They then sequenced them using the Illumina HiSeq2000 or Illumina MiSeq machines.
Before determining the polymorphisms present in those strains, the researchers first resequenced two serotype M1 reference strains, MGAS5005 and SF370, using an Illumina platform to 125-fold coverage and corrected some 235 errors in the assemblies.
Using the bioinformatic tool variant ascertainment algorithm (VAAL), the researchers found more than 13,000 SNPs and nearly 1,600 indels in the core genome of the 3,615 serotype M1 organisms as compared to the corrected reference strains. Still, they noted that the 3,615 serotype M1 organisms were all closely related, with each strain differing by an average 48 SNPs and eight indels in their core genomes.
Further, they reported that five genes encoding virulence factors — M protein (emm), regulator of protease B (ropB), streptococcal inhibitor of complement (sic), and the two-component system control of virulence regulator (covR) and sensor (covS) — had higher variation levels than average.
Additionally, by comparing the mobile genetic elements of the strains under study to those of the references, Musser and his colleagues found that most of the strains causing disease since the mid-1980s had the same MGE content as the MGAS5005 reference genome, meaning they contained prophage 5005.1, which encodes the superantigen SpeA, and prophages 5005.2 and 5005.3, which encode the DNases Spd3 and streptococcal DNase D2 (SdaD2), respectively.
Most of the SF370 reference-like strains, they added, dated back to 1988 or earlier, while all MGAS5005-like strains were isolated in 1988 or later. Additionally, they noted that the ancestral SF370-like organisms had either the speA1 or speA2 allele, while the descendent MGAS5005-like strains only have the speA2 allele.
Musser and his colleagues said that they found "three critical points" using a number of phylogenetic inference tools.
The speA2 allele in the modern MGAS5005-like strains is nearly identical in sequence to the speA1-encoding prophage from the SF320-like strains from the 1920s, they noted. Additionally, they said that the two prophages integrated at the same chromosomal spot in both MGAS5005- and SF320-like strains. Finally, all strains encoding speA isolated since the 1970s have the speA2 allele.
This, they said, indicates that speA2 allele evolved from the speA1 allele in a single event that predates the emergences of severe invasive infections. The speA2 allele then became common in contemporary M1 strains through clonal expansion, they added.
Similarly, Musser and his colleagues found that the prophage containing spd3 is present in most M1 strains since the 1970s, before the rise of invasive infections.
Mussr and his colleagues also pinpointed the arrival of the 5005.3 prophage, which encodes sdaD2, to before the acquisition of the speA-containing 5005.1 prophage.
"These findings also strongly argue for a clonal expansion and identity-by-descent evolutionary pathway," the researchers said.
The last link of the chain of events leading to the emergence and spread of these strains was acquisition of a 36-kb region through horizontal gene transfer, Musser and his colleagues said. The purA-nadC recombination event occurred after the early 1970s.
And based on a best-fit rooted time-tree, Musser and his team estimated that the year of origin for the most recent common ancestor of modern MGAS5005-like strains was 1983.
Clinical reports indicated that serotype M1 strains isolated before the resurgence are less virulent than the resurgent later strains. To test this, Musser and his team exposed two groups of monkeys to either the pre-resurgence reference strain SF370 or the post-resurgence reference stain MGA2221 and monitored them for 21 days.
Based on bacterial burden, the post-resurgence MGA2221 appeared to be more virulent in the both their pharyngitis and necrotizing fasciitis models of disease.
"It is now clear from our analysis that a complex multistep molecular process occurring sequentially over decades culminated in a single organism with increased virulence that fulminantly disseminated globally and displaced other GAS M1 lineages," the researchers said.
They added that although theirs was a retrospective study, it is now becoming possible to conduct similar studies in near real time. "These types of studies, conducted in synchrony with detailed analyses of relevant emergent phenotypic properties (e.g., virulence, antimicrobial agent resistance, dissemination capacity), will undoubtedly provide new information useful for significantly enhanced basic, clinical, and translational research on human, veterinary, and plant pathogens," they added.