Skip to main content
Premium Trial:

Request an Annual Quote

PacBio, Collaborators Publish Analysis of E. coli Outbreak Genome

By a GenomeWeb staff reporter

NEW YORK (GenomeWeb News) – An international team of researchers has used Pacific Biosciences' single-molecule sequencing technology to gain insight into the pathogenicity and evolutionary origins of the O104;H4 strain of Escherichia coli responsible for a recent outbreak in Europe that resulted in thousands of illnesses and more than 50 deaths.

Published today in the New England Journal of Medicine, the study, which sequenced the outbreak strain and 11 related E. coli strains, determined that the outbreak strain, C227-11, is a member of the enteroaggregative pathotype of E. coli, or EAEC, and that it "can be distinguished from those of other O104:H4 strains because it contains a prophage encoding Shiga toxin 2 and a distinct set of additional virulence and antibiotic-resistance factors," the authors wrote.

In particular, they determined that a recent horizontal genetic exchange with another E. coli pathotype — the Shiga toxin-producing enterohemorrhagic E. coli, or EHEC, strain — enabled the emergence of the highly virulent outbreak strain.

The team also found that expression of the Shiga toxin 2 gene, stx2, was increased by certain antibiotics, including ciprofloxacin, which "suggests that caution is warranted in the use of certain classes of antibiotics to counteract this newly emerged pathogen."

The findings are in line with those of other groups that have used sequencing to elucidate the origins of the deadly E. coli strain. Two teams — one from BGI and the other a collaboration between the University of Münster and Life Technologies — were the first to sequence the genome, using the Ion Torrent PGM, and both groups found that the strain contained pathogenic features of both the EHEC and EAEC pathotypes. Further sequencing and assembly on Roche's 454 GS Junior determined that the strain was EAEC with an acquired phage genome that produces the Shiga toxin.

The authors of the NEJM study, which included researchers from PacBio, the University of Maryland School of Medicine, the World Health Organization, Denmark's Statens Serum Institute, and elsewhere, noted that before the outbreak, only three EAEC genomes had been sequenced, so "genome-scale knowledge of the phylogeny of enteroaggregative E. coli was limited."

In order to gain further insight into the evolutionary history of the outbreak strain, they used the PacBio RS to sequence the outbreak strain C227-11, seven other EAEC O104:H4 strains, and four reference strains.

Using three PacBio RS instruments in parallel, they obtained approximately 75-fold coverage for each of the isolates in about five hours per isolate. The mean read length was 2,067 bases.

They then compared all 11 strains alongside data from the Ion Torrent and 454 sequencing efforts in order to identify copy-number variations and single-nucleotide variations across all the isolates.

They also used data from 53 E. coli and shigella genomes to generate a phylogenetic tree outlining the evolution of the outbreak strain. While EAEC isolates were spread throughout the entire tree, suggesting that they are the "most diverse" of the pathotypes, all EAEC O104:H4 strains formed a "distinct clade, with highly conserved core genomes."

The similarity between the Shiga-toxin-encoding strains from the German outbreak and the EAEC O104:H4 strains lacking the Shiga-toxin-encoding phage signals that the incorporation of the phage into the EAEC genome was a "relatively recent event," the authors wrote.

Furthermore, the fact that the outbreak strain lies within the EAEC O104:H4 clade "confirms that the outbreak strain is not a prototypical enterohemorrhagic E. coli strain that has acquired the virulence features of enteroaggregative E. coli."

The researchers found that the outbreak isolate also differs from other EAEC genomes in the number of SPATE (serine protease autotransporters to Enterobacteriaceae) proteases it harbors. While it is unusual for most EAEC genomes to encode more than two SPATEs, the C227-11 genome encodes a combination of three: SepA, SigA, and Pic.

"We speculate that the combined activity of these SPATEs, together with other enteroaggregative E. coli virulence factors, accounts for the increased uptake of Shiga toxin into the circulation," which resulted in the elevated virulence of the strain, the authors wrote.

"This multi-strain sequencing data and analysis significantly increases the amount of scientific information available for the study of this new deadly form of E. coli and has yielded critical insights into its causative agent," David Rasko, an assistant professor at the University of Maryland School of Medicine's Institute for Genome Sciences and a co-author of the paper, said in a statement.

"Our results provide the most complete published genome of this strain to date and highlight the importance of DNA sequencing to understanding how the plasticity of bacterial genomes facilitates the emergence of new pathogens," he added.

The Scan

Transcriptomic, Epigenetic Study Appears to Explain Anti-Viral Effects of TB Vaccine

Researchers report in Science Advances on an interferon signature and long-term shifts in monocyte cell DNA methylation in Bacille Calmette-Guérin-vaccinated infant samples.

DNA Storage Method Taps Into Gene Editing Technology

With a dual-plasmid system informed by gene editing, researchers re-wrote DNA sequences in E. coli to store Charles Dickens prose over hundreds of generations, as they recount in Science Advances.

Researchers Model Microbiome Dynamics in Effort to Understand Chronic Human Conditions

Investigators demonstrate in PLOS Computational Biology a computational method for following microbiome dynamics in the absence of longitudinally collected samples.

New Study Highlights Role of Genetics in ADHD

Researchers report in Nature Genetics on differences in genetic architecture between ADHD affecting children versus ADHD that persists into adulthood or is diagnosed in adults.