NEW YORK (GenomeWeb News) – A comparative genomics study is providing new insights into the processes that lead to enterohemorrhagic forms of Escherichia coli.
Japanese researchers sequenced the genomes of three enterohemorrhagic E. coli, or EHEC, strains and compared these newly sequenced genomes with the genomes of two EHEC O157:H7 strains and 21 strains of Shigella or non-EHEC E. coli.
The research, scheduled to appear online this week in the Proceedings of the National Academy of Sciences, revealed that enterohemorrhagic forms of E. coli have large genomes packed with viral and other sequence not found in other E. coli strains, including some coding for toxins similar to those produced by Shigella. In addition, they found, EHEC strains seem to arise in parallel in E. coli from different strain backgrounds or phylogenies.
"[T]he present study clearly demonstrates how E. coli strains belonging to different phylogenies can independently evolve into EHEC," senior author Tetsuya Hayashi, an infectious disease researcher at the University of Miyazaki, and his colleagues wrote. "The selective forces and special genetic factors … promoting such parallel evolution have yet to be identified, but our results yield unique insights into the dynamic evolution of bacterial complex virulence systems."
EHEC contain toxins, similar to those produced by Shigella, that can cause human infections characterized by diarrhea and abdominal cramps. These infections can lead to more severe conditions such as bloody diarrhea and hemolytic uremic syndrome, a life-threatening condition linked to acute kidney failure.
The 0157:H7 strain is the most common and best-known EHEC strain, but several other EHEC strains have also been identified.
For the latest paper, the researchers sequenced the genomes of three EHEC strains isolated in Japan — O26, O111, and O103 — using whole-genome shotgun sequencing with Applied Biosystems' 3730 xl instruments. They then compared these genomes with two O157:H7 strains, six Shigella strains, and 15 previously sequenced, non-EHEC E. coli strains.
The researchers found that the genomes of the EHEC strains were larger than the other E. coli genomes. While the non-EHEC strains have genomes that are about 5.5 million bases each, the EHEC strains have genomes containing about 400,000 extra bases.
This difference in genome size was partly due to an increase in sequences from prophages, viruses that infect bacteria, integrative elements, and plasmids. The EHEC genomes also contained more protein-coding genes and transfer RNAs than the non-EHEC genome sequenced so far.
Overall, the researchers found that the EHEC strains contained between 98 and 135 copies of 38 new or previously identified sequence elements. Of these, the team found that 15 were shared between all of the EHEC strains.
There were other differences as well. The team grouped the coding sequences in the 25 strains into 12,940 groups sharing at least 90 percent amino acid sequence identity, identifying 1,919 sequences that were conserved in all of the strains.
But when they compared the nearly 13,000 sequence groups found in all of the strains, they found that the sequences from EHEC strains clustered together and were distinct from the non-EHEC clusters.
"These results indicate that the whole-gene repertoires of the EHECs are more similar to each other than to any of the other strains," the authors wrote.
Based on these findings, the researchers suggest EHEC stains have arisen through parallel evolution from a variety of E. coli phylogenies.
"[The non-O157 EHEC] genome sequences provide critical genetic information for developing efficient strategies to control non-O157 EHEC infections," the researchers explained. "[O]ur genomic comparison of these non-O157 EHECs with O157 EHEC and other fully sequenced E. coli/Shigella strains revealed a genetic mechanism underlying the parallel evolution of EHECs."