NEW YORK (GenomeWeb News) – Using an alignment-free phylogeny strategy, a research duo has created new bacterial phylogenies that are based on features in the genome rather than sequences for selected genes.
The researchers used a so-called feature frequency profile approach that looks at how often certain features appear in the genome to develop a pair of phylogenies for Escherichia coli and Shigella species. An evolutionary phylogeny of the bugs focused on core features in the three dozen genomes assessed in the study, while a phenetic phylogeny looked at all available features in these genomes.
"To avoid possible bias in tree building as a consequence of subjective gene selection, we used a method of alignment-free genome comparison that uses FFPs to enable an efficient comparison of whole-genome sequences," corresponding author Sung-Hou Kim, a researcher affiliated with the University of California at Berkeley, Lawrence Berkeley National Laboratory, and Korea's Yonsei University, and co-author Gregory Sims, an informatics and physical biosciences researcher at the J. Craig Venter Institute and LBNL, wrote.
The research, appearing online last night in the Proceedings of the National Academy of Sciences, not only identified some genomic features that correspond to specific phylogroups, but also offers new clues about E. coli and Shigella evolution. For instance, the researchers noted, the phylogenetic data supports the notion that both groups of bacteria descended from a facultative or opportunistic pathogen species.
"If we assume that the ancestral E. coli progenote that first infected primates closely resembles the most basal group in our evolutionary phylogeny, a likely conclusion is that this microbe was not a harmless commensal strain," the authors noted.
Although E. coli and Shigella were originally classified as distinct genera, Sims and Kim explained, evidence collected over the past few decades suggests that they belong to an overlapping group that includes various and sundry species with a range of relationships with humans and other animals.
"The progenitor strains of today's commensal and pathogenic variants were likely present in the primate gut preceding the divergence of the great apes, perhaps greater than 30 [million years ago]," they wrote.
The duo employed the feature-based FFP strategy in the hopes of coming up with new evolutionary insights pertaining to the E. coli/Shigella clade and, more broadly, to explore the utility of this approach.
Using features from 36 sequenced E. coli and Shigella genomes, the researchers put together two phylogenies — an evolutionary phylogeny based on core genome features and a phenetic phylogeny from all identifiable features of sufficient size — and subsequently found sets of features that coincide with the newly defined phylogroups.
These clade-defining features and related bits of the genome "may provide useful information for understanding evolution of the groups and for quick diagnostic identification of each phylogroup," they wrote.
When they compared these phylogenies with those created with alignment-based approaches, the team found that the new phylogenies were mainly consistent with those reported previously, though the FFP approach did identify some distinct groupings and sub-groupings, as well as differences in the order with which certain phylogroups diverged from one another.
As such, the researchers said, the approach provides new clues about bacterial evolution as well as insights into the classification of existing bacterial species.
"Using the FFP method, we are able to examine two aspects of genomic evolution that are highly revealing: the evolution both of core features, which we suggest infers ancestral history of organisms (evolutionary phylogeny), and of the composition of all features, which is likely to reflect phenetic grouping of extant organisms (phenetic phylogeny)," they concluded.