NEW YORK (GenomeWeb) – Researchers from the University of Hong Kong have performed a comparative genomic analysis of pre-epidemic and epidemic Zika virus in the hopes of finding out how and why it went from fewer than 20 sporadic cases in Africa and Asia before 2007 to a large epidemic causing neurological conditions in children and adults.
As they reported today in Emerging Microbes & Infections, the researchers analyzed sequences of 24 Zika isolates with complete genome or complete polyprotein sequences available in GenBank, including strains collected from humans, animals, and mosquitoes in Africa, Asia, the Pacific islands, and Latin America between 1947 and 2015. The team also compared the Zika samples to representative genome sequences of other human-pathogenic flaviviruses, such as Spondweni virus, dengue virus serotype 2, Japanese encephalitis virus, West Nile virus, yellow fever virus, and tick-borne encephalitis virus.
They constructed a phylogenetic tree and conducted protein family analysis, among other analyses. "Besides the reported phylogenetic clustering of the epidemic strains with the Asian lineage, we found that the topology of phylogenetic tree of all coding regions is the same except that of the non-structural 2B (NS2B) coding region," the team wrote in its paper. "This finding was confirmed by bootscan analysis and multiple sequence alignment, which suggested the presence of a fragment of genetic recombination at NS2B with that of Spondweni virus."
Phylogenetic analysis of the ten putative structural and NS coding regions showed that the Zika strains were clustered into the African and the Asian lineages, and that the epidemic Zika strains clustered with the Asian lineage strains. "The complete polyprotein sequences of [Zika] within the same lineage are more similar than those of different lineages," the authors wrote. "Notably, there is a change in the tree topology at the NS2B coding region, with a possible recombination occurring between [Zika] and [Spondweni]."
Using all 24 available Zika genome sequences, the researchers found that the Ka/Ks ratios are low, suggesting that all the genes in the Zika genome are likely under stabilizing selection. In comparing the pre-epidemic Asian lineage and the epidemic Zika strains, the team detected 24 amino acid substitutions in the genomes of the latter virus strains, four of which are associated with a change in the hydrophilicity or hydrophobicity of the amino acids.
In comparing pre-epidemic African lineage virus strains and the epidemic Zika strains, the team further detected 75 amino acid substitutions. Most of these were markers that differentiate the African and Asian lineages of Zika, but 15 of these substitutions are only present in the epidemic Zika strains and not the pre-2007 strains.
Further, the authors wrote, "we have detected a number of amino acid substitutions throughout the genome and a conformational change in the SLI structure at the 3'-UTR of the epidemic [Zika] strain. We have also detected a possible recombination of a NS2B fragment between the Asian lineage of [Zika] and [Spondweni]."
Since mutations in other similar viruses have been associated with changes in virulence, replication efficiency, antigenic epitopes, and host tropism, the authors added, it's important to conduct further studies to ascertain the biological significance of all genomic changes from pre-epidemic to epidemic Zika.