NEW YORK (GenomeWeb) – A research team led by scientists at the University of Maryland School of Medicine's Institute for Genome Sciences has published a paper describing a statistical model that uses genome-wide protein sequence divergence estimates to calculate the evolutionary age of Plasmodium species.
The researchers used the method, which is described in a recent issue of Molecular Biology and Evolution, to estimate the evolutionary ages of seven mammalian Plasmodium species. Their results "indicate that the mammalian-infecting Plasmodium evolved contemporaneously with their hosts, with little evidence of parasite host-switching on an evolutionary scale, provid[ing] a solid timeframe within which to place the evolution of new Plasmodium species," Joana Silva, an assistant professor in U of M's microbiology and immunology department and the IGS, and one of the paper's authors, said in a statement.
"Once the sequence of the nuclear genome becomes available for the new species of Plasmodium recently found in non-human great apes in Africa, it will be very interesting to see what the age estimates for those parasite species tell us in terms of their transmission mode among mammalian hosts," she added.
Identifying the age of Plasmodium species can help investigators understand how malaria is transmitted as well as the likelihood of transmission from animals to human. "We want to determine how common it is to find these parasites being able not only to infect a variety of host species but also [to do it] in a high-enough frequency," Silva told GenomeWeb.
Part of the reason for this, she explained, is that as current research around malaria focuses on eradicating the disease in humans, there are concerns that similar strains to the human versions of the Plasmodium parasite found in great apes could cross over into humans. It's especially concerning because several new diseases that have affected humans in the last decade have been caused by pathogens that originated in animals or animal products. Being able to characterize how species such as Plasmodium evolve is crucial because it provides valuable information about their propensity for cross-species infection.
Simply put, it the parasite's as old as its host, "we can postulate that the most likely transmission is a vertical transmission, but if the parasites are much younger than the host, then there has to be some of these lateral jumps or host switches," Silva said.
However, estimating the timing of the divergence of these parasitic protozoans "remains highly controversial," with multiple studies suggesting a wide range of possible evolutionary ages, the researchers wrote. Some of the methods used in these studies try to infer evolutionary age by "calibrating DNA sequence polymorphism (usually in P. falciparum or in P. vivax) or sequence divergence between species with a substitution rate inferred under the assumption of co-speciation of parasite species and their respective host," the researchers wrote. Other studies "have calibrated divergence with rRNA substitution rates estimated for bacterial or eukaryotic taxa," they said. However, these methods look only at a few genomic loci whose polymorphisms and divergence may be unique and not representative of the entire genome, according to the paper.
There are larger quantities of sequence data now available for some Plasmodium species, and, by extension, far more loci to analyze, which could shed more light on evolutionary age of the species. However, while the aforementioned approach works well on small sets of genes, it would be computationally expensive to apply it to a larger pool, Silva said. As such, she and her team set about creating a more computationally efficient approach that optimally uses molecular data from all genes in the genome and would not be biased towards the characteristics of a few loci.
Their method, which is based on total least squares regression, relies on the assumption that the baseline evolution rate of all genomes being compared is similar and requires that a large proportion of the proteins in the taxa of interest evolve according to a protein-specific molecular clock. Silva explained that if each protein-coding gene in a genome evolves in a clocklike manner — meaning that each gene evolves at a certain rate that remains more or less constant in all the Plasmodium species — then it's possible to compare the rate of protein evolution in a given gene between Plasmodium species and determine when the species diverged from each other, she said.
So if, for example, "you compare that gene between species A and B and the gene differs by 10 percent, and then you look at species C and D and that gene differs by 40 percent, what you can say is that if the rate of protein evolution has been constant, then the species C and D diverged from each other four times longer than species A and B," she said. And "if you see a divergence that's four times as long, it means that the separation happens four times earlier."
For a given genome, the U of M method runs this comparison for all protein-coding genes at the same time.For the method to work, it has to be the case that most of the protein-coding genes in the genome maintain similar evolution rates across species. In the case of Plasmodium, which has 4,000 protein-coding genes, "it needs to be true that most of the 4,000 genes do satisfy that condition of evolving in a clocklike manner," Silva said. And if that's the case, "then you calculate the divergence for all 4,000 genes and ... compare the divergence of those … between species A and B [and] between species C and D ... to determine how old the divergence between one species pair is relative to the other species pair."
For this study, the researchers explored several hundred loci from seven Plasmodium species. They were able to confirm the existence of a molecular clock specific to Plasmodium proteins, and also to estimate the relative age of mammalian-infecting Plasmodium, according to the paper. In the latter instance, they also found "striking" overlap between age of the parasite and its host, Silva said. They estimated, for example, that the age of the divergence between the human and rodent versions of plasmodium is between 65 and 120 million years, which overlaps with the estimated age of divergence between rodents and primates — between 70 and 100 million years, she said.
Some specific findings reported in the paper include that the split between the human parasite P. vivax and P. knowlesi, which comes from Old World monkeys, occurred 6.1 times earlier than the split between P. falciparum and P. reichenowi, which are parasites of humans and chimpanzees respectively. They also found that the split between P. falciparum and P. reichenowi occurred between 3 million and 5.5 million years ago, and that mammalian parasites originated over 64 million years ago — about 22 times earlier than the split between P. falciparum and P. reichenowi.
For their next steps, the researchers are looking to use the method to explore Plasmodium species that were not included in this study, Silva said. They are also looking to apply the method to genomic data from other kinds of creatures. For this study, they used it to analyze data from the Drosophila 12 Genomes project — and identified the same clocklike evolution process in its genes — but going forward, they'll look into analyzing data from organisms such as fungi, Silva said.
The research team responsible for this study included scientists and statisticians from the National Center for Biotechnology Information, and the University of Maryland, College Park's department of applied mathematics and statistics. The project was supported by the National Institute of Allergy and Infectious Diseases and by the Intramural Research Program of the National Institutes of Health's National Library of Medicine.