NEW YORK (GenomeWeb) – Two independent research teams, led by scientists in the US and Europe, have analyzed Ebola virus genomes from more than 400 patient samples collected during the outbreak in West Africa, allowing them to glean new information about the virus' transmission routes and long-term evolution.
One study, published in Cell today by Pardis Sabeti at the Broad Institute of Harvard and MIT and colleagues, sequenced samples from 232 patients in Sierra Leone, collected over seven months last year, and analyzed them along with 86 previously published Ebola genomes.
The European Mobile Laboratory Consortium, led by Miles Carroll at Public Health England, sequenced 179 patient samples collected in Guinea between March 2014 and January 2015 and analyzed them along with the previously published genomes as well. They published their results yesterday in Nature.
The results of the two studies are quite concordant, according to Shirlee Wohl, a graduate student at Harvard University and one of the lead authors of the US study. "It is encouraging to see that two datasets from two different countries lead us to complementary conclusions about the evolution of the virus," she told GenomeWeb in an email. "This underscores the usefulness of viral whole genome sequencing to understand [the] spread and evolution of Ebola virus during an epidemic."
Both studies, she said, showed that the virus split into two lineages that mixed early during the outbreak and evolved separately later on, supporting the idea that "there was minimal exportation of the virus from Sierra Leone to other countries during the majority of the outbreak."
In addition, the two studies found that the virus changed more slowly over a longer period of time than during a short window of rapid evolution at the beginning of the epidemic, which both groups attributed to purifying selection that weeded out deleterious mutations over time.
To date, the Ebola epidemic in West Africa has resulted in more than 27,000 cases and has claimed more than 11,000 lives. It likely started when a bat transmitted the virus to a boy in Guinea in late 2013. Since then, the virus has since spread only from human to human, mainly in Guinea, Sierra Leone, and Liberia.
To track the origin and transmission of the virus at the genomic level, Sabeti and her collaborators initially sequenced Ebola genomes from 78 patients in Sierra Leone obtained early during the outbreak. They published results from that study last summer.
For their new paper, the researchers sequenced 232 additional Ebola genomes collected between June and December of 2014 and analyzed them together with 86 existing genomic sequences. They did not include 175 Ebola genomes from Sierra Leone that were published by a Chinese team last month but said that those data would probably not have changed their main findings.
The new samples came from two laboratories in Sierra Leone, the Kenema Government Hospital and a US Centers for Disease Control and Prevention lab in Bo. They were sequenced at the Broad Institute, using Illumina technology, and the genomes were assembled and analyzed using a new computational pipeline that is available as open-source software.
The researchers actually attempted to sequence samples from more than 550 patients, but many samples failed because there was a six-month delay in sample shipments that led to the degradation of nucleic acids in the samples. The delay was caused by the introduction of new virus inactivation protocols, required because of heightened scrutiny from US agencies "at a time when safety protocols were not uniformly standardized," Daniel Park, another lead author of the study and a Broad Institute researcher, told GenomeWeb.
A future goal, the authors wrote, is to learn which deactivation protocols are best suited for high-quality genome sequencing and to establish sequencing capabilities in outbreak countries, so no Ebola samples would need to be shipped abroad. Researchers actually recently started sequencing Ebola genomes locally, including in Liberia, Sierra Leone, and Guinea.
By tracking Ebola genomes over a seven-month period, "we're beginning to see signs of what we might expect when a virus is allowed to evolve over the long term in the human population," Wohl said. For example, mutations in potentially damaging protein-coding regions that accumulated early during the outbreak are starting to disappear.
Also, there are signs of long-term infection, for example, mutations in viral cell surface proteins that might interact with the human immune system, and Ebola genome editing by the human host. "The important takeaway message is that the virus is beginning to evolve in the manner of an endemic human pathogen," Wohl said. "We need to shut that down immediately."
The important takeaway message is that the virus is beginning to evolve in the manner of an endemic human pathogen.
Another important finding of the study is that after two viral lineages were initially introduced to Sierra Leone in May or early June of 2014, there was little border crossing of Ebola between Sierra Leone and its neighbors, according to Wohl. "This information confirms that border control was effective at stopping the virus from crossing those country borders," she said. A third lineage, derived from one of the two initial ones, emerged in Sierra Leone in mid-June and became dominant in that country, accounting for 97 percent of the genomes analyzed in the study.
The fact that all viruses in Sierra Leone seem to derive from the same origin also suggests that the virus is only transmitted between humans, and no further infections from animals contributed to the epidemic, Wohl said.
The Broad researchers continue to sequence additional samples in the US, she said, both samples from the Kenema Government Hospital and samples that initially failed because of their low quality. "We hope that these additional samples, in conjunction with samples collected and sequenced by other groups, will fill in our understanding of transmission and spread of the virus."
One thing the group is adamant about is the early release of Ebola genome data prior to publication. "As we generated our data, we made it immediately available," Wohl said, and the data were used by others working on diagnostics, vaccines, therapies, and Ebola surveillance. Lack of data from the Ebola outbreak "is a tremendous problem," she said, in particular the lack of clinical data.
"There's some heterogeneity amongst those who are sequencing this virus as to how quickly data is released," Park said. "In the future, it will be important to set up systems that incentivize this."
The European Mobile Laboratory Consortium team, for its study, sequenced 179 patient samples processed by its laboratory in Guéckédou, Guinea, all collected within an 11-month period since March 2014 and sent back to Europe for sequencing. Most samples came from patients in Guinea, with a few from Sierra Leone and Liberia, and the patients chosen for sequencing had a relatively high viral load and about 80 percent mortality. Like the US group, the researchers included previously published Ebola genomes in their analysis.
Phylogenetic analysis showed an initial lineage from early cases in Guinea in March 2014. This lineage stayed largely confined to Guinea and "was almost successfully contained in May 2014 by the intervention of the multi-agency response," the authors wrote.
In May and June, a second lineage emerged, which spread into Sierra Leone, Liberia, and further into Guinea and was responsible for the large epidemic in these three countries that persists today.
The analysis, though retrospective, is "a really accurate way to look at virus spread because we have location data, time data, and outcome data," lead author Miles Carroll told GenomeWeb earlier this month.
The data showed that the outbreak was initially "relatively well contained" in Guinea, he said, but before it could be controlled completely, someone carried the virus to Freetown in Sierra Leone, where it caused an explosion of cases in May and June. It then traveled back from Freetown to Guéckédou in Guinea, causing "a much bigger second wave of outbreak, which killed many more people than the first," he said.
Another objective of the study was to understand the mutation rate of the virus, because "there were lots of scare stories in the media that it was mutating twice as quickly as previous outbreaks," he said, which turned out to be wrong. His team also found no correlation between increases or decreases in mortality with particular virus clusters.
The researchers also looked for mutations in the viral glycoprotein, which is the target of vaccines and immunotherapies, and while there were some, the protein appears to be relatively stable, which "is good news for vaccine developers," Carroll said. They also found mutations in the viral polymerase, the target of a drug candidate, "but we need to test those mutations to see if they confer resistance to the drug," he said.
In the meantime, the EMLab has set up a sequencing laboratory in Guinea that uses Oxford Nanopore's MinIon technology and tracks down the transmission routes for the last Ebola cases.