Overestimated mutation rate

At the start of the epidemic in West Africa, the Ebola virus did not change as rapidly as thought at the time. ETH researchers explain why scientists misjudged it at the time.

Scientific evidence at the start of the last major Ebola epidemic in West Africa that suggested the virus would change exceptionally rapidly was probably due to methodological biases. This has been shown by scientists led by Tanja Stadler, a professor at ETH Zurich’s Department of Biosystems Science and Engineering in Basel; they published the corresponding study in the journal external pagePNAS this week.

When Ebola developed into an epidemic in 2014, an international team of scientists estimated that the pathogen’s genome would change on average every 9.5 days, based on virus samples and computer simulations. This estimate is an atypically high rate of change. Normally, the Ebola virus genome only mutates at just under half that speed. The high mutation rate led to fears at the time that if the virus rapidly altered, it could also quickly become more virulent.

However, in later studies, researchers evaluating much larger numbers of virus samples could not confirm the high rate. They showed that when viewed over the whole epidemic, the pathogen only changed at its typical slow speed.

ETH researchers have now shown that the high estimated mutation rates at the start of the epidemic were due to the limited number of virus samples at the time in combination with the computer models used, which calculate the estimates using genetic data from virus samples and from underlying assumptions.

“The smaller the amount of genetic data available for a model, the bigger the influence of the underlying model assumptions on the end result,” explains ETH professor Stadler. She highlights that the early estimates for the Ebola epidemic based on a small dataset were heavily influenced by assumptions, which in retrospect have proven to be inaccurate.

Current computer models, however, do not simplify reality as much as those used a few years ago, and they are less heavily influenced by the underlying assumptions, says Stadler. For example, the new models no longer assume that everyone infected has the same probability of passing on the pathogen to other people; instead, they take into account different population structures. While the new models – some of which are being developed in the Stadler group  – are more complex and require a lot more computation, they provide more accurate results even at the start of an epidemic, when very little genetic data is available. New calculations by the ETH scientists with the genetic data from 2014 show this increase in accuracy.

Reference

Möller S, du Plessis L, Stadler T: Impact of the tree prior on estimating clock rates during epidemic outbreaks, PNAS, 2 April 2018, doi: external page10.1073/pnas.1713314115

JavaScript has been disabled in your browser