Phylogeographic Analysis and Identification of Factors Impacting the Diffusion of Foot-and-Mouth Disease Virus in Africa

10 Foot and mouth disease (FMD) is endemic in sub-Saharan Africa and can lead to important and 11 continuous economic losses for affected countries. Due to the complexity of the disease epidemiology 12 and the lack of data there is a need to use inferential computational approaches to fill the gaps in our 13 understanding of the circulation of FMD virus on this continent. Using a phylogeographic approach we 14 reconstructed the circulation of FMD virus serotypes A, O and SAT2 in Africa and evaluated the 15 influence of potential environmental and anthropological predictors of virus diffusion. Our results show 16 that over the last hundred year the continental circulation of the tree serotypes was mainly driven by 17 livestock trade. Whilst our analyses show that the serotypes A and O were introduced in Africa trough 18 livestock trades, the SAT2 serotype probably originates from African wildlife population. The 19 circulation of serotype O in eastern Africa is impacted by both indirect transmission through 20 persistence in the environment and anthropological activities such as cattle movements. the influence of 13 potential environmental and

CC-BY-NC-ND 4.0 International license It is made available under a (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint . http://dx.doi.org/10.1101/358044 doi: bioRxiv preprint first posted online Jun. 28, 2018; Phylogeographic tree reconstruction for serotype A 1 0 1 The reconstructed phylogeographic tree of the African serotype A viruses has a time to most recent 1 0 2 common ancestor (TMRCA) of around 1926(1889.6 -1950, with geographic origin in the 1 0 3 eastern part of Africa and high posteriors probabilities for Kenya (49.83 %) and Ethiopia (35.95 %) 1 0 4 (see fig. 1a). For serotype A there is no clear clade separation between the western and eastern sides 1 0 5 of Africa, as the first isolated clade combines all the western African sequences as well as sequences 1 0 6 from Sudan, Ethiopia and Egypt. Although a few transmissions events are observable between the 1 0 7 two sides of Africa, all of them involve Sudan as a link between them.

0 8
Phylogeographic tree reconstruction for serotype O 1 0 9 The TMRCA of the African serotype O is estimated to be 1937 (1921( -1952  West and Central African countries (Cameroon, Nigeria, Niger and Togo) and seems to originate from 1 1 7 Sudan. Overall, we can see that the situation for the serotype O is quite similar to that of serotype A 1 1 8 with only few observed transmissions between the Eastern and western sides of Africa, with Sudan 1 1 9 acting as a link between the two sides of Africa.

2 0
Phylogeographic tree reconstruction for serotype SAT2 The diversity for serotype SAT2 viruses is much greater than for serotypes A and O, and the TMRCA 1 2 2 is much older, estimated as 1583 (1722 -1440 95% HPD). Due to these long timescales and low 1 2 3 posterior probabilities on the location it is difficult to estimate an origin location (see fig.1c).

2 4
Five geographically defined main clades, with location posterior probabilities above 45% can be 1 2 5 observed. The first clade is exclusively composed of sequences from Botswana, Namibia and 1 2 6 Zimbabwe and seems to have its origin in the first half of the 19 th century. The second clade is 1 2 7 composed of Ethiopian, Kenyan, Ugandan and Tanzanian sequences and seems to originate at the 1 2 8 transition between the 19 th and 20 th century. The third clade seems to have emerged at the end of the 1 2 9 . CC-BY-NC-ND 4.0 International license It is made available under a (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.   CC-BY-NC-ND 4.0 International license It is made available under a (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint . http://dx.doi.org/10.1101/358044 doi: bioRxiv preprint first posted online Jun. 28, 2018;  c . CC-BY-NC-ND 4.0 International license It is made available under a (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint . http://dx.doi.org/10.1101/358044 doi: bioRxiv preprint first posted online Jun. 28, 2018; 9 countries (see supp.  . CC-BY-NC-ND 4.0 International license It is made available under a (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint . http://dx.doi.org/10.1101/358044 doi: bioRxiv preprint first posted online Jun. 28, 2018; 1 0   Predictive factors for FMDV diffusion using a discrete location approach trees. We considered the set of predictors to be 'conductors' -i.e. enhancing viral diffusion, or 1 9 0 'resistors' -i.e. impeding viral diffusion. We observed that the diffusion process was enhanced by 1 9 1 the average daily temperature (BF 4), the logarithm of the cattle density (BF 4) and human densities 1 9 2 (BF 9) (see table 2). It was impeded by the accessibility (BF 8), the distance between sampled 1 9 3 locations (BF 8), average amount of precipitation (BF 7) per year and by the average daily 1 9 4 temperature (BF 7) (for all the results see supp. Table 9 and 10). To gain a better understanding of 1 9 5 the impact of the average temperature and precipitation on the viral diffusion we selected different 1 9 6 thresholds of precipitation and temperature to parametrize our GLM analysis (see supp. table 11 and 1 9 7 12). We detected that low precipitation values (< 80 mL/year) were associated with an impeding 1 9 8 . CC-BY-NC-ND 4.0 International license It is made available under a (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.

Environmental and anthropological factors affecting FMDV
The copyright holder for this preprint . http://dx.doi.org/10.1101/358044 doi: bioRxiv preprint first posted online Jun. 28, 2018; (negative) impact on the viral diffusion processes whereas high precipitation was associated with an 1 9 9 enhancing (positive) effect on the diffusion process. We also observed that in the case of low  by the presence fragmented cropland (BF6). We were not able to detect a predictor with an 2 1 6 enhancing (positive) influence on the diffusion process (see table 2).

1 7
. CC-BY-NC-ND 4.0 International license It is made available under a (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.

2
To gain a better understanding of the role of the fragmented crop and cattle density we isolated the  . CC-BY-NC-ND 4.0 International license It is made available under a (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.   Africa and North-East Africa. The evolution of the SAT2 serotype seems to be quite different.

4 9
According to our results, this serotype seems to have emerged around 1583. Over the 5 main clades 2 5 0 that compose the SAT2 phylogeny only two emerged outside Southern Africa (clades 2 and 5). Those clades. Additionally, these two clades have emerged more recently than those involving Southern 2 5 4 African countries (second part of 19 th century/early 20 th century for the clade 2 and 5 against late 18 th 2 5 5 century/early 19 th century for the other clades). . CC-BY-NC-ND 4.0 International license It is made available under a (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.

4
The SAT2 serotype analysis shows the signs of the impact that the African rinderpest epidemics had 2 6 5 on FMDV circulation in Africa. Although FMDV was first reported in southern Africa in 1795 it likely 2 6 6 had coevolved with buffalo over millennia resulting in a large diverse viral pool, but the rinderpest 2 6 7 epidemic decimated almost all FMDV potential carriers and probably pushed it through a huge 2 6 8 bottleneck 36 . It is thought that FMDV re-emerged from wild buffalo population that survived the

7 9
Using both a discrete and continuous framework, we looked at the effect that diverse environmental . CC-BY-NC-ND 4.0 International license It is made available under a (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.

5
Regarding the effect of the different selected predictors on virus diffusion in a continuous setting, our 2 9 4 results suggest that cattle densities above 125 cattle per km 2 and the presence of cropland (pure 2 9 5 cropland or mixed with other types of land) both have a negative impact on virus diffusion. Ours 2 9 6 results suggest that the virus had difficulties to spread beyond the geographic region located at the 2 9 7 root of the tree, where high cattle densities and low crop densities were present and to spread to 2 9 8 areas with low cattle densities but high crop densities, presumably due to lack of suitable hosts.

9 9
However, it is difficult to know exactly whether it is the cattle or crop density that had the most impact 3 0 0 due to high correlation of the two variables at the time and in the region of origin.

0 1
The location uncertainty found at the root of the continuous tree could explain the differences between 3 0 2 the discrete and continuous methods in estimating the effect of the cattle density on virus diffusion.

0 3
For our analysis, this uncertainty seems to be translated by the SEREPHIM programme as a period 3 0 4 where the virus is almost not moving. This uncertainty seems to drive SEREPHIM to the conclusion 3 0 5 that the high cattle densities found near the origin of the epidemic are related to this lack of movement 3 0 6 and therefore estimate that they have a negative influence on the virus diffusion. Although we suspect 3 0 7 a link between the cattle density and the location of emergence of the analysed clade, we think that 3 0 8 the continuous analysis does not offer the resolution needed to understand that relation (i.e. the 3 0 9 spatial HPD confidence interval is too large). By parameterizing each rate of among-location 3 1 0 movement as a function of predictors, the discrete approach seems therefore more appropriate to 3 1 1 characterise the environmental and anthropological effect of the virus diffusion in this endemic 3 1 2 situation.

1 3
In conclusion, the reconstituted phylogeographical tree pattern for the FMDV serotypes A, O and 3 1 4 SAT2 reflects a situation where the recent FMDV circulation is mainly driven by commercial 3 1 5 exchanges, through pastoral herd movements, and where wildlife seems to have almost no influence 3 1 6 on the intra-continental circulation of the disease. However, the observed patterns for SAT2 reflects a 3 1 7 situation where wildlife (wild buffaloes) constitute the original host of the serotype, whilst the 3 1 8 observations for A and O suggest that those serotypes were imported in Africa at the start of the 19 th 3 1 9 century. We observed that indirect transmission through the environment and direct transmission 3 2 0 through anthropological activities had an enhancing effect on the virus diffusion in Eastern Africa.

2 1
. CC-BY-NC-ND 4.0 International license It is made available under a (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.