Phylogenomics and Spatiotemporal Dynamics of Bovine Leukemia Virus Focusing on Asian Native Cattle: Insights Into the Early Origin and Global Dissemination

Bovine leukemia virus (BLV), the causative agent of enzootic bovine leukosis, is currently one of the most important pathogens affecting the cattle industry worldwide. Determining where and in which host it originated, and how it dispersed across continents will provide valuable insights into its historical emergence as the cattle pathogen. Various species in the Bos genus were domesticated in Asia, where they also diversified. As native cattle (taurine cattle, zebu cattle, yak, and water buffalo) are indigenous and adapted to local environments, we hypothesized that Asian native cattle could have harbored BLV and, therefore, that they were important for virus emergence, maintenance, and spread. In this study, phylogeographic and ancestral trait analyses—including sequences obtained from Asian native cattle—were used to reconstruct the evolutionary history of BLV. It was shown that, since its probable emergence in Asia, the virus spread to South America and Europe via international trade of live cattle. It was inferred that zebu cattle were the hosts for the early origin of BLV, while taurine cattle played the significant role in the transmission worldwide. In addition, the results of positive selection analysis indicate that yak had a substantially minor role in the transmission of this virus. In this study, endogenous deltaretrovirus sequences in bats, collected in Asian countries, were also analyzed on whether these sequences were present in the bat genome. Endogenous deltaretrovirus sequences were detected from bat species endemic to specific regions and geographically isolated for a long time. Endogenous deltaretrovirus sequences from these geographically isolated species represent ancient exogenous deltaretroviruses distributions. The phylogenetic analysis revealed that these newly obtained endogenous deltaretrovirus sequences were closely related to those of BLV from Asian native cattle, indicating that BLV-related ancient deltaretroviruses circulated in Asia long before the emergence of BLV. Together, our analyses provide evidence for origin and spatiotemporal dynamics of BLV.

Bovine leukemia virus (BLV), the causative agent of enzootic bovine leukosis, is currently one of the most important pathogens affecting the cattle industry worldwide. Determining where and in which host it originated, and how it dispersed across continents will provide valuable insights into its historical emergence as the cattle pathogen. Various species in the Bos genus were domesticated in Asia, where they also diversified. As native cattle (taurine cattle, zebu cattle, yak, and water buffalo) are indigenous and adapted to local environments, we hypothesized that Asian native cattle could have harbored BLV and, therefore, that they were important for virus emergence, maintenance, and spread. In this study, phylogeographic and ancestral trait analysesincluding sequences obtained from Asian native cattle-were used to reconstruct the evolutionary history of BLV. It was shown that, since its probable emergence in Asia, the virus spread to South America and Europe via international trade of live cattle. It was inferred that zebu cattle were the hosts for the early origin of BLV, while taurine cattle played the significant role in the transmission worldwide. In addition, the results of positive selection analysis indicate that yak had a substantially minor role in the transmission of this virus. In this study, endogenous deltaretrovirus sequences in bats, collected in Asian countries, were also analyzed on whether these sequences were present in the bat genome. Endogenous deltaretrovirus sequences were detected from bat species endemic to specific regions and geographically isolated for a long

INTRODUCTION
The rapid growth of the global cattle industry has been associated with changes in the spatial dynamics and types of pathogens in cattle population. Over the years, the domestication of wild cattle and/or the cross-breeding of indigenous animals with domesticated ones, which affect disease dynamics, resulting in emergence of pathogens. Moreover, the increasing global trade of live animals and selective breeding also elevate the risk of dissemination of pathogenic viruses (Perry et al., 2013).
Bovine leukemia virus (BLV) is one of the emerging pathogens in cattle population, and it belongs to the deltaretrovirus genus, Retroviridae family (Aida et al., 2013). BLV causes enzootic bovine leukosis (EBL), which is the most common neoplastic disease with B-cell lymphosarcoma in cattle. Although only a small fraction (3∼5%) of BLV-infected cattle develop lymphosarcoma, EBL is responsible for significant economic losses in dairy and beef cattle farms worldwide. EBL is listed as a disease of importance to international trade by OIE (the World Organization for Animal Health) (Ott et al., 2003;Benitez et al., 2020).
EBL in cattle was first reported in Germany in 1878 (Boeke and Stoye, 1997). The disease gradually spread from the initial endemic area of Memel in East Prussia (now Klaipeda, in Lithuania) along the coasts of the Baltic Sea by approximately 1920 (Sihvonen, 2015). During the two world wars, massive international trades of cattle probably contributed to the spread of BLV across Europe and other countries (Bendixen, 1963). In the mid-twentieth century, EBL had been reported in most cattle raising countries. BLV infections in Western Europe have been eradicated and remain free of infection to this day through continuous surveillance and eradication programs (OIE, 2021). However, BLV is still common in Canada, the United States, and many countries in eastern Europe, South America, and Asia (Polat et al., 2017). Previous detailed phylogenetic studies have indicated that BLV can be divided into at least 10 genotypes associated with different geographic distributions (Polat et al., 2017). For example, genotype 1 is found all over the world, genotype 4 is mainly found in Europe and Russia, and genotypes 6 and 10 are mainly found in Asia (Polat et al., 2017). As cellfree viruses are unstable, close physical contact and exchange of infected lymphocytes, or body fluids, are required for BLV transmission (Constable et al., 2017). Therefore, the worldwide distribution of BLV genotypes between distant geographic locations has been probably driven by the virus spreading through the transport of live infected animals.
Given the wide geographic distribution of BLV, little is known about the place or region where BLV originated. Europe is considered as the historical origin of EBL, however, while taurine cattle (Bos taurus) was introduced to Europe during the Neolithic period (8,000 years ago), the earliest record of the symptoms associated with BLV dates back to only 140 years ago (Scheu et al., 2015). This historical context raises the possibility that the emergence and rapid spread of BLV in taurine cattle populations occurred recently. BLV predominantly infects taurine cattle, but a number of studies reported that it also infects Asian native cattle, such as zebu (B. indicus), yak (B. grunniens) and, sporadically, water buffalo (Bubalus bubalis) (Ma et al., 2016;Wang et al., 2018). However, BLV dissemination routes relative to roles each host species played remain unknown.
Most agree with the thoughts that bats are reservoirs of emerging viruses. It was shown that bats could harbor exogenous retroviruses (Hayward et al., 2020). In fact, bats have been reservoirs for emerging viral diseases, including Hendra virus, Nipah virus, Marburg virus and Ebola virus, severe acute respiratory syndrome (SARS) virus as well as the current pandemic SARS-CoV-2 (Irving et al., 2021). Therefore, bats could be considered as the comparative phylogeography of widespread, co-distributed species, providing unique insights into regional biodiversity and diversification patterns (Roberts, 2006). Previous studies have reported that endogenous deltaretrovirus sequences are present in the bat genome (Farkasova et al., 2017;Hron et al., 2019). Although bat species are distributed worldwide, some species are endemic to specific regions and geographically isolated for a long time (Millien-Parra and Jaeger, 1999). Endogenous deltaretrovirus sequences from these geographically isolated species represent ancient exogenous deltaretroviruses distributions.
The aim of this study was to gain the information on the area, original host, and dispersal route of BLV, focusing in particular on Asian native cattle. In the present study, BLV infections were epidemiologically analyzed, and BLV proviral sequences were obtained from extensively collected samples of Asian native cattle. Bayesian phylogeographic techniques were used to infer the origin and dispersal routes of BLV. We then combined geographic distribution of bat species and phylogenetic relationships of bat endogenous deltaretrovirus sequences with BLV to further validate Asian origin of BLV. The data generated through this study provide evidence that BLV was transmitted from Asian zebu cattle to taurine cattle and was then disseminated throughout the world.

Detection of Proviral Genome and Analysis of the Bovine Leukemia Virus Genetic Lineage in Native Cattle
To analyze the epidemiological distribution and obtain the BLV sequences from Asian native cattle, archived field samples from 256 zebu cattle (B. indicus), 16 native taurine cattle (B. taurus), 268 yaks (B. grunniens), and eight water buffaloes (Bubalus bubalis) collected in 11 Asian countries were used. In addition, samples from 37 zebu cattle in Madagascar were also analyzed. Among these samples, the BLV proviral DNA was detected in zebu cattle samples obtained from seven countries (Bhutan, Cambodia, Laos, Myanmar, Philippine, Vietnam, and Madagascar), native taurine cattle samples from Republic of Kazakhstan, and yak samples from three countries (China, Nepal, and Pakistan), while the proviral DNA was not detected in water buffalo samples (Table 1). These results indicate that BLV is widely distributed in Asian native cattle population (within the Bos genus).

Phylodynamic Analysis of Bovine Leukemia Virus From Asian Native Cattle
To gain insights into the temporal and spatial dissemination of BLV in Asian native cattle, the full length BLVgp51 sequences were analyzed using a Bayesian Markov Chain Monte Carlo (MCMC) phylogeographic approach (Figure 2 and Table 2). The maximum clade credibility (MCC) tree revealed that BLV sequences were segregated into two reciprocally monophyletic clusters, lineage I and lineage II, in 1791 (95% HPD: 1640-1889) and 1817 (95% HPD: 1680-1912), respectively (posterior probability = 1). BLV sequences from most zebu cattle and all yak samples obtained in Asian countries fell into lineage II, while sequences from taurine cattle samples from all over the world fell into lineage I.
To estimate where and which host BLV originated, ancestral traits at all internal tree nodes were reconstructed by a parsimony-based algorithm. The results of the analysis using sampling locations as terminal taxa showed that BLV sequences obtained from Asia reconstructed at the BLV root in 100% (node ID: A in Figure 3B), suggesting that BLV originated in Asia. Similarly, the results of the analysis using host species as terminal taxa showed that zebu cattle reconstructed at the BLV root in 100% (node ID: A in Figure 3A), suggesting that the ancestral host of BLV is zebu cattle. Overall, these results indicate that BLV was likely to have originated in Asian zebu cattle.

Positive Selection Analysis of Bovine Leukemia Virus Sequences in Each Lineage
To investigate whether lineages I and II appeared because of viral adaptation, BLV sequences from each lineage were separately analyzed for positive selection by using the CODEML site test in the Phylogenetic Analysis by Maximum Likelihood (PAML) program package. The non-synonymous (dN) and synonymous (dS) substitution rates of BLV envelope sequences were calculated. The Bayes empirical Bayes calculation of posterior probabilities in PAML identified five amino acids (121, 281, 290, 291, and 301) as being under positive selection with posterior probability >0.50 in the BLV sequences of lineage II ( Figure 4A). The selected sites likely correspond to predicted BLV receptor binding sites ( Figure 4A). Among the sequences classified in linage II, variants in three positively selected sites were only detected in BLV sequences from yaks ( Figure 4B). No evidence of selection was detected for lineage I (data not shown).

Identification of Major Bovine Leukemia Virus Migratory Routes
To analyze BLV migration pathways in detail, the sampling locations were divided into 11 geographic areas ( Figure 5A). In the results of the root state posterior probabilities (RSPP) of the tree location files obtained from BEAST analysis, Southeast Asia showed the highest RSPP value, indicating that the BLV progenitor is likely to have emerged in Asia (PSPP = 0.277, Figure 5B). Using the discrete phylogeographic model, the BLV dataset analysis identified a total of 46 well-supported transmission routes through the Bayes factor (BF) test using a cutoff value of three (Table 3). Based on the results of the ancestral trait reconstruction (Figure 3B), following the initial emergence of BLV in Asia, the progenitor of BLV lineage I reached Europe in 1824 (95% HDP, 1697-1912, node ID: C), and it subsequently reached South America in 1850 (95% HDP, 1744-1923, node ID: D). Then, the spread of BLV from South America to the United States and Mexico began in 1942 (95% HDP, 1913(95% HDP, -1972 node ID: D) (Figure 3 and Table 2). Potential transcontinental BLV spreads were also detected from North America to Japan, from Europe to Russia, and from Southeast Asia to China in 1950 (Table 3 and Figure 3, summarized in Figure 5C). The Bayesian skyline plot analysis indicated that the effective population size significantly increased during the mid-1900s, followed by a slight decline after the 2000s ( Figure 5D).

Endogenous Deltaretrovirus Sequences From Asian Bats
To obtain information about the long-term evolution of BLV in Asia, archived bat samples, which were collected in Asian countries were analyzed for the presence of endogenous deltaretrovirus sequences in their genome. Specifically, the Frontiers in Microbiology | www.frontiersin.org FIGURE 1 | Maximum likelihood (ML) phylogenetic tree of BLVgp51 partial sequences. ML phylogenetic tree of the BLVgp51 partial sequences (444 bp) derived from 699 bovine leukemia virus (BLV) sequences. The tree was constructed using the BLVgp51 sequences from 13 native cattle in this study and 686 sequences from known BLV. The year and the location of sequence data collected were obtained from the GenBank database. Genotypes 1 through 10 are indicated by the letter "G." Native cattle are indicated by large circles, with zebu cattle and yak being identified by blue and gray colors, respectively; the sequences obtained in this study are surrounded by a black border. Taurine cattle are indicated by a small circle, and the sequences obtained in this study are represented by a white circle.  The node abbreviations "G1-G10" and "A-I" indicate the BLV genotypes and the node ID, respectively, based on the coalescent event (node ID; Table 2). The sequences obtained in this study are marked by a black circle.
FIGURE 3 | Time scale phylogenetic tree and ancestral state reconstruction. Ancestral states of hosts (A) and locations (B) in bovine leukemia virus (BLV) were estimated using the maximum parsimony method in Mesquite using the MCC tree. On the left, the node pie-charts and branch colors for the host species indicate taurine cattle, zebu cattle, and yak, respectively; on the right, the node pie-charts and branch colors for the locations show Asia, South America, Europe, United States/Mexico, and Africa, respectively. The posterior probability of the MCC tree is indicated by the size of the black circle at each node. The node abbreviations (A-I) indicate the node ID based on the coalescent event (node ID; Table 2).
DNA samples of 41 bat species from eight bat families found in 10 Asian countries were examined (  -Parra and Jaeger, 1999). Moreover, the ML phylogenetic analysis using partial LTR sequences obtained from the analyzed samples revealed that endogenous deltaretrovirus sequences obtained from bats endemic to Asia were closely related to the BLV LTR of lineage II (Figure 6).
As some endogenous deltaretrovirus positive bat species were diverged and distributed in Asia during the late Miocene to Pliocene, these results indicate that BLV-related ancient deltaretroviruses circulated in Asia long before the emergence of BLV (Figure 6).  : 48, 50, 77, 98, 99, 112, 120, 135, 136, 187, 292, and 298). (B) The amino acid composition of the positive selection of zebu cattle and yak in BLV lineage II is visualized in Weblogo.

DISCUSSION
Bovine leukemia virus is highly prevalent in cattle worldwide, however, little is known about its emergence and evolutionary history. This in part reflects the fact that our current understanding of BLV diversity comes almost exclusively from domesticated, phenotypically and genetically selected taurine cattle. Native cattle are highly adapted to local environments and resources and are not so often exported to other countries compared to the major commercial cattle breeds (Ajmone-Marsan et al., 2010). Therefore, it was hypothesized that Asian native cattle could have harbored BLV, causing them to be important for virus emergence, maintenance, and spread (Chen et al., 2010;Qiu et al., 2012;Decker et al., 2014). This study represents the first analysis of genetic diversity of BLV in a wide range of native cattle population in Asia, and the first application of Bayesian phylogeographic analysis to gain new insights into the epidemiological and spatiotemporal dynamics of BLV. Together with the results from endogenous deltaretrovirus sequences obtained from the genome of indigenous Asian bats, these data provide evidence for origin and evolutionary dynamics of BLV. The results of the phylodynamic and ancestral trait reconstruction analyses reveal that progenitor of BLV originally circulated in Asia ( Figure 3B). This conclusion was also supported by the high RSPP value for Southeast Asia obtained from the BEAST analysis (RSPP = 0.227 for Southeast Asia; Figure 5B). Although our results are not consistent with the report of the first EBL cattle in Germany, Asia is still a potential geographic region where the emergence and maintenance of BLV could have occurred (Bendixen, 1963). BLV exhibits a relatively narrow host range and predominantly infects the Bos genus. Not only taurine cattle, but also diverse Bos species such as zebu, banteng, and gaur originated and were domesticated in Asia (Fuller, 2006;Melletti and Burton, 2014; Perez-Pardal et al., 2018). These species have been extensively cross-bred with other species of the Bos genus through the process of adaptation. This historical background may explain why the phylodynamic model inferred that Southeast Asia was the ancestral location of BLV origin. Further studies, including surveillance of diverse Bos species using a wide range of samples, will be needed to determine the more detailed geographic origin of BLV.
The results of ancestral trait reconstruction analysis reveal that zebu cattle were the source of BLV introduction into the taurine cattle population (Figure 3). This is also supported by the identification of a monophyletic clade, mainly constituted by zebu cattle and yak BLV (lineage II), sharing a MRCA with taurine BLV (lineage I). Lineage II also includes yak sequences, indicating that this animal should also be a source of cross-species transmission to taurine cattle. However, from the results of the Bayesian phylodynamic analysis and ancestral trait reconstruction analysis of lineage II, zebu cattle were reconstructed at the MRCA of the lineage II root in 100% (node ID: H and I in Figure 3A), suggesting that BLV in the yak  Figure 5C).
population is the result of recent cross-species transmission from zebu cattle. These results are further supported by the positive selection analysis, which showed that the BLVgp51 sequences from yak displayed a higher diversity and positive selection sites near predicted receptor binding sites (Figure 4). As the habitat of yak is limited to high altitude areas, it is concluded that the role of this animal played for the transmission cycle may be limited (Qiu et al., 2012).  The results of the Bayesian phylodynamic analysis and ancestral trait reconstruction analysis of lineage I indicated that the ancestral BLV sequence of Asian zebu cattle was introduced to taurine cattle not only in South America, but also in Europe (Figures 3B, 5, and route No. 31 of Table 3). Indeed, in the eighteenth century, zebu cattle were extensively exported from Asia to South America, mainly to Brazil, and were cross-bred with local taurine cattle for genetic improvement (Felius et al., 2014). Our results indicate that this process may have contributed to the spread of BLV in the taurine population. On the other hand, it remains unclear how the BLV ancestor reached European countries, because zebu cattle were not introduced in large numbers to European countries. A possible explanation of the BLV spread in Europe could be that in the eighteenth century, western European countries were directly involved in trade with Asian countries as well as with South America via the Atlantic Ocean. Their major transit ports were used during transit on the route from Asia to South America. As a result, large numbers of live cattle from diverse countries in Asia, South America, and Europe may have temporarily been held in the same port, or in quarantine facilities, causing the risk of increased virus transmission in Europe (Acemoglu et al., 2005). It is known that zebu cattle that were en route to Brazil from India introduced the rinderpest to Belgium via the port of Antwerp (Mammerickx, 2003). After the spread of BLV in Europe, genotype 4 spread to Russia, diversified to genotype 7, and was thus maintained, while it was eradicated in Europe (Polat et al., 2017).
The present analysis, however, identifies taurine cattle as the host responsible for the later dissemination of BLV worldwide over time. This probably began after World War II, and it coincided with a drastic increase of live cattle trade between continents (Ajmone- Marsan et al., 2010). The phylogenetic analysis suggests that the BLV genotype 1 is predominantly distributed worldwide as a pandemic genotype, and its source location is South America (Figure 3, MRCA: D) (Polat et al., 2017). The global dispersal of genotype 1 appears to have occurred in two steps, starting with the widespread export of the virus from South America via the United States to the rest of the world around 1950, followed by local diffusion within the countries where it was introduced (Figure 3). The initial step coincided with the worldwide distribution of the established commercial breeds to many other countries, and the second step coincided with the cross breeding with local populations (Polat et al., 2017). The phylogenetic analysis also demonstrates that the significant increase in genetic diversity observed during the late-1900s, coincided with the increase of international animal trade activities (Figure 5D; Ajmone-Marsan et al., 2010;Felius et al., 2014). This increase in genetic diversity, which occurred worldwide, may be explained by the adaptation of BLV to the local cattle that presented diverse genetic backgrounds (Figure 3). The increase was then followed by a notable decline, which was associated with the successful eradication programs adopted in European countries (Polat et al., 2017).
In this study, endogenous deltaretrovirus sequences, which were closely related to BLV lineage II, were detected in the bat genome of several indigenous Asian species (Table 4). Previous studies suggest that bat species were infected with the ancestors of endogenous deltaretroviruses around the end of the Paleogene and beginning of the Neogene (Hron et al., 2019). The inferred date corresponds closely with the adaptive radiation of several bat clades and species including endogenous deltaretrovirus positive bat species; for example, the tMRCA of the Myotis and the Hipposideros Asian clade is estimated to be 8.8 MYA and 19.9 MYA, and the time of migration and genetic isolation of the Japanese population of Rhinolophus nippon is estimated to be 0.5 MYA (Table 4; Ruedi et al., 2013;Amador et al., 2018;Ikeda and Motokawa, 2021). On the other hand, molecular dating estimates in previous studies have suggested that the tribe Bovini rapidly diversified into the three subtribes-Bovina, Bubalina, and Pseudorygina-during the late Middle Miocene (around 13 MYA) (Melletti and Burton, 2014). As these ancestors of bats and Bovini subtribes are currently more diversified in Asia, it can be assumed that ancient exogenous deltaretroviruses would circulate, endogenized to bat genome and would be maintained as they evolve at the host mutation rate of evolution. The LTR sequences of endogenous deltaretrovirus were also detected from bat endemic to Ryukyu islands. Ryukyu islands became isolated from mainland during the whole Quaternary (Millien-Parra and Jaeger, 1999;Yamada, 2017). It is possible that transmission of ancient deltaretrovirus occurred directly between animals of the tribe Bovini and bat species or between hosts via an intermediate host, or were transmitted to both hosts from unknown host. Regardless of the route of transmission, having the result of our study as a presumption that ancient exogenous deltaretrovirus were circulated in Asia before the isolation of Ryukyu islands, we can assume that BLV-related ancient deltaretroviruses in the past would circulate in Asia long before the emergence of this virus. Interestingly, while human T-lymphotropic virus (HTLV) was inferred to have emerged as a result of cross-species transmission of simian T-lymphotropic virus (STLV) from animal reservoirs in Africa, and several previous studies pointed out that STLV originated in Asia. Therefore, these observations suggest that this region is likely to be the center of origin of deltaretroviruses (Slattery et al., 1999;Reid et al., 2016;Afonso et al., 2019). Further analysis is needed to detect other natural hosts of BLV, because the infection rate in zebu cattle is relatively low for the maintenance of deltaretrovirus as a reservoir (Mwenda et al., 1999;Takemura et al., 2002;Traina-Dorge et al., 2005;Jegado et al., 2019).
This study has several limitations, as the phylodynamic models were constructed using BLVgp51 sequences, and the evolutionary history of other gene sequences was not included. Moreover, the host species of most of the analyzed sequences collected before 2000 were represented by taurine cattle. Although the topologies of ML phylogenetic tree of four BLV open reading frames were mostly compatible with each other, and it was shown to be appropriate for the trait analyses conducted in our previous study, the analysis of full length genomes from various species collected on different years could also provide basic information to deepen the long history of viral evolution (Ohnuki et al., 2021).
In summary, by analyzing BLV sequences from Asian native cattle and endogenous deltaretrovirus sequences from indigenous Asian bat species, it was inferred that ancient BLV emerged and circulated in Asia long before it was detected in Europe in the late 1800s. The potential cross-species transmission of BLV to taurine cattle, possibly through Asian zebu cattle, is likely to have occurred around the late 1800s in South America and Europe. These results can not only elucidate the mechanisms of BLV dissemination worldwide, but also provide new insights into the evolution of deltaretroviruses.

Bovine Sample Collection
Blood samples were collected from 256 zebu cattle in six Asian countries. Each sample was unrelated and taken from one head per farm. The detailed sampling method has been described in our previous work (Lwin et al., 2018b). In addition, samples from 37 zebu cattle in Madagascar were also collected, because this population is closely related to Asian zebu populations (Table 1; Kaufman, 2008). Blood samples were also collected from 16 taurine cattle (Bos taurus) in the Republic of Kazakhstan. Blood samples and skin tissue samples from 268 yaks (Bos grunniens) and 16 water buffaloes (Bubalus bubalis) were collected in four countries (Kyrgyzstan, China, Pakistan, and Nepal) (Guo et al., 2006;Qiu et al., 2012). The DNA extraction procedure and the information on samples have been included in our previous study (Yonesaka et al., 2016;Lwin et al., 2018a,b).

Detection of Bovine Leukemia Virus Proviral DNA and Sequencing Analysis
A 1304 bp fragment of the complete BLVgp51 gene was amplified by PCR using KOD FX Neo (TOYOBO, Osaka, Japan) with following primers (Forward Primer 5 -AGA TGG GAG CTA CAC CAT TCA-3 , Reverse Primer 5 -CAC AGA GGC CAC ATT AAG A-3 ) to detect BLV proviral sequences in the genome of zebu cattle, taurine cattle and water buffaloes. The following cycling conditions were used: 94 • C for 2 min, followed by 40 cycles of denaturation at 98 • C for 10 s, and extension at 68 • C for 60 s. The amplified PCR fragments were purified by QIAquick gel extraction kit (QIAGEN, Hiden, Germany), and were then sequenced (FASMAC, Atsugi, Japan). The sequences included a 903 bp sequence coding the complete BLV envelope GP51 region, corresponding to nucleotide positions 4826-5738 on the pBLV903 genome (GenBank accession number EF600696). The obtained sequences were analyzed in Sequencher TM 5.2.4 (GeneCodes, Michigan, United States).
A 198 bp fragment of the LTR of the BLV proviral genome in yak was amplified by nested PCR using PrimeSTAR GXL DNA polymerase (Takara, Shiga, Japan) with the following primers: BLTR256F (5 -GAG CTC TCT TGC CCG AGA C-3 ) and BLTR453R (5 -GAA ACA AAC GCG GGT GCA AGC CAG-3 ) for first PCR, and BLTR306F (5 -GTA AGG CAA ACC ACG GTT T-3 ) and BLTR408R (5 -AGG AGG CAA AGG AGA GAG T-3 ) for second PCR, as previously described (Polat et al., 2015). The following cycling conditions were used: 45 cycles of denaturation at 98 • C for 10 s, annealing at 60 • C for 15 s, extension at 68 • C for 30 s.

The Maximum Likelihood Phylogenetic Tree
A total of 699 BLVgp51 partial sequences (444 bp) as well as collection date and location data were retrieved from GenBank. These sequences were aligned with 13 sequences obtained in this study using MEGA7 software (Kumar et al., 2016). The best-fitting nucleotide substitution model was selected using the Akaike information criterion corrected (AICc). A ML phylogenetic tree of the BLVgp51 sequences was inferred by applying the Kimura 2-parameter model plus gamma distribution (K2+G) model. The reliability of the phylogenetic relationships was evaluated by non-parametric bootstrap analysis with 1,000 replicates.

Phylogenetic Reconstruction and Phylodynamic Analysis of Bovine Leukemia Virus
To analyze the evolutionary history of BLV, the time to the most recent common ancestor (tMRCA) and the effective population size were estimated by employing the Bayesian MCMC approach in BEAST v 2.4.8 (Bouckaert et al., 2014). Coalescent theory has been applied to phylogenetic methods to reconstructing the epidemic history of viruses, providing estimates of origins, dating of common ancestors, and population dynamics. The best nucleotide substitution model for the Bayesian MCMC-derived phylogenetic tree was selected based on the Bayesian information criterion (BIC) values in MEGA7 (Kumar et al., 2016). The Hasegawa-Kishino-Yano (HKY) model that showed the lowest BIC value was used for this analysis. The temporal scale of the evolutionary process was inferred from the sampling dates of the sequences using relaxed clock models (exponential and lognormal relaxed clocks) and four demographic models (two parametric priors, coalescent constant population and coalescent exponential population; two non-parametric priors, coalescent extended skyline and coalescent Bayesian skyline). Therefore, the best fitting demographic models were selected based on the BF test and Akaike's information criterion for MCMC (AICM) values using Tracer v1.5 (Rambaut and Drummond, 2007). The BF and AICM analyses also showed that the exponential relaxed clock was better than the lognormal one in fitting the data. Also, under the exponential relaxed clock, the BF and AICM analyses showed that the coalescent exponential population fitted the data better than other demographic models did. For the Bayesian MCMC analysis, HKY+exponential relaxed clock+coalescent exponential population were selected. A total of 156 BLVgp51 sequences (sequence length, 502-903 bp) were used for the analysis in BEAST. First, the sequences with collection date, country, and host information were obtained from the database. A 75% random sampling was performed for countries with more than 10 sequences registered, because the BLV database was biased toward many closely related sequences in the same country. Then, Bayesian MCMC simulations for 400 million cycles, sampling every 10,000 states, and 10% burnin were applied to infer the BLV evolutionary phylogenetic tree under the best fitting model. The convergence was assessed by estimating the effective sample size (ESS) (>200) of a parameter sampled from the Bayesian MCMC using Tracer v 1.5 (Rambaut and Drummond, 2007). The maximum clade credibility (MCC) tree was constructed using TreeAnnotator v1.8.0 (Rambaut and Drummond, 2016), and finally, the MCC tree was generated in FigTree v 1.4.4 (Rambaut, 2019).

Ancestral State Reconstruction
To study the evolutionary history of BLV, ancestral state reconstruction analyses were performed using Mesquite v 3.61 (Maddison, 2016). The BLVgp51 MCC tree previously estimated in this study (Figure 2) was used for the input data. The ancestral states of each BLV in relation to its host and geography were reconstructed using a parsimony approach. The minimum number of trait changes along the tree that are necessary to explain the present host state and sampling location associations at the tree tips were calculated to estimate the evolutionary process.

Positive Selection Analysis
Of the 156 BLVgp51 sequences (903 bp) used in this study, 63 were classified as lineage II, and were used to test the positive selection. The phylogenetic trees were reconstructed by employing the Bayesian MCMC method in BEAST v 2.4.8, and the resulting trees were then used for detecting positive selection. For the detection of positive selection pressure, the analysis was performed with the PAML 4.8 site model. Amino acid sites under positive selection pressure were identified by posterior probabilities through the Bayes empirical Bayes (BEB) method (Yang, 2007). The selected sites were compared to those obtained from the previous study of predicted BLV receptor binding sites (Corredor et al., 2018).

Spatial Phylogenetic Reconstruction of Evolutionary Dynamics
The global geographic origins of BLV and its significant dispersal routes between infected countries were inferred using discretestate ancestral reconstruction methods in BEAST v 2.4.8. Each BLV isolate was assigned an individual location state, based on the country/region of the sample location data. Diffusions among the location states were modeled using an asymmetric model, with Bayesian stochastic search variable selection. The same settings as those described for the Bayesian phylogenetic dating analyses above were used. The resulting diffusion rates were then used to calculate BF in SPREAD (Bielejec et al., 2011). The migration pathways were interpreted as "supported" when the BF was ≥ 3. To minimize the impact of sampling bias, countries with a sample size of 10 or more were kept below 20% of the sample.

Detection of Bat Endogenous Deltaretrovirus and Its Relationship to Bovine Leukemia Virus
To assess the presence of endogenous deltaretrovirus sequences, DNA or tissue samples obtained from 46 bats originated in Asia-comprising 41 species and eight families were analyzed. The DNA was extracted by phenol chloroform method. The LTR and gag regions of endogenous deltaretroviruses were amplified by PCR using PrimeSTAR GXL DNA polymerase (Takara, Shiga, Japan) (information on the specific primers used is included in Supplementary Table 1). The following cycling conditions were used: 40 cycles of denaturation at 94 • C for 10 s, annealing at 60 • C for 15 s, and extension at 68 • C for 45 s. The amplified fragments were purified by QIAquick gel extraction kit (QIAGEN, Hiden, Germany), and were then sequenced (FASMAC, Atsugi, Japan). All amplicons were sequenced directly, and sequences with ambiguous positions excluded from further analysis. To determine the evolutionary relationships of the obtained sequences among endogenous and exogenous deltaretroviruses, the LTR sequences were aligned using ClustalW in MEGA7, and a phylogenetic tree was estimated using the ML method with the K2+G model of nucleotide substitution. To evaluate the robustness of each node, bootstrap resampling analysis was performed with 1,000 replicates.

DATA AVAILABILITY STATEMENT
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found in the article/ Supplementary Material.

ETHICS STATEMENT
The animal study was reviewed and approved by Kobe University Animal Experimentation Regulations and the Animal Research Committee at Tokyo University of Agriculture, and we confirm that all experiments were performed in accordance with the committee's guidelines and regulations. Written informed consent was obtained from the owners for the participation of their animals in this study.