Phylogeography of Yersinia ruckeri reveals effects of past evolutionary events on the current strain distribution and explains variations in the global transmission of enteric redmouth (ERM) disease

Phylogeographic patterns and population genetic structure of Yersinia ruckeri, the pathological agent of enteric redmouth disease (ERM) in salmonids, were investigated on the basis of concatenated multiloci sequences from isolates of different phenotypes obtained between 1965 and 2009 from diverse areas and hosts. Sequence analyses revealed genetic differentiation among subpopulations with the largest genetic distance occurring between subpopulations of Europe and Canada and/or South America. Bayesian analysis indicated the presence of three ancestral population clusters. Mismatch distribution displayed signatures characteristic of changes in size due to demographic and spatial expansions in the overall Y. ruckeri population, and also in the geographically separate subpopulations. Furthermore, a weak signal of isolation by distance was determined. A significant positive correlation between genetic and geographical distances was observed. These results revealed that the population of Y. ruckeri has undergone both ancient and recent population changes that were probably induced by biogeography forces in the past and, much more recently, by adaptive processes forced by aquaculture expansion. These findings have important implications for future studies on Y. ruckeri population dynamics, on the potential role of genetic structure to explain variations in ERM transmission, and on the effect of past evolutionary events on current estimations of gene flow.


INTRODUCTION
Studies on the evolutionary history of organisms have benefited from the availability of an increasing amount of data, especially multiple whole genome sequences. This fact has led to more accurate reconstructions of phylogenetic relationships within several bacterial species (Pritchard et al., 2000). Dispersal, geographic isolation, drift processes and selection leave their signature in the pattern of molecular diversity of contemporary populations (Avise, 2000). Thus, sequence data provide direct genealogical information that can be efficiently used to estimate phylogenetic relationships and parameters associated with population dynamics.
Despite the limitations of multilocus sequence typing (MLST) and eBURST analysis for the phylogenetic inferences to determine exact relationships between individual isolates (Castro-Nallar et al., 2015), large MLST data set represent a valuable resource from which population level trends can be obtained. Analysis of allele frequencies can facilitate recognition of distinct populations, and the comparisons of allelic diversity among populations are informative since ancient populations are expected to be more diverse than recent populations (Slatkin and Hudson, 1991;van Gremberghe et al., 2011).
Throughout recorded history, Yersinia ruckeri, the etiological agent of enteric redmouth disease (ERM), has been spread multiple times from the USA, probably by egg or carrier fish transfers, as the culture of rainbow trout (Oncorhynchus mykiss) became more widely practiced in the world (Austin and Austin, 2012). There is a hypothesis that Y. ruckeri could have already existed previously in the USA as suggested by the study of isolates recovered from the National Fisheries Center (USA), which showed some of them to be dated before the first reports of isolation by Rucker in the 1950s (Bullock et al., 1978). In addition, an Australian Y. ruckeri isolate was also found dating back to the 1960s (Roberts, 1983). On the other hand, Y. ruckeri strains in the UK were not reported until 1980s, nevertheless, the first isolations were achieved in the 1970s but the findings were never published (Roberts, 1983).
Although ERM could potentially affect different salmonid fish (mainly rainbow trout), the microorganism has been isolated from non-salmonid and marine fish (Fuhrmann et al., 1983). Mammals, birds, invertebrates, and even humans were considered as possible vectors of the Y. ruckeri (Willumsen, 1989;De Keukeleire et al., 2014). This bacterium has been also recovered from feces and sewage sludge and the aquatic environment, including water (Willumsen, 1989). In addition, it has been described as being readily able to form biofilms (Coquet et al., 2002). These biofilms may be a source of recurrent infection in rainbow trout farms.
ERM has been successfully controlled for decades by vaccination. Although formulations of most commercial vaccines are based only on common serotype O1a, different degrees of cross-protection among serotypes have been reported (Stevenson, 1997). However, recently ERM vaccine breakdowns have been described in Europe and the USA mostly attributed to Y. ruckeri non-motile and lipase negative strains (biotype 2) (Austin et al., 2003;Fouz et al., 2006;Arias et al., 2007;Calvez et al., 2014). Other epizootics have been reported in Spain, being caused by uncommon serotype O2b in rainbow trout (Romalde et al., 2003), and in Australia and Chile by serotype O1b/biotype 1 strains in vaccinated Atlantic salmon (Salmo salar) (Bastardo et al., 2011;Bridle et al., 2012).
Y. ruckeri remains as a concern for aquaculture due to the expanding range of both hosts and pathogen across the world. In this study, the genetic structure of a broad geographical range Y. ruckeri population was analyzed using MLST data to evaluate the evolutionary past of this pathogen within the context of ERM re-emergence. Such data together with information on pathogenhost associations are critical to understand the dynamics of ERM agent and to form hypothesis concerning its past and future spread.

Data Set Construction
The data set used in this study consisted of multiloci concatenated sequences of 103 Y. ruckeri isolates previously typed into 30 sequence type (ST) and described by a MLST scheme (Bastardo et al., 2012). DNA sequences for 6 housekeeping genes (glnA, gyrB, Y-HSP60, recA, dnaJ, and thrA) and the concatenated sequences were downloaded from the Y. ruckeri MLST database hosted on publmlst.org (http:// pubmlst.org/yruckeri/). For analyses, each ST was considered as one haplotype, and the 103 isolates of Y. ruckeri were considered as one population, which were divided into nine different subpopulations on the basis of the geographical site of isolation as designated in the MLST database (Table 1).

Phylogenetic and Phylogeographical Analysis
Phylogenetic relationships among the concatenated sequences (2876 bp) of the 30 haplotypes (equivalent to each ST) were estimated by constructing a maximum-likelihood (ML) tree with PHYML 3.0 (Felsenstein, 1989). For the analysis of genetic structure and haplotype sharing, a haplotype network was constructed using the program NETWORK 4.1 (http://www. fluxus-engineering.com). Concatenated sequences of the six housekeeping genes from all Y. ruckeri isolates were previously aligned using the software DNA Alignment 1. 3 (http://www. fluxus-engineering.com), and the output file were imported into NETWORK 4.1 to conduct a network analysis using a median joining algorithm (Bandelt et al., 1999). The network of closely related haplotypes was displayed using the geographical origins.

Genetic Differentiation
Haplotype polymorphism (Hd), nucleotide diversity (π), and average number of pairwise differences (K) were calculated to assess the genetic variability using the software DnaSP 5 (Librado and Rozas, 2009). The pairwise genetic differentiation (F ST index) (Wright, 1965) was determined employing the program Arlequin3.5 (Excoffier and Lischer, 2010). An analysis of molecular variance (AMOVA) was performed to partition genetic variation among regions of restricted gene flow using Arlequin 3.5. This program was also used to determine the number of migrants N M (estimated number of migrants between populations per generation) among subpopulations, assuming a constant migration index.

Genetic Population Structure Analysis
The genetic population structure of Y. ruckeri was reconstructed using a Bayesian Monte-Carlo Markov chain (MCMC) sampling method that was implemented using the Structure software (Pritchard et al., 2000). This algorithm identifies genetically distinct populations assigning individual haplotypes to populations on the basis of allele frequencies, and determines the individual membership coefficient in each probabilistic population. For this analysis, an admixture model and the assumption of correlated allele frequencies among subpopulations were assumed (Falush et al., 2003). The probability of assigning individuals into clusters was estimated using 100,000 burn-in repetitions and a final run of one million MCMC steps. The number of clusters (K) was set from 1 to 15, and all runs were replicated 20 times to test the stability of the results. The most probable number of populations (K) was determined by means of the model value ( K) based on the second-order rate of change, with respect to K, in the likelihood distribution (Evanno et al., 2005) employing the program Structure harvester (Earl and von Holdt, 2011).

Analysis of Demographic History
Historical demographic structure of the genetic variation at the concatenated loci sequences was investigated using coalescentbased Tajima's D, Fu and Li D * and F * , and Fu's F S statistics to test the hypothesis that all mutation are selectively neutral (Tajima, 1989;Fu and Li, 1993;Fu, 1997). Additional tests of neutrality were also performed by assessing the haplotype structure using Ramos-Onsins' R 2 (Ramos-Onsins and Rozas, 2000) and Strobeck's S tests (Strobeck, 1987). Analyses of demographic expansion were conducted using DnaSP 5. This program evaluates the significance of these analyses comparing the observed statistics to a distribution of values generated with 5000 coalescent simulations. The demographic history of the populations were examined using the frequency of distribution of number of mismatches between pairwise sequences, and by modeling the expected distributions under the demographic scenarios of population constant size using DnaSP 5. The population expansion and spatial expansion assumptions were also modeled employing Arlequin 3.5. The fitted model was tested statistically by calculating the sum of squared deviation (SSD) of the observed data relative to the model. Harpending's raggedness (r) index (Harpending, 1994) was determined to quantify the smoothness of mismatch distributions. Confidence intervals for mismatch distribution parameters were obtained by performing 1000 bootstrap replicates.

Spatial Analysis
Geographical coordinates for each isolate were located using Google Earth (http://earth.google.com/). The physical distance between different pair of sampling locations was then calculated using the ARC CALC 3 Spherical Trigonometry Calculator macro (http://www.jqjacobs.net/astro/arc_form.html). The presence of phylogeographic structure was tested for individual allele frequencies in six equally-spaced classes of geographic distances by mean the spatial autocorrelation Moran's I index (Moran, 1950) using the software Gedis v1.74 (Peña et al., 2009). Furthermore, to assess the relative influence of drift and gene flow, association between genetic distance and geographic distance was determined employing the Isolation By Distance (IBD) web service v 3.1.6 (Jensen et al., 2005). Significance of the associations was tested with a partial Mantel test using 10,000 randomizations (Mantel, 1967).

Phylogenetic and Phylogeographical Relatedness
Phylogenetic and phylogeographical analyses included isolates from diverse areas and hosts isolated between 1965 and 2009 ( Table 1). The median-joining network constructed using all 103 concatenated sequences, illustrated the mutational relationship of the Y. ruckeri haplotypes (Figure 1). All haplotypes (except haplotype 19) differed by less than three mutational steps with a considerable divergence between genomes occurring in different regions. The major groups were separated by one mutational step. Haplotype 2 was the most common interior haplotype found in six different areas, so it is most likely the oldest haplotype. Many haplotypes (13) were tip alleles, being considered as more recently derived and geographically restricted. On the other hand, the majority of haplotypes differed by only one or two mutational steps, suggesting a demographic expansion. Geographical clustering was observed among haplotypes from Portugal (PO), Peru (PE), USA (US), and Chile (CH) subpopulations while the haplotypes present in UK, Denmark/Germany (DG), Finland/Norway (FN), Spain/France (SF), and Canada (CA) were spread into the network. Phylogenetic relationship among all haplotypes using maximumlikelihood was very poorly resolved and not informative probably because the sequence variation contain insufficient phylogenetic signal (data not shown).

Genetic Differentiation
Haplotype diversity ranged from 0.47 to 1.00 among subpopulations with a mean value of 0.79 for the overall population ( Table 2). The highest Hd values were observed in CA and DG subpopulations while the lowest value was determined for CH subpopulation. On the other hand, π values ranged to 0.0004 in SF, PE, and CH to 0.0017 in CA and DG, regardless the haplotype diversity observed in each area. Furthermore, genetic differentiation based on haplotype diversity among subpopulations was significant (χ 2 = 411.87; P < 0.001).
At the finest scale, pairwise F ST values were low among several subpopulation being between −0.0253 and 0.0860, indicative of a high degree of gene flow between different pairs of subpopulations; however these values were not statistically significant (P > 0.05) ( Table 3). A negative F ST value indicates great differences between two random individuals from the same population, rather than between two random individuals from different populations. In contrast, significant pairwise F ST values were detected for PE, CH and US compared with the other subpopulations, and for CA respect to UK, indicative of restricted gene flow (P < 0.05). The AMOVA results indicated that the highest proportion of the molecular variance (82.67%) could be explained by variations within each subpopulation, while the rest of the variance is explained by the genetic differences among them (Fisher = 0.1733; P = 0.000). The net number of migrants between subpopulations per generation N m ranged between 0.4 to infinity ( Table 3). Moderate N m values were found for PE and US respect to the other geographical areas, with the exception of US vs. SF, which were two orders of magnitude larger than the majority (104.6). Infinity values indicated the highest gene flow between pairs of subpopulations. Only the estimated gene flow for CH was low, indicating that this subpopulation is genetically isolated and/or with limited gene flow.

Genetic Population Structure
Bayesian analysis of population structure evaluated with Evanno's criterion K, indicated that the data were highly consistent with the presence of three genetic populations or genetic clusters (K = 3) (Figure 2A). The majority of isolates from subpopulations FN, PO, SF, PE, UK, DG, and US appear to share similar proportions of ancestry, with posterior probabilities between 0.500 and 0.798 ( Figure 2B). These isolates were included in the genetic cluster I (red color) with the exception of six isolates from PE, two isolates from FN, and one isolate from UK. These findings evidence the existence of wide spread processes involved in the current distribution of   Y. ruckeri. On the other hand, all isolates from CH appeared to be distinct, sharing genetic cluster II (green color) together with other two isolates from PE, and one isolate from UK (posterior probabilities from 0.567 to 0.872) indicating a separate diversification. However, the third genetic cluster (blue color) included in addition to the DG strains, the rest of isolates from FN, US, and CA excluded in clusters I and II (with posterior probabilities values between 0.567 and 0.872). Interestingly, high levels of admixture were present within all groups.
To check the substructure within the clusters, the analysis was re-run separately for the strains assigned to each cluster.
K determined the presence of genetic subpopulations into each cluster establishing K = 5 and K = 6 genetic subpopulations in cluster I ( Figure 3A) and cluster II (Figure 3B), respectively. Genetic subpopulations in cluster III were supported by two K peaks at K = 8 and K = 5 but in both scenarios similar proportions of ancestry were retained ( Figure 3C). Although relevant geographical differentiations were obtained in concordance with the areas of isolation for the majority of Y. ruckeri strains, the isolates found outside of the corresponding genetic subpopulation were indicative of possible migration or introduction processes, gene flow and/or isolation by distance.
An AMOVA conducted by partitioning variation among and within the three clusters revealed that 59.1% of the total variation was attributed to within-population differences (F CT = 0.3504; P = 0.000), 31.8% of the variation was associated to the haplotype frequencies among the 3 clusters (F SC = 0.1424; P = 0.000), whereas the remaining 9.1% of variation was attributed to within subpopulation-clusters haplotype differences (F ST = 0.4414; P = 0.000). Nei's G ST and N m (1973) were also determined to examine the haplotype pairwise differentiation and gene flow among the clusters (I, II and III), respectively. High levels of genetic differentiation (P < 0.05) was detected among the three clusters with F ST between 0.0886 and 0.1309, while the estimation of gene flow showed N m values from 1.9 to 2.3 between populations, supporting limited genetic flow by genetic isolation of clusters.  Table 1. For each plotting strains ordered by ancestry coefficients (A) and by geographic origin (B) are showed.

Demographic History
Tajima's D and Fu and Li's F * and D * neutrality tests for the overall population analyzed had significant negative values ( Table 2). These results allowed the rejection of the neutral model in the Y. ruckeri population as a result of relatively recent population expansion. Fu's F S was also negative (F S = −20.646; P < 0.001), which occurs when an excess of rare haplotypes are present, indicating population expansion or genetic hitchhiking events. Ramos-Onsins' R 2 statistic and Strobeck's S index determined that the total population has significant positive R 2 (0.056) and high S (1.00) values supporting also possible population expansion. However, FN and PO subpopulations showed positive but non-significant F S values, indicating natural selection or population growth. Positive D * values exhibited by CA and DG subpopulations suggest balancing selection, although these values were no significant. When neutrality was tested within each genetic cluster (I, II, and III), the results indicated similar conditions to those observed for the overall population ( Table 2).
The mismatch distribution of pairwise nucleotide differences in concatenated sequences of the overall Y. ruckeri population showed a smooth unimodal distribution, characteristic of a large population expansion ( Figure 4A). The SSD statistic and raggedness index values were low, supporting these results ( Table 4). Study-wide site-frequency spectra revealed an excess of singleton mutation when compared with expected frequencies under a stable population size ( Figure 4B). Similar patterns were observed in the individual distribution for each subpopulation except for FN, UK, and CA, which exhibited a more ragged distribution (data not shown). The FN population differed significantly from the modeled distribution for an expanding population (P = 0.04) with high raggedness value (r = 0.9722, P = 0.05) indicating constant population at equilibrium, although raggedness P value supported spatial expansion (P = 0.12). Furthermore, the mismatch distribution for each cluster (I, II, and III) also showed a unimodal distribution typical of demographic expansion, supported by the low values for the SSD statistic and raggedness index ( Table 4).
The mismatch distribution parameter, τ (the time in mutational steps per generation since the modeled expansion event), from the raggedness calculation, was 1.10 (95% CI: 0.65, 1.58) for the demographic expansion, and 1.13 (95% CI: 0.67, 0.12) for the spatial expansion for the entire population of Y. ruckeri ( Table 4). The divergence times were variable for subpopulations founding highest estimations in PO and CA for the demographic expansion, and in CH and US for the spatial expansion. On the other hand, for cluster I τ was estimated at 1.00 (95% CI: 0.00, 53) and 0.54 (95% CI: 0.35, 2.40) mutational steps for both population expansion and spatial expansion, respectively. However, τ parameters were higher for cluster II and cluster III, showing similar τ values for demographic and spatial expansion (τ values between 2.61; 95% CI: 0.86, 6.43 and 4.44; 95% CI: 2.40, 6.69). These findings together provide evidence for demographic and spatial expansion in Y. ruckeri population occurring at different times, regardless that the rate of evolution could be similar among subpopulations.

Spatial Analysis
Spatial dependence of haplotype frequencies was only detected at the intermediate distance (4500-6000 km) (P = 0.000) indicating that the allele frequencies tend to be more different at this geographical distance, and suggesting isolation by distance ( Figure 5A). Although not significant Moran's I values were observed within the clusters, this value showed a decrease when the pairwise distances increased (data not shown). The partial Mantel test determined that genetic distances and geographical distances among Y. ruckeri subpopulations were positively correlated (Z = 42.10 * 10 7 ; r = 0.5915; one-side P = 0.0020) ( Figure 5B). Similarly, cluster I showed a positive but nonsignificant correlation (Z = 24.23 * 10 8 ; r = 0.0260; one-side P = 0.4690). However, clusters II and III provided evidence of nonsignificant but negative correlation within the subpopulations (cluster II: Z = 14652.4478, r = −0.9532; one sided P = 1.0000, and cluster III: Z = 8189.3193, r = −0.3541; one sided P = 0.8260) supporting the signal of isolation by distance detected for the overall population.

DISCUSSION
In this study, biogeographical and inferred dispersal patterns of Y. ruckeri population were analyzed on a global scale using multi-sequence data. A pattern of sequence divergence corresponding to geographical area was observed at relatively recent points in the evolutionary history of Y. ruckeri. Neighbour network showed a large number of rare haplotypes separated by single-nucleotide differences. Statistical signatures of population expansion were detectable in all subpopulations. Moreover, mismatch distribution indicate that few haplotypes spread and mutated to generate several closely related haplotypes, as is commonly observed during population expansion (Rogers and Harpending, 1992). Furthermore, Tajima's and Fu and Li tests indicated a significant negative deviation from evolutionary neutrality. If Y. ruckeri had emerged in one region early in the past and went on to seed the other regions since then (Austin and Austin, 2012), the same haplotypes in all regions would have predominantly collected and signatures of ancient population expansions would not be observed. Conversely, these results revealed that Y. ruckeri global population had undergone a population expansion some time in the relative recent past.
This study suggests that the ERM disease emergence in South America and Europe over last few decades resulted from two independent and parallel events. First, an early spread of ERM probably facilitated by human-mediated dispersal, and second, intrinsic factors such as genetic differentiation of Y. ruckeri populations, niche specialization and/or isolation by distance.
Although trout farming dates back over 400 years in Europe, about 150 years in the USA, and about 100 in South Africa, Y. ruckeri was only isolated for the first time from rainbow trout in the Hagerman Valley of Idaho, USA in 1950s (Busch, 1981). Then, the pathogen was increasingly isolated from other states of the USA and from Canada (Wobeser, 1973). The first report of ERM in Europe was published in 1981 by Lesel et al. (1983), who described the isolation of Y. ruckeri from rainbow trout in the southwestern of France. Subsequently, Y. ruckeri have been isolated in the 1980s in Denmark, Italy, Norway, UK, and Spain (Austin and Austin, 2012). In South America, reports about the isolation of Y. ruckeri from salmonid are limited. The first occurrences of Y. ruckeri in Peru were reported from 1998 to 2000 (Bravo and Kojagura, 2004). In Chile, regardless being one of the largest producers of salmonids, the occurrence of ERM was only reported occurring in Atlantic salmon (Salmo salar) in 1992 (Toledo et al., 1993). A possible route for the introduction of the pathogen was only described by Michel et al. (1986), who reported in France a clinical case of this bacteriosis in minnows (Pimephales promelas) imported from the United States at least since 1981 for live-bait fishing. In addition, as pointed out by Gall and Crandell (1992) and Barnes (2011), different hatcheries may have imported salmonid eggs, fry or brood stocks from infected areas in the absence of strict controls and monitoring schemes. Thus, the pathogen associated with aquaculture animals may have influenced the infectious microbiota and the parasitc fauna of the wild populations (Willumsen, 1989). These works suggested that the worldwide spread of Y. ruckeri could have been facilitated by the growth and expansion of aquaculture and other human activities. However, the possibility of Y. ruckeri being ubiquitous in the environment, and that aquaculture practices favored its selection and spread cannot be ruled out, despite the fact that the bacterium has never been described in natural fresh water microbial ecology studies.
Analyses of population structure performed in the present study demonstrated a genetic separation among Y. ruckeri strains in North America, Europe, and South America. The major ancestral population was formed by isolates from USA, Peru, UK, Finland, and Spain, and isolated mainly from O. mykiss. The second ancestral population was defined for the majority of isolates from Chile (isolated from S. salar, and belonging to O1b serotype) being more closely related to isolates from UK (from O. mykiss, serotype O1b), and in less proportion to some isolates from Portugal (isolated from fish farm sediment and O. mykiss, belonging respectively to serotypes O3 and O1a). The third ancestral population was formed by Y. ruckeri strains from Canada, Portugal, and Denmark not belonging to the serotype O1a, and with the majority of strains isolated from hosts different to O. mykiss. The evidenced genetic admixture in the entire Y. ruckeri population also supports the previous hypotheses of specificity of Y. ruckeri strains and niche specialization (Bastardo et al., 2012). The population of Y. ruckeri analyzed in this study reflected both signatures of population expansion (a non-equilibrium condition) and isolation by distance (which is consistent with evolutionary equilibrium). This phenomenon may be the result of residual effects of range expansion, and by transfer of strains through countries due to human activities. Humans have had an enormous influence on the global environment and are known to have inflicted genetic variation within other species in the recent past (Conover and Munch, 2002). Recently, the maintenance and spread of Y. pestis in Madagascar have been reported as a dynamic and highly active process that relies on the natural cycle between the primary hosts, the black rat, and its flea vectors as well as human activity (Vogler et al., 2011). In Y. ruckeri case, it is still likely that sequences reflect mainly a partial return to evolutionary equilibrium after expansion events.
Time from population expansion occurred in a population could be estimated based on the parameter τ (τ = 2µt, where µ is the mutation rate per nucleotide/year) (Slatkin and Hudson, 1991). Without a known mutation rate from Y. ruckeri housekeeping genes, it is not possible to accurately pinpoint the time since the inferred expansion events. However, considering that the typical rates of spontaneous mutation per generation in bacteria have been estimated to be on the order of 10 −10 substitutions per site per year (Drake, 1991), the mutation rate for Y. ruckeri (based on 2876 bp) can be preliminary estimated to be on the order of 2.8 × 10 −7 substitutions per site per year. In theory, if an average of 100-300 generations per year is assumed (Gordon et al., 2002), the time since expansion indicated by mismatch distribution parameter for the overall population of Y. ruckeri is at least several thousand years ago (approximately 6375-20,000 years), varying between 800 and 30,000 years among the different areas. This fact suggests ancient spread of Y. ruckeri long before to the emergence of modern ERM disease, and highlights the scenario of the independent ERM disease emergence events in the North America and Europe in the last centuries. However, these date ranges should be regarded as preliminary, and further studies are needed on the ancestry and molecular clock in the evolution of Y. ruckeri to confirm this hypothesis. Furthermore, the mismatch distributions observed are consistent with expansion in almost all regions, the high values of raggedness index of the Canada, Finland and Portugal subpopulations may indicate a more stable demographic equilibrium with a less sudden expansion (Harpending, 1994).
No evidence for genetic differentiation was obtained in this study to explain the emergence and spread of Y. ruckeri biotype 2. However, non-motile isolates were genetically grouped into the same group of motile strains strengthening the theory that biotype 2 has evolved from related motile Y. ruckeri strains (Wheeler et al., 2009;Bastardo et al., 2012). The presence of different non-motile haplotypes in the USA, UK, Finland, and Peru support the independent emergence of biotype 2 in geographically separate areas (Ström-Bestor et al., 2010;Bastardo et al., 2012). Furthermore, genetic mixure observed in this study suggests that the emergence of virulent of Y. ruckeri variants could be forced by factors extrinsic to the population such as resistance to antibiotics or vaccination.
Selective pressure induced by intensive vaccination could cause changes in phenotypic and immunogenic properties, which can cause outbreaks in vaccinated fish. The observed change from Y. ruckeri motile to non-motile isolates being recovered from disease outbreaks could be due to vaccine induced strain replacement (Martcheva et al., 2008). In this context, Pulkkinen et al. (2010) suggested that the presence of several genetically distinct bacterial populations in one area Moran's I was plotted for individual allele frequencies across 6 distance classes (black line). Significant value (black dot) of Moran's I indicate positive spatial dependence at P < 0.05 (B) Mantel test for isolation by distance. Regresion based on genetic distance (PhiF ST ) values among 9 subpopulations. Regresion slope = 0.0431 ± 0.005; R 2 = 0.350 Mantel probability P < 0.01. might favor virulence, if the virulent strains have a competitive advantage. The explanation for the increase of biotype 2 Y. ruckeri cases in Europe and USA, as well as the emergence of virulent motile isolates belonging to serotypes O1b and O2b in South America could be associated with this evolutionary change.
Few studies have addressed questions of phylogeographic structure and dispersal limitation in bacteria on a truly global scale in discontinuos but globally common habitats (van Gremberghe et al., 2011). Such studies would provide a realistic insight into the degree of dispersal limitation typically encountered by bacteria. In this study, the significant geographic contribution to the overall genetic differences of Y. ruckeri was supported by the positive correlation between genetic and geographic distances among strains and populations. The lack of a total geographic structure could be caused by similar nucleotide diversity across spatial scale among the largest subpopulations of USA, Europe and Peru. Nevertheless, geographic isolation of Y. ruckeri haplotypes was evidenced in Chile, Canada, and Denmark, which may have epidemiological implications due to differences in clinical outcomes associated with specific genotypes.
In summary, this study provides phylogeographical findings of signature of ancient demographic processes in Y. ruckeri population, including spatial expansion, occurring theoretically at least thousands years ago, and more recent genetic divergence among regions. Furthermore, these results suggest that genetic divergence occurring in Y. ruckeri populations over last decades are independent, and in some cases isolated events, highlighting the usefulness of genetic studies in explaining the variations in the transmission and maintenance of ERM disease.

AUTHOR CONTRIBUTIONS
AB, CR, and JR conceived and designed the experiments; AB performed the experiments; AB and JR wrote the paper; CR critically revised the manuscript; AB, CR, and JR edited and approved the manuscript.