The Gene Flow Direction of Geographically Distinct Phytophthora infestans Populations in China Corresponds With the Route of Seed Potato Exchange

Phytophthora infestans is a widespread destructive plant pathogen that causes economic losses worldwide to potato production. In this study, we sequenced four mitochondrial DNA gene sequences of 101 P. infestans isolates from five potato-growing regions in China to investigate the population structure and dispersal pattern of this pathogen. The concatenated mtDNA sequences in the populations showed high haplotype diversity, but low nucleotide diversity. Although there was a degree of spatial structure, our phylogeographic analyses support frequent gene flow between populations and the direction of gene flow, primarily from north to south, corresponds to the route of seed potato transportation, suggesting a role of human activities in the dispersal of P. infestans in China.


INTRODUCTION
Gene flow, or migration, refers to the movement of alleles among spatially or temporally separated populations and is a major driving force in the evolution of populations. Gene flow may increase the genetic variation of local pathogen populations but decrease the genetic difference between populations, which can slow or limit population differentiation (Endler, 1973;Zhan, 2016). Population genetics studies have contributed to our understanding of the evolutionary history of plant pathogens, as well as to the development of strategies to prevent their spread in major crops (Mcdonald and Linde, 2002). For example, some soil-borne or seedborne plant pathogens, such as the fungal Cryphonectria parasitica, are spatially structured and quarantine procedures can be effective in preventing the invasion of plant diseases caused by the pathogens (Gilbert, 2002;Horton, 2010).
Phytophthora infestans, the causal agent of the Great Irish Famine, is a highly destructive plant pathogen, causing worldwide losses in potato production of $6.7 billion annually (Haverkort et al., 2008). Knowledge of population genetics and the evolutionary biology of P. infestans are important for developing sustainable, effective strategies for controlling potato late blight. In population genetics, selectively neutral loci, such as microsatellites (simple sequence repeats, SSRs), are the most frequently used genetic markers (Kirk and Freeland, 2011) and have been employed to investigate the evolutionary dynamics and population structure of P. infestans (Goodwin, 1997;Lebreton et al., 1998;Li et al., 2012). However, SSRs are often expensive and time-consuming (Vieira et al., 2016). Mitochondrial DNA (mtDNA) is now a popular marker because of its utility in investigating evolutionary processes, such as migration or dispersal (Holderegger et al., 2006;Wang et al., 2020). Cardenas et al. (2011) conducted a population genetics analysis of P. infestans in Colombia and Venezuela using four nuclear (ITS, Ras, β-tubulin, and Avr3a) and one mitochondrial (Cox1) gene, and revealed the low genetic diversity and low frequency of heterozygotes in P. infestans.
Potato (Solanum tuberosum L.) is the fourth most important staple food crop worldwide. China is the world's largest producer of potatoes, producing ∼99.21 million tonnes on ∼5.76 million ha in 2017 1 . With government support and shifts in dietary habits, potato production in China is expected to increase substantially in the coming decades (Kearney, 2010). Potato planting in China can be roughly divided into four producing regions according to climate conditions and management practices: the northern single-cropping, central double-cropping, winter-cropping, and southern mixed-cropping zones (Jansky et al., 2009). Of these, the northern single-cropping zone, including Heilongjiang Province, is the major producer of potatoes and accounts for 50% of China's total potato yield.
The existence of P. infestans in Kunming and Chongqing of China was noted in as early as 1938 and 1940, respectively (Ristaino and Hu, 2009). Molecular population genetics approaches have been used to investigate the genetic diversity of P. infestans in China, and to identify the evolutionary forces responsible for its diversity. For example, based on SSRs and amplified fragment length polymorphisms (AFLPs), Guo et al. (2009) found low genotypic diversity of P. infestans in Northern China. Subsequently, a population analysis of P. infestans inferred from mtDNA haplotypes and restriction fragment length polymorphism (RFLP) data suggested multiple migrations of this pathogen into China (Guo et al., 2010). In addition, Tian et al. (2016) investigated the genetic structure of P. infestans populations in northwestern China using SSR markers and found that migration and selection might be important in shaping the genetic structure and diversity of regional populations. However, the population structure of P. infestans and the migration dynamics of this pathogen in China remain poorly understood.
This study analyzed the concatenated mtDNA sequences of the partial cytochrome c oxidase subunit 1 (cox1), NADH dehydrogenase subunit 9 (nad9), NADH dehydrogenase subunit 4 (nad4), and ATP synthase subunit alpha (atp1) genes from 101 P. infestans isolates collected in China. The objectives of this study were to: (i) characterize the genetic structure of P. infestans in China; (ii) investigate the geographic and host-origin effects on the population structure; and (iii) determine gene flow among populations, particularly in terms of the source-sink dynamics of the populations. 1 http://www.fao.org/faostat

Sampling P. infestans and DNA Extraction
A total of 101 P. infestans infected potato (Solanum tuberosum) and tomato (S. lycopersicum) leaves, stems and tubers exhibiting typical late blight symptoms were collected from 34 sampling sites in Fujian, Hebei, Beijing, Heilongjiang, Inner Mongolia, and Jiangsu, China (Supplementary Table S1 and Figure 1). These isolates were grouped into five populations based on their geographical origin: Fujian (FuJ, n = 36), Hebei and Beijing (HeB, n = 11), Heilongjiang (HLJ, n = 20), Jiangsu (JSu, n = 21), and Inner Mongolia (NMG, n = 13). The pure culture of each isolate was preserved on antibiotic-free rye agar slants covered with mineral oil (Sigma Diagnostics Inc, St Louis, MO, United States) at 15 • C in the dark and transferred to new slants at 6-month intervals. P. infestans genomic DNA was extracted using a modified cetyl trimethylammounium bromide (CTAB) method, as described previously (Li et al., 2009;Chen et al., 2012). The genomic DNA was eluted with 200 mL ultrapure water and stored in TE (10 mM Tris-HCl, 0.1 mM EDTA, pH8.0) at -20 • C.

PCR Amplification and Sequencing of the mtDNA Genes
Supplementary Table S2 lists the PCR primers and cycling parameters. PCR amplification was performed in a total volume of 25 µL, comprising 5 µL of TransTaq reaction buffer (5×), 2 µL of dNTPs (2.5 mM each), 1 µL each of forward and reverse primers (10 µmol/L), 14.25 µL of ddH 2 O,0.25 µL of TaKaRa Ex Taq DNA polymerase with the proofreading activity (5 U/µL), and 1 µL of template cDNA. The PCR program comprised an initial denaturation at 95 • C for 2 min, followed by 30 cycles of 90 • C for 20 s, 57 • C for 25 s (for cox1; 41 • C for nad9 and 63 • C for nad4 and atp1; Supplementary Table S2), 72 • C for 5 min; and a final 5 min extension at 72 • C. The amplification products were separated by electrophoresis on 1% agarose gels, visualized by UV transillumination, and cleaned using an EasyPure Quick Gel Extraction Kit (TransGen, Beijing, China). The amplicons were ligated to pMD18-T Simple Vector (Takara, Beijing, China) and sequenced by Sangon Biological (Shanghai, China) using an ABI3730 automated DNA sequencer (Applied Biosystems, Foster City, CA, United States). At least three positive clones from each transformation were sequenced to obtain a consensus sequence.

Sequence Dataset
The resulting sequences were assembled using DNAMAN 6.0 (Lynnon, Quebec, Canada) and deposited in GenBank under accession numbers MN458052 to MN45815 for cox1, MN458254 to MN458354 for nad9, MN458153 to MN458253 for nad4, and MN457951 to MN458051 for atp1. The four gene sequences were aligned by codons using TranslatorX (Abascal et al., 2010). A recent study found evidence for intragenic recombination in an effector gene of P. infestans (Yang et al., 2018). To test recombination signals in our data set, we conducted recombination analysis using the RDP 4.95 software package (Martin et al., 2015). No recombination signals were FIGURE 1 | Map of the locations from which the Phytophthora infestans isolates were collected in this study. Colors indicate sampling locations, as shown in the key. Detailed information on the samples is given in Supplementary Table S1. identified; therefore, we combined the four gene sequences for subsequent analyses.

Genetic Diversity and Population Differentiation
To assess how genetic diversity varied across geographical and host populations, we calculated the two summary statistics, nucleotide diversity (π ) and haplotype diversity (H d ), using DnaSP 5.10 (Librado and Rozas, 2009). The nucleotide diversity was also calculated according to a sliding-window (100 bp) analysis, with a 30 bp step size estimate the step-wise diversity across the sequence. A minimum spanning haplotype network was built using PopART (Leigh and Bryant, 2015).
Genetic differentiation among populations was evaluated using K ST , Z, and S nn using DnaSP 5.10 (Hudson et al., 1992;Hudson, 2000;Librado and Rozas, 2009). The level of genetic differentiation among P. infestans was also quantified using F ST (Weir and Cockerham, 1984). We calculated pairwise F ST values using noncorrected pairwise differences and obtained p-values after a randomization test with 10,000 permutations using Arlequin 3.5 (Excoffier and Lischer, 2010). Default values were used for the remaining parameters. The degree of differentiation (given by the F ST values) was classified as low (<0.05), moderate (0.05-0.15), large (0.15-0.25); or great (> 0.25) (Balloux and Lugon-Moulin, 2002). We also applied discriminant analysis of principal components (DAPC), a multivariate method designed to identify clusters of genetically related individuals, to allow investigation of the genetic structure of P. infestans populations (Jombart et al., 2010). This method has the advantage of not assuming panmixia. We performed the DAPC analysis based on pre-defined groups, using the adegenet package (Jombart, 2008) implemented in R software (ver. 3.5.1; R Development Core Team, Vienna, Austria).
To test the effects of geography and host on the genetic diversity of P. infestans, we conducted analysis of molecular variance (AMOVA) using Arlequin 3.5 (Excoffier and Lischer, 2010). The statistical significance of the ϕ-statistics was tested based on 1,023 permutations (default). To determine the potential geographic and host-origin effects on P. infestans populations, the isolates were divided into five populations based on geographic origin, as described above. They were also divided into two populations based on host origin, i.e., S. tuberosum (potato, n = 70) and S. lycopersicum (tomato, n = 31). Pairwise population differentiation was also estimated based on G ST using the PopGenome 2.16 R package (Pfeifer et al., 2014).

Phylogeographic Analysis of P. infestans
To explore the spatial diffusion patterns of P. infestans across regions in China, we used Bayesian stochastic search variable selection (Lemey et al., 2009) to reconstruct the spatial pathway of this pathogen in BEAST 1.10.4 . The five geographic locations in China were coded as discrete states. Before the analysis, we performed a date-randomization test (Ramsden et al., 2008) to analyze the temporal signal in the dataset using the TipDatingBeast package (Rieux and Khatchikian, 2017). However, we found no evidence of temporal structure in our dataset, so we applied a previous informative prior distribution of 1.5 × 10 −6 to 3.3 × 10 −6 substitutions/site/year on the clock rate (Yoshida et al., 2013). The best nucleotide substitution model, GTR+F+I, was identified by ModelFinder (Kalyaanamoorthy et al., 2017) with the Bayesian information criterion and used for analyzing the combined mtDNA gene sequences.
We calculated the marginal likelihoods using path sampling (Baele et al., 2012) and analyzed them in terms of Bayesian skyline, constant-size coalescent, and exponential-growth coalescent tree priors, as well as the strict clock and uncorrelated lognormal relaxed clock (Drummond et al., 2006). An exponential growth tree prior and uncorrelated lognormal relaxed clock gave the best fit to the data (Supplementary Table S3). The collection dates for all P. infestans samples were used as calibration points. Two independent Markov chain Monte Carlo (MCMC) runs were performed to estimate the posterior distributions of the parameters, with samples taken every 10,000 steps over 50,000,000 steps. After discarding the first 10% of the samples as burn-in, all parameters showed sufficient sampling, as confirmed by Tracer 1.7 , which showed that the effective sample size was above 200.

Population Historic Dynamics of P. infestans
As neutrality indices, Tajima's D and Fu's F S statistics were calculated using Arlequin 3.5 (Excoffier and Lischer, 2010). Tajima's D test is based on comparison of the estimated number of segregating sites with a mean pairwise difference among sequences (Tajima, 1989). A negative Tajima's D indicates bias toward rare alleles, which is indicative of recent population expansion. Fu's F S statistic is based on the allele or haplotype distribution and usually has a negative value, which is also indicative of a recent population expansion (Fu, 1997). We also used the exponential growth model implemented in BEAST to analyze the population demographic history of P. infestans.

Genetic Diversity of P. infestans and Haplotype Network
We obtained 1,968 bp of mitochondrial genes in P. infestans in China, including cox1 (735 bp), nad9 (348 bp), nad4 (396 bp), and atp1 (489 bp). No indels were detected in our dataset. In the entire mtDNA, 141 polymorphic sites were identified, representing 7.16% of all sites analyzed, including 100 single variable sites and 41 parsimony-informative sites. The transition/transversion ratio (R) was 0.702, indicating a strong transversion bias during the mitochondrial evolution of P. infestans.
In total, 62 haplotypes were identified from the 101 P. infestans isolates studied, with a haplotype diversity of 0.920 ( Table 1). The overall nucleotide sequence diversity of these isolates was 0.0023 (Table 1). When the P. infestans isolates were categorized according to geographic origin, the highest nucleotide diversity (0.00920 ± 0.00025) was observed in the FuJ population and the lowest (0.00153 ± 0.00033) was found in the HLJ population. Greater variation in the concatenated mtDNA sequences was observed in P. infestans isolates isolated from potato compared with those isolated from tomato, when the isolates were categorized according to host origin ( Table 1). Figure 2 shows the minimum spanning haplotype network of the P. infestans population. There were 58 low-frequency haplotypes, each found in a single P. infestans isolate. Four haplotypes were shared by two or more isolates. Hap_2 and Hap_4 were commonly shared haplotypes, being present in 26 and 13 isolates, respectively. Of all the identified haplotypes, 11 private haplotypes were found in the isolates from tomato, of which 10 were from the FuJ population and one was from the JSu population. The population network showed low levels of sequence divergence and a high frequency of unique mutations, which is a signature of rapid population expansion. The stepwise diversity across the sequence indicated two highly variable regions at sites 1111-1270 and 1681-1870 (Figure 3), which were located within the nad4 and atp1 genes, respectively.

Differentiation Between P. infestans Populations
With the exception of the comparison between HLJ and HeB, three independent tests of population differentiation for all populations, grouped by geography or host-species, were significant ( Table 2); the F ST values ranged from 0.028 to 0.289. The highest F ST value was between the HLJ and FuJ populations and the lowest was between NMG and HeB ( Table 2). More than half of the 10 F ST values were higher than 0.05, suggesting a degree of spatial structure of the P. infestans populations in  China. Our investigation of F ST also showed significant genetic differentiation between P. infestans from tomato versus potato. The DAPC analysis revealed patterns of genetic differentiation that were similar to those of the F ST analysis. DAPC scatterplots also indicated that the NMG population was relatively distinct from the other populations along the second discriminant function axis, while the JSu population had a more subtle structural difference along the third discriminant function axis (Figure 4). This suggests that geography contributes to the differentiation of P. infestans populations.
AMOVA also revealed significant variation among geographic groups, accounting for 13.87% of the total variation in P. infestans ( ST = 0.139, p < 0.001; Table 3). When host species was used a grouping factor, similar results were observed, i.e., there was significant population differentiation among groups ( ST = 0.087, p < 0.001), accounting for nearly 8.66% of the total variation in P. infestans (Table 3). These results suggest that the effect of geography on the genetic variance of P. infestans is greater than that of host species, although both contributed to the genetic variance in P. infestans.
The results of sliding-window analysis of the pairwise G ST values of population differentiation are plotted in Figure 3B. The pairwise G ST values estimated based on the geographic groupings were slightly higher than those based on the host species groupings, across the sequence regions located in the cox1, nad9, and nad4 genes. However, the former values were significantly higher than the latter in the highly variable region in the atp1 gene (alignment sites 1681-1870).

The Dispersal Pattern of P. infestans Populations in China
All F ST values were lower than 0.33 (Table 2), indicating a degree of gene flow between these populations. To gain insight into the circulation of P. infestans across China, we reconstructed the migration pathway of this pathogen. Our Bayesian phylogeographic analysis suggested that there were six major migration events in the diffusion process of P. infestans, with mean migration rates ranging from 0.922 to 1.323; four of the migrations were from HLJ to HeB, NMG, JSu, and FuJ, and the remaining two were from FuJ to HeB and NMG (Figure 5 and Supplementary Table S4). The observed state changes (MCMC jumps) also suggest that HLJ has acted as the seeding population for P. infestans in China (Figure 5). The Markov rewards for FuJ (14,748; 95% credibility interval, 7,756-50,530) were higher than those for other regions, suggesting that Fujian played a major role in the evolution and persistence of P. infestans.

Demographic History of the P. infestans Population in China
Tajima's D test was significantly negative for all populations. Fu's F S test was also significantly negative for all populations, except the HeB population ( Table 1). The negative values of both neutrality tests indicate an excess of rare mutations, suggesting recent population expansion. This is also indicated by the results of the coalescence-based demographic analysis, which showed that P. infestans had experienced substantial population expansion until the last sampling year ( Figure 6A). Interestingly, the demographic expansion of P. infestans correlates with the increase of potato production in China over the past decades (r = 0.95, DF = 23, p = 0.000, Figure 6B).

DISCUSSION
We obtained new sequence data for four mtDNA genes from 101 isolates of P. infestans in China. Using these data, we first investigated the genetic diversity of this pathogen. High haplotype diversity and low π was found. This differs from a recent study, which reported high genetic diversity in the effector gene Avr3a of 96 P. infestans isolates collected from six potato-planting regions in China (Yang et al., 2018). In addition to genetic markers, the major difference between the two studies was that only isolates with distinct genotypes were chosen in the previous, but not the current study (Yang et al., 2018). However, high genetic variation was also found in the mtDNA genes in the current study. In total, 62 nucleotide haplotypes ( Table 1) and 70 isoforms (data not shown) were detected in the 101 sequences. The genetic variation is comparable to that of P. infestans nuclear genes (Yang et al., 2018).
Despite the high haplotype diversity of the P. infestans population, the low π suggests only small differences between haplotypes. This was also observed in the minimum spanning haplotype network (Figure 2). The high haplotype and low π observed in this study indicates rapid population expansion from a small effective population size (Grant and Bowen, 1998). Based on FAO data 2 , the production of potato in China increased from ∼12.91 million tons in 1961 to 6.45 million tons in 2006. This coincidence suggested a link between potato cultivation area and the population size  of P. infestans. Indeed, our results supported this with a positive correlation between the demographic history and the production of potato in China ( Figure 6B). Further studies aiming to understand how human activities influences the demographic expansion of the P. infestans population will be interesting. Among the five geographic populations of P. infestans, the FuJ population had the highest genetic diversity. One potential reason for this is that the FuJ population may have acted as an important hub for the spread of P. infestans in China, because Fujian Province, a winter potato-cropping zone, imports seed tubers extensively from other cropping zones. This is also supported by the results regarding the MCMC jumps ( Figure 5). Comparatively, the genetic variation was higher in P. infestans from potato than from tomato ( Table 1). This might be due to the larger sample size of potato sequences compared to tomato sequences (70 vs. 31 isolates). When we recomputed the haplotype diversity after standardizing sample sizes with the smaller population (tomato in this case) using a bootstrapping approach, the difference in genetic variation between the two populations disappeared (p> 0.5, data not shown). Current studies usually use the relatively short, reasonably variable gene fragments flanked by highly conserved sequences as mtDNA markers, such as the cox1 gene, which was used for DNA barcoding to identify plants and fungi (Robba et al., 2006;Stockinger et al., 2010). For intraspecific studies, slow mtDNA evolution generally results in low levels of polymorphism in mtDNA genome. A similar observation has been made for the eEF-1α gene of P. infestans (Wang et al., 2020). The atp1 gene is a rotary enzyme that exploits the proton gradient across the inner membrane, generated by the electron transport chain, to synthesize ATP (Song et al., 2018), however, it had the highest variability among the four mtDNA genes examined in this study, as indicated in Figure 3. This gene can distinguish closely related P. infestans populations between different geographic regions or host species; therefore, it could be an ideal genetic marker for population genetic analysis.
Our phylogeographic analysis revealed multiple north-tosouth dispersal events in P. infestans in China (Figure 5 and Supplementary Table S4). This accords with the fact that seed potatoes are normally produced in northern China, suggesting that the pattern of P. infestans dispersal is associated with humanmediated activities. In China, the majority of potatoes in the winter-cropping and central double-cropping zones in southern China are intended for export, and not for planting, while seed potatoes are supplied from the northern single-cropping zone. Further studies to understand how human-mediated activities influence the migration of P. infestans should be informative.

CONCLUSION
In conclusion, this study examined the population genetic structure and phylogeography of P. infestans. The results showed high haplotype diversity, but low nucleotide diversity, of P. infestans in China. The diversification patterns are likely shaped by gene flow. We also found multiple north-to-south migration pathways of P. infestans in China, suggesting that the movement of seed potato tubers through the country has played a major role in the dispersal of P. infestans in China. This information is potentially valuable for the development of more effective strategies to control this pathogen. Further analyses of larger datasets with P. infestans outside of China will lead to a more comprehensive view of the evolutionary history of this pathogen.

DATA AVAILABILITY STATEMENT
The datasets generated for this study can be found in the GenBank database under accession numbers MN458052 to MN45815 for cox1, MN458254 to MN458354 for nad9, MN458153 to MN458253 for nad4, and MN457951 to MN458051 for atp1.

AUTHOR CONTRIBUTIONS
QC and FG conceived and designed the experiments and analyzed the data. CC, FG, BL, and QW performed the experiments. QC and FG wrote the manuscript. All the authors reviewed the manuscript.

FUNDING
This work was supported by grants from the National Natural Science Foundation of China (Grant No. 31772141), Science and Technology Major Project of Fujian Province (Grant No. 2017NZ0003-1) and The Provincial Public Welfare Project in Fujian Province (Grant No. 2019R1024-4).

ACKNOWLEDGMENTS
We thank Dr. Zhenguo Du and Dr. Fengping Chen (Fujian Agriculture and Forestry University) for their comments and suggestions on the manuscript and Dr. Yingbo Zhang (Chinese Academy of Tropical Agricultural Science) for his assistance during Figure 3's preparation.