Phylogeography and Population Structure Analysis Reveal Diversity by Gene Flow and Mutation in Ustilago segetum (Pers.) Roussel tritici Causing Loose Smut of Wheat

Ustilago segetum (Pers.) Roussel tritici (UST) causes loose smut of wheat account for considerable grain yield losses globally. For effective management, knowledge of its genetic variability and population structure is a prerequisite. In this study, UST isolates sampled from four different wheat growing zones of India were analyzed using the second largest subunit of the RNA polymerase II (RPB2) and a set of sixteen neutral simple sequence repeats (SSRs) markers. Among the 112 UST isolates genotyped, 98 haplotypes were identified. All the isolates were categorized into two groups (K = 2), each consisting of isolates from different sampling sites, on the basis of unweighted paired-grouping method with arithmetic averages (UPGMA) and the Bayesian analysis of population structure. The positive and significant index of association (IA = 1.169) and standardized index of association (rBarD = 0.075) indicate population is of non-random mating type. Analysis of molecular variance showed that the highest variance component is among isolates (91%), with significantly low genetic differentiation variation among regions (8%) (Fst = 0.012). Recombination (Rm = 0) was not detected. The results showed that UST isolates have a clonal genetic structure with limited genetic differentiation and human arbitrated gene flow and mutations are the prime evolutionary processes determining its genetic structure. These findings will be helpful in devising management strategy especially for selection and breeding of resistant wheat cultivars.


INTRODUCTION
Loose smut caused by the basidiomycete fungus Ustilago segetum (Pers.) Roussel tritici Jensen (UST), is one of the most serious diseases on wheat (Triticum aestivum L.) globally. The disease is favored by moist and cool climate during anthesis (Quijano et al., 2016). This fungus converts the spike floral tissues to fungal teliospores, causing yield losses equivalent to the percent smutted spikes (Green et al., 1968;Singh, 2018). The primary inoculum of the pathogen survives in the embryo of the wheat seeds (Kassa et al., 2015). Wilcoxon and Saari (1996) documented that the fungus can result in reductions of 5-20 per cent profit at an infection level of 1-2 per cent. Similarly, Nielsen and Thomas (1996) reported 15-30% annual yield losses as a result of UST infection. Joshi et al. (1980) reported loose smut incidence up to 10% in North Western parts of India. Besides India, 5-10% loose smut incidence was also reported from Russia, New Zealand, and USA (Thomas, 1925;Bonne, 1941;Atkins et al., 1943;Watts Padwick, 1948;Menzies et al., 2009;Kaur et al., 2014).
The infection process and disease cycle of UST on wheat has been elaborately discussed by several workers (Wilcoxon and Saari, 1996;Ram and Singh, 2004). Dikaryotic spores of UST disembarked on the wheat floret, germinate and penetrate the ovary through feathery stigma during anthesis (Dean, 1964;Shinohara, 1976). Mycelia of UST stay alive within the embryo of infected seeds and move systemically through the growing point of the tillers without showing any visible symptoms (Kumar et al., 2018).The symptoms become visible on emergence of spikes from the boot. Several methods are available to manage loose smut that include use of disease free seed, seed treatment with hot water or systemic fungicides, and host resistance are highly effective in controlling loose smut of wheat (Jones, 1999;Bailey et al., 2003;Knox et al., 2014). Unfortunately, the high genetic variability in the pathogen population may develop strains resistant to fungicides and also reduces lifespan of the resistant varieties (Randhawa et al., 2009). Therefore, the understanding of the variability and mechanism causing variability in the pathogen population is important for framing effective disease management and resistance breeding strategies.
Traditionally, variations in fungal pathogens have been deciphered on the basis of morphology, cultural characters, physicochemical characters, virulence pattern, mating type, and disease reaction on differential hosts (Kaur et al., 2014;Kashyap et al., 2015;Yu et al., 2016). Unfortunately, these methods are time consuming, highly influenced by environment and thus are not very precise. Recently, DNA profiling based on restriction fragment length polymorphism (RFLP), random amplified polymorphic DNA (RAPD), amplified fragment length polymorphism (AFLP) and inter simple sequence repeat (ISSR) are being extensively employed to study the population biology and genetic diversity among fungi (Bennett et al., 2005;Kashyap et al., 2016;Kumar et al., 2016;Yu et al., 2016). Karwasra et al. (2002) used RAPD, ISSR, and AFLP profiling for assessing the extent of genetic variation among the regional UST isolates collected from Haryana, India. SSR or microsatellites markers have an advantage in studying genetic diversity, population genetic structure, genetic linkage mapping, and quantitative trait locus (QTL) because of its high repeatability, transferability, codominance, and ubiquitous presence (Ellegren, 2004;Kumar et al., 2013a;Singh et al., 2014;Zhu et al., 2016). Recently, SSR has been used for deciphering diversity in many fungal plant pathogens, such as Puccinia triticina (Wang et al., 2009), Magnaporthe grisea (Shen et al., 2004), Rhynchosporium secalis (Bouajila et al., 2007), Phytophthora infestans (Zhao et al., 2008), Phaeosphaeria nodorum (Sommerhalder et al., 2010), Fusarium culmorum (Pouzeshimiab et al., 2014), Ustilago hordei (Yu et al., 2016), Puccinia graminis f. sp. tritici (Prasad et al., 2018), and Bipolaris oryzae (Ahmadpour et al., 2018). Storch et al. (2007) reported that protein-coded genes are generally more conserved and can be aligned with more reliability. Among protein-coding markers, second-largest subunit of nuclear RNA polymerase II (RPB2), translation elongation factor 1-alpha (Tef1), beta-tubulin (Tub2) and actin (ACT) have been most frequently used for inferring phylogenetic relationships among fungi (Stielow et al., 2015;Raja et al., 2017). Functionally, RPB2 gene is responsible for the transcription of protein-encoding genes (Sawadogo and Sentenac, 1990) and present as singlecopy in all eukaryotes (Thuriaux and Sentenac, 1992). A high level of polymorphisms in this gene makes this an excellent tool to study molecular evolution and phylogenetic relationships (Matheny et al., 2007;Krimitzas et al., 2013;Wang et al., 2016;Kruse et al., 2017). Stockinger et al. (2014) reported RPB2 gene as a potential marker for adequate phylogenetic resolution to resolve fungal lineages when compared to rDNA loci. Therefore, in the present study phylogenetic analysis of the single copy of RPB2 gene was done to explore genetic differentiation of UST populations. Despite recent advances, the role of gene flow in the reproduction, dispersal and evolution of UST populations is still poorly studied. To the best of our knowledge, fingerprinting and genetic diversity in UST population using microsatellite markers have not been studied extensively on large number of UST isolates. Thus, the present investigation was undertaken to study the genetic variation in the UST isolates collected from four different agro-ecological zones of India. The specific objectives of present investigation were to: (i) analyze the genetic diversity of UST isolates of Indian origin based on geographic areas of collection by microsatellites and RPB2 gene sequence comparison (ii) investigate the possibility of random mating within their sampling sites using MULTILOCUS version 1.31 (Agapow and Burt, 2001) and (iii) determine the population genetic structure of the UST population in four different wheat growing zones by employing genetic data analysis tools like GenAlEx 6.5 (Peakall and Smouse, 2012), NTSYS-pc program V2.1 (Rohlf, 2002), DnaSP program (Tajima, 1983), STRUCTURE 2.3.4 (Pritchard et al., 2000), and Bottleneck v1.2 (Agapow and Burt, 2001).

Sampling, Isolation and Purification of Fungal Isolates
One hundred and twelve isolates of UST were collected from different agro-ecological zones of India during 2016-2018 (Table 1, Figure S1). Stratified random sampling method was adopted at spike emergence to anthesis stage at least 30 KM apart, in each field. One smutted spike was gathered per field to avoid the possibility of mixture of genotypes. Isolates were assigned into four populations and named as Central zone (CZ), North Eastern Plain Zone (NEPZ), North Hill Zone (NHZ), and North Western Plain Zone (NWPZ). Single-teliospore cultures were obtained by inoculating teliospores on half-strength potato dextrose agar (50% PDA) and incubated at 25 ± 1 • C for 24 h. Single germinating teliospores were transferred to slants containing 50% PDA, incubated at 25 ± 1 • C and maintained at 4 • C for further use.

Genomic Preparations
For genomic DNA extraction, isolates of UST were transferred, using a sterile needle, from Petri dishes to 100 ml Erlenmeyer    Frontiers in Microbiology | www.frontiersin.org flasks containing potato dextrose yeast (PDY) broth. The cultures were grown for 10 days in an orbital shaker (150 rpm) at 25 ± 1 • C. The fungal mat harvested on sterile Whatman filter paper, frozen in liquid nitrogen, and ground to fine powder with pestle and mortar. Cetyl trimethyl-ammonium bromide (CTAB) method was used to extract genomic DNA as described by Kumar et al. (2013b). DNA was quantified by recording absorbance at 260 nm and determined purity by calculating the ratio of absorbance at 260 nm to that of 280 nm. The concentration of DNA was adjusted to 50 ng µl −1 for PCR analysis.

RNA Polymerase II Second Largest Subunit (RPB2) Gene Amplification and Sequence Analysis
A portion of the RNA polymerase II second largest subunit (RPB2) gene was amplified for all the 112 UST isolates using polymerase chain reaction (PCR) with primer RPB2F (5 ′ -AACCACCGATTTGGAGCAGT-3 ′ ) and RPB2R (5 ′ -ACT CATTAGATGGCGGGGAGA-3 ′ ). The primers were designed using the NCBI accession number DQ846896.1. PCR amplification was performed in Q cycler 96 (Hain Lifescience, UK). Each PCR reaction mixture (50 µl) consisting of 50 ng template DNA, 1.5 µM of each primer, 1.5 mM MgCl 2 , 0.2 mM of each deoxynucleotides, 1.5 unit of Taq DNA polymerase (New England Biolabs, USA), and final volume of 50 µl was maintained by adding distilled water. The amplification was done as initial denaturing at 94 • C for 4 min, followed by 35 cycles at 94 • C for 1 min, annealing at 55 • C for 1 min, extension at 72 • C for 1 min and a final extension at 72 • C for 7 min. The amplified PCR products were separated on 1% agarose gel, and the desired specific band was purified by Wizard SV Gel and PCR Clean-Up System (Promega, Madison, WI, USA) according to the manufacturer's instruction. DNA was sequenced commercially at the Eurofins Genomics India Pvt. Ltd. (Bangalore, India). A consensus sequence was obtained from the sequencing of both forward and reverse strands, and further data quality were checked using Chromas 2.32 (Technelysium Pty. Ltd.). BlastN search programme was used to compare the sequences available in National Center for Biotechnology Information (NCBI) databases. The sequences were aligned using MEGA 7  and gaps and missing data were not considered for phylogenetic analysis. Evolutionary tree was drawn using neighbor-joining (NJ) method (Saitou and Nei, 1987) and evolutionary distances were determined by Kimura (1980). The nucleotide sequences of RPB2 gene were submitted to NCBI GenBank.

Microsatellite Genotyping
For the amplification of each microsatellite marker gradient PCR was performed to select the best annealing temperature. PCR amplifications were performed in Q cycler 96 (Hain Lifescience, UK) in a total volume of 10 µl containing Promega TM PCR Master Mix, additional 0.5 mM MgCl 2 , 0.05-0.15 µM forward primer, 0.05-0.15 µM reverse primer, and 1.0 µl template DNA (50 ng µl −1 ). PCR cocktail without template DNA was taken as control. PCR was programmed for initial denaturation at 94 • C for 4 min; followed by 35 cycles of denaturation at 94 • C for 60 s, annealing at 51, 52, 53, 54, and 55 • C for 1 min, and extension at 72 • C for 1 min; with a final extension step at 72 • C for 7 min. PCR products were separated in 4% agarose gel stained with ethidium bromide along 100 bp ladder (Genei, Bangalore) to know the polymorphism. Test isolates were scored on the basis of amplification and non-amplification of SSR markers. The numbers of varoius alleles per locus, effective alleles per locus, private alleles and Shannon's Information Index were computed for each population using GenAlEx 6.5 (Peakall and Smouse, 2012).The polymorphic information content (PIC) value for each SSR markers was calculated using the formula: where k is the total number of alleles detected for a microsatellite; P i is the frequency of the i th allele in the set.

Population Structure and Gene Flow
The presence (1) and absence (0) of desired amplicom for each SSR marker in all the 112 isolates were treated as binary characters and was analyzed using the NTSYS-pc program V2.1 (Rohlf, 2002). All the isolates were grouped in different clusters using Un-weighted Pair Group Method with Arithmetic average [UPGMA; (Yu et al., 2006)] in the SAHN subprogram. Dice similarity coefficient based on the proportion of shared alleles with the SIMQUAL was used to know the genetic similarity between isolates. Analysis of molecular variance (AMOVA) based on SSR markers was calculated using GenAlEx 6.5 to know the genetic diversity in different populations [ Table 1; (Peakall and Smouse, 2012)]. The fixation index (F st ) of the total populations and pairwise F st among all pairs of populations were calculated to investigate population differentiation, and significance was tested based on 1,000 bootstraps. Gene flow among populations was calculated based on the number of migrants per generation (Nm) using the formula, Population structure analysis was executed with STRUCTURE 2.3.4 (Pritchard et al., 2000) using microsatellite loci data. The optimum number of populations (K) was selected by testing K = 1 to K = 15 using five independent runs of 25,000 burn-in period length at fixed iterations of 100,000 with a model allowing for admixture and correlated allele frequencies. The optimum number of population was predicted using the simulation method of Evanno et al. (2005) in STRUCTURE HARVESTER version 0.6.92 (Earl and vonHoldt, 2012). The K value was determined by the log probability of data [Ln P(D)] based on the rate of change in LnP(D) between successive K. Bottleneck v1.2 was used to determine whether there was an excess (a recent population bottleneck) or deficit (a recent population expansion) in H (gene diversity) relative to the number of alleles present in UST populations (Piry et al., 1999). To determine whether loci displayed a significant excess or deficit in gene diversity Sign and Wilcoxon significance tests (WT) were performed (Cornuet and Luikart, 1996). The Tajima's D value was estimated using the DnaSP program (Tajima, 1983). Number of haplotypes, number of segregating sites, and the π and w measures of nucleotide diversity for each population were determined with clone-corrected samples whereas, Nei's haplotype diversity (H d ) was also determined before sample clone-corrected. The value of π and w represents the average number of pairwise nucleotide differences and the total number of segregating sites in a set of DNA sequences, respectively. MULTILOCUS version 1.31 was used to measure linkage disequilibrium among SSR loci using the index of association (I A ) and rBar D index (Agapow and Burt, 2001). Tests of departure from random mating for both indices were done with 10,000 randomizations of the complete and clone-corrected MLH dataset. To dissect the recombination in UST populations, the proportion of compatible pairs of loci (PrCP) was determined using MULTILOCUS v.1.31 (Agapow and Burt, 2001). The null hypothesis of random mating was rejected if more compatible loci than expected in a randomized population were observed (P < 0.05).
The population statistic parameters revealed statistically negative values of Tajima's D (−1.68380 to −0.77815) in UST populations of different wheat growing zones and provided evidence that the dominance of purifying selection and population expansion is operating in UST isolates (Table 2). Similarly, the test statistic FLD * and FLF * reflected analogous type of results for UST isolates and highlighted the principle of operation of purifying selection and population size expansion in different wheat growing zones. The statistically significant negative value of FFs statistic except NEPZ further strongly denotes the expansion observed in CZ, NWPZ, and NHZ population ( Table 2).

Marker Development, SSRs Polymorphism and Gene Diversity
To evaluate allelic diversity among UST isolates, 35 microsatellite markers were used. Twenty five SSR primer pairs produced clear single amplicons while rest did not amplify. Out of 25 primers, only 16 (Table 3) showed polymorphism and therefore used in genetic diversity analysis of 112 UST isolates originated from four geographical distinct zones of India. The polymorphism of the different SSR loci is presented in Table 3. The alleles per locus varied from two to four and allele size ranged from 170 to 750 bp. The PIC values ranged from 0.3713 to 0.6632. One SSR loci (UST31) was highly informative (PIC ≥ 0.5) and rest all loci were reasonably informative (0.5 <PIC > 0.25). The 16 polymorphic primer pairs revealed a total of 68 alleles across the 34 loci in 112 isolates, ranging from 2 to 4 alleles per isolate ( Table 3). The markers were all selectively neutral according to the Ewens-Watterson test (Table S2).

Population Genetic Diversity
The genetic diversity indices for diverse four UST populations from different wheat growing zones are depicted in Table 4. The number of different alleles (N a ), effective alleles (N e ) and expected heterozygosity (H e ) averaged across all loci ranged from 1.824 to 2.0, 1.695 to 1.721, and 0.373 to 0.406, respectively for the four different populations (CZ, NEPZ, NHZ, and NWPZ). The    (Table S2), and responsible for 91% of the total genetic diversity (H T = 0.3611). The proportion of the total genetic diversity attributable to the population differentiation (G st ) ranged from 0.0029 to 0.391 with an average of 0.0678 over all loci (Table S2).

Population Genetic Structure and Gene Flow
The AMOVA analysis comparing the four populations showed that 1% of the total variance was distributed among zones. A relatively higher proportion of the variation (91 %) was distributed within UST isolates ( Table 5). Genetic variation among wheat growing zones (F st = 0.012), isolates (F is = 0.079) and within isolates (F it = 0.090) was mentioned in Table 5. Pairwise F st values of the genetic distance between different populations were low but significant (P < 0.01; Table 6). The average gene flow among populations (N m ) was ranged from 0.00 (between NWPZ and NHZ;between NWPZ and CZ) to 29.131 (between CZ and NHZ). Pairwise estimates of gene flow (N m ) indicated that Nm value was more than 1.0 in most of the population pairs suggesting gene flow between populations, although with different magnitude except NWPZ and NHZ and NWPZ and CZ ( Table 6). The highest value was observed among pair of CZ and NHZ populations (N m = 29.131; F st = 0.048) followed by CZ and NEPZ pair (N m = 14.989; F st = 0.016). When population of NEPZ was compared with other populations for genetic distance (F st ) on the basis of SSR (0.009-0.037) and  Table 6).
The dendrogram based on unweighted Neighbor-joining method grouped all the 112 isolates representing four populations into two major clusters (Figure 2). Among these, 100 and 12 isolates were grouped in cluster 1 and cluster 2. The grouping by UPGMA using genetic distances do not showed any spatial clustering among the different geographic zones (Figure 2). Several subgroups within cluster 1 were observed irrespective to populations, indicating genetic variability within and among isolates in each population. The similarity coefficient of overall isolates averaged 0.50. The substructure analysis for genetic relationship among UST isolates, excluding loci with null alleles, showed a clear K peak at K = 2 ( K = 113.96) ( Figure 3A) and K = 2 was the most likely value thus revealed that all individuals grouped into two major clusters ( Figure 3B).

Linkage Disequilibrium and Population Expansion
Linkage disequilibrium analysis was performed to infer the reproductive strategy. The values of I A (0.991-2.034) and rbar D indices (0.066-0.134) in the association tests differed significantly from zero in all the UST populations ( Table 7). UST isolates sampled from different wheat growing zones rejected the null hypothesis of gametic equilibrium, this shows that isolates in each zone were not under random mating ( Table 7).
The results depicted in Table 8 shows that all the UST populations evolved through stepwise mutation method (SMM).  Index of association (I A ) and rBar D, a modified index of association (1); P-values were estimated from 1,000 randomizations and are identical for I A and rBar D.
The sign tests in Bottleneck revealed a significant H deficit in 1 of the 13 loci under IAM (P = 0.00027), TPM (P = 0.00070) and SMM (P = 0.00132) model of evolution, indicating recent population expansion in CZ. Similarly, in NHZ and NWPZ population, significant H deficit in 1 of the 14 loci under IAM, TPM, and SMM model of evolution were observed ( Table 8).
The bottleneck analysis supported for the non-existence of any bottleneck in UST populations in recent past. The concept of heterozygosity excess works on the principle that the observed gene diversity is higher than the expected equilibrium gene diversity (Heq) in a recently bottlenecked population (Table 8).
In CZ population, sign rank test under IAM mutation model, expected number of loci with heterozygotic excess was 6.35 while the observed number of loci with heterozygosity excess was 13 ( Table 8). The expected and observed loci with heterozygosity excess calculated by using TPM and SMM models were 6.86 and 7.22, respectively. Similarly, the outcome for IAM, TPM, and SMM supported the absence of any bottleneck in CZ population. Similar trends were also observed for NHZ, NEPZ, and NWPZ population using TPM and SMM. Although, one locus with heterozygosity deficiency (P = 0.00002) was also observed in NEPZ population using IAM model. The SDT provided the T2 statistics equal to 4.024, 3.492, and 3.168 for IAM, TPM, and SMM, respectively in CZ population. The probability values were significant for IAM (P = 0.00003), SMM (p = 0.00024), Probability of deviation from mutation-drift equilibrium (H>HEQ) was determined according to Cornuet and Luikart (1996).
and TPM (P = 0.00077). Thus, null hypothesis was accepted by all the models. The probability values with WT for one tail for H excess under three models IAM (p = 0.00006), TPM (p = 0.00009), and SMM (p = 0.00015) indicated acceptance of null hypothesis under all the models in CZ population. Thus, all the three tests (ST, SDT, and WT) indicated the acceptance of mutation drift equilibrium (P > 0.05) in UST populations under all the mutation models for all CZ, NHZ, NEPZ, and NWPZ populations. Another powerful test of qualitative graphical method based on the allele frequency spectra detected a normal L-shaped curve, where the alleles with the lowest frequencies (0.03-0.3) were found to be most abundant in the entire wheat growing zones ( Figure S2).

DISCUSSION
Loose smut is a monocyclic internally seed borne disease. The seed borne inocula leads to long distance rapid spread of the disease across the entire wheat growing zones of India. The Ustilago segetum tritici isolates were collected from four major wheat growing zones (NWPZ, NEPZ, CZ, and NHZ) of India. The genetic structure was analyzed by performing RPB2 gene sequence comparison and fingerprinting with newly developed SSR markers. The degree of nucleotide difference in the RPB2 region in UST populations is low. It may be due to action of concerted evolution leads to homogenizing effect. Furthermore, evidence of recombination (R m = 0) in entire UST population was not detected. The low but significant F st values (<0.01 and <0.05) and pair wise population differentiation among UST population from different zones indicate low genetic differentiation in the total populations (Fst = 0.012). The UST populations appear either of common origin or limited distribution, reproduces predominantly by asexual means, or experience substantial gene flow (from CZ to NHZ and NEPZ and later from NEPZ to NWPZ and NHZ) coupled with genetic drift. However in populations where mutation rates are high, F st tends to fall back to zero (in case of CZ) as novel alleles are added to the population (Onaga et al., 2015). This happened because of negative dependence of F st on diversity (Charlesworth et al., 1997) and has been reported in several pathogens; even in the absence of asexual reproduction (Couch et al., 2005). To outwit this prejudice due to mutation rates, F ST has been compared with other genetic diversity indices. Analysis of molecular variance (AMOVA) showed that 91% of the total variation was due to differences among isolates within populations and the variation among populations reflected only 1% of the total variation. A low degree of differentiation among UST populations may be due to admixture among isolates from the different geographic regions. These results are also unswerving with the low Shannon's indices (0.531-0.589). The overall Shannon's index (I = 0.552) suggests that more than 50% of the genetic diversity explained by the differences between isolates. Therefore, all these results conclude that most of the genetic variation (91%) was distributed among isolates across the regions. The similar findings have been earlier reports in Ustilago maydis (Valverde et al., 2000), Mycosphaerella graminicola (Boeger et al., 1993), Phytophthora infestans (Goodwin et al., 1994), Rhynchosporium secalis (McDonald et al., 1999), and Rhizoctonia solani (Goswami et al., 2017) while analyzing their populations structure. It is worth mention here that the fungal isolates are mostly similar at the genetic level despite long distances among different wheat growing zones in India.
Haplotype analysis performed in present study provides information on the number of haplotypes (h), their frequency and diversity, and genetic distances within and between RPB2 gene sequence. The H d can range from zero to 1.0, which means no diversity to high levels of haplotype diversity (Nei and Tajima, 1981). In present study, H d (0.104-0.473) values indicated low to moderate levels of diversity in different wheat growing zones. NWPZ revealed the maximum diversity based on the number of haplotypes (i.e., 51 haplotypes from 56 UST isolates). Few haplotypes were shared among different populations indicating role of asexual reproduction and longdistance dispersal. Contrary to this, sexual reproduction occurs at the site of infection on the spike, was apparently prevalent in Indian wheat growing areas. Besides this, two other reasons may explain the genetic variability among isolates, first there might be possibility of multiple founder populations that result in the admixture of populations. Secondly, population genetic expansion may took place due to the accumulation of different alleles in UST populations as evident from the total and shared mutations noticed in the presents study. All these are agreeable, as evidenced by the high levels of population admixture identified in Structure analysis. Mutation generates diverse regional populations of UST, creating a pool of mutants from which new, virulent isolates can emerge. Many haplotypes were shared among these populations supports multiple introductions of the same haplotype, which could be due to pronounced asexual reproductive phase, as the case with UST. Furthermore, the analysis revealed that all UST populations are admixed and contain haplotypes from multiple populations within a region or between the regions.
To investigate the role of evolutionary forces on UST population, different neutrality test statistics (Tajima's D, Fu and Li's D * and F * , and Fu's Fs) were performed to examine the RPB2 sequence data for departure from neutrality. Significant and negative Tajima's D test statistics indicated that RPB2 locus is experiencing population bottlenecks, where the population is largely uniform and only a few sequences compose the new population. The biology of UST and its colonization could serve as a source of population bottlenecks. Similarly, significant and negative value of almost all D * and F * test statistics showed strong purifying selection. Overall, the RPB2 gene data displayed genetic divergence in the structure of the population among the four wheat growing zones analyzed and well-supported by the results from microsatellite loci. Moreover, 16 newly developed SSR markers used were polymorphic on all of the UST isolates. These results are comparable to earlier reports on SSR markers developed for other plant pathogens (Yang and Zhong, 2008;Pouzeshimiab et al., 2014). Thus, these SSR markers could be useful tool to study the population biology and genetics of this fungus at global level.
The loose smut fungus is carried as dormant mycelium within healthy appearing seed and is spread by growing infected seed. Moreover, teliospores are easily shaken from the smutted heads and may be carried for long distances by wind, insects, or other agencies (Ram and Singh, 2004). The low differentiation among regions (8% of total variation) detected on the basis of microsatellites can be explained by different ways. Firstly, the level of gene flow (N m ) is sufficient to maintain genetic similarity. The low levels of population differentiation were observed in corresponding high values of N m . N m is >1, reveals little differentiation among populations, and under such circumstances migration is more important than genetic drift (McDermott and McDonald, 1993). Theoretically, average gene flow value (N m = 21) indicates that 21 isolates would need to be exchanged each generation among populations of different regions to achieve current degree of similarity. Highest gene flow was recorded between the CZ and NHZ (N m = 29.131) populations. Regular gene flow and random mating among isolates from various populations could result in new pathotypes with improved pathological and biological fitness traits (Mishra et al., 2006). Besides this, inbreeding coefficient (F it = 0.090) indicated little genetic differentiation across UST populations. The corroborating results were also observed in American populations of P. nodorum (F st = 0.004; Stukenbrock et al., 2006), North American population of Mycosphaerella graminicola (F st = 0.08) (Zhan et al., 2003) and Septoria musiva populations (F st = 0.20) in north-central and northeastern North America (Feau et al., 2005). In UST populations, SSR data provided no discrete clustering in different populations on the basis of structure analysis and little among-populations variance (8%) was observed in AMOVA. Therefore, no specific demarcation of genetic grouping was noticed, and results further suggest large and widespread populations with high migration rates facilitated by wind-dispersed teliospores and frequent exchange and long distance transport of infected seed material in different wheat growing zones.
In present study, the plausible reasons leading to the structuration of the regional collection were not elucidated since no clear main direction of gene flow among the sampled sites, and no significant isolation by distance (P = 0.49; > 0.05) were observed. The genetic identities of the four populations evaluated in present study were close to 1. The moderate G st (−0.0678) values indicated weak genetic differentiation and minimal geographic clustering among the populations from four different zones and yielded average Nm values 6.875 across all loci and populations, suggesting that the level of gene flow was approximately seven times greater than that needed to prevent populations from diverging by genetic drift. Moreover, absence of private alleles in all the four zone population may indicate that the observed migration levels reflect gene flow. The wind direction in India is generally from North West to North East during the wheat cropping season. This may cause migration and gene flow between populations lead to admixture among isolates from the diverse geographic origin as observed in present study. The pathogen is cosmopolitan in distribution, and telisopores are known to be disseminated over long distances by wind (Ram and Singh, 2004). Besides this, long-distance gene flow in UST was man mediated, and subsequent natural gene flow gradually reduced the isolation by distance.
Mutation is the main evolutionary mechanism that generates polymorphisms, and its implications to disease management are well-known (Jolley et al., 2005). For UST, point mutations (4 mutations) in the sequence of the RPB2 gene resulted in the introduced of new alleles in the population. Similar observations have been documented by Lourenço et al. (2009), while studying the molecular diversity and evolutionary processes of Alternaria solani, a seed borne pathogen in Brazil inferred using genealogical and coalescent approaches. However, the relatively low number of singleton mutation estimated in present study for RPB2 locus in different wheat growing zones does not signify low mutation rate in whole genome. Therefore, authors felt that the evaluation of other housekeeping genes or genomic regions should be analyzed for more accurate quantification of mutation occurrence in UST populations. Further, bottleneck analysis based on three models (IAM, TPM and SMM) indicated that the observed heterozygosity excess (H e ) found less than the expected excess heterozygosity (H ee ) in all the four wheat growing zones. Thus, the lower magnitude of H e with their respective H ee reflect absence of genetic bottleneck in UST populations. Further, the negative TD-value of the UST population indicates that the UST population is undergoing demographic expansion. Further support for this hypothesis is gained from lack of private alleles in UST populations collected from all wheat growing zones.
The knowledge of population genetic structure of a pathogen provides information on its potential to overcome host genetic resistance (McDonald and Linde, 2002). The results of present study showed that UST have a clonal genetic structure with limited differentiation between populations. It means variability is mainly contributed by mutation and recombination is uncommon. Therefore, wheat disease management measures, such as replacement of infected seed and fungicide-treated seeds, could help to reduce UST severity and limit gene flow. Host resistance is also economical and effective to manage loose smut of wheat (Singh, 2018), and breeding efforts in different wheat growing zones have put emphasis on exploration of more resistance sources and other gene pools to fill this gap. In addition, screening of germplasm and breeding material against genetically diverse isolates needs to be emphasized to develop durable and effective resistance cultivars. In nutshell, the current study presents a first stab to comprehend the genetic variation within and among populations of UST causing loose smut in different wheat growing regions. The results highlight that microsatellite markers can be used to analyze genotypic and genetic diversity of populations of UST.

AUTHOR CONTRIBUTIONS
The work was conceived and designed by PK and SK. The sampling survey was performed by PK, SK, PJ, and DS. Experiments were conducted by RT, PK, and RK. Data analysis was done by PJ and RT. The manuscript was drafted by PK and SK. The final editing and proofing of manuscript was done by DS and GS. RK and RT contributed equally. The manuscript was approved by all the authors.