Viruses Without Borders: Global Analysis of the Population Structure, Haplotype Distribution, and Evolutionary Pattern of Iris Yellow Spot Orthotospovirus (Family Tospoviridae, Genus Orthotospovirus)

Iris yellow spot, caused by Iris yellow spot orthotospovirus (IYSV) (Genus: Orthotospovirus, Family: Tospoviridae), is an important disease of Allium spp. The complete N gene sequences of 142 IYSV isolates of curated sequence data from GenBank were used to determine the genetic diversity and evolutionary pattern. In silico restriction fragment length polymorphism (RFLP) analysis, codon-based maximum likelihood studies, genetic differentiation and gene flow within the populations of IYSV genotypes were investigated. Bayesian phylogenetic analysis was carried out to estimate the evolutionary rate. In silico RFLP analysis of N gene sequences categorized IYSV isolates into two major genotypes viz., IYSV Netherlands (IYSVNL; 55.63%), IYSV Brazil (IYSVBR; 38.73%) and the rest fell in neither group [IYSV other (IYSVother; 5.63%)]. Phylogenetic tree largely corroborated the results of RFLP analysis and the IYSV genotypes clustered into IYSVNL and IYSVBR genotypes. Genetic diversity test revealed IYSVother to be more diverse than IYSVNL and IYSVBR. IYSVNL and IYSVBR genotypes are under purifying selection and population expansion, whereas IYSVother showed decreasing population size and hence appear to be under balancing selection. IYSVBR is least differentiated from IYSVother compared to IYSVNL genotype based on nucleotide diversity. Three putative recombinant events were found in the N gene of IYSV isolates based on RDP analysis, however, RAT substantiated two among them. The marginal likelihood mean substitution rate was 5.08 × 10–5 subs/site/year and 95% highest posterior density (HPD) substitution rate between 5.11 × 10–5 and 5.06 × 10–5. Findings suggest that IYSV continues to evolve using population expansion strategies. The substitution rates identified are similar to other plant RNA viruses.

Iris yellow spot, caused by Iris yellow spot orthotospovirus (IYSV) (Genus: Orthotospovirus, Family: Tospoviridae), is an important disease of Allium spp. The complete N gene sequences of 142 IYSV isolates of curated sequence data from GenBank were used to determine the genetic diversity and evolutionary pattern. In silico restriction fragment length polymorphism (RFLP) analysis, codon-based maximum likelihood studies, genetic differentiation and gene flow within the populations of IYSV genotypes were investigated. Bayesian phylogenetic analysis was carried out to estimate the evolutionary rate. In silico RFLP analysis of N gene sequences categorized IYSV isolates into two major genotypes viz., IYSV Netherlands (IYSV NL ; 55.63%), IYSV Brazil (IYSV BR ; 38.73%) and the rest fell in neither group [IYSV other (IYSV other ; 5.63%)]. Phylogenetic tree largely corroborated the results of RFLP analysis and the IYSV genotypes clustered into IYSV NL and IYSV BR genotypes. Genetic diversity test revealed IYSV other to be more diverse than IYSV NL and IYSV BR . IYSV NL and IYSV BR genotypes are under purifying selection and population expansion, whereas IYSV other showed decreasing population size and hence appear to be under balancing selection. IYSV BR is least differentiated from IYSV other compared to IYSV NL genotype based on nucleotide diversity. Three putative recombinant events were found in the N gene of IYSV isolates based on RDP analysis, however, RAT substantiated two among them. The marginal likelihood mean substitution rate was 5.08 × 10 −5 subs/site/year and 95% highest posterior density (HPD) substitution rate between 5.11 × 10 −5 and 5.06 × 10 −5 . Findings suggest that IYSV continues to evolve using population expansion strategies. The substitution rates identified are similar to other plant RNA viruses.
The disease caused by IYSV is characterized by chlorotic or necrotic, straw-colored to white, dry, elongated or spindle shaped lesions along the scape ( Figure 1A). Lesions are frequently at middle to lower portions of the scape. The diamond-shaped lesions tend to be less defined on leaves (Pappu et al., 2008;Bag et al., 2015). The photosynthetic activity is affected in the infected plants leading to reduced bulb size. As the disease progresses, the lesions girdle the scape causing the seed head to collapse leading to severe crop losses ( Figure 1B; Gent et al., 2006). IYSV, as other tospoviruses, consists of a tripartite genome: Small (S) and Medium (M) RNAs encode proteins in both sense and antisense orientations (ambisense) while the Large (L) RNA encodes protein from negative sense strand. The L RNA codes for RNA dependent RNA polymerase (RdRp), M RNA codes for glycoprotein precursors (G N and G C ) and the non-structural movement protein (NSm), and the S RNA codes for nucleocapsid (N) and non-structural silencing suppressor protein (NSs) (reviewed in Bag et al., 2015;Pappu et al., 2020;Resende et al., 2020).
Genetic evolution of viruses directly impacts the host-virus interactions and as such is important to ascertain genetic diversity within a viral species (Sacristan and Garcia-Arenal, 2008;Gibbs and Ohsima, 2010). Genetic drift, migration, mutation, natural selection, segment reassortment, and recombination are the major sources of evolutionary changes in the genetic architecture of viral populations (Moya et al., 2004;Butkovic et al., 2021). Phylo-geographical analysis is a powerful tool to determine the geographical distribution pattern of virus, assessing their genetic variation, and historical events that are shaping the genetic architecture of the viral populations (Hewitt, 2004;Chen et al., 2012). Comprehensive genetic architecture and evolutionary genomic analysis of viral populations have become a subject of increasing attention in a number of viruses.
While IYSV is widely distributed in the world, the complete genome of only a few isolates are sequenced. Since the N gene is considered as one of the descriptors for tospovirus identification and classification, the N gene of a large number of isolates was sequenced and the genetic diversity was determined (de Avila et al., 1993;Pappu et al., 2006;Nischwitz et al., 2007;Iftikhar et al., 2014;Bag et al., 2015). The number of N gene sequences of IYSV isolates reported since the last study (Iftikhar et al., 2014) has been on the rise. Building on the earlier findings, we carried out a more detailed and a global analysis of the extent of genetic recombination, genetic diversity, genetic differentiation, and gene flow among different genotypes of IYSV isolates reported from different parts of the world. Further, Bayesian modelbased coalescent approaches were used to gain insights into the molecular evolutionary pattern of IYSV population.

Data Source of Nucleocapsid (N) Gene Sequences
Complete nucleocapsid (N) gene sequences of 142 IYSV isolates reported from across the globe were obtained the nucleotide sequence repository, GenBank. IYSV isolates analyzed were from 19 countries spread over six continents-Africa, Asia, Australia and New Zealand, Europe, North America (Canada, Mexico and the United States) and South America, infecting 10 different hosts including Allium cepa (the most commonly reported host), Allium porrum, Eustoma russellianum, Allium tuberosum, Allium chinense, Wild onion, Alstroemeria sp., Allium sativum, and Allium fistulosum. The N gene sequence (HQ267713) derived from tomato spotted wilt orthotospovirus (TSWV) infecting pepper crop in South Korea was used as an outgroup (Supplementary Table 1). Only complete IYSV N gene sequences (822 nt-long open reading frame coding for a 273-amino acid protein) were considered for analysis.

In silico Restriction Fragment Length Polymorphism (RFLP) Analysis
N gene sequences were analyzed for sequence variations by performing in silico RFLP analysis using Restriction Mapper (Restriction Mapper, 2009). The complete N gene sequence was virtually digested, and sites were mapped as recognized by restriction enzyme HinfI (Zen et al., 2005). Based on HinfI digestion, IYSV isolates can be grouped into IYSV Netherlands (IYSV NL ) or IYSV Brazil (IYSV BR ) types. The size of the largest fragment generated by digestion is considered for differentiating the given isolate into two groups. The genotypes viz., IYSV NL and IYSV BR are differentiated based on the resultant 308 and 468 bp fragments, respectively. Those isolates that yielded any different fragment size upon restriction digestion were grouped into IYSV other .

Phylogeny Construction
Multiple sequence alignment (MSA) was performed using MUSCLE algorithm available in MEGA7 (Edgar, 2004). Best-fit model of nucleotide substitution was determined using MODEL TEST in MEGA7. Aligned sequence relatedness was evaluated using the Maximum Likelihood (default parameters with 1,000 bootstrap replicates) method based on Tamura parameter 3 model (T92) with Gamma distributed (G) available in MEGA7 (Kumar et al., 2016). The phylogenetic tree was rooted using TSWV N gene reported from South Korea as an outgroup.

Population Selection Studies and Neutrality Test
Mean rates of non-synonymous (dN) and synonymous substitutions (dS) were calculated using codon-based maximum likelihood methods, i.e., SLAC (single like ancestor counting), FEL (fixed effects likelihood), and REL (random effects likelihood). DATAMONKEY server (Weaver et al., 2018) was used to calculate dN/dS ratio. To test the theory of neutral evolution, the test statistics such as Tajimas's D (Tajima, 1989), Fu and Li's D, and Fu and Li's F (Fu and Li, 1993;Fu, 1997) were computed in DnaSP software.

Genetic Differentiation and Gene Flow Estimates
DnaSP was used to compute nucleotide test statistics such as Ks, Kst (Kst value close to zero indicate no differentiation), Snn (Snn value close to one indicates differentiation) (Hudson, 2000) and haplotype statistics Hs, Hst (Hudson et al., 1992a,b). These tests estimate genetic differentiation within the populations of IYSV genotypes. Fst statistics was used to estimate the extent of the gene flow (panmixia or free gene flow has values close to zero whereas infrequent gene flow attains values close to one) (Hudson et al., 1992b).

Recombination Detection Analysis (RDA)
Unaligned sequences were loaded in SDT v1.2 program, pairwise scan was performed with the MUSCLE, and the sequence data was saved with minimum identity of 70% and maximum of 100% to ensure sequences were properly aligned. The aligned IYSV N sequences were then used as an input query and analyzed for recombination events using Recombination Detection Program (RDP) v 4.0 (Martin and Rybicki, 2000), BOOTSCAN (Salminen et al., 1995), 3SEQ, GENECONV (Sawyer, 1999), MAXCHI (Maynard, 1992), CHIMAERA (Posada and Crandall, 2001) and SISCAN (Gibbs et al., 2000) available in RDP 4 Beta 4.88. Default settings for the different recombination detection methods and a Bonferroni corrected P-value cut-off of 0.05 were used for analysis.

Recombination Analysis Tool (RAT)
Recombination analysis tool (RAT) was used for the analysis of aligned nucleotide sequences (Etherington et al., 2005). RAT algorithm uses pairwise comparisons between sequences based on the distance method to identify recombinants in nucleotide sequence alignment. Percentage of nucleotide similarities were compared using a sliding window size of 10% of the sequence length and an increment size being half of the window size.

Bayesian Evolutionary Analysis by Sampling Trees (BEAST)
Bayesian phylogenetic analysis was performed in BEAST v2.4.6 (Bouckaert et al., 2014) to estimate evolutionary rate. Strict, relaxed (exponential, lognormal) and random local clocks were utilized for comparison (Bouckaert et al., 2014). Demographic models-coalescent constant population, coalescent exponential population, coalescent Bayesian skyline and coalescent extended Bayesian skyline were used to infer demographic history. "Temporal signal" (i.e., genetic changes between sampling times are sufficient and there is statistical relationship between genetic divergence and time) in the dataset was assessed using TempEst program (Rambaut et al., 2016). Using Markov Chain Monte Carlo (MCMC) method Bayesian phylogenies were constructed in BEAST v2.4.6. First 10% of the samples were discarded as burn-in. Convergence of the chain to stationary distribution and adequate sampling were analyzed using Tracer v1.6 (Tracer, 2018). Tracer was used to analyze the Effective Sample Size (ESS) and other prior parameter values. Tree Annotator was used for generating Maximum Clade Credibility (MCC) phylogenetic trees with common heights node. FigTree (2018) was used to generate the dendrograms.

In silico Restriction Fragment Length Polymorphism Analysis
Computational RFLP-based analysis of N gene sequences recognizing HinfI restriction site divided the population into two major groups [79 NL (55.63%), 55 BR (38.73%)]. Thus, the genotype IYSV NL was found to be predominant over IYSV BR while the rest (5.63%) fell in neither category (IYSV other ) (Figure 2).

Molecular Phylogeny of IYSV N Gene
Phylogenetic tree of the N gene of IYSV constructed based on the aligned nucleotide sequences (Figure 3) using Maximum Likelihood method broadly clustered IYSV genotypes into two major clades (NL and BR types). Four IYSV other isolates (EU750697, KT27882, KT272883, and EU586203) and one IYSV BR isolate (FJ713700) clustered with the NL group. Similarly, four more IYSV other isolates (HQ148173, HQ148174, EU310290, and EU310291) and two IYSV NL isolates (FJ785835 and AF001387) clustered with the BR group. One IYSV BR isolate from Brazil (AF067070) formed a separate monophyletic clade along with a TSWV N gene isolate HQ267713 from South Korea (Outgroup). The clades also followed a geographical pattern as majority of IYSV NL genotypes are from North America and IYSV BR are from the Asian countries. Only one IYSV isolate has been reported from Brazil which formed a separate monophyletic clade even though in silico RFLP characterized it as an IYSV BR type.

Population Selection Studies, Neutrality Test, and Genetic Diversity Test
Gene codons that are in positive or negative selection pressure provide knowledge regarding the molecular evolution pattern of the N gene. The mean dN/dS (dN-rate of non-synonymous substitutions and dS-rate of synonymous substitutions) for N gene accessions belonging to IYSV NL group were found to be 0.192 with no positively selected codon site. SLAC methodology identified 20 negatively selected codons in the N gene of IYSV NL type. The data set when analyzed by FEL methodology revealed Frontiers in Microbiology | www.frontiersin.org one positively selected codon site (codon no. 139) against 62 negatively selected codons. The dN-dS (mean difference between dN and dS) was -0.803 based on REL analysis denoting that the codon sites are under purifying selection acting against deleterious non-synonymous substitutions ( Table 1).
For N gene accessions derived from IYSV BR , the mean dN/dS was found to be 0.191 with no positively selected codon site. SLAC methodology identified 27 negatively selected codons. The data set when analyzed by FEL revealed one positively selected codon site (codon no. 139) against 63 negatively selected codons. The dN-dS was -0.813 based on REL analysis suggesting that the codon sites are under purifying selection acting against deleterious non-synonymous substitutions. Six positively selected codon (codon nos. 30, 40, 109, 139, 210, and 225) and seven negatively selected codons were identified by REL analysis, respectively ( Table 1).
For N gene accessions of IYSV other group, the mean dN/dS was found to be 0.172 with no positively selected codon site. SLAC methodology identified three negatively selected codons. FEL methodology revealed no positively selected codon site against 38 negatively selected codons. The dN-dS was -0.805 based on REL analysis and it denotes codon sites are under purifying selection acting against deleterious non-synonymous substitutions. One positively selected codon (codon no. 270) and zero negatively selected codons were identified by REL analysis, respectively ( Table 1).
Nucleotide diversity (π) of IYSV BR was about two folds higher than that for IYSV NL (0.04513 and 0.01990; respectively, Table 2). However, the highest nucleotide diversity among the IYSV isolates was found in IYSV other (0.08042) indicating IYSV other is more genetically diverse than the IYSV NL and IYSV BR ( Table 2). Number of polymorphic sites (S) was 136 from the N gene sequences of eight isolates of IYSV other , IYSV NL showed 189 polymorphic sites among the N gene sequences obtained from the 79 isolates, whereas IYSV BR showed 230 polymorphic sites obtained from the 55 isolates (Table 2). IYSV other is more diverse than IYSV NL and BR based on number of polymorphic sites (S).

Neutrality Test
The test of neutral evolution analyzed based on the total number of mutations and segregating sites, revealed statistically significant and non-significant negative values of test statistic Tajimas's D for IYSV NL and IYSV BR , respectively (Tables 3, 4).

Genetic Differentiation and Gene Flow
Haplotype-based statistics (Hs and Hst) and nucleotide-based statistics (Ks, Kst, Snn) were estimated to evaluate genetic differentiation between the IYSV genotypes (

Recombination Detection Analysis
Three potential recombination events were detected among the IYSV N genes analyzed ( Table 6 and Figure 4). AF067070 (BR type) IYSV isolate from Brazil is a potential recombinant of isolates: JX861126 Bosnia (major parent) and AY345226 Australia (minor parent). This recombinant was detected by GeneConv, 3Seq, algorithms in RDP. The recombination breakpoint begins at 789 in alignment (789 without gaps) with breakpoint clustering at 99% confidence interval ranging from 730 to 809 in alignment (730-809 without gaps) and breakpoint ends at 12 in alignment (12 without gaps) with breakpoint clustering at 99% confidence interval ranging from 822 to 44 in alignment (822-44 without gaps). The second recombination event involved isolate HQ148174 (IYSV other ) from Iran putatively arising from EU310281 India (major parent) and AB180922 Japan (minor parent). However, only MaxChi algorithm detected this recombinant. The third recombinant, AB180918 (BR type) IYSV isolate from Japan, is the result of a potential recombination event 22 arising from HQ148174 Iran (major parent) and EU586203 Serbia (minor parent). This recombinant was detected by SiScan and 3Seq algorithms of recombination detection program. The recombination breakpoint begins at 20 in alignment (20 without gaps) with breakpoint clustering at 99% confidence interval ranging from 731 to 3 in alignment (731-3 without gaps) and breakpoint ends at 820 in alignment (820 without gaps) with breakpoint clustering at 99% confidence interval ranging from 731 to 3 in alignment (731-3 without gaps). Among the three putative recombinants detected, two belonged to IYSV BR type and the remaining one belonged to IYSV other type. IYSV isolate from Brazil is a potential recombinant and hence this isolate formed a separate monophyletic clade in the phylogenetic tree (Figure 3).

Recombination Analysis Tool
Recombination analysis tool (RAT) was used to substantiate the findings of the RDP. An isolate was considered recombinant when the major and minor parent isolates intersect at two points in the graph (Figure 5). Based on this criterion, HQ148174_Iran and AB180918_Japan were considered potential recombinants even though only MaxChi in the RDP4 program detected HQ148174_Iran as a recombinant.

Bayesian Evolutionary Analysis by Sampling Trees (BEAST)
The rates of nucleotide substitution in tospovirus genomes have not been reported. Therefore, the global repository of IYSV N gene sequences was used to estimate the rates of nucleotide substitution and discern the rapidity with which molecular evolution might occur in the tospoviruses. Genetic recombinants were removed for BEAST analysis since their inclusion violates the assumption of coalescent-based analyses and thus could result in incorrect estimates of the rate of evolution. For nucleotide models, Hasegawa-Kishino-Yano (HKY)-based analysis was performed and it converged satisfactorily. While comparing two models if the marginal posterior distributions of the loglikelihoods do not overlap then the model with the higher posterior distribution of log-likelihood was preferred. Estimate is a better approximation of the true posterior distribution when larger Effective Sample Size (ESS) is available (ESS > 200 are desirable). Based on the above criteria, General Time Reversible (GTR) relaxed exponential growth clock model with coalescent Frontiers in Microbiology | www.frontiersin.org FIGURE 5 | Recombination cross over sites in the Iris yellow spot orthotospovirus nucleocapsid (N) gene generated by Recombination Analysis Tool. The X-axis denotes sequence location along the genome and Y -axis represents the genetic distance. (A) AF067070_Brazil, the crossover site is located at 3 end portion of the genome; (B) HQ148174_Iran, the crossover sites were located toward the 5 and 3 ends of the genome; (C) AB180918_Japan, the crossover site is located in the middle portion of the genome.
constant population was found to be the best fit with a marginal likelihood mean substitution rate of 5.08 × 10 −5 subs/site/year, 95% highest posterior density (HPD) substitution rate between 5.11 × 10 −5 and 5.06 × 10 −5 and ESS was 305 (Supplementary Table 2). Bayesian phylogenetic tree separated the IYSV isolates into two distinct clades, clade I comprising of IYSV BR isolates and clade II comprising IYSV NL isolates. The isolates that belonged to the same geographic region (or same country) clustered together (Figure 6).

DISCUSSION
The global population structure and temporal dynamics of the IYSV conducted previously (Iftikhar et al., 2014) based on N gene sequences delineated that the viral isolates could be categorized into two major genotypes (IYSV BR and IYSV NL ). Further, temporal dynamics of IYSV showed greater incidence of IYSV BR post-2005 compared to IYSV NL . Since the last publication, the number of N gene sequences added to the public repository has increased significantly. To gain a better understanding of the evolutionary genomics and to further gain deeper insights into the evolution rate of IYSV, we analyzed 142 complete N gene sequences using a wide range of computational tools to infer molecular evolutionary genomics.
In silico RFLP analysis to categorize the genotype of IYSV isolates showed that the majority of the isolates belonged to IYSV NL category (55.63%), whereas 38.73% of IYSV BR isolates were observed. There was an increment in IYSV NL genotype incidence or characterization compared to IYSV BR since the last report (Iftikhar et al., 2014). Interestingly, gene flow estimates showed greater gene flow between NL and BR genotypes, rather than between the individual major genotypes and the "other" category. Even between the major genotypes, IYSV NL exhibited a greater gene flow with IYSV other . Also, a greater genetic diversity was observed in IYSV other , compared to NL and BR. However, codon substitution analysis of N gene showed little change since the last study (Iftikhar et al., 2014). In fact, the positively selected Recombination is a common phenomenon in RNA viruses but the implications of recombination for evolution is not well studied (Sztuba-Solinska et al., 2011). There is a serious limitation in understanding the contribution of recombination to evolution of IYSV due to lack of full-length genome sequences. The potential recombinants identified in this study belonged to BR type which seems to be evolving using population expansion strategies. The recombination breakpoints were at 5' and 3' ends suggesting that these are potential hot spots for recombination (Gawande et al., 2015).
In the BEAST analysis, General Time Reversible (GTR) relaxed exponential growth clock model with coalescent constant population was found to be the best fit model explaining the genetic architecture of IYSV population. In a similar analysis of PVY genomic sequences, it was found that the relaxed uncorrelated log normal clock was the best fit with a population of constant size (Gibbs et al., 2017). Further, similar topology of the phylogenetic tree was obtained by both ML method and Bayesian MCC based phylogeny for IYSV. PVY dating was reported by comparing the estimated phylogenetic dates with historical events in the worldwide adoption of potato and other PVY hosts (Gibbs et al., 2017). While the potato-PVY analysis was based on the sample collection dates over several decades, onion-IYSV interactions are relatively new and hence predicting the phylodynamic patterns and demographic history of IYSV require more such data on temporal scale. The PVY demographic history and population expansion was deduced and compared with that of geographic distribution of host (potato) suggesting direct influence of potato cultivation area on the population size of the virus (Mao et al., 2019). In this context, further studies how expansion of onion cultivation area influences the population expansion of IYSV will be interesting.
Bayesian coalescent estimates of evolutionary dynamics of citrus tristeza virus, based on the p25 gene, showed that the rate of substitution was at 1.19 × 10 −3 subs/site/year (Benítez-Galeano et al., 2017). Similarly, Bayesian phylogenetic reconstructionbased nucleotide substitution rates of CP gene derived from four species of viruses in Secoviridae family estimated it to range 9.29 × 10 −3 to 2.74 × 10 −3 (subs/site/year) (Thompson et al., 2014) while for tobamovirus the estimate ranged from 1 × 10 −5 and 1.3 × 10 −3 substitutions per site, per year (Pagan et al., 2010). Further, Bayesian analysis of VPg gene of PVY reveals that it has been evolving at a rate of 5.60 × 10 −4 subs/site/year (Mao et al., 2019). Thus, the mean substitution rates identified for the IYSV N gene are comparable to those found in other plant-infecting RNA viruses. Substitution rates tend to be higher in RNA viruses as they are shown to mutate at faster rate. These mutations help in viral emergence on novel hosts but are not adaptive (Sacristan and Garcia-Arenal, 2008). Furthermore, the time of divergence of PVY clades, clade N and clade O, was found to be the year 1861 CE (95% credibility interval 1750-1948 CE) (Mao et al., 2019). Similar estimation of temporal divergence of IYSV BR and IYSV NL and the role of geographically driven adaptation of IYSV are worth exploring for a better understanding of the evolutionary dynamics of IYSV.
There are a very limited number of sequences of the other IYSV genes (NSm, NSs, G N /G C , RdRp) and even fewer complete genome sequences. Evolutionary analysis on such small sample size is not feasible. In the absence of complete genome sequences of a considerable number of the virus isolates (as is the case of IYSV) extrapolation of results of a single (or a few) gene(s) for the entire species is not uncommon. There are increasing number of studies on molecular evolutionary analysis, including phylodynamics and temporal evolutionary features of plant viruses based on a single or few viral gene sequences, such as VPg of PVY (Mao et al., 2019), NABP and CP genes of Potato virus M (PVM) (He et al., 2019) and P3, CI, Nib genes of (PVY) (Gao et al., 2020). However, to avoid any discrepancies in extrapolating the evolutionary analysis based on one or a few genes of a virus to the entire virus species, next generation sequencing-based sequencing followed by de novo assembly would provide a near complete genomic sequences that could be used to generate a more comprehensive picture of the genetic diversity of the virus populations (Zarghani et al., 2018).

CONCLUSION
IYSV NL was found to be the predominant genotype on a global scale. Interestingly, the IYSV other genotype is genetically more diverse than IYSV BR and IYSV NL genotypes. Population structure analysis revealed that it is under purifying selection and the phenomenon of population expansion is occurring. BEASTbased molecular clock analysis showed that the rates of molecular evolution of IYSV N gene are similar to other plant RNA viruses. This study is a step forward in identifying molecular factors that contribute to the evolution of IYSV, and serves as a foundation for further evolutionary genomic studies on one of the economically important plant virus groups.

DATA AVAILABILITY STATEMENT
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

AUTHOR CONTRIBUTIONS
AT and HP conceived and designed the experiments. AT and YZ performed the experiments. AT, SR, YZ, and HP analyzed the data. AT, CO, SR, YZ, RI, and HP contributed reagents, materials, and analysis tools. AT, SR, and HP wrote the manuscript, proofread and finalized the manuscript. All authors read and approved the final manuscript.  (Grant No. 2018-5118128435), and USDA-NIFA Hatch project Accession #1016563 "Reducing the Impact of Pests and Diseases Affecting Washington Agriculture." The funders have no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.