Association Mapping of Seedling Resistance to Tan Spot (Pyrenophora tritici-repentis Race 1) in CIMMYT and South Asian Wheat Germplasm

Tan spot caused by Pyrenophora tritici-repentis (Ptr) is an important disease of wheat in many wheat producing areas of the world. A genome wide association study (GWAS) was conducted using 11,401 SNP markers of the Illumina Infinium 15K Bead Chip with whole genome coverage to identify genomic regions associated with resistance to tan spot in a diverse panel of 184 wheat genotypes originating from South Asia and CIMMYT. The GWAS panel was phenotyped for seedling resistance to tan spot with Ptr race 1 in two greenhouse experiments. Besides CIMMYT germplasm, several lines from South Asia (India, Bangladesh and Nepal) showed good degree of resistance to tan spot. Association mapping was conducted separately for individual experiments and for pooled data using mixed linear model (MLM) and Fixed and random model Circulating Probability Unification (FarmCPU) model; no significant MTAs were recorded through the MLM model, whereas FarmCPU model reported nine significant MTAs located on chromosomes 1B, 2A, 2B, 3B, 4A, 5A, 5B, 6A, and 7D. The long arms of chromosomes 5A and 5B were consistent across both environments, in which the Vrn-A1 locus was found in identified region of chromosome 5A, and MTA at IACX9261 on 5BL appears to represent the resistance gene tsn 1. MTAs observed on chromosomes 1B, 2A, 2B, 3B, 4A, 6A, and 7D have not been reported previously and are likely novel.


INTRODUCTION
Wheat is a widely grown cereal crop around the world, and it is considered as staple source of nutrition for nearly 40% of the world's population and supplies 20% of dietary protein and food calories (Giraldo et al., 2019). The forecast for global wheat utilization has been raised by 1.5 million tonnes for year 2019/20 than in 2018/19, which is mainly due to 3.5% rise in feed use demand (FAO et al., 2019). The present global wheat production is 766 million tonnes and is expected to rise to about 840 million tonnes by 2050; this demand excluded the requirement of animal feed and the adverse impact of global climate change on wheat production (Sharma et al., 2015). Hence, it is necessary to increase the wheat production to meet its increasing demand. However, changing climatic conditions and onset of severe plant disease epidemics significantly curtail the wheat grain yield and quality (Gurung et al., 2014). About 5-14% of global wheat yield is lost each year due to diseases (Oerke, 2006). A major disease of wheat is tan spot (synonym yellow spot or yellow leaf spot) which occurs in both temperate and warmer wheat growing areas in the world (Duveiller et al., 1998). This disease is caused by the necrotrophic fungal pathogen Pyrenophora tritici-repentis (Died.) Dreches [anamorph Dreschslera triticirepentis (Died.) Shoemaker]. The tan spot fungus was first described in 1823 (Hosford, 1982), and subsequently the disease was reported from Europe, USA, and Japan in early 1900, being considered as a saprophyte causing minor to severe spotting in wheat (Wegulo, 2011). Tan spot epidemics were first reported in the 1970s from Canada, USA, Australia, and Southern Africa (Hosford, 1971;Tekauz, 1976;Rees and Platz, 1992), and it further spread to the entire Central Asia. Tan spot pathogen infects the whole plant but is generally most prominent on leaves followed by stem and head tissues. This infection leads to reduction in photosynthetic area and eventually leads to yield reduction and quality deterioration. In severe cases, the yield losses can be beyond 50% (Wegulo, 2011). In recent years this necrotrophic disease is causing increased wheat yield losses globally, which is associated with reduced tillage practices as necrotrophic pathogens overwinters in wheat stubble (Cotuna et al., 2015) The fungus can produce at least three host selective toxins (HSTs) known as PtrToxA, PtrToxB, and PtrToxC causing chlorotic or necrotic symptoms. The toxins produced are genetically distinct on different host genotypes, based on which the tan spot isolates can be divided into eight races. The HSTs produced by the pathogen interact with the corresponding host sensitivity genes and result in compatible interaction called as effector-triggered susceptibility (ETS) which is described as an inverse gene-for-gene model or toxin model (Friesen et al., 2007). However, identification of non-race-specific resistance QTL clearly indicates that the inverse gene-for-gene model does not fully explain all interactions that occur in the tan spot pathosystem (Faris and Friesen, 2005;Faris et al., 2012). Resistance to tan spot is qualitatively or quantitatively inherited (Faris and Friesen, 2005;Chu et al., 2008;Singh et al., 2009;Chu et al., 2010;Liu et al., 2015;Singh et al., 2019;Hu et al., 2019;Liu et al., 2020), as single dominant gene tsn 1 on chromosome 5BL  confers host sensitivity to Ptr ToxA. The host Ptr ToxC sensitivity gene, Tsc1 was mapped to the short arm of chromosome 1A (Effertz et al., 2002).
Although biparental mapping was used effectively for identification of genomic regions for tan spot resistance, the limited recombination events in biparental mapping lead to limitation of identification of closely linked markers useful for MAS due to long linkage block (Riedelsheimer et al., 2012). The GWAS approach provides better resolution for identification of closely linked markers; also, it circumvents the need to develop specific mapping populations using contrasting parents, which requires long time. GWAS has previously been used for the identification of genomic regions' resistance to tan spot in spring wheat accessions by Gurung et al. (2014), and resistance QTLs were mapped to chromosomes 2B, 4B, and 7A. Patel et al. (2013) identified 11 QTLs located on chromosomes 1A, 1D, 2B, 2D, 6A, and 7A, and Singh et al. (2016) identified QTLs on short arm of chromosomes 1A, 1B, and 6B and long arm of chromosomes 4A, 6A, 2B, 3B, 5B, and 7B; however, all three studies used General Linear Model (GLM) procedure for association analysis, which is regarded as less stringent. The present GWAS study used a diverse panel of germplasm based on collection from CIMMYT, India, Bangladesh, and Nepal. The objective of the study was to identify genomic regions associated with seedling resistance to tan spot using 184 diverse spring wheat genotypes in controlled environmental condition using mixed-linear model (MLM) and Fixed and random model Circulating Probability Unification (FarmCPU) model to identify common genomic regions.

MATERIAL AND METHODS
A panel of 184 spring wheat genotypes originating from CIMMYT-Mexico (CIM-1 to CIM-97), India (IND-1 to IND-40), Bangladesh (BGD-1 to , and Nepal (NPL-1 to NPL-28) was used in the present study (Supplementary Table 1). These genotypes represent the modern elite varieties and breeding lines in the respective organization or countries. Two experiments were conducted in a greenhouse for disease assessment at seedling stage. Each experiment was conducted in completely randomized design with three replications. The experimental unit consisted of four plants per entry and four checks Erik (resistant), Glenlea (susceptible), 6B-365 (moderately susceptible), and 6B-662 (moderately resistant).

Disease Screening
For tan spot disease screening, the isolate MexPtr1 (race 1) that produces Ptr ToxA and Ptr ToxC (Singh et al., 2009) was used. The inoculation was done as described by Singh et al. (2011), and the inoculum concentration was adjusted to 4,000 conidia/ml. The seedlings were grown under controlled environmental condition in a greenhouse with the maintenance of air temperature of 20-22/16-18°C (day/night) with 16 h photoperiod. At two leaf stage or two weeks after sowing, the seedlings were inoculated with conidial suspension of the MexPtr1 isolate until runoff using a hand sprayer. After inoculation, the seedlings were incubated for 24 h under continuous leaf wetness in a mist chamber and were then returned to the greenhouse. Seedling response was evaluated seven days of post inoculation by following the 1-5 lesion rating scale developed by Lamari and Bernier (1989).

Genotyping
The GWAS panel was genotyped with Illumina Infinium 15 K Bead Chip by Trait Genetics GmbH, Germany. Markers with missing data points more than 10% (222 markers), or minor allele frequency less than 5% (2,707 markers), or unknown position (1,695 markers) were filtered, resulting into 11,401 markers for GWAS analysis.

Linkage Disequilibrium, Kinship, Principal Coordinate Analysis (PCA) and Population Structure Analysis
The linkage disequilibrium parameters R 2 among the SNP markers were calculated using Tassel 5 (http://www.maizegenetics. net), and the LD estimates as the allele frequency correlation (R 2 ) between SNP markers were plotted against the physical distances. A kinship matrix and clusters among individual genotypes were calculated using all 11,401 SNP markers; the heat map was generated using classical equation from Van Randen (2008) in R program. PCA analysis was performed using SNP markers, and PC1 was plotted against PC2.
The numeric transformation of genotypic data was done using XLSTAT (2010) as per required format of the Structure 2.3.4 software (Pritchard et al., 2000). The admixture model was adjusted with burn in period length for the 100,000 followed by 500,000 markers chain Monte Carlo (MCMC) replications. The subpopulation test range was kept from K1 to K10, each with five interactions (runs). The D K approach was used to access the actual subpopulations (Earl, 2012). D K was confirmed by the Evanno et al. (2005) method using the STRUCTUREHARVERSTER program (Earl, 2012). Average logarithm of the probability of the observed likelihood [LnP(D)] was calculated along with the standard deviation from the output summary. LnP(D) for each step of the MCMC was calculated for each class (K= 1 to 10) by computing the log likelihood for the data.

Statistical and GWAS Analysis
The combined analysis of variance was carried out for the two experiments; three variance components genotypic variance s 2 g , experimental variance s 2 e and interaction of genotype and experiment variance s 2 g * e , were estimated for tan spot using restricted maximum likelihood (Patterson and Thompson, 1971) estimation procedure of GenStat software, 17 th edition (VSN, International, Hemel Hempstead, UK). Broad-sense-heritability was estimated using the formula: Where s 2 g presents the genetic variance, s 2 ϵ represents the error variance, and nreps represents the number of replications.
Bartlet test was used to assess the homogeneity of error variance prior to pooling the two-experiment data for GWAS analysis. Marker-trait association (MTA) was performed using mixed-linear model (MLM) and fixed and random model circulating probability unification (FarmCPU). For GWAS analysis using MLM model, a Q + K model that considers both Kinship (K matrix) and population structure was adopted in Tassel (http://www.maizegenetics.net), whereas the FarmCPU model was performed using the R software package GAPIT v. 3.5. GWAS study was conducted for the two experiments separately as well as for the pooled experimental data. The markers were declared to be significant using Bonferroni correction with significant cutoff (p-values, 4.4 E-06) calculated at the alpha level of 0.05 using 11,401 markers to reduce false discovery rate in both MLM and FarmCPU models.

Evaluation of Tan Spot Resistance
The coefficient of correlation between the two experiments was high, with r = 0.73 at p ≤ 0.001. The broad-sense-heritability estimate based on seedling tan spot data was 85% for experiment 1, 78% for experiment 2, and 84% for across two environments. Analysis of variance showed that variances due to genotypes s 2 g , experiment s 2 e and their interaction s 2 g * e were all highly significant ( Table 1). The average tan spot scores were 1.8 and 2.0 in experiments 1 and 2. The checks Erik, 6B-662, 6B-365, and Glenlea had average tan spot scores 1.0, 2.3, 2.6, and 4.6, respectively over experiments, which confirms disease induction by P. tritici-repentis race 1. In pooled analysis two genotypes, HD 2733 from India and BL 4407 from Nepal, were found to be highly resistant and stood above the resistant check Erik. Another 141 genotypes were found to be moderately resistant with disease scores lower than the moderately resistant check 3B-662. Twenty-one genotypes were found to  Figure 2) and PCA analysis ( Figure 1D) also divided the population into two major groups. The two sub-populations were designated as Subpop 1 and Subpop 2, which comprised 140 and 44 genotypes respectively. Subpop 1 was mainly composed of genotypes originated from CIMMYT (93 accessions), Indian breeding programs (27) A clear clustering of CIMMYT and non-CIMMYT lines was also observed using Kinship analysis ( Figure 2) and PCA analysis ( Figure 1D). Broadly, three different groups were observed. A small group of non-CIMMYT lines includes genotypes from India, while the other two large groups which included CIMMYT and CIMMYT derived lines from India such as The LD decay plots ( Figure 3) were plotted using physical distances in mega base pairs (Mb) against marker R2 across the chromosomes. The average extent of LD, considered as physical distance taken for decay of R2 to a critical value of 0.10 across the genome, was approximately 25 Mb.

Marker Trait Association (MTA) for Tan Spot
An MLM model, a Q + K model that considers both Kinship (K matrix) and population structure, reported no significant MTAs at LOD 4.75, whereas MTAs reported at LOD 3 for MLM model were depicted in Supplementary Figure 1. A FarmCPU model, having advantage over the MLM model, has high power and less false positive through iterative usage of fixed and random effect and was also used to identify significant MTAs. Total nine MTAs showed significant association with tan spot using Bonferroni correction cutoff (p-values, 4.4 E-06) spreading over eight chromosomes viz., 2A, 2B, 3B, 4A, 5A, 5B, 6A, and 7D ( Table 2). SNP (IACX9261) on chromosome 5B was most stable and consistent in both individual experiments and in pooled analysis, followed by SNP (TA001138-0446) on chromosome 5A and SNP (AX-94880001) on chromosome 2B which are common among experiment 1 and in pooled analysis. The remaining five MTAs are exclusive to either experiment 1 or experiment 2. The significant markers with LOD scores in the Manhattan plot are presented in Figure 4. The R 2 explained by   significant markers range from 2.0 to 11.3%. The highest R 2 (11.3) was explained by SNP (IACX9261) on chromosome 5B. Comparison of significant MTAs identified from individual environment and pooled analysis showed that genomic regions on the long arm of chromosome 5A and 5B are most stable. The effects of resistant and susceptible alleles for MTAs on chromosomes 5A and 5B were shown with box plot in Figure 5.
The presence and absence of resistant alleles for total nine identified MTAs from both the models were examine in all 184 wheat genotypes (Supplementary Table 2). Almost all highly resistant genotypes in this panel showed the presence of resistance alleles for MTA on chromosome 5B (IACX9261); also lines with susceptible alleles for this MTA showed highly susceptible reaction as observed in genotypes IND-31, CIM-89, and CIM-16. Other MTA on 5A chromosome (TA001138-0446) was also found to be useful for differentiation of resistant and susceptible genotypes. Whereas other MTAs on chromosome 2B (AX-94880001), 4A (wsnp_Ex_c12450_19850925), 6A (RAC875_c103443_475), and on 7D (Kukri_c15768_1383) were observed to be less prominent in differentiation of resistant and susceptible genotypes.

DISCUSSION
In the present study, 184 diverse spring wheat genotypes were screened for seedling tan spot resistance in a greenhouse for the identification of significant MTAs. Field screening of large number of genotypes for tan spot is considered challenging due to the often-natural incidence of other foliar diseases that mimic tan spot symptoms; in addition, limitation of required light and humidity for inoculum growth in field condition precludes pathogen growth (Singh et al., 2009). Moreover, high level of positive correlation between greenhouse and field experiments for tan spot was observed by Evans et al. (1999), implying that MTAs from seedling experiments could be found in field experiments as well. The present study identified broad genetic base of resistance for tan spot, which includes genotypes from CIMMYT-Mexico, as well as from the three South Asian countries. The high resistance of CIMMYT germplasm was previously reported by Singh et al. (2016) and Ali et al. (2008); however, the present study adds more information using diverse genotypes other than germplasm set, which include stable  breeding lines form CIMMYT international nurseries viz., 40IBWSN, 28 ESWYT, and 18 HRWSN along with South Asian wheat genotypes and observed broad genetic base for resistance. The lines HD 2733 and BL 4407, newly found as highly resistant genotypes, can be used in breeding to incorporate tan spot resistance.
Prior to GWAS analysis, information about population structure is very important because the presence of population structure in the GWAS panel may cause spurious association results (Oraguzie et al., 2007). The presence of a subpopulation in a large population can be justified by selection and genetic drift (Buckler and Thornsberry, 2002). Population structure, kinship matrix, cluster analysis, and PCA analysis revealed there was moderate population structure, which has resulted from lines with common parents and two to three sibs in the pedigree.
In the present study, association analysis using MLM and FarmCPU model was adopted; the MLM model has limitation of false negative due to confounding between population structure, Kinship, and quantitative trait nucleotides; however, this limitation was overcome by using FarmCPU model as it performed marker test using associated markers as covariates in fixed effect and followed optimization with associated covariate markers in random effect, which enables to remove confounding and also control false positive (Liu et al., 2016), which is also proven from the results of quantile-quantile (QQ) plots (Supplementary Figure 2) which showed FarmCPU model fitted data well compared to the MLM model. Genomic regions identified for tan spot are categorized into two groups as stable and unstable. Unstable genomic regions are those which are expressed only in one experiment, whereas stable genomic regions for tan spot include the chromosomal regions which are constantly expressed across both individual experiments and pooled analysis or common either in one environment or in pooled analysis. Therefore, the genomic region on the long arm of chromosomes 5B and 5A is designated as stable and is explained here. GWAS results showed major role of tsn 1 gene on chromosome 5B in resistance to Ptr race 1, as the MTA at IACX9261 that is close to tsn 1 was stably significant across both experiments. Previous mapping studies also identified major roles of tsn 1 in conferring resistant to Ptr race 1, such as Chu et al. (2008) in a doubled haploid population, Singh et al. (2008) and Faris et al. (2012) in RIL populations, Kollers et al. (2014) in European winter wheat varieties, and Liu et al., 2020 from three hexaploid wheat mapping population LP573, SK, and TN. Previously, PCR based markers Xfcp620 and Xfcp623 were extensively used to detect tsn 1 gene (Lu et al., 2006;Faris et al., 2010). But from now on, the SNP IACX9261 found in the present study could also be used since it can be transformed to high throughput markers like KASP.
In addition to 5BL, another genomic region on chromosome 5AL appeared to be stable for tan spot resistance. The identified 5AL chromosome region in Chinese Spring RefSeq v. 1.0, harbors or overlaps with QTL identified in previous studies. QTs-Fcu-5AL flanked by markers barc1061 and cfd2185 by Chu et al. (2008), QYls.lrc-5A closely linked by gdm132 (Zwart et al., 2010), and for QTs.zhl-5A mapped between markers iwa7025 and iwa5173 (Kariyawasam et al., 2016) were previously identified on the same genomic region. Interestingly, MTAs on chromosome 5AL (TA001138-0446 and BS00022071_51) showed tight linkage with Vrn-A1, which matches with our previous results (Hu et al., 2019). The Vrn-A1 locus was reported to contribute to disease escape via its effects in alteration of flowering date in Fusarium head blight , spot blotch (Singh et al., 2018), and Septoria tritici blotch (Dreisigacker et al., 2015) in field condition. However, association of Vrn-A1 with seedling resistance to tan spot and its association with spot blotch resistance even after excluding the effect of days to flowering (Singh et al., 2018) implies its possible linkage with an unknown disease resistance gene or its pleiotropic role in resistance to tan spot (Hu et al., 2019). In CIMMYT spring wheat genotypes, the vrn-A1 allele for late flowering and tan spot resistance is almost fixed (Dreisigacker et al., 2016), which is supportive of tan spot resistance and may have contributed to a good level of tan spot resistance in CIMMYT germplasm.
Two resistance genes, tsn 2 controlling resistance to necrosis caused by Ptr race 3 (Singh et al., 2006) and tsn 5 controlling resistance to Ptr race 5 were reported previously in a marker interval of gwm 285 and wmc 366 on chromosome 3B. In the present study, we observed one significant MTA from experiment 2 on FIGURE 5 | Box plots for effects of resistant and susceptible alleles on average tan spot score for stable MTAs. Foot note: X axis-resistant and susceptible allele, Y axis-average tan spot score. chromosome 3B, but this MTA does not match the interval between gwm 285 and wmc 366. Hence it appears that this MTA does not represent tsn 2 or tsn 5 genes, which is in agreement with the fact that the Ptr isolate used in this study was for race 1 only. The significant MTAs identified on chromosomes 1B, 2A, 2B, 3B, 4A, 6A, and 7D do not match with previously identified QTL, and hence these might be novel genomic regions for resistance to Ptr race 1, for which further validation is needed. The single MAT on 5B (IACX9261) will be the first choice for the selection of resistant genotypes for tan spot, and in novel identified genomic regions, single MTA on chromosome 1B (BobWhite_c28635_785) will be a priority for validation as it showed better differentiation for resistant and susceptible genotypes. Overall, in this study, along with CIMMYT germplasm, diverse sources of resistant genotypes against Ptr race 1 were identified which can be used to develop broad genetic resistance to tan spot of wheat. Association mapping identified both known and novel QTLs for tan spot resistance along with novel markers potentially useful for marker-assisted selection. Together the identified novel resistant genotypes and genomic regions could be useful for developing cultivars with durable resistance to tan spot in wheat.

DATA AVAILABILITY STATEMENT
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found in the article/supplementary material.

AUTHOR CONTRIBUTIONS
PS and XH designed the research and supervised the experiments. RP and SB did the disease screening. GS, AJ, RS, MK, and KR provided plant materials. XH and PJ contributed to genotyping work. RP, XH, and PJ analyzed the data. All authors contributed to the article and approved the submitted version.