Combination of Genome-Wide Association Study and QTL Mapping Reveals the Genetic Architecture of Fusarium Stalk Rot in Maize

Stalk rot causes significant yield loss in maize worldwide. The breeding of resistant variety is a potential way to effectively control the disease. This study aims to dissect resistance genes for maize stalk rot caused by Fusarium graminearum using an integrated gene mapping method. A diversity panel of 165 inbred lines and an F2 population from the hybridization of CDMA66 and Huangzao 4 were used as materials. The 165 inbred lines were clustered into four subgroups, of which tropical materials are in the majority. Through combining disease severity index (DSI) and single nucleotide polymorphisms (SNPs) of Maize 50K chip and 20K, genome-wide association analysis (GWAS) revealed 34 SNPs that were significantly associated with stalk rot in maize (P < 0.001). These SNPs were distributed on chromosomes 1, 3, 4, 6, 8, and 10, of which the loci on chromosomes 4 and 8 were confirmed by the mapped QTLs for stalk rot. Finally, candidate genes were identified including GRMZM2G082709 which encoded NAC domain-containing protein and thioredoxin reductase (GRMZM5G841142). However, LRR receptor-like serine/threonine-protein kinase (GRMZM2G080041) was down-regulated after inoculation. These SNPs and candidate genes identified would provide essential information for resistant gene cloning and molecular breeding of anti-stalk rot variety in maize.


INTRODUCTION
Stalk rot is a destructive disease in maize and occurs worldwide. It has been prevalent in China for decades and causes extensive economic losses (Song et al., 2015). The genetic mechanism of maize stalk rot is complex. It is caused by one or more fungal pathogen(s). The causative agents, including Fusarium graminearum, Fusarium verticillioides, Fusarium moniliforme, Fusarium proliferatum, are soil-borne fungus, indicating it is difficult to control stalk rot. Once stalk rot occurs, manual harvest has to be conducted due to the lodging of plants. Thus, it is necessary to understand the genetic basis of stalk rot and to breed resistant varieties.
A genome-wide association study (GWAS) has been proven to be an effective method in identifying the genes, alleles, or heliotypes associated with agronomic traits. Moreover, many quantitative trait loci (QTL) or genomic regions that confer resistance to important maize diseases have been detected by the method (Gowda et al., 2015). Exploring the genome-wide association analysis between SNPs and phenotypes of a desired trait allows the identification of site for quantitative traits (Brachi et al., 2011).
The stalk rot resistance in maize is controlled by polygene and the additive genetic effects among the genes . Till now, a number of quantitative trait loci (QTLs) associated with maize stalk rot have been identified. For example, Pè et al. (1993) used an F2:3 population from B87 and 33-16 to detect the QTLs for stalk rot and found 5 QTLs that were located on chromosomes 1, 3, 4, 5, and 10, respectively. Yang et al. (2010) detected a major QTL-qRfg1 with a back cross population from 1145 and Y331. The QTL was located on Chromosome 10 and explained that 36.3% of the phenotype. A gene ZmCCT was further cloned from this region and the epigenetic regulation mechanism was studied (Wang C. et al., 2017). Zhang et al. (2012) narrowed the physical distance of a micro QTL-qRfg2 on bins 1.09/10-300 kb. Ye et al. (2019) cloned a pathogenic gene ZmAuxRP1 at a minor QTL-qRfg2 site and proved that the gene can promote the biosynthesis of indole-3-acetic acid (IAA). Ma et al. (2017) used a RIL population from H127R and C7-2 to map QTLs for maize stalk rot and explored a major QTL-qRfg3 on Chromosome 3, explaining 10.7-19.4% of the phenotype variation. Jung et al. (1994) found a resistant gene Rcg1 on Chromosome 4 for anthracnose stalk rot. The gene encodes NBS-LRR protein, which plays an important role in the pathogen recognition (Frey, 2005;Abad et al., 2006). Chen et al. (2017) identified a resistant QTL (QTL-Rgsr8.1) associated with maize Gibberella stalk rot and located it between 164.69 Mb and 166.72 Mb on Chromosome 8 by the wholegenome resequencing-based approach. Further, two candidateresistant genes were found in this region. One was responsed to auxin and the other encoded a disease resistance protein. At present, many QTLs (genes) have been reported, but no consistent QTL (gene) is able to be used in the molecular breeding of stalk rot in maize.
In this study, based on 165 maize inbred lines, we performed genetic diversity analysis, GWAS, and linkage analysis and performed phenotypic analysis through the inoculation of stalk rot pathogen on maize plants in the field. Finally, we obtained the candidate resistant genes of stalk rot, which is of great significance for further breeding of maize stalk rot resistant varieties.

Materials and Phenotyping
A set of 165 maize inbred lines (Supplementary Table 1), including resources from the International Maize and Wheat Improvement Center (CIMMYT) and China, were used as materials in this study. These inbred lines were collected or bred by the Maize Molecular Breeding Team, Qingdao Agriculture University, China.
The trials were conducted under artificial inoculation using the injection method. Fusarium graminearum were cultured on petri dish including PDA agar medium in the dark at 26 • C. A week before inoculation, mung beans were put into boiling water for 10 min; after filtering with gauze, mung bean medium were transferred into flasks for sterilization at 121 • C for 30 min. 5-6 Fusarium graminearum clumps with a diameter of about 1 cm were inoculated into 100 ml of mung bean medium. The clumps were shaken at 180 rpm for 5-7 days at 26 • C. When conidia were produced, they were filtered and ready for field inoculation. Aseptic conditions were maintained during all the preparation steps. At the tassel emergence stage, 1 mL of inoculum was injected into the second internode of the plant using a sterile syringe and needle after making a 2 mm hole, and the inoculation sites were sealed with tape for preventing the contamination. The control plants were mock-inoculated with sterilized water. The pathogen, kindly provided by Pathologist Professor Yan in Qingdao Agricultural University, was derived from a single isolate. Randomized complete block with three repetitions were adopted in the experiment. Each inbred line was planted in a row, 20 plants per row. The row length was 4 m, the adjacent row spacing was 0.6 m, and the plants within a row were 0.2 m apart from each other.
Plants showing typical stalk rot symptoms at the last milk stages were scored. There are 20 plants scored in each row. The infected plants initially appear brown at lower internodes and stalks, and then wilt, lodge, or even die. After being scored, every investigated plant was longitudinally cut into pieces to assess the mycelia. Disease incidence had six grades: 1, 2, 3, 4, 5, and 6. Grades 1-3 were regarded as resistant materials and 4-6 were regarded as susceptible materials (Yang et al., 2010). The grade values were then converted into the disease severity index (DSI) with following formula: DSI (%) = [sum (class frequency × score of rating class)] / [(total number of plants) × (maximal disease index)] × 100 (Chiang et al., 2017). We conducted experiments in Laiyang in 2017 and 2019, using different plots, and the environment in these 2 years was quite different. Therefore, we treat it as two environments. In order to eliminate the influence of environmental effects and genetic effects and improve the accuracy of the results, we used the lme4 package of R software (https://www.r-project.org) to fit the BLUE (best linear unbiased estimates) value. The mixed linear model was used for correcting these phenotypes.

Genetic Diversity Analysis
A total of 5,585 SNP markers were obtained by comparing maize 50K genotypes and 20K genotypes to the same version of maize B73 reference genome. These SNP sites were then filtered with parameters of missing rate >20%, heterozygosity >20%, and minor allele frequency (MAF) <0.05. At last, 4,666 SNP markers were chosen for GWAS analysis. Linkage disequilibrium (LD) analysis was performed for these SNP markers among 165 inbred lines by Tassel 5.2.3 (Bradbury et al., 2007). The population structure analysis of these inbred lines was conducted by Structure V. 2.3.4 (Pritchard et al., 2000;Evanno et al., 2005), and the principal component analysis (PCA) and genotype analysis were performed by GAPIT (Lipka et al., 2012).

GWAS Analysis
Based on the phenotype and genotype of 165 inbred lines, GWAS was performed to screen the significant SNP markers among the population by Fixed and random model Circulating Probability Unification (FarmCPU) (Liu, 2015). The FarmCPU model alternately uses a fixed-effects model and a random-effects model to control the confounding problem of the model, thereby better controlling the false-positive and false-negative problems, thus greatly improving the statistical power (Liu et al., 2016). The significance of the association between SNP markers and DSI was determined with a less stringent threshold of -log10(P)>3.5, similar to the approach adapted in the references (Phan et al., 2018).

QTL Mapping in a Genetic Linked Population
A stalk rot-resistant inbred line CDMA66 (♀) was hybridized with a stalk rot susceptible inbred line Huangzao4 (♂); the generated F 1 and F 2 and their parental lines were planted in Laiyang, Shandong, for stalk rot evaluation in the summer of 2018. At the silking stage, 350 well-grown F 2 plants were inoculated by Fusarium graminearum by the injection method. The genomic DNA of these inoculated plants was extracted using cetyltrimethylammonium ammonium bromide (CTAB) method (Chen and Ronald, 1999). At the mature stage, the disease grade of each plant was investigated according to the method in section Materials and Phenotyping, and then the DSI was calculated.
The polymorphism markers between CDMA66 and Huangzao4 used for genetic linkage map construction were screened from 985 SSRs that covered 10 maize chromosomes. The linkage map was constructed using the software IciMapping V3.2 (Wang J. L et al., 2012). Inclusive Composite Interval Mapping (ICIM)  was used for QTL mapping.

Candidate Gene Mining and Analysis
With the genomic sequence of the Maize B73 RefGen_v3 as a reference genome, candidate genes were chosen based on the SNP loci that were significantly associated with stalk rot (Schnable  et al., 2009). The chosen candidate genes were confirmed at MaizeGDB (https://www.maizegdb.org/). The SNP loci that were significantly associated with stalk rot (P < 0.001) were annotated at MaizeGDB 4 (https://www.maizegdb.org/) and NCBI 5 (https:// www.ncbi.nlm.nih.gov/).

Gene Expression Analysis
Two candidate genes' expression levels in CDMA66 and Huangzao4 were further analyzed using qRT-PCR at time points of 12, 24, 48, and 72 h after inoculation. Total RNA was extracted using an RNAiso kit (TaKaRa, Japan) according  to the manufacturer's instructions. Single-strand cDNA was synthesized with a reverse transcription kit (TaKaRa, Japan). Quantitative RT-PCR assays were performed with SYBR Premix Ex Taq (Accurate Biology, China) on a T100 TM Thermal Cycler (BIO-RAD). The primers used for qRT-PCR are listed in Supplementary Table 2. The temperature cycling conditions were 94 • C for 30 s, then 40 cycles of 95 • C for 10 s and 55 • C for 10 s and 72 • C for 30 s. With Actin1 as an inference gene, the relative expression of interest genes was determined using the 2 − Ct method . The qRT-PCR experiment was independently repeated at least twice, with each sample run in triplicate, and a representative data set was shown.

Phenotype Statistics
The DSI information of stalk rot of 165 maize inbred lines evaluated in 2017 and 2019 is presented in Table 1 and Figure 1.
The BLUE (best linear unbiased estimates) value is the corrected phenotype value. The DSI values were within the range of 0.14-1 in 2017 and within 0.05-1 in 2019. The absolute values of kurtosis and skewness of these indexes were both <1. That is, the phenotype data were suitable for QTL mapping.

Genetic Diversity Analysis
We established a correlation map to estimate the genomewide LD levels of 4,666 SNPs among the 165 inbred lines. Frontiers in Agronomy | www.frontiersin.org LD decayed differently in the 10 chromosomes of maize, with chromosomes 2 and 10 having the most and least decay rates, respectively (Figure 2A). When the cutoff value of r 2 was set to 0.14, the averaged LD decay distance of 165 inbred lines was approximately 300 kb. Structure software was used to calculate the K value. K reached its maximum at K = 2 and got its secondary maximum value at K = 4 ( Figure 2B). When K = 2, the inbred lines were grouped into temperate and tropical groups and the tropical materials are in the majority. When K = 4, the materials were divided into four subgroups, namely, Lvdahonggu, Tangsipingtou, tropical material group, and P group ( Figure 2C). The population grouping of maize inbred lines were analyzed using the principal component analysis (PCA) method, and they were well-separated by the first principal component (PC1), the second principal component (PC2), and the third principal component (PC3) (Figure 2D). PC1, PC2, and PC3 explained 26.2, 9.6, and 2.4% of the total variation, respectively, and the cumulative value was 38.2%.

GWAS Analysis
Filtering the genetic data by quality criteria before GWAS analysis improved the reliability of the results. The filtered genotype showed a low level of the heterozygosity for individuals and markers (Figure 3). The association analysis between selected SNPs and the stalk rot traits was performed using FarmCPU to explore associated loci and allelic variation. The results were shown in Figure 4. Quantile-quantile (Q-Q) plots implied that the population structure and genetic relationship of 165 inbred lines were well-controlled. A total of 19 SNPs that were closely related to stalk rot in maize were detected. These SNPs were distributed on chromosomes 1, 2, 3, 4, 6, 7, 8, and 10 ( Table 2). Among them, 3 SNP loci were detected on each of chromosomes 1, 2, and 6; 2 SNP loci were detected on chromosomes 3 and 8; 4 SNP loci were detected on chromosomes 4; and 1 SNP locus was detected on chromosomes 7 and 10.

Preliminary QTLs Mapping Using the Linked Population
A total of 428 polymorphism markers between CDMA66 and Huangzao4 were screened out of 985 SSR markers covering 10 chromosomes of maize (Supplementary Figure 1). Taking chromosome 1 polymorphism marker bnlg1556 and some individuals in the F 2 population as an example, the genotype analysis was performed, and the amplification results are shown in Supplementary Figure 2. Individuals in the F 2 population with the same bands as the parent P 1 or P 2 are homozygous, and individuals with bands from both parents are heterozygous. The polymorphism markers on chromosomes 9 and 2 showed the highest polymorphism rate (63%) and the lowest polymorphism rate (26%), respectively. After removing the partial segregation markers, 227 SSR markers were remained to construct a genetic linkage map (Figure 5). Using LOD 2.5 as the threshold, three QTLs closely related to disease resistance were mapped on chromosomes 3, 4, and 8 (Table 3, Figure 6A). Comparing the results of QTL mapping and GWAS, two SNP loci (PZE-108021117 and PZE-104097660) were exactly located at QTL intervals at bin 8.02 and bin 4.07. The QTL on Chromosome 4 was mapped between mmc 0371 and umc 1808, had a LOD value of 2.59 and a contribution rate of 5.49%, and showed an additive effect of 0.3916. The QTL on Chromosome 8 was spanned by marker bnlg 2235 and umc1530, having the LOD value of 4.96 (Figure 6B), explaining 10.62% of phenotypic

Candidate Gene Prediction
Some candidate genes containing the SNPs that were significantly associated with the resistance of stalk rot were selected, and the function was predicted at MaizeGDB 4 (https://www.maizegdb. org/) and NCBI 5 (https://www.ncbi.nlm.nih.gov/) ( Table 4). Gene GRMZM2G010912 was detected by PZE-104097660, but no functions were reported till now. SNP locus PZE-101101076 on maize Chromosome 1 was located within the interval of gene GRMZM2G082709, whose functional annotation is NAC domain-containing protein 86; SNP locus SYN4735 on maize Chromosome 2 was located within the interval of gene GRMZM2G080041, whose functional annotation is LRR receptor-like serine/threonine-protein kinase FLS2; SNP locus PZE-106068309 on maize Chromosome 6 was located within the interval of gene GRMZM5G841142 whose functional annotation is thioredoxin reductase 2; and SNP locus PZE-108053920 on maize Chromosome 8 were located within the interval of gene GRMZM2G006745, whose functional annotation is DRE-binding protein 1c.

Gene Expression Analysis
The expression levels of some candidate genes were analyzed in the stalk rot-resistant inbred line CDMA66 and susceptible inbred line Huangzao4. After CDMA66 was inoculated by Fusarium graminearum, the expression levels of GRMZM2G082709 and GRMZM5G841142 peaked at 12 and 72 h, respectively, and with the relative expression values of 5.28 and 6.95, respectively. When Huangzao4 was inoculated, the expression of GRMZM2G082709 was down-regulated, while GRMZM5G841142 showed an unchanged expression level. In addition, GRMZM2G080041 was up-regulated in Huangzao 4 but down-regulated in CDMA66 after the two inbred lines were inoculated by pathogen. In Huangzao 4, the maximum relative expression value of 4.37 appeared at 12 after inoculation. The gene GRMZM2G006745 changed the expression level in both Huangzao 4 and CDMA66 (Figure 7). The above genes may be involved in the mechanism of maize stalk rot.

DISCUSSION
GWAS and genetic linkage map have been widely used to mine the resistance genes of crops. With the method, Li et al. (2015) detected two major resistance-related QTLs (09HR and 03HR) and 31 SNP loci associated with head smut in maize. Moreover, four and one of the 31 identified SNPs were located within the interval of 09HR and 03HR, respectively. Chen et al. (2016) identified 45 SNP loci that were closely related to Fusarium ear rot (FER) resistance in maize by the GWAS method, eight of which were located within FER QTLs' intervals through integrating the results of GWAS and QTL mapping. Xiao et al. (2019) detected two QTLs (qRBSDVD5 and qRBSDVD6) resistance to RBSDVD on chromosomes 5 and 6 and further detected an associated locus in the interval of qRBSDVD6 by GWAS. In our study, we used the GWAS and linkage mapping methods to explore the resistance genes of maize stalk rot and  found 3 QTL loci and 34 SNP loci that were significantly related to maize stalk rot. The two SNP loci, PZE-108021117 and PZE-104097660, were located at the QTL intervals on bin 8.02 and bin 4.07, respectively. Previously, some QTLs for maize stalk rot have been reported to be located on bin 4.03-4.08 (Pè et al., 1993;Jung et al., 1994;Dong and Wang, 2001), and a QTL locus associated with stalk rot was on bin 8.06-8.08 (Chen et al., 2017). Furthermore, Yang et al. (2010) found two QTLs (qRfg1 and qRfg2) for stalk rot and narrowed the physical distance through fine mapping. The qRfg1 locus was located in a ∼500-kb interval on Chromosome 10 and flanked by SSR334 and SSR58. The qRfg2 locus was located in a ∼300-kb interval on Chromosome 1 and had markers SSRZ68 and SSRZ93 on both sides . For the previous studies, most materials used for genome-wide association analysis and genetic linkage map construction are temperate germplasms, while in our study, both temperate and tropical materials were included in order to find new resistance sites of maize stalk rot. GWAS is a powerful tool to identify marker variants that are significantly associated with complex traits (Pritchard et al., 2000). It is a powerful approach for exploring the molecular basis of phenotypic variations and has been widely used for functional gene discovery in plants (Li et al., 2016). In our study, we performed GWAS to detect functional genes associated with maize stalk rot. With the significant associated SNPs obtained from GWAS, NAC family gene was detected after comparing with the reference genome. NAC transcription factors are a large plant-specific transcription factor (TF) family and the NAC domain protein functions as homo-or hetero-dimer forms (Nakashima et al., 2012). Most NAC proteins have conserved DNA-binding domains that contain approximately 150 amino acids at the N-terminus. These domains are divided into five subdomains (A-E). The subdomains A, C, and D are highly conserved, while B and E are variable and involved in many unique functions of NAC proteins (Olsen et al., 2005). So we speculate that the functions of the maize NAC domain protein are mainly the functions that the B and E subdomains are involved in.
The NAC transcription factor family plays a key role in multiple biological processes, such as plant growth, development and stress responses . Harrington et al. (2019) found that the NAM-A1 NAC domain was required for protein binding in wheat and the mutation of the NAC domain would lead to the delayed peduncle and flag leaf senescence. Nogueira et al. (2005) found Saccharum sp. NACs (SsNAC23) was significantly associated with cold and water stresses in sugarcane. The Arabidopsis thaliana NAC proteins ANAC019 and ANAC055 might function as transcription activators to regulate the expression of Jasmonic acid (JA) and then regulate plant defense responses against herbivore attack, pathogen infection, and mechanical damages (Bu et al., 2008). Lv et al. (2016) found the overexpression of a novel NAC domain-containing transcription factor gene (AaNAC1) enhanced the content of artemisinin and increases tolerance to drought and Botrytis cinerea in Artemisia annua. Sun et al. (2013) found that the two rice genes ONAC122 and ONAC131 played an important role in the resistance to rice blast fungus. Lin et al. (2018) found that the maize NAC transcription factor ZmNAM played an important role in regulating the growth and development of root in maize. Zhang et al. (2014) found that NAC transcription factors participated in the response process of maize to drought stress.
We also found a gene related to thioredoxin reductase. Thioredoxin is a small protein that catalyzes the redox of disulfide bonds. It plays an important role by regulating the redox state of cells. In plants, the thioredoxin system is particularly complex, and it participates in plant metabolism, transcription and translation regulation, signal transduction, and plant resistance. The function of thioredoxin in plants is regulated by the dimer thioredoxin reductase (NTR) (Charles et al., 2000). Thioredoxin reductase (TrxR) is a dimer flavinase, a member of the pyridine nucleotide disulfide reductase family, and is widely expressed in cells of all levels of organisms from prokaryotes to humans (Holmgren, 2000). A thioredoxin gene, AtTrxh5, in Arabidopsis thaliana plays a role in the stress response of Arabidopsis to the Victorian oat blight (Avena sativa) disease (Sweat and Wolpert, 2007). Laloi et al. (2004) found that cytosolic thioredoxin AtTRXh5 may have potential significance in response to pathogens and oxidative stresses.
Besides, our research team examined one gene related to DRE-binding protein in Chromosome 8 and LRR receptorlike serine/threonine-protein kinase in Chromosome 2. At present, dehydration-responsive element-binding protein (DREbinding protein) is the most thoroughly studied stress-related transcription factor (Wei et al., 2013). DRE-binding protein transcription factors are involved in regulating the expression of abiotic stress-mediated gene in plants. Sharma et al. (2019) found that apple transcription factor, MdDREB76, conferred salt and drought tolerance in transgenic tobacco by activating the expression of stress-responsive genes. Guo et al. (2018) found that the overexpression of peanut AhDREB1 gene was effective in enhancing salt tolerance in peanuts. Plant receptorlike protein kinases (RLKs) are one of the larger gene families in plants. They have a special protein structure and play an important role in signal transmission. They can sense signal of growth, development and external environmental stress (Lally et al., 2001). Many reports have proved that most of the disease resistance of plants belongs to the LRR-RLK subfamily; for example, the bacterial blight resistance gene Xa21 was cloned from rice and belonged to the LRR subtype, which prevents pathogens from attacking rice (Song et al., 1995). The FLS2 gene of A. thaliana can encode LRR-type RLKs, which can play an important role in plant disease resistance and pathogen identification (Robatzek et al., 2006). These results suggest that multiple mechanisms might be involved in conditioning maize stalk rot resistance.

CONCLUSION
Genetic diversity analysis and GWAS about maize stalk rot were performed in a panel of 165 maize inbred lines. These materials were clustered into four subgroups. Thirty-four SNP loci and three QTLs that significantly associated with maize stalk rot were detected using FarmCPU. These three QTLs are located on chromosomes 3, 4, and 8, respectively. Among them, two SNP loci, PZE-108021117 and PZE-104097660, were located within QTL interval. Moreover, two genes (GRMZM2G096904 and GRMZM2G010912) which contain the two SNP loci respectively were chosen as candidate resistance genes for stalk rot in maize, but no more information was reported. We also found some other candidate genes such as NAC domain-containing protein 86 and DRE-binding protein that may be related to maize stalk rot. How these candidate genes are involved in the pathogenesis of maize stalk rot needs further verification. Our results would be helpful in discovering the mechanism of stalk rot resistance, and to breed resistant varieties of maize.

DATA AVAILABILITY STATEMENT
The original contributions presented in the study are included in the article/Supplementary Material, further inquiries can be directed to the corresponding author.

AUTHOR CONTRIBUTIONS
XS and MZ conceived and designed the study. MZ, SL, XS, JF, and ZS performed the experiments, phenotyped the populations, and field work. SL and JF contributed to genotype and phenotype data analysis. SL drafted the manuscript. All authors read and approved the final manuscript.