Genome-Wide Association Study for Grain Micronutrient Concentrations in Wheat Advanced Lines Derived From Wild Emmer

Wheat is one of the important staple crops as the resources of both food and micronutrient for most people of the world. However, the levels of micronutrients (especially Fe and Zn) in common wheat are inherently low. Biofortification is an effective way to increase the micronutrient concentration of wheat. Wild emmer wheat (Triticum turgidum ssp. dicoccoides, AABB, 2n = 4x = 28) is an important germplasm resource for wheat micronutrients improvement. In the present study, a genome-wide association study (GWAS) was performed to characterize grain iron, zinc, and manganese concentration (GFeC, GZnC, and GMnC) in 161 advanced lines derived from wild emmer. Using both the general linear model and mixed linear model, we identified 14 high-confidence significant marker-trait associations (MTAs) that were associated with GFeC, GZnC, and GMnC of which nine MTAs were novel. Six MTAs distributed on chromosomes 3B, 4A, 4B, 5A, and 7B were significantly associated with GFeC. Three MTAs on 1A and 2A were significantly associated with GZnC and five MTAs on 1B were significantly associated with GMnC. These MTAs show no negative effects on thousand kernel weight (TKW), implying the potential value for simultaneous improvement of micronutrient concentrations and TKW in breeding. Meanwhile, the GFeC, GZnC and GMnC are positively correlated, suggesting that these traits could be simultaneously improved. Genotypes containing high-confidence MTAs and 61 top genotypes with a higher concentration of grain micronutrients were recommended for wheat biofortification breeding. A total of 38 candidate genes related to micronutrient concentrations were identified. These candidates can be classified into four main groups: enzymes, transporter proteins, MYB transcription factor, and plant defense responses proteins. The MTAs and associated candidate genes provide essential information for wheat biofortification breeding through marker-assisted selection (MAS).


INTRODUCTION
Mineral elements are essential micronutrients for animals and human beings. It is estimated that approximately half of the world's population are at risk of micronutrient deficiency (Gregory et al., 2017;Tabbita et al., 2017), resulting in serious health problems (Cakmak and Kutman, 2018). For example, the deficiency in iron (Fe) and zinc (Zn) may cause overall poor health, anemia, increase morbidity and mortality rates, and low worker productivity (Peleg et al., 2008;Gibson, 2012;Borrill et al., 2014), whereas the manganese (Mn) shortfall may affect reproductive ability, mental development, and bone development (Li and Tan, 2012). In addition, the previous study showed that crops grown in high CO 2 environments not only have low nutritional value, but also lack Fe, Zn, and other important micronutrients (Myers et al., 2014). Developing micronutrientenriched agricultural crops (biofortification), agronomically and/or genetically, is considered the most cost-effective and sustainable approach to alleviate malnutrition and related health problems (Peleg et al., 2008;Bouis and Saltzman, 2017).
Wheat (Triticum aestivum, 2n = 6x = 42, AABBDD) is a major staple food crop worldwide. Therefore, the composition and nutritional quality of wheat grains have a significant impact on human health and well-being. Modern wheat cultivars are, however, inherently low in micronutrients and show a narrow genetic variation for micronutrients to be exploited in breeding programs (Velu et al., 2017). One effective strategy to improve the amount of mineral elements in wheat grains is to exploit the "left behind" genetic variation in wild relatives for grain micronutrients (Peleg et al., 2008(Peleg et al., , 2009.
Wild emmer wheat (Triticum turgidum ssp. dicoccoides, 2n = 4x = AABB) is the tetraploid progenitor of cultivated wheat, harboring a rich allelic repertoire for improvement of various economically important traits in wheat (Gomez-Becerra et al., 2010), including micronutrient concentrations (Yan et al., 2018). Substantially high concentrations of up to 190 mg kg −1 for Zn and 109 mg kg −1 for Fe in grains of wild emmer have been reported (Cakmak et al., 2004. The accumulation of minerals in seeds depends on a plethora of processes that are controlled by several genes (Peleg et al., 2009). Till now, dissection of the genetic basis of grain micronutrients content in wild emmer wheat was performed using tetraploid wheat recombinant inbred lines (RILs), derived from a cross between durum wheat (T. turgidum ssp. durum, cv. Langdon) and wild emmer wheat (accession G18-16) (Peleg et al., 2009;Fatiukha et al., 2020). The introgression and identification of wild emmer quantitative trait locus (QTL) linked to micronutrient accumulation in a hexaploid wheat background have been less reported. The wild emmer gene Gpc-B1 that belongs to the NAC-domain transcription factor affects grain protein content (GPC), grain Zn, Fe, and Mn concentrations (GZnC, GFeC, and GMnC) in wheat (Uauy et al., 2006;Distelfeld et al., 2007). However, this gene was associated with reductions in grain weight and yield in some environments and wheat cultivars (Uauy et al., 2006;Brevis and Dubcovsky, 2010;Tabbita et al., 2013). Thus, the exploration of genomic regions associated with high GZnC, GFeC, and GMnC that have less negative effect on grain yield is desirable.
Genome-wide association analysis (GWAS) has been widely used for deciphering the genetic basis of multiple traits in crops (Su et al., 2016;Wang et al., 2016;Ates et al., 2018;Liu et al., 2019). It was available to study QTLs related to agronomically important traits in large sets of germplasm resources such as landraces , elite cultivars (Sukumaran et al., 2015), and advanced breeding lines (Wang et al., 2017) as well as backbone parents and their derived lines (Yu and Tian, 2012;Yu et al., 2014;Xiao et al., 2016;Liu et al., 2019). In wheat, GWAS has been extensively applied to reveal genomic regions controlling traits such as GPC (Liu et al., 2019), grain ionome (Fatiukha et al., 2020), and yield-related traits (Tadesse et al., 2015).
In our previous study, the agronomically stable advanced lines were obtained from a cross between common wheat cultivar Chuannong16 (CN16 hereafter) and wild emmer accession D1 (Liu et al., 2019). We found that most of the tested advanced lines without Gpc-B1 showed thousand kernel weight (TKW), GPC and grain mineral concentrations (e.g. GZnC and GFeC) simultaneous improvement (Wang, 2015). In this study, GWAS was used to analyze the genetic basis of GFeC, GZnC, and GMnC in a multi-parent population which consisted of wild emmer as backbone parent and its derived advanced lines. The objectives of the current study were to identify genomic regions associated with high GFeC, GZnC, and GMnC in a hexaploid wheat population, and to scan promising candidate genes using the whole genome assembly of wild emmer wheat (Avni et al., 2017) and Chinese Spring (Appels et al., 2018). The identified markers and genomic regions will provide essential information for cloning genes related to high GFeC, GZnC, and GMnC and be useful in biofortification breeding programs.

Plant Materials and Experimental Design
The same set of 161 advanced lines (Liu et al., 2019) derived from a cross between CN16 and D1 were used in the present study. Wheat plants were arranged in the field using a randomized complete block design with three replicates over two growing seasons (2015 and 2016) at the Chongzhou (2015CZ and 2016CZ) and Wenjiang (2015WJ and 2016WJ) experimental stations of Sichuan Agricultural University, China (Liu et al., 2019). Individual plants were spaced 10 cm apart within a 2 m row with 30 cm between rows. Each replicate contained 20 plants in a 2 m row. Cultivation and management followed local field production conditions. The soil types at fields of Wenjiang and Chongzhou experimental stations are paddy soil and yellow soil, respectively. Mature seeds were harvested from the middle six wheat plants of each row and measured for GFeC, GZnC, GMnC, and TKW.

Phenotypic Measurements
Wheat grains were dried to a constant weight and then ground to a fine powder using Chopin CD1 AUTO (Renault, Boulogne-Billancourt, France). A 0.3 g powder samples were digested with a diacid mixture [HNO 3 (4): HClO 4 (1) (v/v)]. Atomic absorption Spectrometer (PinAAcle 900T, USA) was used to measure the GFeC, GZnC, and GMnC. The weight of 300 randomly sampled kernels (GB T 5519-2008(GB T 5519- , 2008 was recorded with an electronic balance to represent the TKW.

Genotyping and Population Structure and Linkage Disequilibrium (LD) Analysis
The wheat genomic DNA was isolated from seedlings using a CTAB method (Murray and Thompson, 1980) with minor modification and then, genotyped using the DArT markers (https://www.diversityarrays.com/). The obtained data were filtered according to call rate (minimum threshold value of 85%) and reproducibility (minimum threshold value of 95%) (Liu et al., 2019). The recalled marker data with missing data >10% and minor allele frequency (MAF) <5% were further filtered (Liu et al., 2019). The retained 13,116 markers were used for further study. The linkage disequilibrium (LD) and population structure analyses were conducted as described by Liu et al. (2019).

GWAS for GZnC, GFeC, and GMnC
The best linear unbiased prediction (BLUP) across four tested environments was performed using the META-R (Alvarado et al., 2015). The association analysis of markers and GZnC, GFeC, and GMnC was conducted using the general linear model (GLM) and mixed linear model (MLM) in TASSEL. The population structure (Q matrix) was used as a covariate to adjust population stratification. Kinship matrix (K) was calculated using the Scaled IBS method (Yu et al., 2006). A threshold p-value of -log 10 (p) ≥ 3 (Alomari et al., 2018) was used to define significant marker-trait associations (MTAs). MTAs were displayed with a Manhattan plot using Haploview4.2 software (Barrett et al., 2004). Important p-value distributions were shown with a quantile-quantile plot. The prediction of candidate genes linked to MTAs was performed as described by Liu et al. (2019).

Statistical Analysis
Analysis of variance (ANOVA) and estimates of heritability were performed using the META-R (Alvarado et al., 2015). T-test and the phenotypic Pearson correlation coefficients were calculated using SPSS version 22.0 (SPSS Inc., Chicago, IL, USA).

Phenotypic Variation for GFeC, GZnC, and GMnC
The variation for GFeC, GZnC, and GMnC in the advanced lines and their core parents were showed in Table 1. A wide range of phenotypic variation was found among the wheat advanced lines. These traits were significantly different among genotypes and showed medium-to-high heritability (0.67-0.98) ( Table 1). Most genotypes displayed relatively stable GFeC, GZnC, and GMnC across four environments (2015WJ, 2015CZ, 2016WJ, and 2016CZ) (Figure 1).
The wild emmer D1 had significantly (P < 0.05) higher GFeC (mean range 92.69-115.47 mg/kg) compared to that of CN16 (mean range 38.15-49.20 mg/kg) across all test environments ( Table 1). The GFeC of the advanced lines were ranged from 42.44 to 168.00 mg/kg (mean range 83.71-109.68 mg/kg) across four environments. The lowest GFeC (42.44 mg/kg) was recorded at 2016CZ, whereas the highest GFeC (168.00 mg/kg) was recorded at 2015 WJ. There is no significant difference in GMnC between D1 and CN16, except in two environments (2015CZ and 2015WJ). The GMnC of the advanced lines were ranged from 16.28 to 53.85 mg/kg (mean range 30.36-37.14 mg/kg) across four environments. The lowest GMnC (16.28 mg/kg) was recorded in 2016 at Wenjiang, whereas the highest GMnC (53.85 mg/kg) was recorded in 2015 at Wenjiang ( Table 1).
The mean TKW of the advanced lines under 2015WJ, 2015CZ, 2016WJ, and 2016CZ environments was 46.08, 52.90, 49.88, and 49.06 g, respectively, which were all higher than those of female parent CN16 (Table 1).

Phenotypic Frequency Distributions and Pearson's Correlation Analysis
Frequency distributions of advanced lines for GFeC, GZnC, and GMnC with an indication of the bi-parents controls were presented in Figure 2. The advanced lines exhibited normal distributions for all micronutrient variables under four environments. The highest frequency distribution of GFeC showed a large difference in four environments, implying this variable was subjected to environmental factors. In contrast, the highest frequency distributions of GZnC, GMnC, and TKW were clustered in a narrow range of 50-70 mg/kg, 30-40 mg/kg, and 45-55 g, respectively (Figure 2). Most wheat lines had both higher grain micronutrient concentrations and TKW than those of CN16 in four tested environments (Supplementary Table 1).
Pearson's correlation analysis revealed highly significant positive correlations among the GFeC, GZnC, and GMnC base on BLUP. The TKW had a significant positive correlation (P < 0.05) with GMnC and no negative correlation with GFeC and GZnC ( Table 2).

Population Structure and Linkage Disequilibrium (LD) Analysis
Based on population structure and LD analysis, the 161 RILs could be divided into three subgroups and the LD decay distances for the A, B, and D subgenomes and the entire genome were about 9, 12, 13, and 12 cM, respectively, which were rapidly decreased with increasing pairwise distance as described previously (Liu et al., 2019).

Candidate Genes That May Be Associated With GFeC, GZnC, and GMnC
The high-confidence MTAs were selected to predicate candidate genes using the annotated Chinese Spring genome sequence (RefSeq v.1.0) and wild emmer genome sequence (WEWSeq v.1.0). A total of 38 putative candidate genes for micronutrient concentrations were predicted (Supplementary Table 3). These genes can be roughly classified into four groups based on the types of proteins they encoded. The first group of candidates encoded enzymes related to iron-dependent dioxygenase, metaldependent hydrolase, phosphatase, and methyltransferases. The second group consisted of transporter proteins, such as sugar transporter, ABC transporter, and heavy metal transport proteins. The third group included MYB transcription factor, zinc finger family, and transmembrane proteins. The last group consisted of proteins related to plant defense responses, such as receptor-like kinase, leucine-rich repeat-containing protein, and mitogen-activated kinase (MAPK) (Supplementary Table 3).

DISCUSSION
The poor bioavailability and low concentration of grain micronutrients (especially Zn and Fe) in cultivated wheat are common problems (Xu et al., 2012). Enhancement of the wheat grain nutritional value through genetic biofortification is available (Borrill et al., 2014). The wild emmer accessions are a valuable resource for micronutrient improvement in wheat (Chatzav et al., 2010). In the present study, the same set of 161 advanced lines derived from wild emmer (Liu et al., 2019) were phenotyped in four environments and genotyped using the DArT markers to understand the genetic basis of GFeC, GZnC, and GMnC accumulation in wheat. GWAS identified 14 high-confidence MTAs for GFeC, GZnC, and GMnC by both GLM and MLM (   predicate using the wheat sequences annotation (RefSeq v.1.0; WEWSeq v.1.0). We have found that the GFeC, GZnC and GMnC of advanced lines derived from wild emmer were significantly higher than those of CN16, with the exception of GMnC in 2016WJ and 2016CZ. Medium-to-high heritability (0.67-0.98) were estimated for GFeC, GZnC, GMnC, and TKW, which indicated these traits were largely governed by genetic factors. Similar heritability for these traits has been reported in previous studies (Velu et al., 2017;Bhatta et al., 2018). These results demonstrated the potential value of wild emmer for grain micronutrient improvement in wheat. On the other hand, we have observed significant positive correlations among GFeC, GZnC, and GMnC and no significant negative correlations FIGURE 4 | Potential genotypes with high micronutrient concentrations selected from the RILs based on high-confidence MTAs associated with GFeC, GZnC, and GMnC. between grain micronutrient concentrations and TKW. Our results were consistent with a previous study that the wild emmer GZnC and GFeC were positively correlated (Peleg et al., 2009).
These results indicate that the alleles for GZnC and GFeC are either co-segregated or have a pleiotropic effect. Six MTAs on chromosomes 3B, 4A, 4B, 5A (2), and 7B are related to GFeC. Two MTAs (2258533 and 3033960) at position 0-0.38 cM on chromosome 5A are located in the same GFeC-QTL region as reported by Peleg et al. (2009). The MTA 2255722 (46.59 cM) was located in a GFeC-QTL region between markers wmc468-barc170 at position 32.6-50.8 cM on chromosome 4A (Pu et al., 2014). The remaining 3 MTAs on 3B, 4B, and 7B in our studies might be novel.
Among the three MTAs related to high GZnC, two MTAs (1077698 and 1234362) on chromosome 2A at position 139.90-145.15 cM were located in the same interval of wild emmer GZnC-QTL (112.4 ± 35.0 cM) (Peleg et al., 2009). The position of MTA 1136167 (359.46 cM) in the current study is different to the GZnC-QTL (39 cM) on chromosome 1A (Roshanzamir et al., 2013).
The five MATs related to high GMnC located on chromosome 1B. Several GMnC-QTLs have been reported on different wheat chromosomes such as 1A, 2A, 2B, 2D, 3A, 3B, 4B, 4D, 5A, 5D, 6B, and 7B (Shi et al., 2013;Pu et al., 2014;Bhatta et al., 2018). To our best knowledge, there have no reports of GMnC QTLs on chromosome 1B. Noteworthy, we have found that some wheat genotypes carrying MTAs for high GFeC, GZnC, and GMnC, but without reduction in TKW (Figure 4, Supplementary Table 2). The GFeC, GZnC, GMnC, and TKW of these wheat genotypes were significantly higher than those of the parent CN16 (p < 0.001) (Supplementary Table 2). These results indicate the presence of novel wild emmer alleles that would be used for grain minerals biofortification.
In the present study, the high-confidence MTAs were used to predicate putative candidate genes. We found that MYB transcription factor, dioxygenase, and hydrolase genes may play important roles in conferring high GFeC. For example, the dioxygenase genes in higher plants were reported to be involved in the absorption and transport of Fe (Kobayashi and Nishizawa, 2012). The MYB transcription factor in Malus xiaojinensis was responded to Fe deficiency stress (Han et al., 2020). In addition, we have identified hydrolase family protein/HAD-superfamily protein, thioredoxin-like protein AAED1, kelch repeat-containing protein, receptor-like protein kinase, and HD domain-containing metal-dependent phosphohydrolase family protein on chromosome 4A and 3B. The previous study showed that the kelch repeat-containing protein was associated with Fe acquisition and metal homeostasis (Kawahara et al., 2017).
Our results indicated that zinc ion binding (FAR1), heavy metal transport, zinc finger (C3HC4-type RING finger) family, and mitogen-activated protein kinase (MAPK) may be promising candidates for GZnC (Supplementary Table 3). Previous studies found that the metal transporter (ZIP1) could transport Zn in wild emmer wheat (Durmaz et al., 2011). A recent study reported that a MAPK-related gene may be involved in Zn accumulation in chickpea seeds (Upadhyaya et al., 2016;Alomari et al., 2018).
We have identified S-adenosyl-L-methionine-dependent methyltransferases, ABC transporter, phosphatase, and argininosuccinate synthase as the candidates for GMnC. For example, the ABC transporter was proved to be involved in the uptake and transport of manganese ions (Li and Tan, 2012). Arginase is a type of manganese enzyme, and therefore argininosuccinate synthase can lead to Mn accumulation (Li and Tan, 2012).
Previous studies have demonstrated that the grain yield and micronutrients (Fe, Zn, and Mn) are negatively correlated in wheat, making the simultaneous improvement of these two traits challenging (Shi et al., 2013;Bhatta et al., 2018). However, Peleg et al. (2008) reported neither negative nor positive associations between grain micronutrient concentrations and yield in wild emmer wheat. In our previous studies, we have introgressed the high GFeC, GZnC, and TKW traits of wild emmer into common wheat through wide hybridization (Wu et al., 2008;Wang, 2015). In this study, we found no significant negative correlation between grain micronutrient concentrations (GFeC, GZnC, and GMnC) and TKW in the advanced lines. Our findings demonstrate the improvement of grain micronutrient concentrations (GFeC, GZnC, and GMnC) without sacrificing grain yield could be possible by the exploitation of wild emmer MTAs.

CONCLUSION
Our results show the importance of wild emmer as valuable germplasm for the improvement of grain micronutrient concentrations in wheat. The advanced lines derived from wild emmer showed high grain micronutrients and a weak correlation between grain micronutrients and TKW, indicating simultaneous improvement of grain micronutrient concentrations (especially GFeC and GZnC) and TKW could be possible. The top-ranking 61 wheat genotypes carrying MTAs for high GFeC, GZnC, and GMnC without reduction in TKW could be valuable donor parents for wheat biofortification breeding.
A GWAS identified 14 high-confidence significant MTAs that were associated with GFeC, GZnC, and GMnC, of which nine MTAs were novel. In total, 38 putative candidate genes for these MTAs were predicated. The MTAs and associated candidate genes provide information for fine mapping and cloning of genes related to GFeC, GZnC, and GMnC and for wheat biofortification breeding programs.

DATA AVAILABILITY STATEMENT
The original contributions presented in the study are included in the article/Supplementary Material, further inquiries can be directed to the corresponding author.

AUTHOR CONTRIBUTIONS
BW, JL, and LH designed the experiment. JL, TL, and GT performed experiments. JL analyzed data and drafted the manuscript. YL and ZY provided the genotyping data. YZ, DL, and BW provided their valuable suggestions. LH and BW revised the manuscript. All authors read and approved the final manuscript.