Identification of Genetic Loci and Candidate Genes Related to Grain Zinc and Iron Concentration Using a Zinc-Enriched Wheat ‘Zinc-Shakti’

The development of nutritionally enhanced wheat (Triticum aestivum L.) with higher levels of grain iron (Fe) and zinc (Zn) offers a sustainable solution to micronutrient deficiency among resource-poor wheat consumers. One hundred and ninety recombinant inbred lines (RILs) from ‘Kachu’ × ‘Zinc-Shakti’ cross were phenotyped for grain Fe and Zn concentrations and phenological and agronomically important traits at Ciudad Obregon, Mexico in the 2017–2018, 2018–2019, and 2019–2020 growing seasons and Diversity Arrays Technology (DArT) molecular marker data were used to determine genomic regions controlling grain micronutrients and agronomic traits. We identified seven new pleiotropic quantitative trait loci (QTL) for grain Zn and Fe on chromosomes 1B, 1D, 2B, 6A, and 7D. The stable pleiotropic QTL identified have expanded the diversity of QTL that could be used in breeding for wheat biofortification. Nine RILs with the best combination of pleiotropic QTL for Zn and Fe have been identified to be used in future crossing programs and to be screened in elite yield trials before releasing as biofortified varieties. In silico analysis revealed several candidate genes underlying QTL, including those belonging to the families of the transporters and kinases known to transport small peptides and minerals (thus assisting mineral uptake) and catalyzing phosphorylation processes, respectively.


INTRODUCTION
About 3 billion people around the world, especially in countries where cereal-based foods represent the largest proportion of the daily diet, suffer from micronutrient malnutrition resulting primarily from iron (Fe) and zinc (Zn) deficiencies (Cakmak, 2002;Bouis and Islam, 2012;Black et al., 2013;Grew, 2018). The Fe and Zn deficiencies affect the immune system and cognitive abilities and are considered to be important causes of retarded growth (Bryan et al., 2004;Stoecker et al., 2009). Approximately one-fourth of the world population suffers from anemia caused by iron deficiency (Allen et al., 2006). The pregnant women and young children are the hardest hit sections to acute micronutrient malnutrition. WHO states that globally 45% of annual child mortality is attributed to malnutrition. In India alone, more than 50% of children below 5 years of age and pregnant women are anemic, whereas 38% of children of the same age group are stunted (Sharma et al., 2020).
Hence, food security is not only about consuming a sufficient quantity of food but also nutrients to ensure proper human health. The process of supplementation and fortification of food products are common practices to reduce micronutrient deficiency. However, supplementation and fortification have not been successful where malnutrition problem is alarming due to the unaffordability of either of the two options. Moreover, these are not sustainable approaches to combat micronutrient malnutrition due to recurring investments (Pfeiffer and McClafferty, 2007). Genetic biofortification, a strategy to develop the staple food crop varieties with increased levels of micronutrients and reduced levels of anti-nutrients using plant breeding techniques, has been heralded as a sustainable and longterm solution for contributing to alleviating the malnutrition problem (Bouis and Saltzman, 2017).
Wheat (Triticum aestivum L.) is the staple food for over 2.5 billion people, and accounts for 17% of the calories and 20% of human protein intake (Gerard et al., 2020). Wheat is also a main source of micronutrients in predominantly vegetarian populations lacking food diversity in many developing countries (Braun et al., 2010). Therefore, wheat is a suitable candidate for genetic biofortification to combat this hidden hunger among the rural and urban poor particularly from underdeveloped and developing regions of the world.
Genetic variability is the first prerequisite in plant breedingbased methods. Triticum species related to wheat such as Aegilops tauschii, Triticum monoccocum, Triticum dicoccum, Triticum boeticum, and Triticum spelta have been found to be the sources of tremendous genetic diversity for grain Zn and Fe concentrations and for other agronomic and nutritional quality traits . However, the introgression of genes from these wild relatives to the elite wheat lines requires substantial efforts, time, and resources. Hence, synthetic hexaploids were developed by crossing tetraploid durum wheat (T. durum or T. dicoccum) with Ae. tauschii and synthetic-derived lines have been used extensively in introgressing genes into the elite bread wheat (Ginkel et al., 2002). The introgressions from synthetic hexaploids were exploited to develop varieties like 'WB02' and 'Zinc-Shakti' with 20-40% increased Zn content than local checks .
Quantitative trait loci (QTL) mapping using bi-or multiparental populations is highly useful for the discovery of genes for quantitative traits such as grain Zn and Fe and for the development of molecular markers to be used in breeding programs . The advances made in nextgeneration sequencing technologies in wheat and other crops, generating thousands of markers faster and cheaper (Poland et al., 2012;Wang et al., 2014;Winfield et al., 2016), led to increased QTL mapping and genomic prediction studies in the past decade for multiple traits including grain Zn and Fe Tiwari et al., 2016;Velu et al., 2017;Liu et al., 2019;Cu et al., 2020). QTL linked to grain Zn and Fe contents have been reported on chromosomes 1A, 2A, 5A, 2B, 3D, 4B, 6A, 6B, and 7A as individual or pleiotropic genomic regions controlling grain Zn and Fe contents and/or thousand kernel weight (Xu et al., 2012;Hao et al., 2014;Crespo-Herrera et al., 2016).
Therefore, identifying the genomic regions that regulate the accumulation of Zn and Fe in the grain without any confounding effects on grain yield would allow breeders to develop high yielding biofortified cultivars. In the current study, we evaluated a recombinant inbred line (RIL) population, developed from a cross between the high grain Zn cultivar Zinc-Shakti and the low Zn cultivar Kachu, for grain Zn and Fe and other agronomic and yield attributing traits. The specific objectives of the study were to (1) identify stable QTL associated with grain iron (GFeC) and zinc (GZnC) concentration for use in wheat breeding programs, (2) identify lines with best QTL combinations to be deployed in future crossing program, and (3) dissect the role of epistasis in the genetic architecture of nutritional traits (GZnC and GFeC).

Planting Material
The F 6 population consisting of 190 RILs was developed from the cross between the CIMMYT's high Zn wheat cultivar Zinc-Shakti and the low Zn cultivar Kachu. The RIL population along with the two parents and two commercial checks, Baj and Borlaug100, were grown at Norman E. Borlaug Research Station, Ciudad Obregon, Sonora, Mexico to evaluate for the agronomic and nutritional quality traits.

Phenotyping and Analysis of Phenotypic Data
The RIL population was evaluated for grain zinc (GZnC), iron (GFeC) concentration, thousand kernel weight (TKW), and plant height (PH) for 3 consecutive years during 2017-2018 (Y1), 2018-2019 (Y2), and 2019-2020 (Y3). The yield components and agronomic traits were test weight (TW), days to heading (DH), and days to maturity (DM) for 2 consecutive years during 2017-2018 (Y1) and 2018-2019 (Y2). Each RIL was grown on a double-row plot of 1 m length and 0.8 m width in a bed-planting system in randomized complete block design with two replications. Diseases and pests were controlled chemically, whereas weeds were controlled both manually and chemically. Plant materials were harvested after physiological maturity when grains were totally dry in the field. Grain samples of about 20 g for each entry were carefully cleaned to discard broken grains and foreign material and then used for micronutrient analysis. GZnC and GFeC were measured with a "bench-top" non-destructive, energy-dispersive X-ray fluorescence spectrometry (ED-XRF) instrument (model X-Supreme 8000; Oxford Instruments plc, Abingdon, United Kingdom) standardized for high-throughput screening of mineral concentration of whole grain wheat (Paltridge et al., 2012). Two laboratory commercial checks, namely, Baj and Borlaug100, were used as in-house quality control checks. TKW was measured with Seed Count digital imaging system (model SC5000; Next Instruments Pty Ltd., NSW, Australia) that was standardized to measure TKW. The Seed-Count system can rapidly and accurately measure wheat grain samples, determining the grain number and grain physical characteristics based on flatbed scanner technology. The data on DH was measured by counting the number of days from germination to 50% of plants heading in a plot. DM was measured by counting the number of days from germination to physiological maturity when more than 50% of spikes were ripe and had turned yellow. PH was measured from the ground to the tip of the spike excluding awns at the late grain-filling stage.
All phenotypic data analysis was conducted in Meta-R (Multi Environment Trial Analysis with R) version 6.0 software. Best linear unbiased predictors (BLUPs) of each RIL were obtained for an individual year and across years in Meta R. These estimated BLUPs were used in QTL analysis. The broad sense heritability (h 2 ) was also estimated in Meta R for traits in each year and across years.

Genotyping
The genomic DNA was extracted by following the standard procedures by Dreisigacker et al. (2004). The population was genotyped with the DArTSeq technique (Edet et al., 2018) in the Genetic Analysis Service for Agriculture (SAGA) with current headquarters at CIMMYT, El Batan, Texcoco, Estado de Mexico. The array technology works on the genome complexity reduction concept by using a combination of restriction enzymes to obtain a representation of the whole genome. The FASTQ files were quality filtered using a Phred quality score of 30. More stringent filtering was also performed on barcode sequences using a Phred quality score of 10, which represents 99.9% of base call accuracy for at least 75% of the bases. A proprietary analytical pipeline developed by DArT P/L was used to generate allele calls for SNP and presence/absence variation (PAV) markers.

Linkage Mapping, QTL Analysis, and Epistatic Interactions
The initial genotypic information consisted of 40,059 markers codified as 0 and 1 for the two homozygous parental alleles and 2 for the heterozygotes. Parental non-polymorphic markers, markers with >30% missing data, and the redundant markers were discarded using bin function in QTL IciMappingv4.2 software. A filtered set of 909 highly informative DArTseq markers were used for linkage map construction. The linkage groups were assembled from the genotypic data using QTL IciMappingv4.2 software 1 , applying a LOD threshold of 3.0 between adjacent markers (Liu et al., 2019). Markers were ordered with the nnTwoOpt algorithm, using a 5 cM window size for rippling the markers in each linkage group. Linkage groups with less than five markers were discarded from the analysis. QTL were identified using two approaches: inclusive composite interval mapping (ICIM) and multienvironment QTL mapping, both algorithms implemented in QTL IciMappingv4.2 software. ICIM identifies additive and dominant QTL for single environment. The multi-environment QTL mapping uses a QTL by environment interaction (QEI) model and estimates both additive and additive × year interaction effects for each QTL. This methodology first conducts stepwise regression in each environment to identify the most significant marker variables and then one-dimensional scanning on the adjusted phenotypic values across the environments to detect QTL with both average effect and QEI effects. A LOD threshold of 2.5 was applied to call significance. QTL nomenclature was done following the standard procedure 2 .
We estimated two-and three-locus epistatic interactions among identified QTL in R using a script described in Sehgal et al. (2017). Marker-marker interactions were declared significant at a threshold of p < 0.0001 and R 2 was used to describe percentage variation explained (PVE) by the significant interactions.

In silico Analysis of QTL
After the identification of QTL, an in silico search of the candidate genes was conducted in the Ensemble Plants database 3 of the bread wheat genome with the Basic Local Alignment Search Tool (BLAST) using the sequence information of the markers present within the peak of the QTL and the flanking markers.

QTL Additive Effects Estimation
The two genotypic classes of the flanking markers of all important QTL (with highest PVE and pleiotropic for GZnC and GFeC) were compared for the average trait value. Additive effects were then estimated for multiple QTL combinations. The RILs with the best combination of QTL for GZnC and GFeC were identified by estimating the additive effects of multiple QTL.

Phenotypic Evaluations
The mean values of the traits for the two parents, Kachu and Zinc-Shakti, RILs, and the two checks, namely, Baj and Borlaug100, in year I (Y1), year II (Y2), year III (Y3), and across years are shown in Table 1. Large and significant differences were observed between the parents for all the traits except TKW. The RIL population showed a normal to near normal frequency distribution for all the traits in all the years and across the years (Figure 1 and Supplementary Figures 1-5). The range, mean, coefficient of variation (CV), heritability, and variance estimates for the RIL population provided in Table 1. Transgressive segregants for GZnC and GFeC were obtained in the RIL population. For GZnC, the highest performing three RILs showed average values of 70.8, 69.4, and 68.4 mg/kg, which are 9.4 to 15.6 mg/kg higher than the two parents and checks. For GFeC, the three RILs showed average values of 42.9, 42.7, and 42.5 mg/kg, which are 2.9 to 5.0 mg/kg higher than the two parents and checks.
The Pearson correlation coefficients (r) observed in the population between GZnC and GFeC were statistically significant (p < 0.001) in each year of evaluation and across years (Figure 2 and Table 2). GZnC is found to have a consistently significant negative correlation with DH and DM at p < 0.05 to 0.001, whereas it exhibited a significant positive correlation with TKW The t-test results indicating significant difference between parents for each trait are provided in the environment column. *Significant values at p < 0.05, **significant values at p < 0.01, ***significant values at p < 0.001.
(p < 0.001) in Y2, Y3, and across years. GFeC showed a significant positive correlation with PH in Y1, Y2, and across years (p < 0.05 to 0.001) and a negative association with TW in Y2 and over the years (p < 0.001). The DH, DM, and PH exhibited a highly significant correlation with each other in each year and also across the years (p < 0.001). TKW showed a consistently negative correlation with DH, DM, and PH in all the environments ( Table 2).

Linkage Map Construction and QTL Analysis
The linkage map representing all the wheat chromosomes was constructed using 909 non-redundant filtered markers. The number of markers in the linkage groups ranged from 5 (5D) to 109 (2B), whereas the map distance of the linkage groups ranged between 24 cM (1D) and 537 cM (2A). Among the total number of markers, 45.32 and 40.70% of markers were grouped on the A and B genome, respectively, whereas the D genome had the least number of markers (13.97%; Table 3). The whole linkage map covered a genetic distance of 4665 cM with an average inter-marker distance of 5.13 cM. Considering all environments individually, QTL mapping using ICIM identified a total of 15 QTL each for GZnC and GFeC.
Of these, nine and four QTL for GZnC and GFeC, respectively, were detected in at least two environments ( Table 4) and thus were stable. The nine stable QTL detected for GZnC showed considerable variation in LOD scores and percentage variation explained (PVE) in different environments. For example, QTL detected on chromosomes 2A and 2B showed the highest LOD (11.0 and 9.2) and PVE (10.3 and 13.3%) in one of the environments and a moderate value for both LOD and PVE in the remaining environments. The QTL on chromosome 7D, on the other hand, showed moderate values for both LOD scores (6.7-8.0) and PVE (7.9-9.5%) in all environments where it was detected. The QTL on chromosome 1D was detected in all four environments; however, it had minor effects in one of the environments (PVE of 5.4% in environment 2). Out of four stable QTL detected for GFeC, the one on chromosome 2A was identified in all four environments with moderate to high LOD scores (5.2-9.4) but consistently high PVE (>10.0%) in all environments.

In silico Analysis
In silico analysis identified 13 candidate genes underlying six QTL with highest PVE for GZnC and GFeC ( Table 7). The most significant of these are those coding for ABC transporters (TraesCS2A02G110200 underlying QTL QZnC-2A.2) and oligopeptide transporters (TraesCS7D02G099500 and TraesCS7D02G139600 underlying QZnC-7D.1) playing critical roles in the transport of small peptides, secondary amino acids, and mineral uptake or the kinase-like superfamily, catalyzing phosphorylation processes (TraesCS1B02G357900 underlying QZnC-1B.1).

DISCUSSION
Quantitative trait loci mapping is a useful strategy to identify genomic regions governing quantitative traits and to identify molecular markers to facilitate marker-assisted breeding (MAB) of the desired trait into the elite germplasm. Identification of stable QTL across a wide range of environments is of great importance in a MAB program. The present study dissected QTL for two important nutritional traits in wheat, grain Zn and Fe concentration. We identified several stable genomic regions governing GFeC and GZnC through the application of both inclusive composite interval mapping, which identifies  additive and dominant QTL in single environments, and multienvironment QTL mapping which uses QTL-by-environment interaction (QEI) model and identifies QTL with average additive effect and QEI effects (Li et al., 2015). The parental lines in the present study showed contrasting phenotype for grain iron and zinc concentrations and agronomic traits. The high Zn parent 'Zinc-Shakti' is a synthetic hexaploid wheat (CROC1_/AE.SQUARROSA(210)//INQALAB 91 * 2/KUKUNA/3/PBW343 * 2/KUKUNA), whereas the other parent 'Kachu' is a high-yielding adapted line grown widely in South Asia. Both coefficient of parentage (COP = 0.17) and SNP-based diversity (π = 0.26) indicated that the parents are  distantly related and this highly likely is the reason for the ample transgressive segregation observed in our population for both nutritional traits. The action of loci with complementary additive effect differentially present in parental lines can be detected when progenitors are distantly related (Rieseberg et al., 1999).
In agreement with this, we detected QTL originating from both parents, signifying the complementary effect of QTL. Kumar et al. (2007); and Wu et al. (2012) demonstrated the importance of QTL-by-environment interactions for quantitative traits and emphasized that the estimation of the main-effect QTL is biased if QEI were not examined. We detected 27 and 23 QTL in multi-environment QTL mapping and 9 and 4 QTL in ICIM analysis for GZnC and GFeC, respectively. Most significantly, all stable QTL detected in ICIM analysis were also detected in multi-environment analysis. It is, however, noteworthy that multi-environment analysis identified a lot more pleiotropic QTL for GZnC and GFeC which could not be detected in ICIM analysis. For GZnC, QTL with the highest PVE were identified on chromosomes 1B and 7D (7.7 and 8.1%, respectively), whereas for GFeC, these were detected on chromosomes 2A and 6B (10.1 and 10.2%, respectively). Previous QTL studies have also mapped QTL for GZnC and GFeC on these chromosomes (Peleg et al., 2008;Xu et al., 2012;Srinivasa et al., 2014;Krishnappa et al., 2017;Velu et al., 2017Velu et al., , 2018Liu et al., 2019). Most significantly, we detected seven pleiotropic regions or overlapping regions for GZnC and GFeC in the present study on chromosomes 1B, 1D, 2B, 6A, and 7D (Figure 3). Previous studies have reported pleiotropic QTL for GZnC and GFeC on chromosomes 2B, 3B, 3D, 4B, and 5A (Xu et al., 2012;Hao et al., 2014;Crespo-Herrera et al., 2016. Hence, the identification of new pleiotropic regions for GZnC and GFeC in the current study has expanded the diversity of QTL that could be used for simultaneous improvement of GZnC and GFeC. Notably, the identification and cloning of GPC-B1 gene Cysteine-type peptidase activity located on chromosome 6B an early regulator of senescence and affects remobilization of protein and minerals to the grain by Uauy et al. (2006) followed by Pearce et al. (2014) documented the GPC-B1 is a NAC transcription factor and has a paralogous copy on chromosome 2 in wheat. Apparently, some of the QTL identified in this study may have associated with the GPC-B1. Based on the best pleiotropic QTL combination, we have identified nine transgressive individuals from the RIL population that could be used in the breeding pipeline ( Table 6). The higher number of pleiotropic QTL for GZnC and GFeC obtained in the present study vis-à-vis previous studies is largely due to the higher correlation obtained between GZnC and GFeC as compared with previous studies (Xu et al., 2012;Crespo-Herrera et al., 2017;Rathan et al., 2020). For example, Xu et al. (2012); and Crespo-Herrera et al. (2017) reported correlations between GZnC and GFeC ranging from 0.42 to 0.82 and 0.38-0.63, respectively, in different environments, whereas we obtained correlations ranging from 0.76 to 0.96 between the two traits in different years. Further, it is noteworthy that none of the QTL identified here had significant G × E effects as all QTL detected here had larger LOD scores for the additive average effect than the LOD score of the interaction ( Table 5). The contribution of epistatic interactions in the genetic architecture of disease resistance, end-use quality, and grain yield has been extensively investigated in wheat using bi-parental designs (Zhou et al., 2002;Yang et al., 2005;Ma et al., 2006;Kumar et al., 2007;Mann et al., 2009;Singh et al., 2009;Kolmer et al., 2011;Wu et al., 2012). However, little information is available on epistatic interactions of QTL for nutritional traits (Xu et al., 2012). The current study identified significant epistatically interacting QTL for both GZnC and GFeC by twoand three-locus interactions, which suggests the significant role of epistasis in their genetic architecture. Particularly for GZnC, PVE by epistatically interacting loci was higher than PVE explained by 11 additive QTL. Hot spots of epistatic interactions were found on chromosomes 2B and 4A for GZnC and on chromosomes 2A and 7D for GFeC.
In silico BLAST search identified various potential candidate genes underlying QTL with high PVE or pleiotropic QTL for GZnC and GFeC (Table 7). QZnC-2A.2 appears to overlap with a gene coding for ABC transporter. In the past years, a great wealth of information has been gained in understanding the interaction of Fe and Zn homeostasis in plants as a consequence of the chemical similarity between their divalent cations. In this regard, ABC transporters have been predicted to play an important role in plants. In A. thaliana, for example, the ABC transporters are shown to contribute to the accumulation of Cdphytochelatin (Cd-PC) complexes in the vacuole. The chemical similarity of Cd 2+ to Zn 2+ and the important role played by phytochelatins in Zn homeostasis (Tennstedt et al., 2009) suggest that these transporters or their homologs might contribute to Zn homeostasis. The pleiotropic QTL identified on chromosome 7D displayed its location in a region where genes code for oligopeptide transporters (OPTs) family. OPTs encode integral membrane proteins that play critical roles in the transport of small peptides, secondary amino acids, glutathione conjugates, and mineral uptake (Kumar et al., 2019). The expression pattern of genes belonging to the subfamily of OPT transporters in wheat during iron starvation experiments revealed an early high transcript accumulation of a few in roots (Kumar et al., 2019). The proven role of OPTs in long-distance iron transport or signaling in Arabidopsis (Stacey et al., 2008) further suggests both candidates, TraesCS7D02G099500 and TraesCS7D02G139600, underlying this pleiotropic QTL are strong for future validation studies. The BLAST results for QZnC-1B.1, another pleiotropic QTL for GZnC and GFeC, displayed its location in a region where genes code for the serine-threonine/tyrosine-protein kinaselike superfamily (Scheeff and Bourne, 2005), which catalyzes phosphorylation processes, and some are known to activate Zn channels and ZnT zinc transporters (Thingholm et al., 2020). In addition, in the region of QFeC-1D.3, there was a gene encoding for papain-like cysteine peptidase superfamily. Cysteine proteins act as "redox switches" and play an important role in regulatory and signaling pathways; sense concentrations of oxidative stressors and unbound zinc ions in the cytosol and control the activity of metalloproteins (Giles et al., 2003).

CONCLUSION
The GZnC was found to have a highly significant positive correlation with GFeC (0.76-0.96) suggesting the possibilities of simultaneous improvement of both GZnC and GFeC concentration in wheat. Further, identification of pleiotropic regions for GZnC, GFeC, and TKW suggests the possibilities of genetic improvement of GZnC and GFeC without compromising grain yield. The novel pleiotropic QTL identified in the present study have not only expanded the diversity of QTL that could be used in wheat breeding programs but also has opened vistas of validating many underlying candidate genes for biofortification. The nine RILs identified with the best combination of pleiotropic QTL can be used in the breeding pipeline and can also serve as direct candidates of biofortified varieties.

DATA AVAILABILITY STATEMENT
The original contributions presented in the study are included in the article/Supplementary Material, further inquiries can be directed to the corresponding author/s.

AUTHOR CONTRIBUTIONS
NR did the QTL mapping and wrote the manuscript. DS contributed to the assistance for QTL mapping. KT contributed to the assistance on identification of candidate genes using bioinformatics. A-MS supervised the data and wrote, reviewed, and edited the manuscript. RS contributed basic germplasm and resources. VG contributed to the resources, conceptualization, developed the mapping population, and generation of genotypic data. All authors contributed to the article and approved the submitted version.