QTL Mapping of Grain Zn and Fe Concentrations in Two Hexaploid Wheat RIL Populations with Ample Transgressive Segregation

More than 50% of undernourished children live in Asia and more than 25% live in Africa. Coupled with an inadequate food supply, mineral deficiencies are widespread in these populations; particularly zinc (Zn) and iron (Fe) deficiencies that lead to retarded growth, adverse effects on both the immune system and an individual's cognitive abilities. Biofortification is one solution aimed at reducing the incidence of these deficiencies. To efficiently breed a biofortified wheat variety, it is important to generate knowledge of the genomic regions associated with grain Zn (GZn) and Fe (GFe) concentration. This allows for the introgression of favorable alleles into elite germplasm. In this study we evaluated two bi-parental populations of 188 recombinant inbred lines (RILs) displaying a significant range of transgressive segregation for GZn and GFe during three crop cycles in CIMMYT, Mexico. Parents of the RILs were derived from Triticum spelta L. and synthetic hexaploid wheat crosses. QTL analysis identified a number of significant QTL with a region denominated as QGZn.cimmyt-7B_1P2 on chromosome 7B explaining the largest (32.7%) proportion of phenotypic variance (PVE) for GZn and leading to an average additive effect of −1.3. The QTL with the largest average additive effect for GFe (−0.161) was found on chromosome 4A (QGFe.cimmyt-4A_P2), with 21.14% of the PVE. The region QGZn.cimmyt-7B_1P2 co-localized closest to the region QGZn.cimmyt-7B_1P1 in a consensus map built from the linkage maps of both populations. Pleiotropic or tightly linked QTL were also found on chromosome 3B, however of minor effects and PVE between 4.3 and 10.9%. Further efforts are required to utilize the QTL information in marker assisted backcrossing schemes for wheat biofortification. A strategy to follow is to intercross the transgressive individuals from both populations and then utilize them as sources in biofortification breeding pipelines.


INTRODUCTION
Malnutrition in its different forms affects more than 2 billion people across the globe, and undernutrition is the main cause of the 45% of deaths in children aged under 5 years of age (WHO, 2017). Recent figures indicate that 155 million children suffer from stunting and 52 million are wasted, of which more than 50% live in Asia and more than 25% live in Africa (WHO, 2017); areas where there is an alarmingly high incidence of malnutrition (Muthayya et al., 2013). Mineral deficiencies, particularly Zn and Fe lead to retarded growth and affect the immune system and cognitive abilities (Bryan et al., 2004;Osendarp et al., 2007;Stoecker et al., 2009;Kambe et al., 2014). In 2016, the United Nations established a decade of action to combat malnutrition and this included the promotion and provision of healthy and sustainable food systems, encompassing investments in agriculture (WHO, 2017).
The generation of biofortified, staple food crops such as wheat, is an important opportunity to contribute to the solution of the hidden hunger problem in low income countries . Agronomic practices can contribute to wheat biofortification. For instance, it is proven that foliar applications of Zn can increase this mineral concentration in the grains, but only in conditions when the soil is Zn-deficient, and there are indications that an effective method for increasing grain Zn is the combination of soil fertilization with foliar applications . However, augmenting mineral concentrations solely through agronomic practices implies increased production costs for farmers, which are often not able to afford.
Genetic biofortification offers a solution that is not opposite to agronomic practices, and in fact can be synergistic . However, wheat improvement for higher concentrations of micronutrients in the grain requires substantial efforts in resources and money, starting from the identification of source materials with high nutrient concentration in the grain, to pre-breeding and breeding for final product development. Furthermore, the identification of favorable alleles from diverse origins is fundamental for wheat biofortification (Singh and Velu, 2017), since breeding progress requires the combination of different loci in breeding pipelines. Sources with vast genetic diversity for grain Zn and Fe concentrations, besides wheat landraces, are species such as Aegilops tauschii (Coss.), Triticum monoccocum L., Triticum dicoccum Schrank ex Schübl., Triticum boeticum Boiss., and Triticum spelta L. . The exploitation of these genetic resources can be through the development and utilization of synthetic hexaploid wheats (SHWs) to introgress favorable alleles from the tetraploid species T. dicoccum and the diploid species A. tauschii in elite bread wheats (Mujeeb-Kazi, 1995). Also favorable alles can be transferred from T. spelta to bread wheat by direct crossing due to the hexaploid nature of T. spelta. Even though, landraces, SHWs and T. spelta can be directly crossed with bread wheat, it requires substantial efforts to apply the appropriate selection methods to only transfer the loci of interest without losing the adaptability and yield potential of elite germplasm (Velu et al., 2012. Modern technologies such as next-generation sequencing and advanced statistical procedures can facilitate the identification and introgression of genomic regions associated with higher Zn and Fe in the grains of elite germplasm. One way to identify genomic regions associated with traits of interest is the QTL or linkage mapping procedures. When mapping populations are developed and then phenotyped in different environments or years, it is possible to conduct QTL analysis for multi-environmental trials in various ways (Da Costa E Silva et al., 2012b;Li et al., 2015). An approach that can be utilized is inclusive-composite interval mapping (ICIM) , where first a stepwise regression is performed in each environment to identify markers that significantly explain the phenotypic variation, which are then used to adjust the phenotypic values, and then interval mapping is conducted on the adjusted phenotypic data across environments to detect QTL with significant average additive effect, and/or QTL with significant interaction with the environment. This two-step approach has proven to be effective in controlling the genetic background effect (selection of cofactors), which in consequence reduces the variance of the estimated genetic parameter and hence increases power and precision . The ICIM approach for multi-environmental trials gives three LOD scores, the first for the QTL-by-environment interaction, the second for the average additive effect and thirdly an overall LOD score which is the sum of the first and the second. The genome wide significance threshold can be obtained through an empirical formula or through permutation tests .
One gene that is cloned and reported to increase Zn, Fe, and protein content in wheat grains is the Gpc-B1 locus in chromosome 6BS, initially mapped in a population of recombinant inbred lines of tetraploid wheat (Uauy et al., 2006). Gpc-B1 gene encodes a NAC transcription factor (NAM-B1) that accelerates senescence and increases nutrient translocation from leaves to grains (Uauy et al., 2006). Additionally, various authors have identified QTL associated with grain Zn and Fe concentrations and efficiency on various chromosomes of wheat and wheat relatives, for instance 1A, 2A, 2B, 3D, 4B, 6A, 6B, and 7A (Tiwari et al., 2009;Xu et al., 2012;Srinivasa et al., 2014;Velu et al., 2016). Some of these works have mapped QTL that are either tightly linked or are pleiotropic for grain Zn and Fe concentration, and even show some association with thousand kernnel weight (Xu et al., 2012;Hao et al., 2014;Crespo-Herrera et al., 2016). These findings are relevent, since they indicate the possibility of breeding for higher concentration of micronutrients sumultaneously. Supported by previous QTL mapping results in the Global Wheat Program at CIMMYT Crespo-Herrera et al., 2016), efforts are being made to develop molecular markers associated with grain Zn and Fe concentrations in the grains, and their further validation and utilization in marker assisted backcrossing schemes.
In the present study we evaluated two diverse recombinant inbred line populations for three crop seasons. With the aid of genotyping by sequencing and QTL analysis for multienvironmental trials, we identified QTL for grain zinc and iron that may be useful for wheat biofortification.

Plant Materials
Two F6 populations (Pop1 and Pop2), each of 188 recombinant inbred lines (RILs), were developed from the cross of a synthetic hexaploid wheat (SHW) and a T. spelta L. derived line ( Table 1). The parental lines were developed at CIMMYT for their medium-high concentration of GZn and GFe and crossed with the expectation of observing transgressive segregation in the mapping populations. Parent 1 (Table 1) of both populations was distributed in the 1st Harvest Plus Yield Trial (Velu et al., 2012), an internationally and annually distributed nursery in South Asia to CIMMYT collaborators within the International Wheat Improvement Network (IWIN). Trials were irrigated five times throughout the crop cycle and fertilized at a rate of 200-50 (N-P), of which 50-50 was applied in pre-sowing, and 150-00 at tillering stage. Diseases and pests were controlled chemically, whereas weeds were controlled manually and chemically. The soil of the field trials was previously enriched with 25 kg ha −1 of ZnSO 4 .7H 2 O over three crop cycles. Soil analysis of the land were RILs were grown had an average Zn concentration of 1.2 ppm at soil depth of 0-30 cm, and 0.86 ppm at a soil depth of 30-60 cm. The average Fe concentration in the soil was 5.01 and 6.1 ppm, at 0-30 and 30-60 cm soil depth, respectively.

Phenotyping and Analysis of Phenotypic Data
Mineral concentrations in the grains (GZn and GFe) were measured with a "bench-top, " non-destructive, energy-dispersive X-ray fluorescence spectrometry (EDXRF) instrument (model X-Supreme 8000; Oxford Instruments plc, Abingdon, UK) standardized for high-throughput screening of GZn and GFe in whole grain wheat (Paltridge et al., 2012).
All analyses of the phenotypic data were conducted in R v3.3.2 (R Development Core Team, 2013) with the lme4 package (Bates et al., 2015). Best linear unbiased predictors (BLUPs) of each RIL were obtained for single crop cycles by specifying a single year analysis model where RILs and reps were regarded as random effects. Additionally BLUPs across years of evaluation were obtained by specifying a multi-year analysis model, where RILs, RILs × year interaction and reps within year were regarded as random effects. Heritability (h 2 ) was computed from the variance components. In addition, to assess the significance of the genotype-by-year interaction (GxY), we performed an analysis of variance with the RILs, years and the interaction of these as fixed effects.

Genotyping
Populations were genotyped with the Diversity Array Technology (DArT), and DArT-Seq. The array technology reduces the DNA complexity by using a combination of restriction enzymes to obtain a representation of the whole genome; the variable fragments of DArT are hybridized to a library of the species of interest, thus showing its nature of "presence/absence" patterns (Wenzl et al., 2004). The difference between DArT and DArT-Seq is that the latter works with the next generation sequencing technologies and skips the hybridization process, thus greater amounts of marker information can be obtained (Sansaloni et al., 2011). SNP calling was made simultaneously for both populations.

Linkage Analysis and QTL Mapping
A total of 9,034 markers were obtained for each population after the genotyping procedures. The markers that were not retained for the linkage analysis were those with: more than 20% missing data, minor allele frequency lower than 5% and those that were monomorphic between the parents of each population. Linkage and QTL analyses were conducted with the ICIMapping software (Li et al., 2008;Meng et al., 2015). The chromosome location of DArT markers (Akbari et al., 2006) was used as anchoring information to group the DArT-Seq markers using a LOD = 5.0 as significance threshold, in this way the markers with unknown chromosome assignment (DArT-Seq) can be grouped together with those that have chromosome information (DArT) given the indicated significance threshold. Markers were ordered with the Traveling Salesman algorithm, using a 5 cM window size for rippling the markers in linkage groups (LGs). Linkage groups with less than three markers or markers with no linkage were discarded from the analysis. The LG for Pop1 were built with 5,301 markers, of which 4,120 were DArT-Seq. The LG for Pop2 were constructed with 4,875 markers, of which 4,521 were DArT-Seq. Consensus maps derived from both populations were built for chromosomes that commonly harbored genomic regions associated with GZn and GFe. The consensus maps were built with the package LPmerge (Endelman and Plomion, 2014) in the R software R v3.3.2.
Inclusive composite interval mapping (ICIM) was used to make QTL analysis. The ICIM method applies a strategy in which a stepwise regression is firstly made, so markers with significant effect on QTL are selected, and then follows an interval mapping step, where the phenotypic values are adjusted by the selected marker variables, except for the two markers that flank the scanning position at each mapping step for background control. ICIM was performed with the multi-environmental model built in ICIMapping . Significant LOD thresholds were taken at the 5% tail of the null distribution in a 1,000 permutations test (Da Costa E Silva et al., 2012a,b).
After the QTL analysis a search was conducted with the Basic Local Alignment Search Tool (BLAST) in the Ensemble Plants database of the bread wheat genome (http://plants.ensembl.org/ index.html) with the default provided parameters. The search was conducted with the sequence of the markers flanking the QTL.

Phenotypic Evaluations
The levels of GZn in the progenitors of Pop1 ranged from 43.7-61.7 to 31.1-35.6 mg·kg −1 for GFe, while the range of GZn in the progenitors of Pop2 was 49.5-66.2 and 33.2-37.6 mg·kg −1 for GFe ( Table 2).
The   Statistically significant (p < 0.001) correlation coefficients (r) were observed for each population between GZn and GFe in each year of evaluations and across years (Figure 3). Additionally, since significant GxY was identified with the ANOVA for GZn, we calculated Kendall's τ coefficient of concordance to determine the presence of significant rank changes, which ranged from 0.20-0.35 (p < 0.01) of Pop1, and 0.27-0.49 (p < 0.0001) for GZn of Pop2. Kendall's τ coefficient of concordance was not calculated for GFe as the ANOVA did not indicate significant GxY interaction.

QTL Analysis
Of the total number of markers for Pop 1 (5,301) and Pop2 (4,875), more than 50% were grouped within the B genome, whereas the D genome had the least number of markers grouped ( Table 3). Linkage groups representing all wheat chromosomes were built with the genotypic data of each population. For Pop1 the number of markers in the linkage groups ranged from 24 (5D) to 635 (2B). The number of markers of the LGs for Pop2 ranged from 40 (2D) to 1,002 (1B). The map distance of the linkage groups ranged from 62.6 cM (5D) to 641 cM (3B). The average density of the linkage maps ranged from 0.5 to 1.7 markers·cM −1 .
The multi-environmental QTL analysis showed the association of various genomic regions with GZn and GFe in both populations (Tables 4, 5). The genome wide significant threshold was LOD = 4.5 and LOD = 5.0 for Pop1 and Pop2, respectively. The analysis indicated the presence of QTL for GZn on common chromosomes of both populations, namely: 1B, 6A, and 7B. In addition, one QTL for GFe in 5B was found in both populations. The consensus map built for 7B mapped the QTL for GZn from each population on near-by positions (Figure 4), indicating the presence of common regions for GZn. Consensus maps for chromosomes 1B and 6A were not possible to construct because of the reduced amount (<20%) of markers shared between linkage groups of each population.
The QTL with the second largest average additive effect was found on chromosome 6A (QGZn.cimmyt-6A_P2) between the interval 178.5-179.5, linked to marker 1697218, which explained a total PVE of 8.53%. Additional QTL of interest are those found on chromosomes 1A and 1B (Table 5), which displayed 10.78 and 11.25% of PVE, respectively.

Sequence Alignment (BLAST)
The sequence of the flanking marker of each QTL were entered in the Ensambl Plants database of the bread wheat genome sequence (http://plants.ensembl.org/index.html). Of all the QTL identified, 10 were located where genes coding for uncharacterized proteins are present and/or no-gene annotation is available. The rest of the QTL appeared to be located in regions were genes coded for diverse proteins, including mainly: Leucine-rich repeat (4 QTL), P-loop containing nucleoside triphosphate hydrolase (3

DISCUSSION
The study of mapping populations, through the implementation of QTL analysis is useful not only to identify genomic regions associated with traits of interest, but also to utilize the information of associated markers in breeding programs to efficiently incorporate particular loci in elite germplasm. In our study, through the application of QTL-by-environment interaction with composite interval mapping  it was possible to locate several genomic regions associated with  GZn and GFe in the two mapping populations that were studied over a period of three crop cycles in northwest Mexico. Our results are in agreement with other authors' findings in that GZn and GFe are traits of quantitative nature (Tiwari et al., 2009;Xu et al., 2012;Hao et al., 2014;Srinivasa et al., 2014;Crespo-Herrera et al., 2016;Krishnappa et al., 2017). These Previous QTL studies have also mapped QTL for GZn and GFe in various chromosomes of wheat and wheat related species, inlcuding 1A, 2A, 2B, 3A, 3D, 4B, 5A, 6A, 6B, 7A, 7B, with PVE ranging from 2.3% in chromosome 5A (Krishnappa et al., 2017) to 27.1% in chromosome 3B (Srinivasa et al., 2014). In our study the largest PVE (32.79%) was displayed by QGZn.cimmyt.7B_1P2 in chromosome 7B. A recent QTL mapping study also identified promising genomic regions on chromosomes 1A, 2A, 2B, 5A, 7A, and 7B, with PVE ranging from 2.3 to 14.4% (Krishnappa et al., 2017). Such QTL originate from the line "Triticum dicoccon PI94624/Aegilops squarrosa [409]//BCN, " a SHW parent that is also present in the pedigree of Turutur (Parent 2, Pop2). QTL in our study were also found on those chromosomes of Pop2, except in 5A. However, it is difficult to stablish similarities between the genomic regions, because our map and that produced by Krishnappa et al. (2017) are based on different and not easily comparable type of markers (DArT vs. SSR).
The genotype by environment interaction has important implications for crop performance and breeding, particularly the cross-over type interactions. For wheat biofortification, the ideal case is to obtain stable wheat genotypes that perform well without cross-over interaction when evaluated in other environments or years in a determined geographical area. The analysis of the phenotypic data in our study indicated the presence of significant GxY interaction, however when we examined the data further and calculated the Kendall's τ coefficient as an indicator of rank changes in the data, we observed that τ values are highly significant (p < 0.001), particularly for GZn of both populations, which indicates a change-of-magnitude rather than a cross-over type of interaction. In addition to that, the QTL analysis showed that most of the LOD scores for the additive average effect were larger than the LOD score of the interaction (Tables 4, 5), which indicates that QTL with larger LOD (Add) are more stable than those with larger LOD (GxY) .
Transgressive segregation was observed in both populations, particularly for GZn, given that the progenitors had similar GZn levels and the source of higher mineral concentrations are putatively of different origin, i.e., one from SHW and the other two from T. spelta. In fact, the coefficient of parentage (COP) between parents of Pop2 is 0.02, which is equal to the probability of genes being identical by descent, and it is calculated from the pedigree information as described by Cockerham (1983) with the Browse application of the International Crop Information System (ICIS) software described by McLaren et al. (2005). The largest range of segregation was detected in Pop2, which on average doubled that of Pop1 (COP = 0.148), and also progenitors of Pop2 contained higher GZn than the progenitors of Pop1 throughout the period of evaluation ( Table 2). The fact that the progenitors of Pop2 were more distantly related than those of Pop1 could be the reason why more QTL were found in Pop2 than in Pop1. One of the attributable causes of transgressive segregation is the action of loci with complementary additive effect differentially present in parental lines, which can be observed when progenitors are distantly related (Rieseberg et al., 1999). In line with this, QTL were found to be originated from both parents, indicating the complementary effect of QTL. In addition to the finding of complementary genomic regions in the two evaluated populations, it is possible to select those transgressive individuals with ideal QTL combination to utilize them in the breeding pipeline.
QGZn.cimmyt-7B_1P2, detected in Pop2 and located on chromosome 7B between markers 1079651 and 1262636, was the major QTL identified in this work, with 32.8% of the PVE, and the largest additive effect (−1.29) for GZn, originated from Louries (Parent 1, Pop2). Interestingly, in the consensus map of 7B, this region co-located with another QTL (QGZn.cimmyt-7B_1P1) detected in Pop1, which also displayed the highest PVE in this population. However, the relationship between these two regions needs to be further validated to determine if they are the same or not. One possibility to study this relationship while the wheat genome sequence is fully annotated and complete, is by firstly developing user-friendly markers from the DArT-seq sequence and then intercrossing the lines that carry this QTL to study the seggregation pattern in F2 and F3 generations.
Even though, the assembly and annotation of bread wheat remains challenging, recent advances report a 78% genome coverage (Clavijo et al., 2017), and it was possible to align the sequence of the markers flanking the QTL with the reference sequence of the wheat genome, available in the Ensambl plants database. Various sequences overlapped or were located in regions where genes code for proteins with unknown function, low confidence coding or not annotated yet. Nevertheless, the BLAST results for QGZn.cimmyt-1A_P2, QGZn.cimmyt-7B_2P2, QGZn.cimmyt-7B_1P1, QGFe.cimmyt-2A_P2 displayed its location in a region where genes code for the Kinase like superfamily, which catalyze phosphorylation processes in which some protein structures are Zn related (Scheeff and Bourne, 2005). Additionally, in the region of QGZn.cimmyt-1A_P2 there was a gene encoding for Multicopper oxidases, which are reported to be involved in the uptake of Zn and Fe in green algae (Herbik et al., 2002). Furthermore, for the QGZn.cimmyt-3B_2P2, QGFe.cimmyt-2A_P2, QGFe.cimmyt-2B_P2, and QGFe.cimmyt-3B_2P2, the BLAST results showed  that on such region of 3B there are genes encoding for the Cytochrome P450, which is reported to be related to Zn and Fe homeostasis, and frequently expressed under high Zn conditions in Arabidopsis (van de Mortel et al., 2006). The QTL in 1B of both populations displayed a large PVE, 15.1% in Pop1 and 11.25% in Pop2. According to the BLAST results, QGZn.cimmyt.1B_P1 appears to overlap with a gene that codes a protein belonging to the endo/exonuclease/phosphatase domain, which function is associated with DNA binding and repair during DNA replication (Wu et al., 2015;Kim et al., 2017). The QTL in chromosome 1B from Pop2 was located in a region where genes code for proteins with unknown function. The additional QTL found in a common chromosome were QGZn.cimmyt.6A_P1 and QGZn.cimmyt.6A_P2 in chromosome 6A. According to the BLAST results, QGZn.cimmyt.6A_P1 appeared to be located in a region where genes belonging to the Leucine-rich repeat, P-loop NTPase and Winged helix DNAbinding domains are present. QGZn.cimmyt.6A_P2 overlapped with a gene coding for a protein of unknown function. However, all these findings require further studies to determine if these regions are effectively related to our QTL findings.
We found that two genomic regions on chromosome 3B that are either pleiotropic or tightly linked for GZn and GFe. Other studies have also reported similar patterns in different chromosomes such as 2B, 4B, and 5A (Xu et al., 2012;Hao et al., 2014;Crespo-Herrera et al., 2016). These pleiotropic or tightly linked regions can partly explain the positive correlation that exists between GZn and GFe. Furthermore, this finding indicate the possibility of simultaneously breed for both traits. However, an additional, non-genetic factor that can contribute to the simultaneous allocation of Zn and Fe to the grains is nitrogen uptake. For example, in durum wheat, at high enough availability of N, Zn, and Fe are more significantly translocated to the grains (Erenoglu et al., 2011;Kutman et al., 2011).
From our analysis we conclude that the regions identified on chromosomes 7B, 6A, 3B, and 1B are of particular interest for wheat biofortification. Further efforts are required to incorporate the marker information in marker assisted backcrossing schemes. With the current information, one strategy to follow is to intercross the transgressive individuals from both populations and then utilize them as sources in the breeding pipeline, under this scheme we are currently validating and introgressing the QTL found by Crespo-Herrera et al. (2016) and Hao et al. (2014).

AUTHOR CONTRIBUTIONS
VG and RS developed the progenitors of the population. VG, RS, and YH developed the populations. VG, JS, LC, RS, and YH, phenotyped and genotyped the population. YH and LC analyzed the data. LC wrote the main part of the manuscript. All authors discussed the data, read, edited the manuscript and approved the final version.