Validation of Genes Affecting Rice Grain Zinc Content Through Candidate Gene-Based Association Analysis

Several key genes governing Zn homeostasis and grain zinc content (GZC) have been functionally characterized. However, the effects of these genes in diverse breeding populations have not been evaluated; thus, their availability in breeding is unclear. In this study, the effects of 65 genes related to rice zinc responses on GZC were evaluated using two panels of breeding lines, and the superior haplotypes were identified. One panel consisted of mega varieties from the International Rice Research Institute (IRRI), South Asia, and Southeast Asia (SEA), and the other panel is breeding lines/varieties from South China (SC). In addition, a multiparent advanced generation intercross (MAGIC) population, named as DC1, was also employed. Three analytical methods, single-locus mixed linear model (SL-MLM), multilocus random-SNP-effect mixed linear model (mrMLM), and haplotype-based association analysis (Hap-AA), were applied. OsIDEF1 (which explained 12.3% of the phenotypic variance) and OsZIFL7 (8.3–9.1%), OsZIP7 (18.9%), and OsIRT1 (17.9%) were identified by SL-MLM in SEA and SC, respectively, whereas no gene was significantly associated with GZC in DC1. In total, five (OsNRAMP6, OsYSL15, OsIRT1, OsIDEF1, and OsZIFL7, 7.70–15.39%), three (OsFRDL1, OsIRT1, and OsZIP7, 11.87–17.99%), and two (OsYSL7 and OsZIP7, 9.85–10.57%) genes were detected to be significantly associated with GZC in SEA, SC, and DC1 by mrMLM, respectively. Hap-AA indicated that Hap1-OsNRAMP5, Hap5-OsZIP4, Hap1-OsIRT1, Hap3-OsNRAMP6, Hap6-OsMTP1, and Hap6-OsYSL15 had the largest effects for GZC in SEA, whereas Hap3-OsOPT7, Hap4-OsIRT2, Hap4-OsZIP7, Hap5-OsIRT1, and Hap5-OsSAMS1 were the most significant in the SC population. Besides, superior alleles were also identified for the significant genes. The genes significantly associated with GZC and their superior haplotypes identified in different panels could be used in enhancing GZC through molecular breeding, which could further address the problem of Zn malnutrition among rice consumers.


INTRODUCTION
Zinc (Zn) is an important micronutrient for global human nutritional status (Keith et al., 2006;Sadeghzadeh, 2013). Zinc deficiency has been associated with serious health concerns, particularly in children from the developing world. Rice is an important crop for more than a quarter of the global population (Bouis and Welch, 2010;Swamy et al., 2016). Therefore, biofortifying rice grain Zn content is an efficient approach to combat Zn malnutrition. Nevertheless, genetic improvement of grain zinc content (GZC) is relatively cost-effective and efficient compared to the agronomic or postharvest processing method used for Zn biofortification (White and Broadley, 2011;Swamy et al., 2016).
The generated genomic resources would pave the way for identification of donors, alleles, and haplotypes associated with traits of interest (Varshney et al., 2009;Alexandrov et al., 2015). The GZC is a typical quantitative trait controlled by many genes individually, explaining tiny proportions of the total observed phenotypic variation (Swamy et al., 2016). Natural variants and haplotype association analysis have been proven to be more beneficial in capturing genes associated with complex traits. Previous approaches for genetic improvement of GZC were mainly based on traditional mapping using populations derived from biparental crosses that hardly consider the existence and effects of natural variants and haplotypes in the populations (Swamy et al., 2016). However, it is noteworthy that only few cloned genes are being used in breeding (Swamy et al., 2016). The details of favorable haplotypes for corresponding genes are crucial for crop breeding (Bevan et al., 2017;Abbai et al., 2019;Sinha et al., 2020). Thus, identifying variations and superior haplotype of genes controlling grain Zn-related traits in diverse panels will provide valuable targets for molecular marker-assisted selection (MAS).
Association analysis is a powerful approach to uncover genetic mechanism for such complex polygenic traits (Flint-Garcia et al., 2003;Zhu et al., 2008). The candidate-genebased association analysis targets genes within the functional regions of the genome, thereby increasing the resolution to detect significant gene-trait associations (Flint-Garcia et al., 2003;Zhu et al., 2008). For instance, using candidate gene analysis, genes governing complex traits were identified in Arabidopsis, wheat, peas, potato, rye, and perennial ryegrass (Zhao et al., 2015). The single-locus association analysis (SL-AA) and single-locus mixed linear model (SL-MLM) are the established analytical methods to detect genetic variants for agronomic traits (Zhu et al., 2008;Cui et al., 2018). The single locus-based approaches are limited in detecting marginal effects of quantitative trait nucleotides (QTNs) influenced by the polygenic background and often requires correction for multiple tests (Zhang et al., 2020). The stringent Bonferroni correction limited the detection of small-effect loci Zhang et al., 2020), which may cumulatively explain a significant amount of the observed phenotypic variation. On the other side, a more recent method known as multilocus association analysis (ML-AA) addresses these shortcomings by simultaneously scanning and estimating all the marker effects across the whole genome (Wen et al., 2018;Zhang et al., 2020). ML-AA slightly outperformed single locus-based methods in detecting significant QTNs associated with several complex traits (Tamba and Zhang, 2018;Wen et al., 2018).
Until now, many MAS breeding practices have been performed to disease-, resistance-, and yield-related genes . However, MAS for higher GZC is limited because of the rare information of genes and their haplotype for deploying MAS. In the present study, candidate genebased association analysis was used to test the effects of 65 genes (Supplementary Table 1), which have been previously characterized for their influence on rice GZC. Two panels of breeding lines and a multiparent advanced generation intercross (MAGIC) population were used. Three analytical methods including SL-MLM, multilocus random-SNP-effect mixed linear model (mrMLM) model, and haplotype-based association analysis (Hap-AA) were applied to maximize the probability of catching all the important genes, meanwhile controlling the false positives.

Genotyping, Population Structure, and Haplotype Analysis
Total genomic DNA for genotyping was extracted from young leaves using the Cetyl trimethyl ammonium bromide (CTAB) procedure. The accessions from SEA and SC were genotyped using the Illumina HiSeq 2000 (PE150) (50X) by Berry Genomics Corporation, Beijing, China 1 . Reads were aligned to the Nipponbare RefSeq (IRGSP-1.0) 2 using BWA-MEM V0.7.10 3 . The duplicated reads were sorted out using Picard tools 4 . The variants for each accession were called by the GATK V3.2.2 best practices 5 . A stringent filtering strategy was conducted to choose high-quality SNPs and InDels for subsequent analysis (QUAL < 30.0, QD < 10.0, FS > 200.0, MQRankSum < −12.5, and ReadPosRankSum < −8.0). Next, the DC1 was sequenced with the 55K Affymetrix Axiom Rice Genotyping Array at the Capital Bio-Technology 6 (Beijing, China), and the SNP data set was obtained according to Meng et al. (2017). Markers with minor allele frequency (MAF) < 0.05 and missing rate > 0.05 were removed.
The nucleotide variations (SNPs and InDels) from SEA and SC were annotated by ANNOVAR (Wang et al., 2010). The SNPs and Indels located in the CDS and the promoter (−1,500 bp) region of the 65 selected genes were extracted and used for subsequent association and haplotype analysis. Haplotype analysis for all the genes was carried out by considering the non-synonymous SNPs and Indels, which can lead to amino acid change. The correlated markers (r 2 = 1.0) for SNPs were excluded from the genotype data set. Besides, the SNPs were filtered according to the following requirements: (1) only two alleles, (2) exclude SNPs of missing data > 0.9, (3) MAF ≥ 0.05, and (4) mean depth values ≥ 5. The haplotype analysis was conducted by CandiHap V2.0 7 based on R 4.0.1.
Population structure, principal components analysis (PCA), and neighbor-joining (NJ) tree analysis were used to infer the population structure and kinship for SEA, SC, and DC1. Population structure was analyzed using 10,000 polymorphic SNP markers with Admixture 1.3.0 (Alexander and Lange, 2011). PCA and NJ trees were calculated by the software Tassel v5.1 8 (Bradbury et al., 2007). The structure and PCA for SEA and DC1 populations have been previously reported by Meng et al. (2017) and Liu et al. (2020), respectively.

Grain Zn Concentration Measurement
The grain samples from all three populations were dried at 65 • C for 3 days. Then, the dried grains were crushed, wet digested in concentrated HNO 3 at 120 • C (30 min), and further digested with HClO 4 at 180 • C until they became transparent. The samples were then diluted with ultrapure water, and the zinc concentrations were evaluated by inductively coupled plasma mass spectrometry (ICP-MS) according to Liu et al. (2020).

Association Analysis and Superior Haplotype Identification
The SNPs and Indels extracted from the CDS and promoter (−1,500 bp) regions were used to conduct association analysis for SEA and SC populations, whereas the SNPs located in the LD decay interval (150 kb) were selected for association analysis in DC1. Association analysis was carried out using the Tassel V5.1 (Bradbury et al., 2007) with a mixed linear model accounting for both population structure and kinship (Bradbury et al., 2007). The Manhattan and QQ plots were displayed using the R package CMplot 9 . Another R package, mrMLM V2.1, was used to realize the mrMLM algorithm . The critical threshold of significance for marker-trait association (MTA) was according to Bonferroni correction in SL-MLM and at an Logarithm of odds (LOD) value of 3 in mrMLM. A QTN was defined as a haplotype block possessing SNPs identified as significantly associated with GZC trait. For Hap-AA, false discovery rate < 0.1 was chosen as the significance threshold.
The resultant significant genes were further used to find superior haplotypes by conducting a Duncan analysis of GZC means (haplotype-wise) for each subgroup across SEA and SC panels. Furthermore, to ensure the accuracy of the results, only the haplotypes validated in at least five lines were considered for statistical analysis. In the present study, superior haplotypes were those with significantly higher average GZC (p < 0.05) but not the most frequent one in the subgroup.

Phenotypic Variation Analysis
Continuous variation was observed for GZC across the SEA, SC, and DC1 panels with approximately normal distributions (Supplementary Figure 1

Genotyping
A total of 3,530 SNPs and 153 Indels were identified in the CDS and promoter regions of 65 selected genes in SEA and SC. The SNPs and Indels ranged from 1 to 135 and from 0 to 34, respectively. The average numbers of SNPs and Indels were 53.3 and 2.3 for the 65 selected genes, respectively (Supplementary Table 7 and Supplementary Figure 2). For the DC1, a total of 9 https://github.com/YinLiLin/CMplot 1,753 SNPs were identified at the Linkage disequilibrium (LD) decay intervals for the selected genes (Supplementary Table 7 and Supplementary Figure 2).

Population Structure
According to Liu et al. (2020), the 207 accessions from SEA are divided into two subgroups, the Japonica subpopulation (SEA-Japonica) and the Indica subpopulation (SEA-Indica). The characterization of both subpopulations is consistent with their geographic origins. However, some levels of admixture between the Indica and Japonica subpopulations were detected in this study. The SEA-Indica accessions included the mega varieties from IRRI, South Asia, and SEA, and few released cultivars were from SC. On the other side, most of the accessions from the SEA-Japonica subpopulation were from the SC panel. Structure, PCA, and NJ tree analysis of the SC panel divided the 99 indica accessions into three subgroups, namely, SC-Indica1, SC-Indica2, and SC-Indica3 (Figure 1). The SC-Indica1 accessions included several landraces and other cultivars, which were released around the 1960s to 1980s (e.g., Dasuikuai, Jiuzhan, and Qinghuaai 6). The SC-Indica2 accessions mainly included cultivars released from the 1960s to 1990s (e.g., Guangchangai 6, Qingguaiai 5, and Aizhenzan). Lastly, majority of the accessions from the SEA-Indica3 subgroup were cultivars released from Guangdong and Guangxi provinces of China around the 1990s to 2000s, such as shanyou 836-1, Xiangsimiao 2, and Yuanzhen 397. Because of the multiple hybridizations and selfing that were used in developing the DC1 population, no strong population structure was found in this population, as also previously reported by Meng et al. (2017). The total variations of population structure explained by the top three PCs were 28.5, 8.2, and 3.2% (SEA panel); 24.5, 9.2, and 7.3% (SC panel); and 5.9, 5.3, and 4.1% (DC1 panel).

Haplotype Analysis
Haplotype analysis was conducted for all the 65 selected genes in both the SEA and SC panels. In total, the number of haplotypes for each gene ranged from 3 to 11, while mainly including four to nine haplotypes (57 genes). The largest number of haplotypes was 10 and was recorded for OsZIFL10 and OsIRO2. Further, OsZIFL2, OsOPT1, and NRAMP6 genes had only three haplotypes each. The frequency of each haplotypes for the selected genes ranged from 3.25 to 81.2% in all accessions. The range of haplotype frequency was 5.25-72.6% and 6.58-81.6% in SEA and SC, respectively. The highest haplotype frequencies in SEA were recorded in Hap3-ZIP7 (94.5%),

SL-AA and ML-AA Analyses
In SEA, three SNPs corresponding to OsIDEF1 and OsZIFL7 were found to be significantly associated with GZC by SL-MLM, and each explained the phenotypic variation of 12.3 and 8.3-9.1%, respectively (Figure 2 and Table 1). Besides, five significant QTNs (LOD ≥ 3) corresponding to three genes (OsYSL15, OsIDEF1, and OsZIFL7) were simultaneously found to be associated with the GZC by mrMLM in the SEA panel. Each of the five QTNs explained phenotypic variation ranging from 7.70 to 15.39% (Figure 3 and Table 2). As shown by SL-MLM, OsZIP7 (one SNP) and OsIRT1 (four SNPs) were significantly associated with GZC in the SC panel and explained phenotypic variations of 18.9 and 17.9%, respectively (Figure 2 and Table 1). Besides, mrMLM showed that three significant QTNs (LOD ≥ 3) corresponding to three genes (OsFRDL1, OsIRT1, and OsZIP7) were significantly associated with GZC in the SC panel and explained a phenotypic variation of 16.08% (OsFRDL1), 17.99% (OsIRT1), and 11.87%   (OsZIP7), respectively (Figure 3 and Table 2). In the DC1 panel, no significant GZC genes were identified with SL-MLM. However, mrMLM detected two significant QTNs (LOD ≥ 3) corresponding to two genes (OsYSL7 and OsZIP7). The amounts of phenotypic variation defined by these two genes were 9.85% (OsYSL7) and 10.87% (OsZIP7), respectively (Figure 3 and Table 2).

DISCUSSION
Although dozens of Zn response genes have been cloned and validated, their usefulness in breeding remains unclear (Rose et al., 2013;Swamy et al., 2016). Deeper insights into the complex relationship among GZC and genes would greatly aid in the selection of appropriate genes and haplotypes to enhance Zn biofortification in rice. In the present study, associations of GZC with variations and haplotype of selected genes were separately conducted to find genes and their superior haplotypes.
The distributions of haplotypes were different in the SEA and SC panels. Previous studies have also reported that haplotype distributions differ across populations Wang et al., 2019). In the present study, Hap1-OsZIFL2, Hap3-OsZIP49, Hap2-OsNRAMP4, Hap3-OsZIP8, and Hap3-OsNAAT1 were only identified in the SEA panel, whereas Hap4-OsZIFL12 and Hap7-OsNRAMP8 were only detected in the SC panel. The frequencies of haplotype distribution in the SEA and SC panels were also different. The Hap3-OsOPT1, Hap3-OsNAAT1, and Hap2-OsNRAMP5 accounted for 40.5, 23.9, and 48.2% of total GZC content variation in the SEA panel, respectively. Yet, these haplotypes only accounted for 17.2% (Hap3-OsOPT1), 0 (Hap3-OsNAAT1), and 6.0% (Hap2-OsNRAMP5) in the SC panel, respectively. Similar findings were also observed for OsFRDL1, OZIFL12, and OsNRAMP1. We also observed that differences in haplotype frequencies within subgroups of the SEA and SC panels. The Hap1-OsOPT1 with 71.2% of haplotype frequency was the major haplotype in the SEA-Indica subgroup, while it only accounted for 23.5% in the SEA-Japonica subgroup. The frequencies of Hap1-OsNRAMP4 and Hap1-OsYSL14 were 53.0 and 56.0% in the SEA-Indica subgroup, respectively. However, in the SEA-Japonica subgroup, the haplotype frequencies were 4.2% (Hap1-OsNRAMP4) and 18.2% (Hap1-OsYSL14). In the SC panel, the haplotype frequencies of Hap1-OsYSL14 was 70.0% (SC-Indica1 subgroup), 46.7% (SC-Indica3), and 23.5% (SC-Indica2); the Hap1-DMAS1 accounted for 35.0% in SC-Indica1 but was not found in SC-Indica2 and SC-Indica3. Thus, identifying the genes significantly associated with GZC and its superior haplotypes by Hap-AA is crucial for rice breeding because of the various differences of haplotype distribution.
Among the genes significantly associated with GZC, OsIRT1, and OsZIP7 were detected in both the SEA and SC panels. This imply that these genes play a stabilizing role in diverse accessions and could be widely used in rice breeding. Besides, OsIDEF1, OsIRT1, and OsZIP7 genes explained the highest phenotypic variations in the SEA, SC, and DC1 panels, respectively. OsIDEF1 involved in Zn synthesis and metabolism could be used for Zn accumulation by upregulating the expression of metal homeostasis genes (Ogo et al., 2008;Ishimaru et al., 2011;Swamy et al., 2016). OsIRT1 and OsZIP7 are involved in Zn uptake and translocation in plants and can be used for enhancing micronutrient levels in grains (Lee and An, 2009;Swamy et al., 2016). OsNRAMP6 (SEA panel), OsYSL15 (SEA panel), OsZIFL7 (SC panel), OsFRDL1 (SC panel), and OsYSL7 (DC1 panel) are also significantly associated with GZC. Several studies have demonstrated that OsYSL family proteins are involved in the pathway of phloem transport and longdistance transport of metals (Inoue et al., 2009;Kakei et al., 2012). OsYSL15 and OsYSL7 are iron-regulated iron (III)deoxymugineic acid transporters expressed in the root tissues and essential for Zn uptake in rice seedlings (Inoue et al., 2009;Lee and An, 2009;Sperotto et al., 2010). ZINC-INDUCED FACILITATOR-LIKE (ZIFL) family genes are essential for grain Fe and Zn accumulation. It was shown that ZIFL genes interact with OsSPL14 and OsSPL16 to increase grain yield (Swamy et al., 2018). OsZIFL7 is a crucial metallic element transporter Zn homeostasis gene. The loss-of-function OsZIFL7 mutant has altered Zn distribution. Also, the transcription of OsZIFL7 is upregulated by Zn excess (Ricachenevsky et al., 2011). Both OsNRAMP5 and OsNRAMP6 are important for Zn transport and accumulation (Sasaki et al., 2012;Swamy et al., 2016). Li et al. (2014) reported that OsFRDL1 significantly increased Zn content in the wild type compared to mutants with OsFRDL1 loss of function. OsZIP4 and OsMTP1 were significantly associated with GZC in the SEA panel as shown by Hap-AA. Meanwhile, the haplotype variations of OsOPT7, OsSAMS1, and OsIRT2 were found to have significant associations with GZC in SC. OsMTP1 is a metal tolerance protein (MTP) family gene with Zn transport functions in plants. It localizes at the tonoplast and plays crucial roles in Zn accumulation (Dräger et al., 2004;Yuan et al., 2012;Swamy et al., 2016). Like OsIRT1, OsIRT2 is responsible for Zn uptake from soil, translocation distribution, and redistribution in root and shoot. OsIRT2 is also essential for grain Zn storage in rice (Ramesh et al., 2003;Lee et al., 2010a,b). OsNAS, OsOPT, and OsSAMS are involved in the biosynthesis, transport, and secretion of phytosiderophores in the root zone, thereby increasing Zn uptake in roots (Bashir et al., 2006;Inoue et al., 2008;Johnson et al., 2011;Swamy et al., 2016).
Although some significant genes were related with high Phenotypic variation explained (PVE), some were related with low GZC in several lines. This may be due to the following reasons: firstly, each gene contains multiple haplotypes, ranging from 3 to 12 in our study. Although the superior haplotypes are often with high GZC, there was no significant difference for most of the other haplotypes. Also, the difference of the haplotypes and GZC caused significant PVE. Besides, GZC is a typical complex trait and controlled by numerous genes. Until now, over 60 genes and 100 loci have been reported associated with GZC. Previous studies have reported the interaction among the genes related to Zn homeostasis and GZC (Swamy et al., 2016).
The conventional single-locus methods have been widely implemented to identify genetic variants in many cereals (Liu et al., 2017;Liu et al., 2020). However, these models have certain shortcomings as they ignore the overall effects of multiple loci and thus suffer from multiple test corrections for critical values Zhang et al., 2020). Differing from the singlelocus methods, mrMLM is a two-stage method. In this method, the SNP effect is viewed as being random, and all the potentially associated markers are selected by a random-SNP-effect MLM with a modified Bonferroni correction for significance test. At the second stage, all the selected markers are placed into one model, and all the non-zero effects are further detected by a likelihood ratio test for QTN identification (Wen et al., 2018;Zhang et al., 2020). In the present study, mrMLM outperformed the SL-MLM for the number of significant QTNs detected in both the SEA and SC panels. Genes that play crucial roles in plant growth and development such as OsYSL15, OsIRT1, OsZIFL3, and OsFRDL1 were not detected by the SL-MLM method (Inoue et al., 2009;Lee and An, 2009;Swamy et al., 2016). These data illustrate that the mrMLM approach is more effective and powerful to detect smalleffect QTNs from complex traits (Segura et al., 2012;Cui et al., 2018;Wen et al., 2018;Zhang et al., 2020). One potential reason is that the mrMLM method improves the power and accuracy for QTN detection due to the nature of the statistical model (Zhang et al., 2020). Another possible reason was the relatively stringent threshold inherent with the SL-MLM method. For both the SEA and SC panels, most of the loci identified in Hap-AA were also observed with both the SL-MLM and mrMLM methods. These loci were, for instance, OsNRAMP6, OsYSL15, and OsIRT1 from the SEA panel, and OsIRT1 and OsZIP7 from the SC panel. Furthermore, several genes that were not identified in SL-MLM and mrMLM could be identified with the Hap-AA method. These were OsNRAMP5, OsZIP4, and OsMTP1 (SEA panel), and OsOPT7, OsSAMS1, and OsIRT2 (SC panel). Likewise, OsIDEF1 and OsZIFL7 (SEA panel) and OsFRDL1 (SC panel) loci identified with SL-MLM and mrMLM could not be identified with the Hap-AA. Based on the above analysis, Hap-AA is an effective and reliable approach for genetic analysis of complex traits. The various methods lead to the differences observed in results. We believe that the results should be interpreted according to the purpose of the study. In this study, we want to provide more potential genes for rice breeding. Thus, we analyzed all the genes found in these methods. However, if we want to identify new genes and clone them, the genes shared by all the three methods may be the best. In summary, combining the merits of SL-MLM, mrMLM, and Hap-AA methods is an effective approach to uncover the genetic architecture for complex agronomic traits (Wu et al., 2016;Zhang et al., 2020).
The use of certain germplasms with higher Zn as donors and an elite line as recurrent parents is commonly adopted in biofortification breeding (Sinha et al., 2020). Normally, at least two steps should be taken during parental selection. Firstly, the existing haplotypes of the target genes in the recurrent parents and donors should be clarified by genotyping and haplotype analysis. Secondly, the favorable alleles/haplotypes transferred from donors to the recurrent parents need to be determined. In the present study, OsIRT1 and OsZIP7 were identified in two unique breeding populations and play a stabilizing role in diverse accessions. In the high-GZC breeding progresses of SC and SEA, OsIRT1 and OsZIP7 should be considered.
As the distribution of favorable genes and superior haplotypes is different, different genes and haplotypes should be selected from different regions. According to the results from the association analysis, OsIDEF1 (explained 12.3% of the phenotypic variance) and OsZIFL7 (8.3-9.1%), OsZIP7 (18.9%), and OsIRT1 (17.9%) were identified by SL-MLM in SEA and SC, respectively.
OsYSL15, OsIDEF1, and OsZIFL7 should be applied in the SEA, whereas OsFRDL1, OsIRT1, and OsZIP7 is first choice in SC. OsZIP7 was also identified in the MAGIC-DC1 panel. This indicates high stability of OsZIP7 across different breeding populations. Furthermore, superior haplotypes from different subgroups can be selected to improve the GZC of different accessions. The SEA and South Asia regions are major rice production areas. The accessions from the SEA are important germplasm resources for rice breeding. Among the SEA-Indica accessions, the Hap3-OsNRAMP5, Hap2-OsZIP4, Hap3-OsNRAMP6, Hap7-OsMTP1, and Hap8-OsIRT1 haplotypes are recommended for GZC improvement. Whereas for the cultivars from the SEA-Japonica subgroup, Hap3-OsNRAMP5, Hap3-OsNRAMP6, Hap6-OsMTP1, and Hap1-OsYSL15 are the best candidate genes and haplotypes. Most of the cultivars from SC, particularly Guangdong province, originated from the traditional breeding methods. For the accessions from the SC-Indica1 subgroup, Hap1-OsZIP7 is the first choice; for the SC-Indica2 accessions, Hap2-OsOPT7, Hap5-IRT1, Hap4-OsIRT2, and Hap4-ZIP7 haplotypes are the most valuable for GZC improvement; and for the SC-Indica3 subgroup, priority genes and haplotypes are Hap2-OsOPT7, Hap5-IRT1, Hap4-OsIRT2, and Hap4-ZIP7. In summary, the superior haplotypes identified in this study for each subgroup would provide reliable guidelines for future breeding. Lines carrying multiple superior haplotypes, such as IR64, IR72, and IR36 in SEA, Guangchangai 6, Sanhuangzhan 2, and Shuangai 11 can be used to rapidly combine several superior target haplotypes into one background.

CONCLUSION
In the present study, 65 grain Zn-related genes were selected to evaluate their effects and identify superior haplotypes in three different populations. In total, seven unique genes were identified for GZC in various populations. Of these, OsIRT1 and OsZIP7 were identified in two populations. Also, different superior haplotypes for GZC were identified in the SEA and SC panels. Introgression of these superior haplotypes by the haplotype-based breeding is a promising strategy. This study used robust populations and analytical approaches to identify superior genes and haplotypes, which may pave the way for future breeding efforts for grain Zn content in rice.

DATA AVAILABILITY STATEMENT
The original contributions presented in the study are included in the article/Supplementary Material, further inquiries can be directed to the corresponding author.

AUTHOR CONTRIBUTIONS
JL and JC designed the research, analyzed the physiology data, and drafted the manuscript. JZ, XL, and SZ performed the experiments. GY revised the manuscript. All authors have read, edited, and approved the current version of the manuscript.