Identification and Validation of New Stable QTLs for Grain Weight and Size by Multiple Mapping Models in Common Wheat

Improvement of grain weight and size is an important objective for high-yield wheat breeding. In this study, 174 recombinant inbred lines (RILs) derived from the cross between Jing 411 and Hongmangchun 21 were used to construct a high-density genetic map by specific locus amplified fragment sequencing (SLAF-seq). Three mapping methods, including inclusive composite interval mapping (ICIM), genome-wide composite interval mapping (GCIM), and a mixed linear model performed with forward–backward stepwise (NWIM), were used to identify QTLs for thousand grain weight (TGW), grain width (GW), and grain length (GL). In total, we identified 30, 15, and 18 putative QTLs for TGW, GW, and GL that explain 1.1–33.9%, 3.1%–34.2%, and 1.7%–22.8% of the phenotypic variances, respectively. Among these, 19 (63.3%) QTLs for TGW, 10 (66.7%) for GW, and 7 (38.9%) for GL were consistent with those identified by genome-wide association analysis in 192 wheat varieties. Five new stable QTLs, including 3 for TGW (Qtgw.ahau-1B.1, Qtgw.ahau-4B.1, and Qtgw.ahau-4B.2) and 2 for GL (Qgl.ahau-2A.1 and Qgl.ahau-7A.2), were detected by the three aforementioned mapping methods across environments. Subsequently, five cleaved amplified polymorphic sequence (CAPS) markers corresponding to these QTLs were developed and validated in 180 Chinese mini-core wheat accessions. In addition, 19 potential candidate genes for Qtgw.ahau-4B.2 in a 0.31-Mb physical interval were further annotated, of which TraesCS4B02G376400 and TraesCS4B02G376800 encode a plasma membrane H+-ATPase and a serine/threonine-protein kinase, respectively. These new QTLs and CAPS markers will be useful for further marker-assisted selection and map-based cloning of target genes.


INTRODUCTION
Wheat (Triticum aestivum L.) is one of the most important cereal crops and provides approximately 20% of the dietary calories for humans worldwide. The demand for wheat is continually increasing due to rapid population growth (Organization of Food and Agriculture, 2009). Grain weight and size, including thousand grain weight (TGW), grain length (GL), and grain width (GW), are positively associated with wheat grain yield (Börner et al., 2002;Dholakia et al., 2003;Breseghello and Sorrells, 2007). Therefore, dissecting the genetic basis of these grain-related traits contributes to the improvement of wheat yield.
Numerous quantitative trait loci (QTLs) underlying grain weight and size were detected on almost all wheat chromosomes by linkage mapping and genome-wide association studies (GWAS) (Cui et al., 2014;Simmonds et al., 2014;Wu et al., 2015;Thorwarth et al., 2019). Several candidate genes for TGW, GW, and GL have been isolated in wheat. TaCKX6-D1 encodes a cytokinin oxidase/dehydrogenase (CKX2) associated with grain number and grain weight (Ashikari et al., 2005;Li et al., 2011Li et al., , 2013. TaGW2-6A encodes a RING-type E3 ubiquitin ligase and functions as a negative regulator of GW (Su et al., 2012;Yang et al., 2012;Vandana et al., 2015). TaCwi-A1 encodes a cell wall invertase for carbon partitioning during early grain filling (Wang et al., 2008;Ma et al., 2012). TaSAP-A1 belongs to a gene family of stress-associated proteins significantly related to TGW, number of grains per spike, and spike length (Chang et al., 2013). TaGS1a encodes a cytosolic glutamine synthetase underlying grain size (Ying et al., 2013). TaGS-D1 is associated with TGW and GL . TaSus1 and TaSus2 encode two isoforms of sucrose synthase (Jiang et al., 2011;Hou et al., 2014). TaGS5-3A regulates grain size (Ma L. et al., 2015). 6-SFT-A2 is involved in fructan biosynthesis and significantly associated with TGW under rainfed conditions (Yue et al., 2015). TaSnRK2.3, TaSnRK2.9, and TaSnRK2.10, three members of the SnRK2 family, are associated with TGW (Miao et al., 2017;Rehman et al., 2019). More recently, Cheng et al. (2020) isolated the Tasg-D1 gene, which encodes serine/threonine protein kinase glycogen synthase kinase 3 (STKc_GSK3), leading to formation of round grains in Indian dwarf wheat (Triticum sphaerococcum Perc.). TaSDIR1-4A, which encodes a RING-type E3 ubiquitin ligase is associated with TGW in well-watered and heat-stress environments . The physical interaction of the proteins encoded by TaDA1-A and TaGW2-6B was significantly associated with grain size and weight Liu J. et al., 2019). TaGS3-7A, a homologous gene of OsGS3, was shown to be associated with grain weight (Yang et al., 2019b). In addition, TaTGW6 (Hu et al., 2016a), Tabas1 , TaCKX4 , TaFlo2 (Sajjad et al., 2017), TaTGW-7A (Hu et al., 2016b), TaCYP78A3 (Ma M. et al., 2015), and TaGL3 (Yang et al., 2019a) are also significantly associated with grain weight or size. All of the genes listed above are useful in genetic improvement of wheat yield. However, wheat grain weight and size are complex traits controlled by multiple genes. Identification and validation of more QTLs for grainrelated traits will not only promote our understanding of the genetic basis of the two traits, but also accelerate the process of pyramiding favorable alleles in high-yield wheat breeding.
In recent years, several approaches for high-throughput molecular markers supported by next-generation sequencing technologies and SNP chips have developed rapidly, including DNA sequencing (RAD-seq) (Baird et al., 2008), genotypingby-sequencing (GBS-seq) (Elshire et al., 2011), restriction siteassociated specific locus amplified fragment sequencing (SLAFseq) , and wheat 90K , 55K, and 660K SNP arrays (designed by the Chinese Academy of Agricultural Sciences) (Cui et al., 2017;Huang et al., 2019). These high-throughput genotyping methods obviously accelerate the identification of novel loci and candidate genes underlying wheat yield-related traits. Cui et al. (2017) constructed a genetic map with 4959 bin markers using a wheat 660K SNP array in a recombinant inbred line (RIL) population derived from Kenong 9204 × Jing 411 and detected a major QTL for kernel number per spike on chromosome 4A. Fiedler et al. (2017) identified a QTL for test weight on chromosome 1B by GBS-seq of 1184 lines collected from the North Dakota State University durum wheat breeding program. Wang F. et al. (2020) constructed a highdensity genetic map (HDGM), including 3556 SLAF-based SNP and SSR markers, and detected 37 QTLs related to yield traits in an RIL population derived from cotton cultivars LMY 22 (highyield) and LY 343 (superior fiber quality). Notably, SLAF-seq has been widely applied to construct a HDGM for QTL mapping as well as narrowing target regions in several crops, such as maize (Liu et al., 2016;Li et al., 2018), rice Quan et al., 2018), soybeans Han et al., 2019), and peppers (Guo et al., 2017;Zhang et al., 2019). However, few HDGMs have been constructed by SLAF-seq for common wheat.
The aims of this study were to (1) identify a large number of SNP markers to construct a HDGM by SLAF-seq in an RIL population derived from a cross between high-TGW cultivar Jing 411 and low-TGW landrace Hongmangchun 21, (2) identify QTLs for grain weight and size using SNP markers, and (3) design cleaved amplified polymorphism sequences (CAPS) markers closely linked to new stable QTLs for grain weight and size and validate them in 192 wheat varieties and 180 Chinese mini-core wheat accessions.
The above three populations were planted at the experimental station of Anhui Agricultural University in Hefei (31 • 58 N, 117 • 240 E), Anhui Province, China. Field trials were conducted in 2-m-long plots, each consisting of double rows spaced 0.25 m apart in randomized complete blocks with two replications.

TGW, GW, and GL Tests
Sixty spikes per plot were harvested and threshed at physiological maturity, which is characterized by yellow color on the whole plant including leaves, stems, and spikes (Kulwal et al., 2005). Three hundred threshed seeds were used to measure TGW, GL, and GW in triplicate using the SC-G wheat grain appearance quality image analysis system (Hangzhou WSeen Detection Technology Co., Ltd, Hangzhou, China) .

Statistical Analysis
Pearson's correlation analysis and Mann-Whitney U-tests were performed by the software SPSS 20.0 (IBM Corporation, Armonk, NY, United States).

Genomic DNA Isolation
Dry seeds were collected to extract genomic DNA using the SDS method (Kang et al., 1998). DNA quality was tested by ND5000 spectrophotometer (NanoDrop, Wilmington, DE, United States) and in 1% agarose gels.

Construction of SLAF Library
The JH-RILs were genotyped by a modified SLAF-seq strategy to develop genome-wide SNP markers . The Chinese Spring reference genome IWGSC RefSeq v1.0 1 was utilized to simulate in silico the number of markers digested by different restriction enzymes and design a pilot SLAF experiment . A restriction enzyme (RsaI, New England Biolabs, United States) was used to digest the genomic DNA. The fragments ranging from 464 to 484 bp were selected and purified after agarose gel electrophoresis for 100-bp paired-end sequencing. To assess the accuracy of the experiment for SLAF library construction, Oryza sativa L. japonica was chosen as a control in comparison to mapping populations with the same experimental treatment and sequencing.

SLAF-Seq Data Analysis and Genotyping
In order to ensure the accuracy of data analysis, quality control for the raw sequencing data were performed according to the following criteria: (i) reads containing adaptor sequences were filtered out, (ii) reads with unknown bases exceeding 10% in length were filtered out, (iii) reads unaligned to the wheat reference genome IWGSC RefSeq v1.0 (see footnote 1) were filtered out, and (iv) reads with a single end aligned to the wheat reference genome IWGSC RefSeq v1.0 were also filtered out because of the inaccurate physical position.
SLAF paired-end mapped reads were clustered based on sequence similarity identified by alignments to the wheat reference genome IWGSC RefSeq v1.0 with BWA software (Chong et al., 2003). SNPs in all SLAF loci were identified between parents by the Genome Analysis Toolkit (McKenna et al., 2010) 2 . SLAFs with 8 or more SNPs, which are considered as high-frequency variable regions of wheat, were filtered out, and this influenced the accuracy of following steps. Minor allele frequency evaluation was used to define alleles in each SLAF. SLAF markers were generated following Sun et al. (2013). Polymorphic markers were selected by genotyping the parents and then classified into eight segregation types (ab × cd, ef × eg, hk × hk, lm × ll, nn × np, aa × bb, ab × cc, and cc × ab). For the RIL population, SLAF markers in the segregation pattern of aa × bb were selected. The quality of SLAF markers used to build genetic maps was controlled by the following criteria: (i) average sequence depths must be >6-fold in parents, (ii) markers with more than 25% missing data were filtered out, (iii) markers with significant segregation distortion (P < 0.001) were rejected, and (iv) markers were required to have a logarithm of odds (MLOD) exceeding 3.

Bulked Segregant Analysis of JH-RILs by 660K SNP Arrays
Two high-TGW bulks (HB1 and HB2) and two low-TGW bulks (LB1 and LB2) were made by the same amount of DNA from extremely high and low TGW lines (5 lines per bulk) of JH-RILs based on phenotypic values. The above four bulks and two parents were genotyped using the wheat 660K SNP array by the CapitalBio Corporation (Beijing, China). SNP genotyping and clustering were performed by Genome Studio Polyploid Clustering v1.0 3 . Consistent SNPs without missing data in HB1, HB2, and Jing 411 were classified as group I, and consistent SNPs without missing data in LB1, LB2, and Hongmangchun 21 were classified as group II. Different SNPs between groups I and II were selected for subsequent analysis and development of PCR-based molecular markers. Physical positions of these SNPs were searched by sequence BLAST with IWGSC RefSeq v1.0 (see footnote 1).

Development and Genotyping of PCR-Based Molecular Markers
Seventy-four SNPs with polymorphisms between groups I and II were converted to CAPS markers by Primer Premier 5.0 4 and used to genotype JH-RILs to construct genetic maps (Supplementary Table 1). Representative CAPS markers closely linked with new stable QTLs were used to genotype 180 CMCCs. A PCR mixture with a total volume of 10 µL comprised 1.0 µL of 10 × PCR buffer, 200 µM dNTPs, 4 pmol of each primer, 0.5 U Taq DNA polymerase, and 50-100 ng template DNA. The reaction was performed in a C1000 Thermal Cycler (Bio-Rad, United States) with the following program: denaturation at 94 • C for 5 min followed by 40 cycles at 94 • C for 30 s, touchdown starting at 62 • C for 30 s (decreasing 0.3 • C per cycle), and 72 • C for 30 s that were followed by a final extension at 72 • C for 8 min. The PCR products were digested with the corresponding restriction enzymes (Supplementary Table 1 5 ) for 5 h and separated on 2.5% agarose gels.

Genetic Linkage Map Construction and QTL Mapping
SLAF markers distributed on 21 wheat chromosomes were selected by mapping using IWGSC RefSeq v1.0 (see footnote 1). To ensure the accuracy of genetic linkage maps, the MLOD scores between markers were estimated, and markers with MLOD scores less than 5 were removed. Then, SLAF, CAPS, and gene-specific markers were collectively used to construct a high-density genetic bin map (designated SLAF-map) without redundant markers by IciMapping v4.1 using the Kosambi mapping function in JH-RILs.
The inclusive composite interval mapping (ICIM) program of QTL IciMapping v4.1 (Li et al., 2007) 6 was used to detect QTLs for TGW, GL, and GW in JH-RILs with a walk speed of 1 cM and a window size of 10 cM. The genome-wide composite interval mapping (GCIM) of QTL.gCIM.GUI v1.0  7 in R version 3.6.0 software was applied to identify QTLs with a random model and a walk speed of 1 cM. For these two methods, an LOD score of 2.5 was used for claiming the presence of QTLs. The QTLnetwork v2.0 software (Yang J. et al., 2007;Yang et al., 2008), which is based on a mixed linear model (MLM), was used for forward-backward stepwise (NWIM) analysis with a threshold of P = 0.05 to select cofactors, multiple linear regression with a 1 cM walk speed, and a window size set at 10 cM.
Adjacent QTLs with the same sign of additive effects satisfying at least one of the following criteria were defined as the same QTL: (1) positions of QTL peaks within 10 cM (Kumar et al., 2016) and (2) QTLs with overlapped confidence intervals (Cui et al., 2017). QTLs for TGW detected in more than five environments and those for GW or GL in more than three environments were defined as stable loci (Cui et al., 2014).

Genotyping and Association Analysis of 192 WVs Using 90K SNP Array
The DNA samples of 192 WVs were genotyped by the Illumina iSelect 90K Infinium SNP array, including 81,587 SNPs , by the Beijing Compass Biotechnology Co., Ltd. Genotypic clusters for each SNP were determined by the Genome Studio version 2011.1 software (Illumina, see footnote 3). The physical positions of SNPs were obtained from IWGSC RefSeq v1.0 (see footnote 1). The SNPs with less than 20% missing data and a minor allele frequency exceeding 5% were used for association analysis.
Linkage disequilibrium and population structure were analyzed for 192 WVs following methods in our previous study (Zhu et al., 2019). The same K and Q matrix data were used in the present study to identify significant marker-trait associations (MTAs) for grain weight and size by the MLM. The MLM was performed by TASSEL 5.0 to detect MTAs at a significance level of P < 0.001 (Bradbury et al., 2007;Chen et al., 2016).

Gene Annotation
Databases from the IWGSC gene annotation 8 were used to perform gene function annotations by BLAST.

Cloning of TraesCS4B02G376400
Primer pair PMA-1 was designed to amplify the partial sequence of TraesCS4B02G376400 based on the putative sequence from IWGSC RefSeq v1.0 (see footnote 1) (Supplementary Table 3). PCR amplifications were performed in 10-µL volumes that included 0.25 µM of each primer, 0.25 mM dNTPs, 100 ng genomic DNA, 0.5 unit Fastpfu polymerase, and 1 µL of 10 × Fastpfu PCR buffer (Beijing TransGen Biotech, Beijing, China). The amplification program consisted of an initial denaturation at 94 • C for 5 min followed by 36 cycles of denaturation at 94 • C for 45 s, annealing at 52 • C for 50 s, and extension at 72 • C for 1 min that were followed by a final extension at 72 • C for 12 min. Amplified PCR fragments were separated on 1.5% agarose gels. Target fragments were recovered and cloned into a Blunt-zero vector and transformed into T1 competent cells (Beijing TransGen Biotech, Beijing, China).
Sequencing was carried out on an ABI 3500 genetic analyzer (Applied Biosystems, Shanghai, China). Sequence alignments were performed using DNAMAN 6.0 9 .

Development of CAPS Marker for TraesCS4B02G376400
The SNP distinguished by restriction enzyme StyI found in TraesCS4B02G376400 between Jing 411 and Hongmangchun 21 was converted to a CAPS marker PMA-2 by Primer Premier 5.0 (see footnote 4) and used to genotype 180 CMCCs (Supplementary Table 3). The PCR conditions for the marker PMA-2 were consistent with those for 74 CAPS markers mentioned above.

Validation of QTLs for TGW, GW, and GL by GWAS
Association analysis was performed to validate QTLs for TGW, GW, and GL detected in JH-RILs using 192 WVs in three environments based on an MLM (Supplementary Figure 2).

Reliability of QTL Mapping
The reliability of QTL mapping is crucial for subsequent fine mapping and map-based cloning of target genes and is affected by the density and genotyping of markers used to construct genetic linkage maps (Cui et al., 2017;Han et al., 2019). Recently, RAD-seq (Baird et al., 2008), GBS-seq (Elshire et al., 2011), SLAF-seq , and SNP arrays  10 were performed as low-cost and efficient strategies to develop abundant SNPs to increase the density of markers. In the present study, SLAF tags (4820), CAPS (74), and gene-specific markers (11) were collectively used to build a high-density genetic linkage map harboring 1614 bin markers spanning 2014.43 cM in length with an average marker interval of 1.22 cM (Supplementary Figure 1). It is well known that different mapping methods are based on specific genetic models as well as their corresponding statistical hypothesis, from which the result is in fact a probability statement. Therefore, it is better to use multiple mapping methods to reduce the risks of identifying ghost QTLs or missing real QTLs (Su et al., 2010). Zhang et al. (2016) identified 63 and 16 additive QTLs for fiber strength of cotton using a composite interval mapping (CIM) method with WinQTLCartographer2.5 and an ICIM model by QTL IciMapping4.1, respectively, and four QTLs detected by both CIM and ICIM were considered as stable loci for the fiber strength of cotton. In our previous study, based on both single-and multi-locus MLMs, 23 and 39 MTAs for preharvest sprouting resistance were detected by GWAS, respectively, and 6 loci were jointly identified by the two models and were, thus, considered stable loci (Zhu et al., 2019).
In addition, a combination of QTL analysis and GWAS is also an important strategy to detect reliable loci. Wang et al. (2018) identified 41 QTLs for maize tocopherol content by linkage mapping in 6 RIL populations and 32 significant loci by GWAS in a diverse panel of 508 inbred lines, and a major QTL colocalized in both linkage analysis and GWAS was finely mapped and characterized as a non-tocopherol pathway gene involved in the modulation of natural tocopherol variation.
Candidate Gene Prediction of Qtgw.ahau-4B.2 The new stable locus Qtgw.ahau-4B.2 spanning a physical interval of 0.31 Mb was detected in 10 of 11 environments in which 19 genes were annotated (Table S10). Among them, TraesCS4B02G376400 and TraesCS4B02G376800 encode a plasma membrane H + -ATPase and a serine/threonine-protein kinase, respectively. The plasma membrane (PM) H + -ATPase is an important ion pump in the plant cell membrane. By extruding protons from the cell and generating a membrane potential, this pump energizes the PM, which is a prerequisite for plant growth. Thus, the PM H + -ATPase is regarded as a driver of growth (Falhof et al., 2016). Rober-Kleber et al. (2003) indicated that the PM H + -ATPase is involved in auxin-mediated cell elongation during wheat embryo development. Auxin activates the proton pump, resulting in apoplastic acidification that contributes to cell wall loosening and elongation of the scutellum. Therefore, the PM H + -ATPase is a component of the auxin-signaling cascade that may direct pattern formation in embryos. Moreover, several reported genes related to TGW belong to the serine/threonineprotein kinase family, such as TaSnRK2.3 (Miao et al., 2017), TaSnRK2.9 (Rehman et al., 2019), TaSnRK2.10 , and Tasg-D1 (Cheng et al., 2020), implying that the serine/threonine-protein kinase proteins play important roles in regulation of wheat grain development. Taken together, TraesCS4B02G376400 and TraesCS4B02G376800 are likely to be candidate genes of Qtgw.ahau-4B.2.

CONCLUSION
A total of 30, 15, and 18 putative additive QTLs for TGW, GW, and GL, respectively, were identified by SLAF-map in JH-RILs using three mapping methods. Particularly, five novel QTLs with stable and significant effects, including Qtgw.ahau-1B.1, Qgl.ahau-2A.1, Qtgw.ahau-4B.1, Qtgw.ahau-4B.2, and Qgl.ahau-7A.2 were identified by all three mapping methods and further validated in a natural population. In addition, TraesCS4B02G376400 and TraesCS4B02G376800 were considered as potential candidate genes underlying Qtgw.ahau-4B.2. The novel QTLs and CAPS markers developed will be helpful for map-based cloning of the target regions and gene pyramiding in breeding for wheat PHS resistance.

DATA AVAILABILITY STATEMENT
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found in the article/Supplementary Material.

AUTHOR CONTRIBUTIONS
JC and YS conceived the study, put into effect the main linkage analyses, and drafted the manuscript. DX, KX, and XC took part in the experiments and drafting of the manuscript. XP, XL, ML, CG, SY, HY, and WG processed the experimental data and helped to draft the manuscript. JL, HZ, CC, XX, SX, and CM conceived and guided the experiments, and helped in coordinating the project and drafting the manuscript. All the authors read and accepted the final manuscript.