QTL Analysis and Fine Mapping of a Major QTL Conferring Kernel Size in Maize (Zea mays)

Kernel size is an important agronomic trait for grain yield in maize. The purpose of this study is to map QTLs and predict candidate genes for kernel size in maize. A total of 199 F2 and its F2:3 lines from the cross between SG5/SG7 were developed. A composite interval mapping (CIM) method was used to detect QTLs in three environments of F2 and F2:3 populations. The result showed that a total of 10 QTLs for kernel size were detected, among which were five QTLs for kernel length (KL) and five QTLs for kernel width (KW). Two stable QTLs, qKW-1, and qKL-2, were mapped in all three environments. Three QTLs, qKL-1, qKW-1, and qKW-2, were overlapped with the QTLs identified from previous studies. In order to validate and fine map qKL-2, near-isogenic lines (NILs) were developed by continuous backcrossing between SG5 as the donor parent and SG7 as the recurrent parent. Marker-assisted selection was conducted from BC2F1 generation with molecular markers near qKL-2. A secondary linkage map with six markers around the qKL-2 region was developed and used for fine mapping of qKL-2. Finally, qKL-2 was confirmed in a 1.95 Mb physical interval with selected overlapping recombinant chromosomes on maize chromosome 9 by blasting with the Zea_Mays_B73 v4 genome. Transcriptome analysis showed that a total of 11 out of 40 protein-coding genes differently expressed between the two parents were detected in the identified qKL-2 interval. GRMZM2G006080 encoding a receptor-like protein kinase FERONIA, was predicted as a candidate gene to control kernel size. The work will not only help to understand the genetic mechanisms of kernel size of maize but also lay a foundation for further fine mapping and even cloning of the promising loci.


INTRODUCTION
Maize is an important agricultural crop. It can be served as food, animal feed, and industrial materials  and plays a special role in food security . High grain yield has always been the most important goal of maize breeding. But most yield traits are quantitative traits controlled by multiple genes (Lynch and Walsh, 1998;Xu, 2010). KL and KW are both considered to be important yield traits (Doebley et al., 2006). Kernel size traits, especially KW, has been revealed to be significantly correlated with grain yield of maize . The improvement of kernel size is therefore of great significance in maize breeding.
To date, numerous studies on maize grain yield traits have been reported at phenotypic levels (Rafiq et al., 2010;Nzuve et al., 2014). However, the genetic architecture and molecular mechanisms underlying natural quantitative variation in kernel yield have not been completely elucidated (Chen et al., 2016). Along with the first genetic linkage map of maize, published in 1986 (Helentjaris et al., 1986), molecular markers based on polymerase chain reaction (PCR) technology have greatly developed and applied for constructing genetic maps. Then, increasing QTLs controlling important agronomic traits in maize were detected by analyzing phenotypic value based on constructed genetic maps. These identified QTLs were distributed on all 10 maize chromosomes (Qiu et al., 2011). Many QTL mapping or fine mapping works for kernel size or weight have been carried out in recent years Zhang et al., 2014;Chen et al., 2016). Till now, more than 150 QTLs for kernel size or weight have been identified by using different maize populations (Gramene QTL database). Liu et al. (2020) detected 50 QTLs for kernel size traits in the intermated B73 × Mo17 (IBM) Syn10 doubled haploid (DH) population, of which eight were repetitively identified in at least three environments. A total 55 and 28 QTL for kernel traits were identified by using composite interval mapping (CIM) for singleenvironment analysis and mixed linear model-based CIM for joint analysis, respectively, with 270 F 2:3 families derived from the cross between V671 (large kernel) × Mc (small kernel)in five environments . It is critically important that QTLs should be validated and fine mapped for applying in further marker assisted breeding process. The near-isogenic line (NIL) is one of the most widely accepted populations commonly used for QTL fine mapping. NILs have been successfully used in confirming and fine-mapping QTLs in many species, such as rice (Lin et al., 2003;Li et al., 2004;Xie et al., 2006) and wheat (Xue et al., 2013;Zheng et al., 2015). In maize, a major QTL qkrnw4 associated with kernel row number was mapped by using a NIL population (Nie et al., 2019). Gao et al. (2019) mapped qLRI4, which conferred leaf rolling index by using NIL populations. Yang et al. (2018) mapped a major QTL qkc7.03 to a 416.27 kb physical interval for kernel cracking with NILs developed.
Great achievements in QTL mapping or isolating underlying genes for kernel size have been made in many species such as rice (Wan et al., 2006;Song et al., 2007;Li et al., 2011;Qiu et al., 2012;Kang et al., 2018), Arabidopsis thaliana (Xia et al., 2013;Du et al., 2014), soybean (Xu et al., 2011;Han et al., 2012), and wheat (Sun et al., 2009;Ramya et al., 2010). In particular, genes controlling rice kernel size or weight, such as GS3 (Fan et al., 2006), GS5 ), qGL3 (Zhang et al., 2012, GW2 (Song et al., 2007), GW8 , GS2 (Hu et al., 2015), qGW7/GL7 , have been successfully cloned. The study of identifying and cloning kernel-size-related genes has lagged in maize. To a certain extent, the reason for this might be due to the genome of maize is large and complicated for many transposable elements and repetitive sequences exist (Gaut et al., 2000;Feuillet and Eversole, 2009). In addition, most complex traits, such as kernel yield and kernel size, are controlled by many genes with small effects (Edwards et al., 1987;Tian et al., 2011). QTLs identified in different genetic backgrounds across multiple environments have a higher chance of being positionally cloned (Chen et al., 2016).
Based on previous studies, the purposes of this study were as follows: (1) to map QTLs for kernel size in three environments by using F 2 and F 2:3 populations from the same cross SG5/SG7; (2) to validate and fine map the identified major QTL qKL-2 by using BC 3 F 1 NILs; and (3) to reveal differently expressed genes (DEGs) between SG5 and SG7 by RNA-seq technology and predict candidate genes responsible for KL. In the study, we constructed an F 2 and an F 2:3 populations using two maize inbred lines SG5 and SG7 and evaluated them in three environments for mapping QTLs for kernel size. Furthermore, we finely mapped a major QTL by using the NILs from the cross of SG5 and SG7 and used RNA-seq technology to reveal the DEGs between parental lines SG5 and SG7. Finally, the candidate genes for qKL-2 were predicted.

Phenotype Evaluation for Segregation Populations
Two kernel size traits, i.e., KL and KW were estimated. The trait values of F 2 population were investigated in 2016, while the phenotypic values of F 2:3 populations were collected in 2018 and 2019, and these were recorded as F 2:3 -2018 and F 2:3 -2019, respectively. Table 1 presents the mean values of KL and KW investigated from F 2 and F 2:3 populations. The two inbred lines SG5 and SG7 were significantly different in both KL and KW traits. KL showed extremely significantly different between SG5 and SG7 (P < 0.01, Figures 3A,B). The data of two kernel size traits both emerged on normal distribution (Supplementary Figure 1). The two traits correlated positively with each other, with Pearson's correlation coefficient being 0.20, 0.25, and 0.24 in F 2 -2016, F 2:3 -2018, and F 2:3 -2019, respectively.

QTL Mapping
CIM procedure was applied to map QTLs conferring KL and KW. Manhattan plots were shown in Figure 1. A total of 10 QTLs were mapped in total for KL and KW from F 2 and F 2:3 populations. The information is summarized in Table 2. For KL, two major QTLs were mapped on maize chromosome 9 in F 2 population. A total four QTLs were mapped on chromosomes 3, 7, and 9 in F 2:3 -2018 population while three QTLs were mapped  on chromosomes 7 and 9 in F 2:3 -2019 population. For KW, three QTLs were mapped on maize chromosomes 3 and 8. A total three QTLs were mapped on maize chromosomes 3 and 8 in both F 2:3 -2018 and F 2:3 -2019 populations, respectively. The phenotypic variation explained by these QTLs ranged from 8.4 to 23.0%, with a mean value of 14.25 and 14.46%, 14.03 and 12.97%, and 10.83 and 13.67% for KL and KW in F 2 -2016, F 2:3 -2018, and F 2:3 -2019, respectively. The LOD score ranges from 4.0 for qKL-7 to 9.5 for qKW-1. Among the 10 QTLs, qKL-2 for KL, and qKW-1 for KW were detected in all the three environments (Figure 2, highlighted in green color circle). That is, they were stable QTLs in the study. Four QTLs (qKW-2, qKL-7, qKW-3, and qKL-10) were detected in two environments, highlighted in blue color circle in Figure 2. In addition, three QTLs, qKL-1, qKW-1, and qKW-2, overlapped with the QTLs identified from the metaQTL analysis (Chen et al., 2017).

Fine Mapping qKL-2 With NILs
From 2017 to 2019, a NIL population, consisting of 998 BC 3 F 1 lines, was developed by introgressing the qKL-2 genomic region of SG5 into the SG7 genetic background. A secondary linkage map with six markers (Supplementary Table 1) around qKL-2 was generated. The six markers were located at 115. 23, 130.51, 133.34, 135.29, 139.75, and 153.88 Mb on chromosome 9 by blasting maize B73 RefGen_v4 ( Figure 3C). The secondary linkage map was 43.35 cM in length, and the genetic distances between every two adjacent markers were 16.75, 8.39, 0.80, 5.67, and 11.74 cM. Then the major QTL qKL-2 was detected with the secondary linkage map of NILs by CIM method in QTL Cartographer v2.5. The qKL-2 had an additive effect of 0.97 mm and explained 16% of phenotypic variation. The LOD peak indicated that qKL-2 was most likely located between SSR3 and SSR5, the LOD peak position was located between SSR3 and SSR4 ( Figure 3C). To confirm the narrowed qKL-2 interval, five recombinant types, namely, Class 1-Class 5, were selected from 998 NILs. Class 1 indicates that 28 recombinants with SSR1 and SSR2 homozygous and SSR3-SSR6 heterozygous. Class 2 indicates 33 recombinants with SSR1 and SSR2 heterozygous and SSR3-SSR6 homozygous. Class 3 indicates three recombinants with SSR1-SSR3 heterozygous and SSR4-SSR6 homozygous. Class 4 indicates 20 recombinants with SSR1-SSR4 heterozygous and SSR5-SSR6 homozygous. Class 5 indicates 47 recombinants with SSR1-SSR5 heterozygous and SSR6 homozygous. At SSR3 and SSR4 loci, Classes 2 and 3 were homozygous while Classes 1, 4, and 5 were heterozygous. There was significantly difference in phenotypic values between the two set of recombinants Classes 2 and 3 and Classes 1, 4, and 5 ( Figure 3D). The progeny test . QTL statistics were also reported for those in which the LOD score exceeded 2.5 but was less than 3.86 (in no bold).
of homozygous segregants indicated that qKL-2 was located in an interval of 1.95 ) and flanked by SSR3 and SSR4 physical interval ( Table 3). The selected overlapping recombinant chromosomes also supported the location of qKL-2.

Candidate Genes for qKL-2 Prediction
RNA-seq procedure was conducted for 18 RNA grain samples at different developmental stages. Results showed that the 1.95 Mb physical intervals of qKL-2 encompassed 40 protein coding genes ( Figure 3E). After DEGs analysis, a total of 11 protein coding genes differently expressed and left in the qKL-2 physical intervals ( Table 4). Previous studies indicated that FERONIA receptor kinase controls seed size in Arabidopsis thaliana (Yu et al., 2014). GRMZM2G006080 encodes receptor-like protein kinase FERONIA and was predicted as a candidate gene of qKL-2, which is most likely responsible for KL.

DISCUSSION
Kernel size controlled by multiple genes is an important component of grain yield in maize. Grain yield was influenced significantly by kernel size, especially KL (Li et al., 2009. Stable QTLs are of great significance for markerassisted breeding, while false positive QTLs are of no use. Normally, two steps, i.e., primary mapping and fine mapping, are needed for QTL analysis unless experiments were conducted in multiple environments with as many as sample size and marker numbers. In this study, primary mapping was carried out in three environments, and two kernel-size QTLs, qKL-2, and qKW-1, detected in all three environments were stable. The two QTLs could be benefit for further marker assisted breeding. Chen et al. (2017) conducted metaQTL analysis based on collecting information on QTLs conferring maize yield-related traits from 33 published studies. A total of 76 MQTLs for maize yield and its related traits were identified across the whole genome, with the number per chromosome ranging from four on chromosome 4-10 on chromosome 5 (Chen et al., 2017). After comparing with the metaQTL analysis results, qKL-1, qKW-1, and qKW-2 detected in this study all overlapped with those MQTLs for kernel-related traits but with more decreased physical intervals ( Table 2).
For qKL-2 locus, primary mapping results showed that the physical intervals were 133.20-135.75, 131.55-134.75, and 127.55-130.05 Mb on chromosome 9, respectively, in three environments. In order to confirm and fine map qKL-2, a NIL population was developed by continuous backcross with markers assisted selection for confirming and fine mapping qKL-2. Finally, qKL-2 was mapped in a 1.95 Mb (133.34-135.29 Mb) interval on maize chromosome 9. Compared with metaQTL analysis results from Chen et al. (2017), MQTL-66, which includes 16 QTLs related to grain yield, ear-related traits, and kernel-related traits located in 120.2-133.6 physical interval on chromosome 9. There was only 0.26 Mb physical distance overlap for qKL-2 (133.34-135.29 Mb) and . It is very likely that qKL-2 was a new locus to control KL.
It is of critical importance that the less genes the better in target QTL interval for map-based cloning. In this study, RNA-seq technology was applied for transcriptomic analyzing DEGs between SG5 and SG7 grains in different developmental stages. DEGs identified were located in the qKL-2 interval. After DEGs analysis, only 11 protein coding genes were left in the QTL qKL-2 intervals ( Table 4). The potential functional genes in QTLs physical intervals decreased significantly after DEGs analysis. According to gene annotation from Blast swiss prot, the function of 11 genes include endoglucanase, 17.0 kDa class II heat shock protein, phospholipid-transporting ATPase 1, receptor-like protein kinase FERONIA, calcium-binding protein, selenium-binding protein 2, NAC domain-containing protein, and thioredoxin-like 1-2, chloroplastic. Further comparative genomics analysis was applied for predicting candidate genes. The evidence on studies of rice or Arabidopsis thaliana showed that kernel size was regulated by multiple signaling pathways, including  ubiquitin-proteasome degradation (Verma et al., 2004), the transcription factor pathway, the phytohormone signaling pathway, and the G protein independent pathway. Yu et al. (2014) concluded that receptor kinase FERONIA involved in a signaling pathway negatively regulated the elongation of integument cells and then controlled the seed size in A. thaliana. Based on the above function analysis of 11 protein coding genes, GRMZM2G006080, which encodes receptorlike protein kinase FERONIA, was predicted as a candidate gene of kernel size. The predicted candidate gene will not only be helpful for underlying genetic mechanism for kernel size but also provides a basis for improving kernel size traits in maize.

Segregation Population Development and Phenotypic Evaluation
Two maize inbred lines, SG5 and SG7, were used in the study. The seeds were provided by the Institute of Grain and Oil, Liupanshui Academy of Agricultural Sciences, Liupanshui, China. We developed an F 2 population by crossing SG5 and SG7 in Liupanshui, Guizhou province of China in the summer of 2013 and 2014. A total of 199 F 2 individuals were planted at the Panxian Maize Breeding Station in Sanya, China, in the winter of 2014. Then, an F 2:3 segregation population containing 199 lines was developed by selfing each F 2 individuals. The F 2:3 population was planted at the Panxian Maize Breeding Station in Sanya for kernel size evaluation in the summer of 2018 and 2019. Field experiment was performed in a randomized block design with three replications. Single-row plots with row spacing of 50 cm were adopted, and each plot grew 15 plants with plant spacing of 35 cm. Kernel size traits, including KL and KW, were investigated in both F 2 and F 2:3 populations after corns were harvested and dried naturally. For F 2 generation, the traits were estimated by the mean value of three repeats including 10 kernels of an ear. For F 2:3 , kernel size evaluation was based on eight ears from the middle part of each plot.
KL and KW were estimated by mean value of three repeats including 10 kernels randomly selected from bulked kernels of eight ears. The measured kernels were all sampled from the middle part of an ear. Young leaves were collected from each F 2 individual for DNA extraction. The methods of genomic DNA extraction, genotype sequencing, and grouping, single nucleotide polymorphisms (SNPs) identification, and high-density linkage map construction were presented in our previous study (Su et al., 2017). The forward regression model of CIM method in QTL Cartographer v2.5 was applied for QTL mapping with walking speed of 1 cM. The likelihood of odds (LOD) value 3.86 was used to declare a QTL, which was based upon 1,000 times permutations analysis. QTL statistics were also reported for those in which the LOD score exceeded 2.5. LOD peaks were used for determining the position of a significant QTL on chromosomes. The positive additive effect value of a QTL indicates that the increase in phenotypic value is provided by SG5 alleles while negative value indicates the decrease in phenotypic value is provided by SG7 alleles. MapChart 2.32 software (Voorrips, 2002) was used for the graphical presentation of QTLs. The QTLs that are mapped in F 2 and F 2:3 populations were compared, and the consistent one will be regarded as stable QTL.

NILs Development and qKL-2 Fine Mapping
NILs for the qKL-2 locus were developed by using continuous backcrossing combined with marker-assisted selection methods. The SSR molecular markers that are near qKL-2 and are polymorphic between donor parent SG5 and recurrent parent SG7 were used for marker-assisted selection of the BC 2 F 1 generation. These SSR markers, based on resequencing maize genome results, were all developed by . We choose SSRs that are near qKL-2 position with high polymorphism information content (PIC) values. These SSRs were used for screening polymorphism between our parental lines SG5 and SG7. SSRs with clearly bands of polyacrylamide gel electrophoresis (PAGE) and polymorphism between SG5 and SG7 were selected for developing secondary linkage map and further fine mapping works. Phenotypic value for BC 3 F 1 lines was investigated in the same way as for the F 2 population. Young healthy leaves were collected from each of the 998 BC 3 F 1 line for genomic DNA extraction. Plant Genomic DNA Kit (TIANGEN, Beijing, China) were used and the manufacturer's protocols were followed. DNA purity was checked by 1% agarose gel and NanoPhotometer R spectrophotometer (IMPLEN, CA, United States). DNA concentration was then measured using an Qubit R DNA Assay Kit in Qubit R 2.0 Flurometer (Life Technologies, CA, United States). The secondary linkage map around qKL-2 was generated by JoinMap 3.0 software (Van Ooijen and Voorrips, 2001). QTL Cartographer v2.5 was applied for QTL mapping with the CIM method, walking speed 1 cM, and a LOD threshold of 10.0.

Candidate Gene for qKL-2 Prediction
Grains of SG5 and SG7 were sampled on the 5th, 10th, and 15th days after selfing three biological replicates. All collected samples were immediately frozen in liquid nitrogen and then transferred to a −80 • C environment before RNA extraction. We finally got 18 grain samples in total. All the samples were sequenced at the Illumina NovaSeq platform. Raw reads with fastq format were firstly handled by in-house perl scripts. Clean reads were then obtained after deleting reads containing adapters and ploy-N and removing reads of a low quality in raw data. In the meantime, the GC content and Q20 and Q30 of the clean reads were calculated. Highquality clean data were then carried out for further downstream analyzing. Reference genome was downloaded directly from genome website 1 , and correlated files of gene annotation were also downloaded from the same website. Bowtie v2.2.3 was used for building reference genome index and TopHat v2.0.12 (Trapnell et al., 2013) was used for aligning pairedend clean reads to the reference genome. The number of reads mapped to each gene was counted by HTSeq v0.6.1. For each gene, the expected number of fragments per kilobase of transcript sequence per millions base pairs (FPKM) was calculated by analyzing the gene length and reads mapped to the gene. FPKM is a widely accepted method currently to evaluate levels of gene expression based on considering sequencing depth effect and gene length of the read count simultaneously (Trapnell et al., 2010). The DEGSeq R package (1.20.0) was applied for analyzing differential expression between two conditions. The P-values adjusted by using the Benjamini and Hochberg method were used. The threshold of corrected P-value 0.005 and log 2 (Fold change) of 1 (absolute value) were considered as significantly differential expression. More information about the methods for reference genome index construction, paired-end clean reads alignment and count, FPKM calculation and DEGs analysis referred to our previous study (Zhao and Su, 2019).
Through analyzing DEGs between SG5 and SG7, the DEGs that were overlaid on to a physical interval of qKL-2 were considered as candidate genes for kernel size in maize. The detected DEGs were further annotated from Blast Swiss Prot database.

AUTHOR CONTRIBUTIONS
GW and YZ developed the F 2 , F 2:3 , and BC 3 F 1 population. WM and XM performed the phenotype investigation. CS developed the genotyping of F 2:3 progeny, analyzed the data, and drafted the manuscript. All authors read and approved the final manuscript.