Genome-wide association study of cooking-caused grain expansion in rice (Oryza sativa L.)

Cooking-caused rice grain expansion (CCRGE) is a critical trait for evaluating the cooking quality of rice. Previous quantitative trait locus (QTL) mapping studies on CCRGE have been limited to bi-parental populations, which restrict the exploration of natural variation and mapping resolution. To comprehensively and precisely dissect the genetic basis of CCRGE, we performed a genome-wide association study (GWAS) on three related indices: grain breadth expansion index (GBEI), grain length expansion index (GLEI), and grain length-breadth ratio expansion index (GREI), using 345 rice accessions grown in two years (environments) and 193,582 SNP markers. By analyzing each environment separately using seven different methods (3VmrMLM, mrMLM, FASTmrMLM, FASTmrEMMA, pLARmEB, pKWmEB, ISIS EM-BLASSO), we identified a total of 32, 19 and 27 reliable quantitative trait nucleotides (QTNs) associated with GBEI, GLEI and GREI, respectively. Furthermore, by jointly analyzing the two environments using 3VmrMLM, we discovered 19, 22 and 25 QTNs, as well as 9, 5 and 7 QTN-by-environment interaction (QEIs) associated with GBEI, GLEI and GREI, respectively. Notably, 12, 9 and 15 QTNs for GBEI, GLEI and GREI were found within the intervals of previously reported QTLs. In the vicinity of these QTNs or QEIs, based on analyses of mutation type, gene ontology classification, haplotype, and expression pattern, we identified five candidate genes that are related to starch synthesis and endosperm development. The five candidate genes, namely, LOC_Os04g53310 (OsSSIIIb, near QTN qGREI-4.5s), LOC_Os05g02070 (OsMT2b, near QTN qGLEI-5.1s), LOC_Os06g04200 (wx, near QEI qGBEI-6.1i and QTNs qGREI-6.1s and qGLEI-6.1t), LOC_Os06g12450 (OsSSIIa, near QTN qGLEI-6.2t), and LOC_Os08g09230 (OsSSIIIa, near QTN qGBEI-8.1t), are predicted to be involved in the process of rice grain starch synthesis and to influence grain expansion after cooking. Our findings provide valuable insights and will facilitate genetic research and improvement of CCRGE.


Introduction
Rice (Oryza sativa L.) is a crucial cereal crop that serves as a staple food for over half of the global population.It is the only cereal crop that is primarily consumed as whole grains, which underscores its significance in the field of rice breeding (Hossain et al., 2009).The quality of rice is assessed based on several factors, including appearance, milling, cooking, sensory properties, and nutrition (Cheng et al., 2005;Feng et al., 2017).Among these factors, cooking quality is a critical determinant for the economic value of rice.The cooking quality of rice refers to the characteristics of cooked rice, including its texture, tenderness, stickiness, and overall palatability.As starch accounts for up to 95% of the dry weight of a polished rice grain (Fitzgerald et al., 2009), the cooking quality of rice is mainly determined by starch.During the cooking process, rice grains absorb water and undergo gelatinization, leading to a noticeable expansion in volume (Golam and Prodhan, 2013).The extent of this cooking-caused rice grain expansion (CCRGE) can affect the texture, tenderness and overall quality of cooked rice, and is significantly influenced by the properties of starch (Pang et al., 2016).In general, rice varieties with a higher amylose content (AC) tend to absorb more water and exhibit greater increase in volume after cooking (Frei et al., 2003).Hence, CCRGE is a complex trait closely related to the cooking quality of rice.As the desired cooking quality can vary depending on the type of rice and the culinary preferences of individuals or cultural cuisines (Suwannaporn and Linnemann, 2008), the corresponding suitable degree of CCRGE is also diverse.To meet the varying demands for the cooking quality of rice, different goals should be established in rice breeding.Dissecting the genetic basis of CCRGE will facilitate the efforts toward the goals.
All the QTLs reported for CCRGE were identified through conventional linkage analysis methods utilizing various populations derived from bi-parental crosses, including F 2 (Arikit et al., 2019), F 3 (Ahn et al., 1993), F 2:3 (Jiang et al., 2008;Thi et al., 2020), BC 2 F 2 (Swamy et al., 2012), BC 3 F 1 (Li et al., 2004), doubled haploid (DH) (Zhang et al., 2004;Tian et al., 2005;Govindaraj et al., 2009), and recombinant inbred lines (RILs) (He et al., 2003;Malik et al., 2022).The linkage-based QTL mapping methods are limited by two main factors.First, it can only investigate the variation between two parents.Second, it has a low mapping resolution due to strong linkage disequilibrium in the mapping population used.Consequently, the mapped QTLs can only account for a small portion of the related genetic variations in the rice germplasm.Therefore, further studies are necessary.
During the domestication process, rice germplasm resources have accumulated a rich array of natural variations in the genome.The advent of high-throughput DNA sequencing technologies has facilitated the use of genome-wide association study (GWAS) as an effective method for identifying natural genomic variations associated with quantitative traits (Huang et al., 2010;Zhao et al., 2011).Unlike the linkage-based QTL mapping method, GWAS utilizes high-density single nucleotide polymorphisms (SNPs) as genetic markers and is performed on diverse natural populations.As linkage disequilibrium is much weaker in natural populations, GWAS achieves higher resolution in QTL mapping (Huang and Han, 2014;Burghardt et al., 2017).GWAS has been successfully employed to map genes or QTLs for numerous important traits in rice, such as flowering time (Huang et al., 2012), grain yield components (Eizenga et al., 2019), grain qualities (Misra et al., 2017;Wang et al., 2020), and so on.However, to date, no GWAS has been conducted to identify QTLs underlying CCRGE.
In this study, we performed GWAS on three traits of CCRGE based on two replicated experiments conducted in two different years (environments) and using seven different methods to analyze the data.We detected 165 related quantitative trait nucleotides (QTNs), including some exhibiting only the effect of QTN-by-environment interaction (QEI).Based on the detected QTNs, we identified five candidate genes through gene ontology (GO), haplotype, and expression pattern analyses.Our findings will facilitate further genetic research and the genetic improvement of CCRGE.

Plant materials and field experiments
A set of 345 rice accessions among the list of the 3K Rice Genomes Project (2014) were utilized for this research (Table S1).These accessions included 108 japonica, 177 indica, 48 circum-Aus group (cA), 2 circum-Basmati group (cB), and 10 admixed (between major groups) according to Wang et al. (2018).All accessions were grown at the Experimental Farm of Fujian Agriculture and Forestry University in Yangzhong (E118.485841,N26.287161) during the normal growing season (April to October) in 2017 (E1) and 2018 (E2).In both years, 20 seeds of each accession were sown on a seedbed after pregermination, and 14 seedlings were transplanted onto the paddy field 25 days later with a 20-cm spacing between plants and between rows.Field management followed standard agronomic procedures.Mature seeds were harvested from each accession, and subjected to sun, then stored at the room temperature.The newly harvested seeds were utilized for the measurement of CCRGE traits in each year.

Measure of cooking-caused grain expansion
The procedure for quantifying the characteristics of cookingcaused rice grain expansion was performed according to Thi et al. (2020).The experiment was conducted in three replicates for each accession.In each replicate, 30 intact white rice grains were soaked (for 30 min) and boiled (for 45 min), and the average length and average breadth of 30 uncooked grains (L 0 and B 0 ) and 15 unbroken and straight cooked grains (L 1 and B 1 ) were measured.Subsequently, the grain breadth expansion index (GBEI), grain length expansion index (GLEI) and grain length-breadth relative expansion index (GREI) of each accession were calculated according to the formulae described by Thi et al. (2020), where GLEI = L 1 /L 0 , GBEI = B 1 /B 0 , and GREI = (L 1 /B 1 )/(L 0 /B 0 ) = (L 1 /L 0 )/ (B 1 /B 0 ) = GLEI/GBEI.

Collection of SNP data
The SNP data of the 345 rice accessions were obtained from the 3K Rice Genomes Project (http://iric.irri.org/resources/3000genomes-project).The core genome set of 404K SNPs (https:// snp-seek.irri.org/download.zul,accessed on 1 September 2021) was employed for the analysis.A stringent quality control process was performed, which involved removal of the SNPs that had more than 20% missing calls and a minor allele frequency (MAF) smaller than 5%.As a result, a total of 193,582 SNPs were retained for subsequent analysis.

Clustering, population structure and linkage disequilibrium analyses
The genetic distances between 345 accessions were calculated based on SNP data, and a phylogenetic tree was constructed using the MEGA 11 software.Population structure was analyzed using principal component analysis (PCA) plots and the Admixture program as described by Alexander and Lange (2011).The linkage disequilibrium (LD) between pairwise SNPs located within 1 megabase (Mb) on each chromosome or across the entire genome was estimated by computing the determination coefficient (R 2 ) using the plink software (Purcell et al., 2007).

Genome-wide association studies
GWAS was performed on GLEI, GBEI and GREI with two strategies: (1) single-environment analysis, namely, analyzing each environment separately; and (2) two-environment analysis, namely, analyzing the two environments jointly.For single-environment analysis, we employed two R packages: 3VmrMLM (Li et al., 2022; https://github.com/YuanmingZhang65/IIIVmrMLM)and mrMLM v4.0.2 (Zhang et al., 2020).The former includes the method 3VmrMLM, while the latter contains six methods, namely, mrMLM (Wang et al., 2016), FASTmrMLM (Tamba and Zhang, 2018), FASTmrEMMA (Wen et al., 2018), pLARmEB (Zhang et al., 2017), pKWmEB (Ren et al., 2018), and ISIS EM-BLASSO (Tamba et al., 2017).The option "method=Single_env" was chosen in 3VmrMLM, while default parameters were used for the other methods.Two-environment analysis was conducted using 3VmrMLM only, with the option set to "method=Multi_env".This method allowed for the estimation of the main effect of a QTN and the effect of QTN-by-environment interaction.For distinction, a QTN showing only the effect of QTN-byenvironment interaction was denoted as QEI.Each QTN or QEI was named following the nomenclature "q + trait + chromosome + number + s/t/i", where "s" and "t" indicate that the QTN was detected based on single-or two-environment analysis, respectively, and "i" indicates a QEI.According to Zhang et al. (2019), the QTNs identified by multiple methods were deemed as reliable QTNs, with particular emphasis on those identified in multiple environments, which were considered stable QTNs.

Prediction of candidate genes
Based on the distinct LD decay in each rice chromosome, the left and right R 2 half-decay regions flanking each QTN or QEI were determined to identify potential candidate genes.The following sequential steps were executed: (1) the SNP effect prediction software snpEff.v1.9 (Cingolani et al., 2012) was employed to evaluate the effects of SNPs on the regional genes, and annotated genes with effective mutation types, such as non-synonymous substitution, splice site, and UTR-5' mutation, were selected; (2) GO classifications related to starch synthesis or endosperm development were searched in the rice database (https://www.ricedata.cn/ontology/),and all genes with these GO classifications were retrieved; and (3) genes that meet both steps 1 and 2 were screened out and then subjected to haplotype analysis, where different haplotypes exhibiting t-test significance were considered as candidate genes.

Trait performance
The traits GBEI, GLEI, and GREI exhibited a continuous unimodal distribution in both environments, suggesting that these traits are quantitative and controlled by multiple genes (Figure 1).After performing the Brown-Forsythe Test for assessing homogeneity of variances, the analysis revealed that the error variances of each accession in both environment for the three traits were homogeneous, indicating that the collected data is suitable for subsequent analysis of variance (ANOVA).Although the population means of these traits were similar in both environments (GBEI: 1.822 and 1.765; GLEI: 1.752 and 1.740; GREI: 0.990 and 1.016), ANOVA revealed statistically significant variation between the two environments and genotype-by-environment interaction (Table 1).These results indicated that all the three traits exhibited significant variation across macro-environments.However, there were still significant correlations between the two environments in these traits, particularly in GLEI and GREI (Table 2).
GREI exhibited significant positive and negative correlations with GLEI and GBEI, respectively (Table 2).This is understandable, as GREI is a composite trait that is influenced by both GLEI and GBEI.However, the correlation between GLEI and GBEI was found to be low (-0.155 in E1 and -0.101 in E2) (Table 2), implying that grain length expansion and breadth expansion during cooking are two relatively independent processes with potentially distinct genetic bases.

Population structures and linkage disequilibrium
A set of 193,582 SNPs meeting the requirements of MAF > 5% and missing data < 20% were obtained.The SNPs were not evenly distributed in the genome (Figure 2).SNPs were the densest on chromosome 11 but the sparsest on chromosome 3, respectively (Table 3).On average, there was one SNP every 1928 bp in the genome.
The results of phylogenetic analysis (Figure 3A), PCA (Figure 3B), and admixture analysis (Figures 3C, D) all indicated that the population of the 345 rice accessions could be basically divided into three distinct groups (subpopulations), namely, indica group, japonica group, and aus group (Figures 3C, D).
The average LD (mean R 2 ) decreased with the increase of physical distance on every chromosome as well as in the whole genome (Figure 4).The average LD half-decay distance (HDD) and the average distance of LD decay to 0.1 (DD0.1) in the whole genome were about 378 kb and 196 kb, respectively (Table 3).However, the HDD and DD0.1 on different chromosomes varied greatly, ranging from 158.4 kb and 62.1 kb on chromosome 2 to 715.7 kb and 712.1 kb on chromosome 7, respectively (Table 3).Therefore, chromosome 2 had the highest LD decay rate, while chromosome 7 had the lowest.

QTNs detected by singleenvironment analysis
In total, 386 QTNs were detected by single-environment analysis using seven different methods, with 145, 127 and 128 QTNs found to be associated with GBEI, GLEI and GREI, respectively (Table 4; Figures S1, S2).However, only 78 (19.5%)QTNs were identified as reliable (Tables 4, S2).The total number of QTNs detected by each method varied greatly, ranging from 32 (FASTmrEMMA) to 131 (3VmrMLM; Table 4).The number and the percentage of reliable QTNs detected by each method also differed significantly (Table 4).Interestingly, there was a positive correlation between the number of reliable QTNs and the total number of QTNs detected by each method (correlation coefficient 80.5%), but a negative correlation between the percentage of reliable QTNs and the total number of QTNs detected by each method (correlation coefficient -88.2%).This indicates that the increase in the number of total QTNs and reliable QTNs detected by a method comes at the cost of a decrease in the percentage of reliable QTNs.Among the three traits, GBEI had the most reliable QTNs, followed by GREI, and GLEI had the fewest (Table 5).Consistently, GBEI had highest proportion of phenotypic variance explained (PVE) by the reliable QTNs, followed by GREI, and GLEI had the lowest (Table 5).More reliable QTNs were detected and therefore there were higher PVEs in E1 than in E2 for GLEI and GREI, but the results in the two environments were similar for GBEI (Table 5).
The two-environment analysis also detected 6, 4 and 5 significant QEIs and 3, 1 and 2 suggested QEIs associated with GBEI, GLEI and GREI, respectively.These QEIs accounted for 24.83%, 14.79% and 21.22% of the total phenotypic variation in GBEI, GLEI and GREI, respectively (Figures 5D-F; Table S4).Notably, there was no common site between the QTNs and QEIs detected, indicating that all the SNPs exhibiting significant main (additive and/or dominance) effects in the two-environment analysis did not show significant effects of interaction with the environment, and vice versa (namely, all the SNPs exhibiting significant effects of interaction with the environment did not show significant main effects).Nonetheless, the SNP markers of two QEIs, qGREI-2.3i(SNP 2:19642336) and qGLEI-5.6i(SNP 5:25726382) were also detected as QTN qGREI-2.2s and qGREI-5.8s in the single-environment analysis, respectively  Test of HOV (homogeneity of variance) was performed using the method of Brown-Forsythe Test, in which F 0.05 = 1.1134 (df 1 = 689, df 2 = 1380).

Prediction of candidate genes for GBEI, GLEI and GREI
In total, the two-environment analysis detected 66 QTNs and 21 QEIs for the three traits.Plus the 78 reliable QTNs detected in the single-environment analysis, this study detected a total of 165 QTNs/QEIs.These QTNs/QEIs were mainly located on chromosomes 5, 11, 12, 3 and 2, and very rare on chromosomes 1 and 10 (Figure 6).
Considering that CCRGE may be largely determined by the starch in endosperm, we tried to predict the candidate genes involved in starch metabolism and endosperm development.By searching 20 related Gene Ontology/Term Ontology (GO/TO) classifications on the China Rice Data Center's website (https:// www.ricedata.cn/ontology/),119 genes were found, of which 26 were located within the R 2 half-decay distance around the detected QTNs/QEIs (Table S5).By analyzing the SNP variations in the genes with the software snpEff v1.9, five genes were found to carry effective mutations, including non-synonymous, splice site and UTR-5' mutations (Table 6; Figure S3).So, these genes were considered to be candidate genes.
We then performed haplotype analysis to assess the reliability of t h e c a n d i d a t e g e n e s .L O C _ O s 0 4 g 5 3 3 1 0 ( O s S S I I I b ) , LOC_Os06g04200 (wx) and LOC_Os08g09230 (OsSSIIIa) exhibited significant haplotype differences for GBEI; L O C _ O s 0 4 g 5 3 3 1 0 , L O C _ O s 0 5 g 0 2 0 7 0 ( O s M T 2 b ) and LOC_Os06g12450 (OsSSIIa) displayed significant haplotype differences for GLEI; and all of the genes except for LOC_Os06g12450 showed significant haplotype differences for GREI (Figure 7).These findings strongly suggested a close association of these five genes with the CCRGE.
To further verify the potential impact of these candidate genes on the regulation of starch synthesis and endosperm development, we analyzed the expression patterns of the five candidate genes in various tissues based on data from the Rice Genome Annotation Project database (Figure 8).The results showed that LOC_Os04g53310 (OsSSIIIb) was expressed mainly in leaf and pre-emergence inflorescence but not in seed or endosperm; LOC_Os05g02070 (OsMT2b) was expressed mainly in post-and pre-emergence inflorescence and in embryo of 25 DAP (days after pollination), but not in endosperm.This suggests that these two genes maybe not closely or indirectly associated with endosperm

Discussion
When analyzing single environmental data, only QTNs, qGREI-5.2s and qGREI-5.6s,were commonly detected in two environments.This may be due to changes in the relative effects of different genes for these traits in different environments,   Genome-wide LD decay across 12 chromosomes.The x-axis represents the physical distance and the y-axis represents the average pairwise R 2 .The color of each chromosome was showed on the top right corner.
indicating that the genes controlling these traits interacted with the environments.Joint analysis of the two environmental datasets using the 3VmrMLM method revealed 21 QEIs for three traits, also indicating the interaction between QTN and environment.Actually, ANOVA results showed significant genotype-by-environment interaction in the three traits (Table 1).However, there were no overlapping sites between QEI and QTNs detected based on two environmental data, indicating that all QEIs had no significant additive or dominant effect, but only the interaction effect between additive or dominant and environment, while all the QTNs in twoenvironment jointly analyze were opposite.Using the same 3VmrMLM method in previous studies, the overlapping sites between QEI and QTNs were also few, ranging from 1-3 sites (Han et al., 2022;He et al., 2022;Yu et al., 2022;Zhang et al., 2022;Jiang et al., 2023;Zhao et al., 2023), except for the study of Zou et al. (2022), which found 13 overlapping sites.From the perspective of the effect of QEI, since most QEIs do not have a significant additive or dominant effect, their reliability needs to be further confirmed.
In this study, among the 78 QTNs detected by singleenvironment analysis, only four QTNs were detected in both environments simultaneously (Table 4; Supplemental Table 2), indicating that only a small proportion (~5%) of QTNs exhibited stable significant effects across the environments.Interestingly, these four stable QTNs appear to represent four different types in terms of the way of being detected (Supplemental Table 2).The first type is qGREI-5.6s,which was detected by the same method in both environments, and no other methods detected it in either environment.The second type is qGBEI-5.4s,which was detected Locations of QTNs/QEIs for GBEI, GLEI and GREI in the rice genome.The QTNs detected by single-environment analysis are indicated by solid circles.The QTNs and QEIs detected by two-environment analysis are indicated by filled triangles and solid diamonds, respectively.by one method in one environment, but by another method in the other environment.The third type is qGREI-5.2s,which was detected by one method in one environment, but by multiple other methods in the other environment.The fourth type is qGLEI-3.3s,which was also detected by one method in one environment and by multiple methods in the other environment, but with one method being the same in the two environments.It is noticeable that three of the four stable QTNs were detected in two different environments due to the use of multiple methods.These findings highlight the advantages of employing multiple GWAS methods to analyze the data collected from diverse environmental conditions.According to the definitions, GREI is a composite trait that comprises various levels of component traits, which exhibit correlation with grain length and grain breadth before cooking (L 0 and B 0 ) or after cooking (L 1 and B 1 ), and is directly proportional to GLEI while inversely proportional to GBEI.Evidently, genes governing GBEI and GLEI may also impact GREI in principle.In other words, the QTLs for GREI may exhibit pleiotropic effects on its component traits or correlated traits.In this study, we did identify 4 QTNs that simultaneously influence GREI and GLEI, and 2 QTNs that simultaneously affect GREI and GBEI (Table 7).This was consistent with the high correlation between GREI and GLEI and GBEI (Table 2).As expected, there were no QTNs pleiotropic on GLEI and GBEI, which is in line with the conclusion that GLEI and GBEI are independent traits and have different genetic bases.Moreover, 3 QTNs controlling GLEI and GREI respectively were detected simultaneously in single and two environments, demonstrating the stability of these QTNs.
As mentioned above in the introduction, there were 10, 47 and 15 reported QTLs controlling length, width and length-width expansion caused by cooking in rice grain.Upon comparing these QTLs with the QTNs mapped in this study, we observed that 12, 9, and 15 QTNs for GBEI, GLEI, and GREI detected in this study were located within the intervals of one or more previously reported QTLs (Table S6).These comparisons provide evidence for the reliability of the QTLs detected in this study.Notably, the four putative genes (LOC_Os05g02070, LOC_Os06g04200, LOC_Os06g12450, and LOC_Os08g09230) identified in this study were found to be in close proximity to four of the aforementioned QTLs).
Due to the swelling of starch granules during cooking, rice grain cooking-caused expansion traits, such as GBEI, GLEI and GREI, is expected to be influenced by starch-related traits which include two typical traits: chalkiness rate and amylose content.Chalkiness rate is a crucial parameter for assessing the visual quality of rice, as high chalkiness rate can lead to easy breakage of grains during processing, low amylose content, and poor eating quality.Thi et al. (2020) utilized a genetic population to map GREI and discovered a positive correlation between amylose content and GREI, with high AC content leading to increased GREI.OsMT2b encodes a metallothionein that binds to metal ions and scavenges reactive oxygen species (ROS).Wu et al. (2022) reported that WCR1, a negative regulator of rice chalkiness rate, functions to regulate OsMT2b (LOC_Os05g02070) transcription level and inhibit 26S proteasome-mediated OsMT2b protein degradation, thereby facilitating ROS clearance, delaying programmed cell death (PCD) of endosperm cells, and ultimately increasing the accumulation of  storage substances, and reducing chalkiness rate.In this study, a SNP site is present in the 5'-UTR region of OsMT2b near qGLEI-5.1s(Figure S3), which may disrupt the expression of OsMT2b, thereby affecting the change in rice cooking caused expansion in the analyzed population.Furthermore, considering the expression pattern of OsMT2b, it is noteworthy that its expression level exhibits a significant reduction in the endosperm.This observation implies its potential indirect influence on starch synthesis or endosperm development.wx (LOC_Os06g04200), OsSSIIa (LOC_Os06g12450), and OsSSIIIa (LOC_Os08g09230) are crucial genes involved in the biosynthesis of starch in rice grains.wx gene encodes granulebound starch synthase (GBSS), a major enzyme responsible for amylose synthesis (Kharshiing and Chrungoo, 2021).It exerts a direct influence on the amylose content in the endosperm and pollen of rice, as well as the gel consistency of grains (Su et al., 2011).OsSSIIa encodes a soluble starch synthase II, and mutations in this gene may affect the activity of starch synthase, which in turn affects the synthesis of medium-length branched chains of amylopectin, changes the crystal layer structure, and ultimately alters the gelatinization temperature (Gao et al., 2003).OsSSIIIa encodes soluble starch synthase III, the second key enzyme involved in rice starch synthesis (Zhou et al., 2016).Mutations in OsSSIIIa can affect the structure of amylopectin, amylose content, and physicochemical properties of starch in rice grains.Double mutants of OsSSIIa and OsSSIIIa exhibited increased chalkiness and amylose content, increased gelatinization temperature, and decreased viscosity (Zhang et al., 2011).In this study, these three genes exhibited the SNP loci with genetic effects.In haplotype analysis, significant differences in GBEI, GLEI, or GREI were observed across different haplotypes caused by SNPs within these genes.In expression pattern analysis, these three genes were highly expressed in the endosperm and seeds 10 days after pollination.All the evidence supported the hypothesis that these three genes were candidate genes controlling CCRGE.
In addition, OsSSIIIb (LOC_Os04g53310) is a gene that encodes soluble starch synthase in rice.Its expression level and activity directly impact the synthesis and quality of starch in rice endosperm.OsSSIIIb can interact coordinately with OsSSIIIa, and loss of function of both genes leads to an increase in resistant starch content in cooked rice (Wang et al., 2023).Although its protein function is redundant with OsSSIIIa, its expression pattern differs significantly from OsSSIIIa which is expressed in the endosperm.OsSSIIIb is mainly expressed in leaves but not endosperm (Figure 8).In this study, the five haplotypes generated by the four SNP loci contained in the OsSSIIIb gene exhibit significant differences in three traits.The evidence proves that OsSSIIIb may indirectly participate in starch sythesis and subsequently affect CCRGE.

Conclusion
In this study, data of GBEI, GLEI and GREI, three traits related to rice grain cooked expansion, were collected from 345 rice accessions in two distinct environments.Utilizing 193,582 SNP markers, seven methods were employed to identify QTNs based on single-environment data, while the 3VmrMLM method was utilized to identify QTNs and QEIs based on two-environment data.A total of 165 reliable QTNs/QEIs were detected, with 60, 46 and 59 of them being associated with GLEI, GBEI and GREI, respectively.Additionally, 26 genes related to starch synthesis or endosperm development were found to be located around these QTNs/QEIs.Further haplotype and expression pattern analyses led to the identification of five candidate genes, namely LOC_Os04g53310 (OsSSIIIb), LOC_Os05g02070 (OsMT2b), LOC_Os06g04200 (wx), LOC_Os06g12450 (OsSSIIa), and LOC_Os08g09230 (OsSSIIIa).These findings can be instrumental in identifying genes and conducting in-depth genetic research on CCRGE.

FIGURE 1
FIGURE 1Frequency distribution of GBEI, GLEI and GREI in two environments.Values on the top right corner of each diagram are mean ± standard deviation (cm).

FIGURE 2
FIGURE 2Distribution of 193,582 SNPs in the rice genome.

3
FIGURE 3 Genetic structure analysis of the population of 345 rice accessions.(A) Phylogenetic tree.(B) Population structure estimated by the software Admixture.(C, D) PCA plots of the first three principal components.

FIGURE 8
FIGURE 8 Heatmap of candidate gene expression analysis by RNA-seq data from RGAP database.Red boxes indicate high transcript levels and blue boxes indicate low transcript levels.DAP, days after pollination.

TABLE 2
Coefficients of correlation between different traits in each environment and between different environments in each trait.
The data in the diagonal are correlations between the two years.The data in the lower triangle and the upper triangle are correlations between the three traits in E1 (2017) and in E2 (2018), respectively.** indicates p-value < 0.01.

TABLE 1
ANOVA of GBEI, GLEI and GREI on genotypes and environments, and their interactions.

TABLE 3
Number and density of SNPs and LD decay distances in the rice genome.

TABLE 4
Numbers of QTNs for GBEI, GLEI and GREI detected by seven methods in two different environments.
1. Redundancy was removed in the totals.2. The number and proportion of reliable QTNs among the total detected by each method or in the whole experiment.

TABLE 5
Statistics of reliable QTNs for GBEI, GLEI and GREI detected in each environment.PVE, proportion of phenotypic variance explained.FIGURE 4

TABLE 7 Common
QTLs between GREI, GBEI and GREI, or between different analysis aspects.