Seed development-related genes contribute to high yield heterosis in integrated utilization of elite autotetraploid and neo-tetraploid rice

Introduction Autotetraploid rice holds high resistance to abiotic stress and substantial promise for yield increase, but it could not be commercially used because of low fertility. Thus, our team developed neo-tetraploid rice with high fertility and hybrid vigor when crossed with indica autotetraploid rice. Despite these advances, the molecular mechanisms underlying this heterosis remain poorly understood. Methods An elite indica autotetraploid rice line (HD11) was used to cross with neo-tetraploid rice, and 34 hybrids were obtained to evaluate agronomic traits related to yield. WE-CLSM, RNA-seq, and CRISPR/Cas9 were employed to observe endosperm structure and identify candidate genes from two represent hybrids. Results and discussion These hybrids showed high seed setting and an approximately 55% increase in 1000-grain weight, some of which achieved grain yields comparable to those of the diploid rice variety. The endosperm observations indicated that the starch grains in the hybrids were more compact than those in paternal lines. A total of 119 seed heterosis related genes (SHRGs) with different expressions were identified, which might contribute to high 1000-grain weight heterosis in neo-tetraploid hybrids. Among them, 12 genes had been found to regulate grain weight formation, including OsFl3, ONAC023, OsNAC024, ONAC025, ONAC026, RAG2, FLO4, FLO11, OsISA1, OsNF-YB1, NF-YC12, and OsYUC9. Haplotype analyses of these 12 genes revealed the various effects on grain weight among different haplotypes. The hybrids could polymerize more dominant haplotypes of above grain weight regulators than any homozygous cultivar. Moreover, two SHRGs (OsFl3 and SHRG2) mutants displayed a significant reduction in 1000-grain weight and an increase in grain chalkiness, indicating that OsFl3 and SHRG2 positively regulate grain weight. Our research has identified a valuable indica autotetraploid germplasm for generating strong yield heterosis in combination with neo-tetraploid lines and gaining molecular insights into the regulatory processes of heterosis in tetraploid rice.

Autotetraploid rice (ATR) is a useful germplasm developed from genome duplication of diploid rice, in which intersubspecific hybrids showed great biological vigor and high yield potential (Koide et al., 2020).However, the limited reproductive capacity of autotetraploid rice and its hybrids has impeded their widespread commercial cultivation (Wu et al., 2015).Prior studies have indicated that autotetraploid sterility may be attributed to irregular meiotic chromosomal behaviors, changes in DNA methylation, and disrupted gene or non-coding RNA expression (He et al., 2011a;Wu et al., 2014Wu et al., , 2015;;Li et al., 2016bLi et al., , 2018Li et al., , 2020)).To dissolve this "bottleneck" problem (polyploidization sterility), Chinese scientists developed some tetraploid rice with high fertility by many year's effort, including PMeS polyploid rice and neo-tetraploid rice (NTR, 80% seed setting) (He et al., 2011b;Guo et al., 2017;Ghaleb et al., 2020;Liu et al., 2023).NTR lines had the ability to overcome the polyploidization sterility when they crossed with typical autotetraploid rice with low fertility (Guo et al., 2017;Ghaleb et al., 2020;Yu et al., 2020).NTR lines were clustered into one independent group adjacent to the japonica subspecies in a comparative genomic study (Yu et al., 2021).On the other hand, NTR lines harbored wide compatibility gene S 5 n and pollen fertility "neutral gene" Sc n (Ghaleb et al., 2020).Thus, those hybrids derived from NTR and indica autotetraploid lines demonstrated no hybrid sterility and significant yield heterosis (Guo et al., 2017;Chen et al., 2019;Ghaleb et al., 2020), indicating that NTR can serve as the primary parental lines for restorer lines in future intersubspecific tetraploid hybrid breeding.
In the past 20 years, our group developed more than 100 ATR lines.The highlight one of these lines, HD11, was derived from progenies resulting from the self-pollination of Huanghuazhan-4x (HHZ-4x), whose hybrids showed significant heterosis and good plant performance.In this study, HD11, 34 NTR lines, and their hybrids were developed to evaluate intersubspecific tetraploid heterosis, two of which were used to ascertain the genes associated with the production of heterosis in grain weight.Our study aims to provide a yield improvement case of polyploid rice by utilizing superior genetic resources and offer a distinct perspective on understanding the mechanisms behind heterosis regulation.

Plant materials
The autotetraploid rice, HD11, was developed from the 8 th generation of self-pollination of Huanghuazhan-4x (HHZ-4x).HHZ-4x was developed from genome duplication of the diploid cultivar Huanghuazhan (Oryza sativa L. ssp.indica) by colchicine treatment in our lab.Two neo-tetraploid lines with high fertility, Huaduo1 (H1) and Huaduo8 (H8), were used as paternal lines of two tetraploid hybrids, 1HF 1 and 8HF 1 .Moreover, 34 hybrids were developed using HD11 by crossing 34 neo-tetraploid rice lines.The OsFl3 and SHRG2 mutants were genetically modified in the ZH11 background using the CRISPR/Cas9 system.

Investigation of agronomic traits and evaluation of heterosis
Yield-related traits, such as panicle number, total grain number, seed setting rate, 1000-grain weight, and grain yield per plant, were investigated.The high-parent heterosis was calculated as described by Guo et al. (2017): HPH = (F 1 -HP)/HP×100%; F 1 indicates the performance of hybrid plants; HP signifies superior performance in both parents.

Whole-mount eosin B-staining confocal laser scanning microscopy observations
To characterize the endosperm structure of mature seeds, WE-CLSM observations were performed as follows: The brown rice was cut by a sharp blade and stained by 4% eosin B solution for 5 min, hyalinized via pure methyl salicylate before observation under WE-CLSM.WE-CLSM observations were also performed to characterize the endosperm and embryo development in 5P ovaries, as described in our previous study (Li et al., 2023).The collected samples were fixed in FAA solution (70% ethanol: acetic acid: methanal = 89:5:5, v/v), went through gradient rehydration, stained by 4% eosin B solution, dehydrated by gradient ethanol, and hyalinized via 50% and pure methyl salicylate before observation under WE-CLSM.

Bioinformatics tools
Those candidate genes are annotated in the National Rice Data Center website (Kawahara et al., 2013).The global gene expression profile of target genes was predicted by using the Rice eFP expression profile analysis website (Winter et al., 2007).Venn analyses, upset plot analyses and heatmap diagrams were performed by TBtools (Chen et al., 2023).Haplotype analyses were performed via RFGB v2.0 tools (Wang et al., 2018;Wang et al., 2020a).

Identification of CRISPR/Cas9 mutants
Single target targeting coding sequences of OsFl3 (5'-GCACTAGCCATCACAAC-3') or SHRG2 (5'-ACATATCTT GTTCTAGT-3') were designed for CRISPR/Cas9 system to obtain transgenic lines.All transgenic seedlings naturally grew at the experimental station of South China Agricultural University, Guangzhou, China.The targeted sites of OsFl3 and SHRG2 were amplified from transgenic plants for Sanger sequencing to select homozygous mutations.The PCR primers were designed by Primer Premier 5.0 (Supplementary Table S1).
Re-sequencing was employed to analyze the genomic DNA polymorphisms of HD11 compared with HHZ, 5 ATR lines, and 3 NTR lines.The evaluation of Q30 bases proportion, average depth, and coverage_10× showed that the quality of these resequencing data was high enough (Yu et al., 2021).A total of 1321 genes with specific variations were detected in HD11 compared to HHZ, out of which 28 are known to have a function (Supplementary Table S3).Gene ontology (GO) enrichment analysis identified 22 prominent terms in the biological process category associated with the mutant genes (Supplementary Table S4).A total of 14371 genes with specific variations were detected in HD11 compared to other ATR lines, of which 212 have known functions, including 60 resistance or tolerance-related genes and 54 physiological trait genes (Supplementary Table S5), which enriched in 14 Gene ontology (GO) biological process terms (Supplementary Table S6).A total of 8260 genes with distinct variations were found in HD11 compared with NTR, which were enriched in 16 prominent GO terms in the biological process category (Supplementary Table S7).Among those specific variant genes compared to NTR, 190 have known functions, including 45 physiological trait genes, 10 genes associated with yield components, 7 heading date genes, and 1 seed morphology gene (Supplementary Table S8).

Grain weight formation among intersubspecific tetraploid hybrids and parental lines
Relative to diploid HHZ, the chalkiness increased in HD11 grains.Interestingly, the chalkiness in 1HF 1 or 8HF 1 grains was less than HD11, H1, and H8, suggesting that improved grain weight formation plays an important role in yield heterosis of 1HF 1 and 8HF 1 (Figures 2A, B).WE-CLSM observation confirmed denser starch grains in the 67.00~75.00%endosperms of hybrids, while severe interstices were observed in 87.00~99.00%endosperm of paternal lines (Figure 2C).We further characterized the grain development of HHZ, H1, H8, 1HF 1 , and 8HF 1 to identify differences during grain weight heterosis formation.The developing grain weights of 1HF 1 and 8HF 1 were significantly higher than HHZ, H1, and H8 at 3 days after pollination.The increase of grain weight in 1HF 1 and 8HF 1 reached a plateau at 15 days after pollination, which was obviously earlier than H1 and H8 (Figures 2D, E).When we evaluate the increased grain weight per two days from 3 to 17 days after pollination, all HHZ, H1, H8, 1HF 1 , and 8HF 1 increased 36~53% grain weight in the first two days, while 1HF 1 and 8HF 1 accumulated more grain weight (52~53%) than HHZ (41%), H1 (46%), and H8 (36%) (Figure 2F).WE-CLSM observation revealed detailed information about ovary development before and after fertilization.Before pollination, the egg cell, synergid, central cell and antipodal cells were observed in embryo sac (Figure 2G), while the embryo has been differentiated from the zygote, and the endosperm cells have filled the hole of the ovary at 5 days after pollination (Figures 2H, I).These results indicate that 5 days after pollination is an important stage for different grain weight among HHZ, H1, H8, 1HF 1 , and 8HF 1 .
RNA-seq analyses detected the genes with higher expression level in tetraploid intersubspecific hybrids than parental lines To reveal the genes related to strong heterosis formation during seed development of tetraploid intersubspecific hybrids, RNA-seq was performed to assess the global gene expression in developing seeds during two stages (0P, flowering; 5P, 5 days after pollination) among 1HF 1 , 8HF 1 and three parental lines.More than 39.8 million clean reads were produced from each library, which could cover 91.85~96.08% of the reference genome (MSU7.0).While counting the number of genes expressed in each sample (FPKM>10), each material expressed a range of 7475 to 7986 genes in 0P seeds and a range of 6265 to 7850 genes in 5P seeds, respectively (Figure 3).Among them, 68 (0P) and 122 (5P) genes were expressed in three parental lines but not in two hybrids, while 32 (0P) and 67 (5P) genes were expressed in two hybrids but not in three parental lines (Supplementary Table S9).These specific genes might contribute to strong yield heterosis formation of tetraploid hybrids.
Finally, we sought to identify genes with high-parent heterosis of expression level in both 1HF 1 and 8HF 1 , which play a crucial role in facilitating robust yield heterosis in tetraploid hybrids.In 0P seeds, seven genes were identified, including 1 common upregulated and 6 common down-regulated genes in both 1HF 1 and 8HF 1 (Supplementary Figure S2E).In 5P seeds, 112 genes were identified, including 107 common up-regulated and 5 common down-regulated genes in both 1HF 1 and 8HF 1 (Supplementary Figure S2F).These 119 candidate genes with high-parent Upset plot analyses of expressed genes in tetraploid lines and their hybrids.(A) Upset plot analyses of expressed genes in 0P ovaries of tetraploid lines and their hybrids; (B) Upset plot analyses of expressed genes in 5P ovaries of tetraploid lines and their hybrids.Orange groups indicate that gene sets exhibit specificity in either parental lines or hybrids.
heterosis in expression level were designed as seed heterosis related genes (hereafter referred to SHRGs) (Supplementary Table S13).
Haplotype analyses were conducted on the above-mentioned 12 known SHRGs using the RFGB database.While concentrating on their primary five haplotypes, the 1000-grain weight was compared among cultivars that possess distinct haplotypes of each known SHRG.Hap1-OsFl3 (25.(26.85 g), Hap2-RAG2 (26.25 g), Hap4-OsISA1 (26.43 g) the haplotypes with the highest 1000-grain weight (Figure 4B).If we suppose these haplotypes with the highest 1000-grain weight are the most elite haplotypes for each known SHRG to analyze the distribution of cultivars carrying different number of elite haplotypes, all 3024 cultivars have no more than 7 elite haplotypes for these 12 known SHRGs (Figure 4C).While randomly couple with two cultivars to construct suppositional hybrids and calculate their most elite haplotypes of 12 known SHRGs, 128658, 7516, and 78 hybrids could pyramid 8, 9 and 10 most elite haplotypes of 12 known SHRGs, respectively, which never exist in parental cultivars (Figure 4D).These findings indicate that the most superior genetic variations (haplotypes) are lacking in any rice cultivar.However, hybrids offer greater possibilities for pyramiding superior genetic variations of grain weight regulators and forming heterosis of grain weight.

Functional verification of two selected SHRGs
To evaluate the biological relevance of 119 candidate SHRGs, we selected two SHRGs overlapped with 67 specific expressed genes in hybrids for functional verification in grain weight formation, LOC_Os01g33350 (OsFl3) and LOC_Os02g55210 (referred as SHRG2, here).Similar to OsFl3, SHRG2 was a strongly expressed gene in both 1HF 1 and 8HF 1 5P samples, which was almost completely suppressed in the 5P samples of the three parental lines (Figures 4A, 5A).Expression pattern analyses via eFP tools revealed that OsFl3 is mainly expressed in S3 developing seed, and SHRG2 is primarily expressed in S2 developing seed (Supplementary Figure S3; Figure 5B).Haplotype analyses showed that haplotypes of SHRG2 were distinguished between indica and japonica cultivars, which japonica cultivars mainly contained SHRG2-Hap2, and indica cultivars carried SHRG2-Hap1 or SHRG2-Hap3 (Figures 5C, D).The haplotypes of SHRG2 affected 1000-grain weight in cultivars, while SHRG2-Hap2 showed the highest 1000-grain weight (Figure 5E).
Here, we focused on the regulation of grain weight heterosis in intersubspecific tetraploid hybrids and used comparative RNA-seq analyses of developing ovary among intersubspecific autotetraploid hybrids and their parental lines to identify a key geneset (119 SHRGs) that might contribute to high grain weight heterosis in tetraploid hybrids.This geneset contains 13 explicit grain weight regulated genes, including OsFl3, ONAC023, OsNAC024, ONAC025, ONAC026, RAG2, FLO4, FLO11, OsISA1, OsNF-YB1, NF-YC12, OsYUC9, and SHRG2.Any mutation of OsFl3 (Guo et al., 2022), ONAC023 (Li et al., 2022), RAG2 (Zhou et al., 2017), FLO4 (Chastain et al., 2006), FLO11 (Zhu et al., 2018;Tabassum et al., 2020), OsISA1 (Chao et al., 2019), OsNF-YB1 (Bai et al., 2016;Bello et al., 2019;Xu et al., 2021), NF-YC12 (Bello et al., 2019;Xiong et al., 2019), OsYUC9 (Xu et al., 2021), or double mutation of OsNAC20 and OsNAC26 (Wang et al., 2020b) would cause uncomplete grain filling and significantly increase chalkiness in seeds.Correspondingly, overexpression of OsFl3 (Guo et al., 2022), ONAC023 (Li et al., 2022), RAG2 (Zhou et al., 2017), or NF-YC12 (Xiong et al., 2019) would increase grain weight.OsNAC024 and ONAC025 contain SNPs that exhibit a noteworthy correlation with grain weight in rice, whose proteins interact with OsMED15a to govern the expression of grain weight genes, such as GW2, GW5, and DR11 (Dwivedi et al., 2019).The flo11 mutant exhibits temperature sensitivity in its phenotype (Tabassum et al., 2020).Sugar levels and its proteins influence the expression of OsNAC23 directly inhibit the transcription of TPP1, hence controlling sugar homeostasis and grain yield in rice (Li et al., 2022).Besides grain weight regulation, the NAC transcription factors, OsNAC20 and OsNAC26, also positively regulate the expression of glutelin (GluA1/B4/B5), a-globulin and 16 kD prolamin (Wang et al., 2020b).In this study, phenotypic observations indicate that mutations in OsFl3 or SHRG2 lead to a decrease in grain weight due to impaired filling (Figure 6), further contributing to the understanding of 119 potential regulators of grain weight heterosis.These results suggest that all 13 grain weight regulators mentioned above function as positive regulators in the development of grain weight.The elite genotypes of these grain weight heterosis associate genes disperse in different varieties, while the generation of more elite genotypes in hybrids results in a higher expression level of grain weight regulators in the hybrids than their parental lines and promotes grain weight heterosis.Taken together, our study has presented a comprehensive analysis of the gene expression patterns in tetraploid rice, specifically focusing on the phenomenon of intersubspecific seed heterosis.We have found a set of genes that are associated with grain weight heterosis, thereby contributing to our understanding of the mechanisms underlying heterosis generation in neo-tetraploid rice.

Breeding strategy for the utilization of multi-generation heterosis in neotetraploid rice
Our group also focused on the exploitation of those unique advantages of neo-tetraploid rice, such as multi-generation heterosis.Autotetraploid rice hybrids possess four homologous chromosomes, and their heterozygotes require more generations to become homozygous.As a result, these hybrids demonstrate robust heterosis for multiple generations.Previously, we had demonstrated that the hybrids of neo-tetraploid lines and indica autotetraploid lines exhibited near similar yield from F 2 to F 4 generation, indicating that the high levels of heterosis were maintained for several generations in the hybrids of neo-tetraploid rice crossed with autotetraploid rice (Chen et al., 2022).The multi-generation heterosis of tetraploid rice has great potential for producing hybrid seeds and reducing cost.In contrast to diploid rice, the key tetraploid progenitors must exhibit the capacity to overcome polyploid sterility, similar to our neo-tetraploid rice.Now, we have successfully bred a series of neo-tetraploid lines and identified an indica tetraploid germplasm, HD11, with a high combining ability to neo-tetraploid lines, which can work as japonica backbone parent and indica backbone parent in our future breeding, respectively.Thus, we proposed a strategy for utilizing multigeneration heterosis and intersubspecific heterosis based on neotetraploid rice (Chen et al., 2022;Liu et al., 2023).Referring to the "two-line" hybrid heterosis utilization in diploid rice involving a temperature-sensitive male sterile line (TMSL) and a restoring line (RL), our key strategy for future intersubspecific tetraploid hybrid rice breeding is as follows (Supplementary Figure S4): (1) Creation of a new indica tetraploid TMSL with elite genes using HD11.The current study focuses on utilizing an exceptional tetraploid line, HD11, to enhance the crop productivity of tetraploid hybrid rice, which could be used for developing tetraploid TMSL.Previously, we confirmed the feasibility of creating tetraploid TMSL by editing the temperature-sensitive male sterile gene, TMS5 (Chen et al., 2022).In this case, we can use CRISPR/Cas9 to target TMS5 to develop HD11-drived TMSL.
(2) Breeding strong restorer lines based on neo-tetraploid rice.Neo-tetraploid rice can be used as the recurrent parent to cross with various autotetraploid lines, backcross 5-6 times assisted with molecular markers to select target genes (such as "wide compatibility genes" and "neutral genes" for pollen fertility), and finally self-cross to select excellent neo-tetraploid restorer Robust restorers need to retain their capacity to overcome sterility caused by polyploidization, while also enhancing the quantity of grains and panicles.
(3) Selection of super vigor combinations of HD11-derived TMSL and neo-tetraploid restorer lines.HD11-drived TMSL can be used to cross with various neo-tetraploid restorer lines, and yield assessment of their F 1 to F 4 hybrids would be performed to identify super vigor combinations with high yield and multi-generation heterosis.Our group created several HD11-tms5 lines that were temperature-sensitive and identified several hybrids with high heterosis based on HD11 and neotetraploid restorer lines.Meanwhile, we also explore to "fix" the heterosis by apomixis using gene editing techniques.Additional efforts are needed in this aspect.It is important to acknowledge that there is significant room for genetic enhancement in tetraploid hybrids regarding grain quantity, panicle number, and tolerance to both biotic and abiotic stress.This implies a substantial potential for increasing grain yield.
In order to get high yield, direct-seedling and dense planting could be tried in neo-tetraploid rice.
Furthermore, tetraploid rice possesses the distinctive benefit of multi-allelic heterosis, which can be effectively harnessed and applied in future breeding programs for tetraploid rice.Tetraploid hybrids can contain multiple alleles in the same locus, while only two alleles are possible in diploid hybrids.Our understanding of the phenomenon of additional heterosis in tetraploid hybrids with multiple alleles remains inadequate.Further investigation into the utilization of related traits is necessary for future tetraploid rice breeding.

Conclusions
Yield assessment of intersubspecific autotetraploid hybrid rice offers empirical evidence for our tetraploid breeding strategy by the combination of elite indica autotetraploid lines and japonica neotetraploid lines.Intersubspecific autotetraploid hybrids still have excellent yield potential in the improvement of grain number, panicle number, elite haplotypes of grain weight regulators, and cultivation patterns.These results provide important germplasms for intersubspecific tetraploid hybrid rice breeding and new insights into the underlying mechanism of heterosis.
FIGURE 4 12 known grain weight regulated genes with higher expression levels in hybrids than parental lines.(A) Expression levels and functional annotation of 12 known SHRGs in 1HF 1 , 8HF 1 , and parental lines; (B) 1000-grain weight comparison among haplotypes of 12 known SHRGs; (C) Distribution of cultivars that carry different numbers of elite haplotypes in 12 known SHRGs; (D) Distribution of putative hybrids that carry different number of elite haplotypes in 12 known SHRGs.
FIGURE 5 Haplotype and expression analyses of SHRG2.(A) Expression levels of SHRG2 in 1HF 1 , 8HF 1 , and parental lines.(B) Expression pattern of SHRG2 via eFP.(C) Haplotype distribution of SHRG2 in japonica and indica cultivars of RFGB database.(D) Five main haplotypes of SHRG2 in the cultivars of RFGB database.(E) 1000-grain weight comparison among cultivars with different SHRG2 haplotypes.
FIGURE 6 Functional verifications of OsFl3 and SHRG2.(A, B) The schematic diagrams of OsFl3 and SHRG2 genes.The sequences of CRISPR/Cas9 target sites were given with protospacer adjacent motifs (PAMs) underlined and resulting mutations highlighted in red.The grain length and grain width of wild and mutant types (C, D, F, G), brown rice grains (E), grain thickness (H), and 1000-grain weight (I) of ZH11, fl3 and shrg2.Bars = 1 cm (C-E).Different lowercase letters indicate significant differences (P < 0.05, one-way ANOVA, least significant difference (LSD) test).Error bars indicate the standard error (SE).