- 1Guangdong Key Laboratory of Plant Adaptation and Molecular Design, Guangzhou Key Laboratory of Crop Gene Editing, Innovative Center of Molecular Genetics and Evolution, School of Life Sciences, Guangzhou University, Guangzhou, China
- 2School of Environmental Science and Engineering, Guangzhou University, Guangzhou, China
- 3Northeast Agricultural University, Harbin, China
Soybean is highly sensitive to photoperiod, which influences the growth period including flowering time (R1) and maturity (R8), ultimately affecting yield. In this study, we used a chromosome segment substitution lines population (CSSLs), generated by introgressing segments of Glycine soja ZYD00006 into Glycine max cultivar Suinong 14, to identify quantitative trait loci (QTL) associated with growth period and yield. A total of 130 QTLs were identified across three environments, including 88 QTLs for growth period and 42 QTLs for grain-related traits. Based on the distribution of these QTLs, we identified 16 QTL clusters across 12 chromosomes. Among these, Chr09-cluster-1 harbored three stable QTLs associated with R1, R8 and 100-seed weight (SW). The gene GmFTIP09 was identified as the target gene. The mutant of Gmftip09 delayed both flowering and maturity times, as well as reduced SW. The allele of early flowering, early maturity and large seed weight were under strict artificial selection during the early stage of modern soybean breeding. This research provides valuable insights into the genetic control of flowering time and seed traits, offering potential targets for soybean breeding.
1 Introduction
Soybean (Glycine max (L.) Merr.) is a primary source of dietary protein and vegetable oil (Hartman et al., 2011). Soybean yield is contributed by several traits, including growth period and seed weight. The reproductive stages (R), which affect the adaptability, productivity, and seed quality, are divided into eight stages (R1-R8) (Fehr et al., 1971). The R1 stage is defined as when the plant has one open flower on the main stem. The R2 stage occurs when an open flower is present at one of the two uppermost nodes on the main stem. The R3-R6 are associated with pod and seed development, including the R3 is defined as beginning pod, R4 means the full pod, R5 indicated the beginning seed and R6 means the full seed.R7 is defined as when one normal pod on the main stem has reached its mature pod color and R8 defined as when 95% of the pods have reached their mature pod color (Clay et al., 2013). Among these stages, R1 marking the transition from the vegetative phase to the reproductive phase, and the R8 means full maturity, and the interval from R1 to R8 (RP) represent the whole reproductive. All of these three traits are particularly important for maximizing soybean yield. Soybean grain yield is typical quantitative trait that is ultimately determined by the number of seeds per unit area and the seed weight (Pedersen and Lauer, 2004). The seed weight of soybean (often denoted by hundred seed weight) in cultivated soybean is not only an important target of breeders aiming to improve yield (Tayade et al., 2023), but also a trait that determines the grade of soybean and quality of soy-based foods (Friedman and Brandon, 2001).
The growth period and seed weight were all controlled by many genes with additive and epistatic effects (Gao et al., 2024). In previous studies, the biparental mapping populations were generally used for mapping quantitative trait loci (QTL). More than 600 genetic loci for growth period (including R1, RP, R8) and over 500 genetic interval for seed traits (including seed weight, seed weight per plant and see yield) were listed in SoyBase (https://www.soybase.org). Due to the inherently narrow genetic base of biparental populations, QTL detection efficiency is substantially reduced. As a high-throughput alternative to traditional linkage mapping, the genome wide association study (GWAS) achieves efficient QTL detection by correlating genome-wide single-nucleotide polymorphisms (SNPs) with phenotypic variation. Presently, GWAS have begun to effectively analyze the genetic basis of growth period and seed weight in soybean (Huang et al., 2010; Zhang et al., 2015, 2016; Yan et al., 2017; Zhao et al., 2019; Kim et al., 2020; Qi et al., 2020; Cao et al., 2022; Yao and Zhang, 2024; Perfilev et al., 2024). However, both of the two ways can only map association signals to large genomic regions, with the complicated genetic backgrounds, making it hard to identify the candidate genes. Chromosomal segment substitution lines (CSSLs), each carrying one or a few specific marker-defined donor segments in the genetic background of the adapted cultivar (Surapaneni et al., 2017), not only can improve the accuracy of QTL mapping (Ando et al., 2008; Ookawa et al., 2016), but also can be directly used in breeding by design. Therefore, CSSLs serve as a valuable platform for breeding by design through target chromosome segment substitutions (Zhu et al., 2009; Li et al., 2019; Zhang et al., 2021).
Domestication was the earliest form of plant breeding and played an essential role in the rising of agriculture (Diamond, 2002; Salamini et al., 2002; Doebley et al., 2006). Modern crops differ from their wild relatives due to changes in various agronomically important traits that collectively referred to as the domestication syndrome, including the loss of seed dormancy and pod shattering, a decrease in branching, an increase in fruit or seed dimensions, modifications in photoperiod sensitivity, and the advancement and synchronization of flowering and maturation processes (Darwin, 1859). It is widely believed that cultivated soybean was domesticated from wild soybean (Glycine soja Sieb. & Zucc.) in China approximately 6000–9000 years ago (Doebley et al., 2006; Kim et al., 2010; Wang and Li, 2011). Compared with wild soybean, cultivars showed earlier flowering time and larger seed weight (Lu et al., 2020; Goettel et al., 2022; Li et al., 2024b), but very few domestication-related genes regulating this trait are currently known, such as Tof12 for early flowering and maturity, POWR1 and GmCYP82C4 for seed weight. The selection of the domesticated traits resulted in a genome-wide reduction of genetic diversity, as well as loss of useful traits reserved in wild relatives. During the domestication, severe genetic bottlenecks occurred, resulting in the loss of more than 70% of rare alleles, which highlights the high allelic diversity in wild soybeans (Hyten et al., 2006; Zhuang et al., 2022). Unraveling the domestication genetics of soybean will facilitate the discovery and utilization of rare but potentially important alleles.
To further identify key genes regulating soybean flowering time and seed weight during domestication, and to exploit superior genetic resources from wild soybean, this study employed a chromosome segment substitution line (CSSL) population, with wild soybean as the donor parent and cultivated soybean as the receptive parent. By integrating phenotypic data collected over three years, we identified 88 quantitative trait loci (QTLs) related to growth period, 42 QTLs associated with seed-related traits. The results revealed distinct regulatory mechanisms for flowering time and maturity, with the post-flowering stage playing a significant role in soybean yield. Additionally, we also detected 16 QTL clusters, and the candidate gene within the Chr09-cluster-1, the most stable cluster, was analyzed. Interestingly, no previously reported loci associated with flowering time or seed weight were found within this region. Results clarified that GmFTIP09 may be the candidate gene regulating both soybean flowering time and 100-seed weight. Notably, the early-flowering haplotype of GmFTIP09 underwent strong selective pressure during domestication. These findings provide valuable genetic resources for understanding the regulation of soybean growth periods and improving crop yield.
2 Materials and methods
2.1 Plant materials and phenotypic analysis
QTL Detection: The seeds of the CSSL population, which consisted of 170 lines constructed from ZYD00006 (ZY06, donor parent) and Suinong 14 (SN14, recurrent parent), were provided by the Northeast Agricultural University (Xin et al., 2016). In 2017, 2018 and 2019, the population of CSSLs were cultivated in experimental fields in Harbin, China (45°75′N, 126°63′E), where features Mollisols. Sowing operations were conducted between May and October. Every soybean material was cultivated in 2-meter-long rows, with 60 cm inter-row spacing. 20 plants were sown per row. Five plants per row were randomly sampled to record R1(the first flower that appeared on 50% of the plants), R8 (the pods of 50% of the plants that had a mature color), RP (the interval from R1 to R8), PM (the percentage of RP within the R8), SW (100-seed weight per plant) and GW (all seed weight per plant), then the line means used four subsequent analyses. In addition, SN14 and CSSL77 were grown in growth chambers under long-day photoperiods (16 h light/8 h dark) for sample collection and RT-PCR assays.
Phenotypic screening of mutant libraries: The Willimas 82 (Wm82) and mutant library induced by ethyl methanesulfonate (EMS) were provided by the Nanjing Agricultural University (Zhang et al., 2022). Wm82 and mutants were sown in Ningxia, China (38°23′N, 106°23′E), where is dominated by Fluvo-Aquic soils. Sowing operations begins in May and harvested in October, 2023. Every line was cultivated in 2-meter-long rows, with 60 cm inter-row spacing. 20 plants were sown per row. Five plants per row were randomly sampled to record R1, R8, and SW. In addition, Wm82 and homozygous Gmfip09–1 EMS mutants were grown in growth chambers under long-day photoperiods (16 h light/8 h dark) to further characterize the phenotype.
The QTLNetwork software was employed to estimate the heritability of traits (Yang et al., 2008).
The three-year phenotypic mean used for collinearity analysis, and the R software was carried out for correlation analysis and visualization.
The statistical analysis and visualization were performed in SPSS and GraphPad Prism, respectively.
2.2 Detection of introgressive chromosome segments
To investigate the SNP markers, CSSLs of the population were re-sequenced using the Illumina Hiseq 2000 platform (20x coverage), a total of 3895 high-quality polymorphic bin marker were distributed on 20 chromosomes. Based on the physical map constructed in this study, markers evenly distributed on the physical map were selected to genotype the 170 individuals of CSSL population. GGT2.0 software (Van Berloo, 2008) was applied to analyze the characteristics of chromosomal introgressed segments (background recovery rate of the CSSLs, the number and length of introgressed segments) with default parameters.
2.3 Identification of maturity- and grain-related QTL
MapQTL 6.0 was applied to map QTLs by Multiple-QTL model (MQM), with a threshold of LOD≥2.0 (Van Ooijen, 2009). Positive additive effects indicated that ZY06 contributed to the favorable alleles of QTLs, whereas negative additive effects indicated SN14 contributed to the favorable alleles. QTLs were named according to their trait and the order on the chromosome. A QTL cluster was defined when a chromosomal region contained three or more QTLs for different traits, and their confidence intervals shared an overlapping genomic segment (Yang et al., 2022).
2.4 Sequence alignment and phylogenetic analysis
The AtFTIP1 (AT2G45660), AtFTIP3 (AT3G57880), AtFTIP4 (AT1G51570), AtFLD (AT3G10390) and AtFLD-Like (AT1G62830) protein sequence from Arabidopsis was retrieved from TAIR (https://www.arabidopsis.org/). The FTIP and FLD proteins sequences from Glycine max are available at Phytozome (https://phytozome-next.jgi.doe.gov). Multiple sequence alignment was performed with Muscle in MEGA version 7.0 using default parameters. A neighbor-joining (NJ) phylogenetic tree was constructed with MEGA 7.0. The amino acid sequences were aligned and shaded with DNAMAN software.
2.5 RNA extraction and RT-qPCR
The trifoliate leaves of SN14, CSSL77 were sampled at 20 DAE at Zeitgeber time (ZT) 16 in long-day conditions (16 h light/8 h dark) for expression analysis. Total RNA was extracted from samples using an Ultrapure RNA Kit (CWBIO, Jiangsu, China). First-strand cDNA synthesis and removal of genomic DNA contaminants were performed using a HiScript® III RT SuperMix for qPCR (+gDNA wiper) (Vazyme, Nanjing, China). Quantitative PCR (qPCR) was performed using a Roche LightCycle480 system (Roche, Mannheim, Germany) using a qPCR kit (Roche). β-Tubulin (TUB, Glyma.05g157300) was used as the internal control. Three independent biological replicates were analyzed, and three technical replicate reactions were used for each sample. The statistical analysis and visualization were performed in SPSS and GraphPad_Prism, respectively. All qPCR primers are listed in Supplementary Table S1.
3 Results
3.1 Introgressive segments analysis of the CSSLs
A chromosome segment substitution lines (CSSLs) population was constructed in a previous study (Xin et al., 2016). However, the limited number of molecular markers available in this population restrict the efficiency of the candidate gene discovery. To address this, we re-sequenced the population using the Illumina Hiseq 2000 platform (20x coverage) and identified 3895 high-quality polymorphic bin markers that distributed across all 20 chromosomes.
As shown in Figure 1, the introgressed fragment spanned the set of 20 chromosomes. The proportion of the introgressed region relative to each chromosome ranged from 1.20% to 23.80%, with an average of 8.87% (Supplementary Figure S1A). For each CSSL, the number of introgressed fragment ranged from 30 to 105 (Supplementary Figure S1B), and the physical distance of introgressed fragment ranged from 5.60 to 214.77 Mb, with an average of 74.58 Mb (Supplementary Figure S1C). Large gaps were observed on chromosomes 2, 6, 9, 12, 17, and 19, indicating the uneven distribution of markers across chromosomes. These gaps may reflect significant variations in recombination rates at different chromosomal locations.

Figure 1. Genetic constitution and introgressive segments of CSSLs. Distribution of introgressed segments in the CSSLs on the 20 chromosomes. Blue: Homozygous for the allele from donor parent ZY06; Green: Homozygous for the allele from donor parent SN14; Red: Heterozygous.
3.2 Characterization of phenotypic performance
All traits exhibited fluctuations across different years (Supplementary Figure S2). The interannual variation of flowering time (R1) was smaller, from 29.72 to 57.66 days, than that of the maturity (R8), from 94.66 to 136 days. Both R8 and RP (the interval from R1 to R8) showed similar interannual trends that 2019 was the highest, followed by 2017 and 2018. However, the interannual trend of R1 was different, with 2017 being the highest, followed by 2018 and 2019. This divergence suggests that distinct regulatory mechanisms may control R1 and R8. 100 seed weight per plant (SW) and seed weight per plant (GW) ranged from 8.59 g to 27.52 g and 5.81 g to 28.76 g, respectively. We conducted a comparative analysis of six traits and found that R1, R8, and SW are mainly influenced by genetic factors, while the other three traits are more significantly affected by environmental factors (Supplementary Table S2).
Correlation analysis of all traits revealed a significant positive correlation between GW was with SW, though the correlation coefficient was relatively low (Figure 2). This suggested that additional factors, beyond SW, are involved in regulating GW. Interestingly, we found a negative correlation between GW and R1, but a positive correlation between GW and the proportion of RP within R8 (PM) (Figure 2). These results imply that the long vegetative photoperiod may not be favorable for yield, whereas an optimal PM is a critical factor in determining yield.

Figure 2. Correlation analysis among maturity- and grain-related traits. The values in the correlation matrix represent Pearson´s correlation coefficient. Positive value represents positive correlation and negative value represents negative correlation. * and *** represent significant differences at the 0.05 and 0.001 probability levels, respectively. R1: Flowering time (day); R8: Maturity (day); RP: The interval from R1 to R8 (day); PM: The percentage of RP within the R8 (%); SW: 100-seed weight (g); GW: Grain weight per plant (g).
3.3 QTL mapping for growth period and grain traits
A total of 130 quantitative trait loci (QTLs) were detected, which were randomly distributed across 20 chromosomes. Among them, the fewest (only one) QTL was detected on chromosomes 8 and 9, the most (twelve) QTLs were located on chromosomes 6. The majority of QTLs were distributed at both ends of chromosomes, which could be attributed to the fact that chromosomes telomeres is typically hotspots for recombination.
Of the 130 QTLs, 88 were associated with the growth period (Figure 3; Supplementary Table S3), including 25 QTLs for R1, 23 QTLs for R8, 18 QTLs for RP, and 22 QTLs for PM. Among these, 18 QTLs were detected in two environments, and 38 QTLs were detected in three environments, all of which are considered stable QTLs. Almost all of the R1 and R8 QTLs were stable, indicating the strong genetic control over these traits. For R1 and R8, the additive effect of most QTLs was positive, meaning that the allele derived from SN14 resulted in earlier flowering and maturation. However, around the marker Block3364, one locus controlling the R1, R8 and RP simultaneously, allelic variations from the wild soybean ZY06 at this locus promote earlier flowering and maturation. This result indicated that there are also early flowering and maturing locus in late flowering wild soybeans. Compared with R1 and R8, the additive effect of the RP showed a different trend, with half of the QTLs having positive effects and the others half negative. Additionally, we also found that QTL near the markers Block4028 regulated RP and R8, but not R1, and QTL near the marker Block8903 regulated R1 and R8, but not RP. These findings suggested the molecular mechanisms underlying R1, R8 and RP may differ.

Figure 3. QTLs detected on all chromosomes. Red, green, purple, blue, brown, yellow fonts represent R1, R8, RP, PM, SW and GW, respectively. Outlined box indicate that QTL was detected in one environment; an outlined box that filled with a grid of outlined diamonds indicate that QTL was detected in two environments; solid box indicate that QTL were detected in three environments, respectively.
Furthermore, 42 QTLs associated with seed traits were identified (Figure 3; Supplementary Table S3), including 13 QTLs for seed weight per plant (GW) and 29 QTLs for 100 seed weight per plant (SW). Most of the SW QTLs were stable, being detectable in two or three environments, indicating that genetic factors significantly influence SW. In contrast, GW was influenced by more factors, making it more complex. Most of the GW QTLs were only detected in one environment, with the exception of the qGW-14.4, which also regulated the SW and may present a major QTL for seed traits in CSSLs. The allelic variation of SN14 is favorable for increasing both SW and GW.
Based on the distribution of the QTLs, we identified 16 QTL clusters across 12 chromosomes (Table 1). Among these, one (Chr02-cluster-1) contained no stable QTL, five contained one stable QTL, and ten contained more than two stable QTLs. Growth was consistently linked to the seed traits. We found that 13 of the clusters regulated both growth period and grain-related traits, indicating a coordinated genetic control of these two traits. However, Chr05-cluster-1, Chr11-cluster-2 and Chr16-cluster-1 regulated only growth period, suggesting that the candidate genes for these loci could be used to improve ecological adaptability without compromising yield.
Of the identified clusters, the additive effect for growth period was always positive, while the additive effect for seed trait was negative, namely the favorable alleles from SN14 promoted flowering time and the optimized grain yield. This aligns with the goal of crop selection for early flowering, which facilitates mechanized harvesting, and high yield. However, Chr01-cluster-1 and Chr06-cluster-1 showed different patterns: the additive effects for both the growth period and seed trait QTLs were the same, meaning that the alleles from SN14 delay flowering time with higher SW and GW.
Among all the clusters, Chr09-cluster-1 is particularly noteworthy. All QTLs in this cluster were detected across three environments, emphasizing importance of identifying the candidate genes within this cluster. These genes would provide valuable loci for further functional research and soybean breeding. Therefore, we will focus on characterizing the candidate genes for this cluster in further studies.
3.4 Candidate genes prediction in Chr09-cluster-1
Focusing on Chr09-cluster-1, we grew SN14 and three CSSL lines (CSSL-77, CSSL-91, CSSL-161), which carrying introgressed segments from Chr09-cluster-1, in Harbin to analyze their phenotype. Compared to SN14, the three lines exhibited delayed flowering and maturity, as well as reduced SW. This indicates that the segment contains key genes regulating R1, R8 and SW (Figures 4A-C).

Figure 4. Candidate genes prediction in Chr09-cluster-1. (A-C) The phenotypic performance of SN14 and three CSSL lines carrying introgressed segments from Chr09-cluster-1 in Harbin. (A) Flowering time (R1); (B) Maturity (R8); (C) 100-seed weight (SW). All data are means ± SD (n=5). (D) Phylogenetic tree based on the amino acid sequences of FTIP proteins and their homologous proteins using the neighbor-joining method. (E) Phenotypes of wild-type plants (WT, W82) and homozygous Gmftip09 EMS mutants in growth chambers under long-day photoperiods (16 h light/8 h dark). Scale bar, 5 cm. (F) Gene structures of the GmFTIP09 show in W82 and Gmftip09-1. +, coding regions (CDS). AA, Amino acid. The blue bars represent C2 domain, green bars represent PRT_C domain. The triangle symbol represents the base mutation position. (G-I) The phenotypic performance of Gmftip09 EMS mutant in Ningxia. The EMS mutant of Gmftip09 has three different mutation types, named Gmftip09-1, Gmftip09-2, and Gmftip09-3, respectively. Compared with Wm82, Gmftip09–1 harbors one non-synonymous variations at the coding sequence (CDS) positions of 1146 and caused a conversion of the 382th amino acid from glycine to aspartic acid (G-D); Gmftip09–2 harbors one non-synonymous variations at the CDS positions of 1714 and caused a conversion of the 572th amino acid from glutamic acid to lysine (E-K); Gmftip09-3, harbors one non-synonymous variations at the CDS positions of 1822 and caused a conversion of the 608th amino acid from alanine to threonine (A-T). (G) Flowering time (R1); (H) Maturity (R8); (I) 100-seed weight (SW). All data are means ± SD (n=5). (J, K) Relative expression levels of GmFTIP09 (J) and GmFT2a (K) in SN14 and CSSL-77 at ZT16. Data shown in are relative to the control gene Tubulin. Data shown are means ± SD from three independent biological replicates. All plants were grown in growth chambers under long-day photoperiods (16 h light/8 h dark). The two-sided Student’s t-test was performed to determine statistically significant differences in (A-C) and (G-K) *, ** and *** represent significant differences at the 0.05, 0.01 and 0.001 probability levels, respectively.
According to the Wm82 soybean reference genome from the Phytozome database, Chr09-cluster-1 contains six genes homologous to flowering-related genes in Arabidopsis, including Glyma.09G143500, the ortholog of Arabidopsis TERMINAL FLOWER 1 (TFL1), Glyma.09G149000, the ortholog of Arabidopsis AGAMOUS-LIKE 6 (AGL6), Glyma.09G161300, the ortholog of Arabidopsis POLYMERASE-ASSOCIATED FACTOR 2 (PAF2), Glyma.09G185800, the ortholog of Arabidopsis FLOWERING LOCUS D LIKE (FLD-like), Glyma.09G187900, the ortholog of Arabidopsis FT INTERACTING PROTEIN (FTIP), Glyma.09G188000, the ortholog of Arabidopsis VERNALIZATION INSENSITIVE 3 LIKE (VIN3-like) (Supplementary Table S4). Of these six genes, only three showed coding sequence differences between ZY06 and SN14 (Supplementary Table S4). We then screened for mutants of these three genes in a Wm82 mutant library induced by EMS and grew them in the Ningxia to evaluate flowering time. The results showed that Glyma.09G185800 and Glyma.09G187900 mutants exhibited significantly delayed flowering and maturity times, as well as reduced SW, consistent with the phenotype observed in the CSSL lines (Figures 4D-I; Supplementary Figure S3), indicating that Glyma.09G185800 and Glyma.09G187900 are the potential candidate genes. According to the phylogenetic investigation, Glyma.09G185800 and Glyma.09G187900 were named as GmFLD-like1 and GmFTIP09 (Figure 4D; Supplementary Figure S4A).We further analyzed natural variation in GmFLD-like1 and GmFTIP09 coding sequence using re-sequencing data from a panel of 3118 soybean accessions. The variation in GmFLD-like1 defined 10 haplotypes. Surprisingly, none of these haplotypes matched the allele found in SN14 (Supplementary Figures S4B, C). In this case, we cannot use association analysis to determine whether the two haplotypes affect flowering time, so we will not focus on GmFLD-like1 here. For GmFTIP09, sequence comparisons identified a total of eight haplotypes, with H1 and H4 corresponding to the SN14 and ZY06 alleles (Supplementary Figures S5A, B), respectively. The H1 haplotype was found in most landrace and improved cultivars, suggesting that H1 was under strict artificial selection during the early stage of modern soybean breeding (Supplementary Figure S5C). To further explore the functional significance of H1 and H4, we examined their association with R1 and SW. Since H4 was only found in wild soybeans, the analysis was conducted using wild soybean accessions. In three different locations, accessions with H1 haplotype flowered significantly earlier and had larger SW compared to those with the H4 haplotype, confirming the function of GmFTIP09 on R1 and SW (Supplementary Figures S5D-G).
The above results suggested that GmFTIP09 may be a candidate gene for Chr09-cluster-1. In Arabidopsis thaliana, FTIP1 is involved in the florigen (FT) movement to the shoot apex and is associated with late flowering. Studies also showed that FT expression was downregulated in ftip1 mutant (Liu et al., 2012). To determine whether GmFTIP09 affects the expression of GmFT2a (the orthologs of Arabidopsis thaliana FT), we quantified FT2a transcription in SN14 and the CSSL-77. FT2a expression was up-regulated in SN14, indicating that the ZY06 allele (H4) repressed FT2a expression and resulted in the later flowering (Figures 4J, K).
4 Discussion
Cultivated soybean was domesticated from wild soybean in the Huang-Huai region (Jia et al., 2024). Compared to wild soybean accessions, cultivars exhibit significantly earlier flowering (Lu et al., 2020). In this research, we found that the allelic variations of cultivated soybeans SN14 at most QTLs associated with R1, RP, and R8 promote early flowering and maturity. This finding is consistent with previous research that early flowering time was selected during the domestication (Lu et al., 2020). However, for some QTLs, such as qR1-6.1/qR8-6.1 and qR1-16.1/qR8-16.1, the allele derived from wild soybean ZY06 promote flowering time and maturity. Compared to the cultivated soybean, its wild counterparts exhibit significantly higher genetic diversity due to the not gone through the artificial selection and population bottlenecks (Hyten et al., 2006; Lam et al., 2011; Li et al., 2010). This loss of diversity in cultivated soybeans may have resulted in the loss of some valuable genes or alleles crucial environmental adaptation. Consequently, the favorable alleles identified in ZY06 serve as a good source that can be re-introduced into soybean cultivars to breeding elite soybeans. In wild soybeans, only E1, E1La, E1Lb, Tof5, and Tof12 were reported to condition wild soybean for early flowering in high latitude (Lu et al., 2020; Dong et al., 2022, 2023; Fang et al., 2024b). Our reported QTLs, such as qR1-6.1/qR8-6.1 and qR1-16.1/qR8-16.1, were not overlap with the above known genes, indicating novel genetic elements. Further fine mapping the candidate for these QTLs will expand our understanding of the molecular mechanisms underlying wild soybean’s early flowering and maturity, providing valuable genetic resources for early-flowering breeding in cultivated soybean.
Based on the position of 130 QTLs, we found 16 QTL clusters. The emergence of favorable alleles in QTL clusters explains the strong phenotypic connection between crucial traits and linkage and suggests that genes in QTL clusters may be pleiotropic or deeply interconnected (Wang et al., 2020; Zhang et al., 2020). Among these clusters, Chr05-cluster-1, Chr11-cluster-2, and Chr16-cluster-1only contained the growth period QTL. This will facilitate the ecology adaptability without thinking about the yield. The other 13 clusters controlled flowing time, maturity and grain-related traits. The Chr09-cluster-1 was the most stable one, because three QTLs regulating R1, R8 and SW in this cluster were all detected in three environments. This interval contains no previously identified genes or loci known to regulate R1, R8 and SW, suggesting that the identified locus may represent a novel gene. Within this region, the coding sequence of GmFTIP09, a homolog of Arabidopsis FTIP1 (Liu et al., 2012), exhibits variations between the parental lines. In rice, Orchidaceae, mango, sugarcane, the homolog of Arabidopsis FTIP1 were all reported to regulate flowering, which provides evidence for a conserved role of FTIP1 in mediating flowering (Song et al., 2017; Wang et al., 2017; Yadav et al., 2020; Fang et al., 2024a). In soybean, association analysis in a natural population (based on haplotypes) revealed that GmFTIP09 regulates R1, R8 and SW, implicating it as a promising candidate gene for this locus.
In Arabidopsis, FTIP1 regulated flowering time, but FTIP1 expression was not regulated by known flowering genetic pathways (Liu et al., 2012). This regulatory pattern suggests that FTIP1 is less influenced by the environment, which is consistent with the stable results detected for the cluster under different environmental conditions. Interestingly, we found that the early-flowering haplotype of GmFTIP09 underwent strong selection and fixation during domestication. Therefore, this locus can only be detected in a genetic population derived from the cross between cultivated and wild soybean.
FTIP1 regulated flowering time by two ways in Arabidopsis, one is that FTIP1 affecting FT transport through the phloem to the shoot apical meristem, the other is that FTIP1 regulated the FT mRNA expression. In this research, we detected the transcriptional level between the parental haplotype and found that the SN14 haplotype relieves the inhibitory effect of GmFTIP09 on FT, promoting soybean flowering. In addition, we found that FTIP09 regulate seed weight, but the molecular mechanism remains unclear. FT is the member of the functional evolution of phosphatidylethanolamine binding proteins (PEBP) family, which can be divided into three subfamilies, TERMINAL FLOWER1(TFL1)-like, FLOWERING LOCUS T(FT)-like, and MOTHER OF FT AND TFL1(MFT)-like (Danilevskaya et al., 2008). Both of the GmMFT (Cai et al., 2023), the number of MFT subfamily, and Dt1 (Li et al., 2024a), the number of TFL1-like subfamily, were reported to regulated seed weight in soybean. In rice, FT-like 9 (FTL9), the member of the FT protein family, regulates grain size (Ta et al., 2023). These results lead us to guess FTIP regulated seed weight through the FT in soybean. However, FT exhibited only minor effect on seed size, but a significantly larger surface in Arabidopsis (Bigas et al., 2025). Therefore, further investigation is needed to explore the molecular mechanisms of FTIP in regulating grain size and to clarify whether FT participates in the regulation of soybean grain weight.
In addition, soybean roots have nodules in which symbiotic bacteria fixed nitrogen to host plants. GmFTs protein can move from the shoots to the roots and in soybean hairy roots enhanced nodule formation (Wang et al., 2021). In Arabidopsis, FTIP1 affecting FT transport through the phloem to the shoot apical meristem. Whether the FTIP affecting the FT transport to the root and whether FTIP regulated the symbiotic nitrogen fixation deserved further research.
5 Conclusion
In this study, 88 QTLs for growth period and 42 QTLs for seed-related traits were detected, and 16 clusters were defined. Focusing on the most stable QTL cluster, combining homologous alignment analysis, haplotype analysis and RT-qPCR, GmFTIP09 was identified as the target gene associated with growth period and grain traits. The discovery of newly detected QTL loci and the mining of candidate genes provide valuable insight for breeding soybean cultivars with high yield and geographical distribution.
Data availability statement
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found in the article/Supplementary Material.
Author contributions
LC: Data curation, Formal Analysis, Funding acquisition, Software, Writing – original draft. JC: Data curation, Formal Analysis, Methodology, Writing – original draft. CL: Data curation, Formal Analysis, Writing – original draft. HL: Data curation, Methodology, Writing – original draft. MY: Investigation, Methodology, Writing – review & editing. QC: Data curation, Formal Analysis, Writing – review & editing. SL: Data curation, Supervision, Writing – review & editing.
Funding
The author(s) declare that financial support was received for the research and/or publication of this article. This work was supported by the National Natural Science Foundation of China (Nos. 32372112 to SL), National Key Research and Development Program (Nos. 2021YFF1001100 to SL) and Graduate Student Innovation Program of Guangzhou University (JCCX2024–019 to LC).
Acknowledgments
We thank Qingxin Song (Nanjing Agricultural University) for providing EMS mutated seeds.
Conflict of interest
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Generative AI statement
The author(s) declare that no Generative AI was used in the creation of this manuscript.
Publisher’s note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
Supplementary material
The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpls.2025.1640116/full#supplementary-material
References
Ando, T., Yamamoto, T., Shimizu, T., Ma, X. F., Shomura, A., Takeuchi, Y., et al. (2008). Genetic dissection and pyramiding of quantitative traits for panicle architecture by using chromosomal segment substitution lines in rice. Theor. Appl. Genet. 116, 881–890. doi: 10.1007/s00122-008-0722-6
Bigas, J. N., Fiers, M., van der Wal, F., Willems, L. A. J., Willemsen, V., Nijveen, H., et al. (2025). The PEBP genes FLOWERING LOCUS T and TERMINAL FLOWER 1 modulate seed dormancy and size. J. Exp. Bot. 76, 1049–1067. doi: 10.1093/jxb/erae466
Cai, Z. D., Xian, P. Q., Cheng, Y. B., Zhong, Y. W., Yang, Y., Zhou, Q. H., et al. (2023). MOTHER-OF-FT-AND-TFL1 regulates the seed oil and protein content in soybean. New Phytol. 239, 905–919. doi: 10.1111/nph.18792
Cao, Y. C., Jia, S. H., Chen, L. X., Zeng, S. N., Zhao, T. J., and Karikari, B. (2022). Identification of major genomic regions for soybean seed weight by genome-wide association study. Mol. Breed. 42, 38. doi: 10.1007/s11032-022-01310-y
Clay, D. E., Carlson, C. G., Clay, S. A., Wagner, L., Deneke, D. L., and Hay, C. (2013). “iGrow soybeans: best management practices for soybean production,” in Agronomy, horticulture, and plant science books (South Dakota State University, USA).
Danilevskaya, O. N., Meng, X., Hou, Z. L., Ananiev, E. V., and Simmons, C. R. (2008). A genomic and expression compendium of the expanded gene family from maize. Plant Physiol. 146, 250–264. doi: 10.1104/pp.107.109538
Darwin, C. (1859). On the origin of species by means of natural selection, or the preservation of favored races in the struggle for life (John Murray).
Diamond, J. (2002). Evolution, consequences and future of plant and animal domestication. Nature. 418, 700–707. doi: 10.1038/nature01019
Doebley, J. F., Gaut, B. S., and Smith, B. D. (2006). The molecular genetics of crop domestication. Cell. 127, 1309–1321. doi: 10.1016/j.cell.2006.12.006
Dong, L. D., Cheng, Q., Fang, C., Kong, L. P., Yang, H., Hou, Z. H., et al. (2022). Parallel selection of distinct Tof5 alleles drove the adaptation of cultivated and wild soybean to high latitudes. Mol. Plant 15, 308–321. doi: 10.1016/j.molp.2021.10.004
Dong, L. D., Li, S. C., Wang, L. S., Su, T., Zhang, C. B., Bi, Y. D., et al. (2023). The genetic basis of high-latitude adaptation in wild soybean. Curr. Biol. 33, 252–25+. doi: 10.1016/j.cub.2022.11.061f
Fang, J. L., Chai, Z., Huang, C. L., Huang, R., Chen, B. S., Yao, W., et al. (2024a). Functional characterization of sugarcane ScFTIP1 reveals its role in Arabidopsis flowering. Plant Physiol. Biochem. 210, 108629. doi: 10.1016/j.plaphy.2024.108629
Fang, C., Sun, Z. H., Li, S. C., Su, T., Wang, L. S., Dong, L. D., et al. (2024b). Subfunctionalization and self-repression of duplicated E1 homologues finetunes soybean flowering and adaptation. Nat. Commun. 15, 6184. doi: 10.1038/s41467-024-50623-3
Fehr, W. R., Caviness, C. E., Burmood, D. T., and Pennington, J. S. (1971). Stage of development descriptions for soybeans, Glycine max (L). Crop Sci. 11, 929–931. doi: 10.2135/cropsci1971.0011183X001100060051x
Friedman, M. and Brandon, D. L. (2001). Nutritional and health benefits of soy proteins. J. Agric. Food Chem. 49, 1069–1086. doi: 10.1021/jf0009246
Gao, H. T., Wu, G. J., Wu, F. F., Zhou, X. J., Zhou, Y. G., Xu, K. H., et al. (2024). Genome-wide association analysis of yield-related traits and candidate genes in vegetable soybean. Plants-Basel. 13, 1442. doi: 10.3390/plants13111442
Goettel, W., Zhang, H. Y., Li, Y., Qiao, Z. Z., Jiang, H., Hou, D. Y., et al. (2022). POWR1 is a domestication gene pleiotropically regulating seed quality and yield in soybean. Nat. Commun. 13, 3051. doi: 10.1038/s41467-022-30314-7
Hartman, G. L., West, E. D., and Herman, T. K. (2011). Crops that feed the World 2. Soybean-worldwide production, use, and constraints caused by pathogens and pests. Food Secur. 3, 5–17. doi: 10.1007/s12571-010-0108-x
Huang, X. H., Wei, X. H., Sang, T., Zhao, Q. A., Feng, Q., Zhao, Y., et al. (2010). Genome-wide association studies of 14 agronomic traits in rice landraces. Nat. Genet. 42, 961–976. doi: 10.1038/ng.695
Hyten, D. L., Song, Q. J., Zhu, Y. L., Choi, I. Y., Nelson, R. L., Costa, J. M., et al. (2006). Impacts of genetic bottlenecks on soybean genome diversity. Proc. Natl. Acad. Sci. U.S.A. 103, 16666–16671. doi: 10.1073/pnas.0604379103
Jia, K. H., Zhang, X. Y., Li, L. L., Shit, T. L., Liu, D., Yang, Y. Y., et al. (2024). Telomere-to-telomere genome assemblies of cultivated and wild soybean provide insights into evolution and domestication under structural variation. Plant Commun. 5, 100919. doi: 10.1016/j.xplc.2024.100919
Kim, K. H., Kim, J. Y., Lim, W. J., Jeong, S., Lee, H. Y., Cho, Y., et al. (2020). Genome-wide association and epistatic interactions of flowering time in soybean cultivar. PloS One 15, e0228114. doi: 10.1371/journal.pone.0228114
Kim, M. Y., Lee, S., Van, K., Kim, T. H., Jeong, S. C., Choi, I. Y., et al. (2010). Whole-genome sequencing and intensive analysis of the undomesticated soybean (Glycine soja Sieb. and Zucc.) genome. Proc. Natl. Acad. Sci. 107, 22032–22037. doi: 10.1073/pnas.1009526107
Lam, H. M., Xu, X., Liu, X., Chen, W. B., Yang, G. H., Wong, F. L., et al. (2011). Resequencing of 31 wild and cultivated soybean genomes identifies patterns of genetic diversity and selection. Nat. Genet. 43, 387–387. doi: 10.1038/ng.715
Li, X. M., Chen, Z. H., Li, H. Y., Yue, L., Tan, C. R., Liu, H. J., et al. (2024a). Dt1 inhibits SWEET-mediated sucrose transport to regulate photoperiod-dependent seed weight in soybean. Mol. Plant 17, 496–508. doi: 10.1016/j.molp.2024.02.007
Li, S. N., Cheng, P., Bai, Y. Q., Shi, Y., Yu, J. Y., Li, R. C., et al. (2019). Analysis of soybean somatic embryogenesis using chromosome segment substitution lines and transcriptome sequencing. Genes. 10, 943. doi: 10.3390/genes10110943
Li, Y. H., Li, W., Zhang, C., Yang, L. A., Chang, R. Z., Gaut, B. S., et al. (2010). Genetic diversity in domesticated soybean (Glycine max) and its wild progenitor (Glycine soja) for simple sequence repeat and single-nucleotide polymorphism loci. New Phytol. 188, 242–253. doi: 10.1111/j.1469-8137.2010.03344.x
Li, Y., Zhao, W. Q., Tang, J. J., Yue, X. L., Gu, J. B., et al. (2024b). Identification of the domestication gene GmCYP82C4 underlying the major quantitative trait locus for the seed weight in soybean. Theor. Appl. Genet. 137, 62. doi: 10.1007/s00122-024-04571-7
Liu, L., Liu, C., Hou, X. L., Xi, W. Y., Shen, L. S., Tao, Z., et al. (2012). FTIP1 is an essential regulator required for florigen transport. PloS Biol. 10, e1001313. doi: 10.1371/journal.pbio.1001313
Lu, S. J., Dong, L. D., Fang, C., Liu, S. L., Kong, L. P., Cheng, Q., et al. (2020). Stepwise selection on homologous PRR genes controlling flowering and maturity during soybean domestication. Nat. Genet. 52, 428–436. doi: 10.1038/s41588-020-0604-7
Ookawa, T., Aoba, R., Yamamoto, T., Ueda, T., Takai, T., Fukuoka, S., et al. (2016). Precise estimation of genomic regions controlling lodging resistance using a set of reciprocal chromosome segment substitution lines in rice. Sci. Rep. 6, 30572. doi: 10.1038/srep30572
Pedersen, P. and Lauer, J. G. (2004). Response of soybean yield components to management system and planting date. Agron. J. 96, 1372–1381. doi: 10.2134/agronj2004.1372
Perfilev, R., Shcherban, A., Potapov, D., Maksimenko, K., Kiryukhin, S., et al. (2024). Genome-wide association study revealed some new candidate genes associated with flowering and maturity time of soybean in Central and West Siberian regions of Russia. Front. Plant Sci. 15. doi: 10.3389/fpls.2024.1463121
Qi, Z. Y., Song, J., Zhang, K. X., Liu, S. L., Tian, X. C., Wang, Y., et al. (2020). Identification of QTNs controlling 100-seed weight in soybean using multilocus genome-wide association studies. Front. Genet. 11. doi: 10.3389/fgene.2020.00689
Salamini, F., Özkan, H., Brandolini, A., Schäfer-Pregl, R., and Martin, W. (2002). Genetics and geography of wild cereal domestication in the Near East. Nat. Rev. Genet. 3, 429–441. doi: 10.1038/nrg817
Song, S. Y., Chen, Y., Liu, L., Wang, Y. W., Bao, S. J., Zhou, X., et al. (2017). OsFTIP1-mediated regulation of florigen transport in rice is negatively regulated by the ubiquitin-like domain kinase OsUbDKγ4. Plant Cell. 29, 491–507. doi: 10.1105/tpc.16.00728
Surapaneni, M., Balakrishnan, D., Mesapogu, S., Addanki, K. R., Yadavalli, V. R., Venkata, V. G. N. T., et al. (2017). Identification of major effect QTLs for agronomic traits and CSSLs in rice from Swarna/Oryza nivara derived backcross inbred lines. Front. Plant Sci. 8. doi: 10.3389/fpls.2017.01027
Ta, K. N., Shimizu-Sato, S., Agata, A., Yoshida, Y., Taoka, K. I., Tsuji, H., et al. (2023). A leaf-emanated signal orchestrates grain size and number in response to maternal resources. Plant J. 115, 175–189. doi: 10.1111/tpj.16219
Tayade, R., Imran, M., Ghimire, A., Khan, W., Nabi, R. B. S., and Kim, Y. (2023). Molecular, genetic, and genomic basis of seed size and yield characteristics in soybean. Front. Plant Sci. 14. doi: 10.3389/fpls.2023.1195210
Van Berloo, R. (2008). GGT 2.0: Versatile software for visualization and analysis of genetic data. J. Hered. 99, 232–236. doi: 10.1093/jhered/esm109
Van Ooijen, J. W. (2009). MapQTL 6.0, software for the mapping of quantitative trait loci in experimental populations of dihaploid species (Netherlands: Wageningen: Kyazma BV), 64.
Wang, T., Guo, J., Peng, Y. Q., Lyu, X. G., Liu, B., Sun, S. Y., et al. (2021). Light-induced mobile factors from shoots regulate rhizobium-triggered soybean root nodulation. Science. 374, 65. doi: 10.1126/science.abh2890
Wang, K. J. and Li, X. H. (2011). Interspecific gene flow and the origin of semi-wild soybean revealed by capturing the natural occurrence of introgression between wild and cultivated soybean populations. Plant Breed. 130, 117–127. doi: 10.1111/j.1439-0523.2010.01815.x
Wang, Y., Liu, L., Song, S., Li, Y., Shen, L., and Yu, H. (2017). DOFT and DOFTIP1 affect reproductive development in the orchid Dendrobium Chao Praya Smile. J. Exp. Bot. 68, 5759–5772. doi: 10.1093/jxb/erx400
Wang, F. R., Zhang, J. X., Chen, Y., Zhang, C. Y., Gong, J. W., Song, Z. Q., et al. (2020). Identification of candidate genes for key fiber-related QTLs and derivation of favorable alleles in Gossypium hirsutum recombinant inbred lines with G.barbadense introgressions. Plant Biotechnol. J. 18, 707–720. doi: 10.1111/pbi.13237
Xin, D. W., Qi, Z. M., Jiang, H. W., Hu, Z. B., Zhu, R. S., Hu., J. H., et al. (2016). QTL location and epistatic effect analysis of 100-seed weight using wild soybean (Glycine soja Sieb. & Zucc.) chromosome segment substitution lines. PloS One 11, e0149380. doi: 10.1371/journal.pone.0149380
Yadav, A., Jayaswal, P. K., Raman, K. V., Singh, B., Singh, N. K., and Usha, K. (2020). Transcriptome analysis of flowering genes in mango (Mangifera indica L.) in relation to floral malformation. J. Plant Biochem. Biotechnol. 29, 571–572. doi: 10.1007/s13562-019-00541-z
Yan, L., Hofmann, N., Li, S. X., Ferreira, M. E., Song, B. H., Jiang Ask ChatGPT, G. L., et al. (2017). Identification of QTL with large effect on seed weight in a selective population of soybean with genome-wide association and fixation index analyses. BMC. Genomics 18, 529. doi: 10.1186/s12864-017-3922-0
Yang, J., Hu, C. C., Hu, H., Yu, R. D., Xia, Z., Ye, X. Z., et al. (2008). QTLNetwork: mapping and visualizing genetic architecture of complex traits in experimental populations. Bioinformatics. 24, 721–723. doi: 10.1093/bioinformatics/btm494
Yang, P., Sun, X. T., Liu, X. Y., Wang, W. W., Hao, Y. S., Chen, L., et al. (2022). Identification of candidate genes for lint percentage and fiber quality through QTL mapping and transcriptome analysis in an allotetraploid interspecific cotton CSSLs population. Front. Plant Sci. 13. doi: 10.3389/fpls.2022.882051
Yao, X. D. and Zhang, D. Y. (2024). Genome-Wide Association Analysis of Active Accumulated Temperature versus Flowering Time in Soybean [Glycine max (L.) Merr. Agronomy-Basel. 14, 833. doi: 10.3390/agronomy14040833
Zhang, Z., Li, J. W., Jamshed, M., Shi, Y. Z., Liu, A. Y., Gong, J. W., et al. (2020). Genome-wide quantitative trait loci reveal the genetic basis of cotton fiber quality and yield-related traits in a Gossypium hirsutum recombinant inbred line population. Plant Biotechnol. J. 18, 239–253. doi: 10.1111/pbi.13191
Zhang, Q., Li, P. T., Liu, A. Y., Li, S. Q., Lu, Q. W., Ge, Q., et al. (2021). Rapid identification of a candidate gene related to fiber strength using a superior chromosome segment substitution line from Gossypium hirsutum x Gossypium barbadense via bulked segregant RNA-sequencing. Phyton-Int. J. Exp. Bot. 90, 837–858. doi: 10.32604/phyton.2021.014437
Zhang, J. P., Song, Q. J., Cregan, P. B., and Jiang, G. L. (2016). Genome-wide association study, genomic prediction and marker-assisted selection for seed weight in soybean (Glycine max). Theor. Appl. Genet. 129, 117–130. doi: 10.1007/s00122-015-2614-x
Zhang, J. P., Song, Q. J., Cregan, P. B., Nelson, R. L., Wang, X. Z., Wu, J. X., et al. (2015). Genome-wide association study for flowering time, maturity dates and plant height in early maturing soybean (Glycine max) germplasm. BMC. Genomics 16, 217. doi: 10.1186/s12864-015-1441-4
Zhang, M. Z., Zhang, X. Y., Jiang, X. Y., Qiu, L., Jia, G. H., Wang, L. F., et al. (2022). iSoybean: A database for the mutational fingerprints of soybean. Plant Biotechnol. J. 20, 1435–1437. doi: 10.1111/pbi.13844
Zhao, X., Dong, H. R., Chang, H., Zhao, J. Y., Teng, W. L., Qiu, L. J., et al. (2019). Genome wide association mapping and candidate gene analysis for hundred seed weight in soybean [Glycine max (L.) Merrill. BMC. Genomics 20, 648. doi: 10.1186/s12864-019-6009-2
Zhu, W. Y., Lin, J., Yang, D. W., Zhao, L., Zhang, Y. D., Zhu, Z., et al. (2009). Development of chromosome segment substitution lines derived from backcross between two sequenced rice cultivars, Indica recipient 93–11 and Japonica donor Nipponbare. Plant Mol. Biol. Rep. 27, 126–131. doi: 10.1007/s11105-008-0054-3
Keywords: soybean, chromosome segment substitution lines, growth period, yield, GmFTIP09
Citation: Chen L, Chen J, Liu C, Li H, You M, Chen Q and Lu S (2025) GmFTIP09 regulated flowering time and seed weight. Front. Plant Sci. 16:1640116. doi: 10.3389/fpls.2025.1640116
Received: 03 June 2025; Accepted: 18 July 2025;
Published: 11 August 2025.
Edited by:
Alejandro Isabel Luna-Maldonado, Autonomous University of Nuevo León, MexicoReviewed by:
Anuradha Singh, Michigan State University, United StatesLiu ShuLin, Chinese Academy of Sciences (CAS), China
Copyright © 2025 Chen, Chen, Liu, Li, You, Chen and Lu. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Sijia Lu, bHVzaWppYUBnemh1LmVkdS5jbg==; Mengyang You, bXl5b3VAZ3podS5lZHUuY24=; Qingshan Chen, cXNoY2hlbkAxMjYuY29t
†These authors have contributed equally to this work