Construction of Chromosome Segment Substitution Lines and Inheritance of Seed-Pod Characteristics in Wild Soybean

Genetic populations provide the basis for genetic and genomic research, and chromosome segment substitution lines (CSSLs) are a powerful tool for the fine mapping of quantitative traits, new gene mining, and marker-assisted breeding. In this study, 213 CSSLs were obtained by self-crossing, backcrossing, and marker-assisted selection between cultivated soybean (Glycine max [L.] Merr.) variety Suinong14 (SN14) and wild soybean (Glycine soja Sieb. et Zucc.) ZYD00006. The genomes of these 213 CSSLs were resequenced and 580,524 single-nucleotide polymorphism markers were obtained, which were divided into 3,780 bin markers. The seed-pod-related traits were analyzed by quantitative trait locus (QTL) mapping using CSSLs. A total of 170 QTLs were detected, and 32 QTLs were detected stably for more than 2 years. Through epistasis analysis, 955 pairs of epistasis QTLs related to seed-pod traits were obtained. Furthermore, the hundred-seed weight QTL was finely mapped to the region of 64.4 Kb on chromosome 12, and Glyma.12G088900 was identified as a candidate gene. Taken together, a set of wild soybean CSSLs was constructed and upgraded by a resequencing technique. The seed-pod-related traits were studied by bin markers, and a candidate gene for the hundred-seed weight was finely mapped. Our results have revealed the CSSLs can be an effective tool for QTL mapping, epistatic effect analysis, and gene cloning.


INTRODUCTION
Cultivated soybean (Glycine max [L.] Merr.) is an important food crop and oil crop in the world, and it is the most important source of plant protein (Nadeem et al., 2021). Therefore, the development of soybean industry is of great significance. Good variety is an important prerequisite for high yield and high yield of soybean. According to various traits of soybean, cultivating good varieties and improving yield and quality of soybean are the goals of breeding workers (Reinprecht et al., 2006). The emergence of molecular marker-assisted selection (MAS) and QTL mapping has significantly shortened the breeding time (Hasan et al., 2021). Some June 2022 | Volume 13 | Article 869455 Zheng et al.
CSSLs Construction and Inheritance Study excellent loci and genes that are of great significance to breeding have been identified by populations, such as F 2 , BC 1 , DH, RIL, and IF 2 (Mian et al., 1996;Song et al., 2005;Reinprecht et al., 2006;Alfred et al., 2012;Botero-Ramirez et al., 2020). The genetic composition of each member in the above population is diverse, which can provide abundant genetic variation information and calculate multiple genetic effects. However, the diversity of genetic background will also bring some interference to QTL mapping and affect its accuracy Qiao et al., 2016). The CSSLs constructed based on backcrosses have similar genetic backgrounds, which can eliminate interference from genetic noise and improve the accuracy of QTL mapping (Ali et al., 2010;Balakrishnan et al., 2019). CSSLs are constructed by continuous backcross between recurrent parent and donor parent. The genetic background is clean and uniform, and only a few or a single donor genome introgressed fragments are different among the lines. CSSL population can accurately locate QTL loci for target traits. In addition, the process of backcrossing can introduce valuable genes into cultivated varieties from wild and distant species to broaden the genetic base of cultivated species. For more than 20 years, since Eshed and Zamir identified QTLs related to fruit solids, plant height, and green fruit weight in tomato (Alpert and Tanksley, 1996), CSSLs have received extensive attention. At present, more than 100 sets of CSSLs have been constructed in 20 crops. At the same time, a large number of QTLs were identified by these CSSLs, and fine mapping and gene mining of these QTLs was carried out (Tian et al., 2006;Gu et al., 2015;Wang et al., 2016b;Xin et al., 2016;Zhang et al., 2019).
As high-quality genetic populations, the above CSSLs can help to understand the genetic basis of crop traits and mining important candidate genes. At the same time, marker density has become the limiting factor in gene detection efficiency. Resequencing technology can provide high-density single-nucleotide polymorphism (SNP) markers. Employing SNP data to divide bin markers in the whole population is a feasible marker strategy that can detect QTLs more effectively than traditional methods. At present, it has been employed for fine mapping of QTLs for yield traits in rice, maize, and other crops (Fonceka et al., 2012;Chen et al., 2014;Gu et al., 2015;Wu et al., 2020).
Annual wild soybean is considered to be the wild ancestor of cultivated soybean (Broich and Palmer, 1980). In the process of domestication and improvement, pods, seeds, and other important yield traits have been under constant selection. At present, a large number of seed-pod trait QTLs have been detected. For example,  carried out QTL analysis and locus interaction effect analysis on the number of threeseed pods in a high-generation recombinant inbred line by a SLAF-seq high-density genetic map, and a total of 17 QTLs were detected. Han et al. (2020) performed additive QTL and epistatic QTL mining on the number of seeds, seed weight, number of pods, three-seed pods, and four-seed pods of recombinant inbred lines derived from the crossing of Hefeng 25 and Lmuri 28 by a molecular genetic linkage map, which was constructed by the Bin-Map mapping technique, and 27 related QTLs were obtained. However, only Ln, GA20OX, SWEET10, and PP2C-1 were confirmed to be associated with seed-pod traits (Lu et al., 2016(Lu et al., , 2017Sayama et al., 2017;. Furthermore, the allelic differences between wild and cultivated soybeans for these traits are largely unknown.
In this study, based on the original CSSLs, the genome was resequenced to QTL mapping of seed-pod traits. The purpose of this study was to broaden the genetic base of cultivated soybean, explore the genetic mechanisms of the formation of soybean pod traits, and mine the excellent genes of wild soybean by wild soybean resources. It is expected that the CSSLs will provide excellent breeding and genetic material and support for breeders and geneticists. Revealing the genetic mechanism of traits and mining important candidate genes could provide important theoretical support for the analysis of seed-pod trait formation in soybean.

Field Planting and Management of Soybean
The experimental materials used in this study included 213 CSSLs. The parents were SN14 (recurrent parent) and ZYD0006 (donor parent), constructed with the aid of simple sequence repeats (SSR) markers (Jiang et al., 2020). The materials of CSSLs consist of eight generations (6 BC 3 F 5 , 37 BC 3 F 6 , 61 BC 3 F 7 , 2 BC 4 F 4 , 12 BC 4 F 5 , 22 BC 4 F 6 , 13 BC 5 F 4 , and 60 BC 5 F 5 ). During 2016-2020, the CSSLs were planted at Xiangyang Farm, Harbin, Heilongjiang Province (Harbin, latitude 45°45′N, longitude 126°38′E) in May every year. A randomized complete block design with three replicates was used for 5 years. The lines were grown in a one-row plot with 60 cm row spacing and 6 cm between plants. The rows were 5 m long, with approximately 80 plants per row. The germplasm resource population was planted in Xiangyang Farm, and the planting method is the same as CSSLs. To fine-map the HSW trait, a CSSL (R183) with small HSW was selected as male parent to cross with SN14. Then, the F 1 plants were selfed in Hainan Experimental Station with the false hybrids removed. After selfing, 74 F 2:3 and 144 RHL lines, which comprised the secondary population named R183-F 2:3 and RHL, were harvested. Farmland management of all the above materials follows common agricultural practices. weight (HSW), and seed-pod weight (SPW). Pod-related traits include the proportion of one-seed-pod-numbers (PoOSPN), the proportion of two-seed-pod-numbers (PoTSPN), the proportion of three-seed-pod-numbers (PoTHSPN), and the proportion of four-seed-pod-numbers (PoFSPN). Seed length is the sum of the longest distance parallel to the hilum of 10 seeds. Seed width is the sum of the longest distance perpendicular to the hilum of 10 seeds. The HSW is the weight of 100 seeds of uniform size. Seed-pod weight is the weight of all seeds and pod shells on the whole plant .

Material Resequencing and Bin Marker Acquisition
Genomic DNA was extracted from young leaves of Suinong 14 (SN14), wild soybean ZYD00006, and CSSLs by a modified CTAB method (Wu et al., 2020). Then, the extracted genomic DNA was randomly interrupted by ultrasonic fragmentation, and the sequencing library was constructed by terminal repair, adding an a at 3′ end, adding a sequencing connector, purification, and PCR amplification. After passing a quality inspection, the library was sequenced with an Illumina HiSeqTM sequencing platform, and the sequence read length was 150 bp. By BWA and GATK software, a sequence was compared with the reference genomic Glycine max Wm82.a2.v1 to identify the original SNP (Li and Durbin, 2009;McKenna et al., 2010;Schmutz et al., 2010), and high confidence SNPs were obtained according to the method of Guo et al. (2015).
A developed approach of sliding-window (Huang et al., 2009) was used for CSSLs genotyping with a group of 17 consecutive SNPs and a step size of 1. The position of a SNP that switched one genotype to another consecutive genotype was recorded as the recombination breakpoint. A bin map was constructed according to the recombination breakpoint information. All of the breakpoints along the same chromosome were sorted by physical position.

Method of Date Analysis
Phenotypic data of CSSLs and secondary segregation population (F 2 , F 2:3 , and RHL) have been statistically analyzed by SPSS 26. For example, average value, maximum value, minimum value, etc. The best linear unbiased estimate (BLUE) can more effectively eliminate the deviation caused by the environment and improve the accuracy of QTL mapping. The BLUE was calculated by BLUE calculation option in the AOV module of the ICIMapping 4.2 software. The parameters used were each line of CSSLs as fixed factor, years and repetition as random factors.

QTL Mapping of Pod-Seed Traits
Bin markers were used for QTL mapping of the seed-pod-related traits in CSSLs, and a likelihood ratio test based on stepwise regression model (RSTEP-LRT-ADD) with ICIMapping 4.2 software was used to determine QTLs. Parameters setting was "By condition number" = −1,000 (equivalent to deleting duplicate markers), "PIN" = 0.001 (PIN: the largest-value for entering variables in stepwise regression of residual phenotype on marker variables), and LOD value ≥2.5 (Wang et al., , 2016aLi et al., 2007Li et al., , 2008. The linear model of the RSTEP-LRT method was as follows: where i = 0 (for the recurrent parent), 1, 2, …, n, b 0 is the intercept, b j (j = 1, …, t) is the partial regression coefficient of phenotype on the jth marker, which represents QTL additive effect on each segment, x ij is the indicator variable for the jth marker in the ith CSSL, which is equal to −1 if the marker type is the same as in P 1 (the recurrent parent) and 1 if the marker type is the same as in P 2 (the donor parent), and e i is the random experimental error following a normal distribution. DNA was extracted from the secondary segregated population by the CTAB method. SSR markers were used as molecular markers (the marker obtained from Soybase). 1 After polymerase chain reaction (PCR) of Panaud, the genotype of secondary segregated population was then detected by 8% PAGE separation (Doyle and Doyle, 1990). Genetic map is constructed by MAP module in ICIMapping 4.2 software. QTL mapping employed the ICIM-ADD model with 1,000 permutations calculation of ICIMapping 4.2 software. And the threshold of ICIMapping for QTL detection was chose p ≤ 0.05. The epistasis of the seed-pod-related traits by the RSTEP-LRT-EPI model with PIN = 0.000001calculation of ICIMapping 4.2 software . And the threshold of ICIMapping for epistasis QTL detection were set chose LOD > 10.

qRT-PCR Analysis of Candidate Genes
The seed of qRT-PCR material was sampled according to different developmental stages of soybean seeds. The Cotyledon Stage (COT, 10 days after flowering (DAF)) and Early Maturity Stage 1(EM1, 20 DAF) mainly carry out cotyledon development and endosperm assimilation. Then, the seed size and weight are increasing at Early Maturity Stage 2(EM2, 30 DAF) and Mid Seed Maturity (MM, 50 DAF). 2 During the COT, EM1, EM2, and MM of seed in parents SN14, CSSL-R183, and wild soybean ZYD00006 (Severin et al., 2010;Collakova et al., 2013;Lu et al., 2016), soybean seed RNA was extracted by TRIzol (Invitrogen, 15596-026, Carlsbad, CA, United States). qRT-PCR was carried out on a Roche LightCycler 480 II fast real-time PCR system using SYBR qPCR Mix (Vazyme, Q711, Vazyme Biotech, Nanjing, China). Actin (GLYM18G52780) was used as the internal reference (Gu et al., 2017), and the relative amount of gene expression (target gene/actin) was calculated by the 2 ΔCT method. Premier 5.0 3 was used to design qRT-PCR primer sequences for candidate genes.

Haplotype Analysis
A total of 129 germplasm resources from China were used for haplotype analysis due to their rich genetic variation and large 1 https://soybase.org/ 2 https://soybase.org/amigo/amigo/term/SOY:0001283#display-lineage-tab 3 http://www.premierbiosoft.com/primerdesign/index.html range of HSWs. The genomic sequences of the candidate genes were proposed on the Phytozome13 website, 4 including the 5′UTR, upstream 2000 bp promoter region sequence, CDS, intron sequence, and 3′UTR sequence information. The sequence information for the candidate gene and the genomic sequencing information from the resequencing of 129 soybean germplasm resources were subjected to local BLAST analysis to obtain SNP information from the candidate genes in the germplasm resource population (unpublished). In this study, Dnasp5.0 software was used to analyze the haplotype distribution of candidate gene SNP sequences in germplasm resource populations, and excellent haplotypes (the number of cultivars belonging to the haplotype exceeding 5% of the total) were screened out. 5 Using the Haps Format module in Haploview 4.2 software (Cambridge, Massachusetts, United States) to analyze the excellent haplotype sequence information from candidate genes, ANOVA of candidate gene haplotypes and phenotypes was performed by SPSS 26.0 software (Armonk, New York, USA), to determine the effect of each haplotype on phenotype. Plant cis-acting regulatory DNA elements (PLACE) 6 and plant promoters and cis-acting regulatory elements (Plant Care) 7 were used to query the function of plant promoter elements.

Resequencing of CSSLs (Parents) and Division of Bin Markers
The sizes of the sequenced data from SN14, ZYD00006, and the 213 CSSLs were 30.08 Gb, 33.38 Gb, and 1410 (Schmutz et al., 2010), and the results showed that the average alignment efficiency of SN14, ZYD00006, and the CSSLs was more than 90%. The average genome coverage depth and percentage of parents were more than 27× and 95% (covering at least 1×), respectively. The average genome depth and percentage of the CSSLs were more than 5.79× and 95.74% (covering at least 1×), respectively (Tables 1, 2). At the same time, 580,524 SNPs were identified between SN14 and ZYD00006, which indicated that there were genetic differences between cultivated soybean SN14 and wild soybean ZYD00006. These SNPs in the parents of the CSSLs also facilitate further fine mapping of QTLs.
The 580,524 SNP was divided into 3,780 bin markers. The range of bin marker length was 20 Kb-22.91 Mb, and the average marker length was 225.12 Kb. 79.26% of bin markers were less than 0.2 Mb in length, and 111 markers were more than 1 Mb in length ( Figure 1A). There were 6 bin markers greater than 10 Mb, which were distributed on Chr. 04 (Bin2350), Chr. 08 (Bin4681), Chr. 09 (Bin4865), Chr. 12 (Bin6269), Chr. 18 (Bin9389), and Chr. 19 (Bin9688). At the same time, there were differences in the number and density of bin markers on 20 soybean chromosomes ( Table 3). The highest and lowest bin marker densities were on chromosome 9 (2.33 markers/Mb) and chromosome 3 (5.50 markers/Mb), respectively. The average marker density was 4.04 markers/ Mb, and the average distance between bin markers was 25.45 Kb. These CSSLs carried 2441 introgressed segments (including 459 introgressed segments with a single bin marker). The length of these introgressed segments ranged from 0.02 Mb to 51.74 Mb. Among the above introgressed segments, 60.43% of the introgressed segments were less than 1 Mb, and large segments greater than 10 Mb accounted for 7.87% ( Figure 1B). The CSSLs many large introgressed fragments, so they need to be further expanded.

Phenotypic Variation and Statistical Analysis of CSSLs
The phenotypic data of CSSLs from 2016 to 2020 were statistically analyzed and the BLUE was calculated.  Table 2). The above results showed that the phenotypic variation in the CSSLs was plentiful, indicating that the CSSLs were suitable for genetic mapping research. The correlation between seed-pod-related traits was analyzed. PoOSPN and PoTSPN were positively correlated;  PoTHSPN and PoFSPN were positively correlated; PoOSPN were also negatively correlated with PoTHSPN and PoFSPN; and PoTSPN were also negatively correlated with PoTHSPN and PoFSPN. SL was positively correlated with SW, SL and SW were positively correlated with HSW. SL, SW, and HSW were significantly positively correlated with SPW. The above results showed that the pod-seed-related traits of soybean were not isolated, and the traits were closely related (Supplementary Table 3).

Screening Materials for Further Construction of CSSLs
Based on the information of sequencing and introduction of donor fragments, 129 lines which can be achieve the maximum coverage of CSSLs were selected from 213 CSSLs as the basis for future construction of CSSLs (Figure 3). These 129 CSSL carried 1772 introgressed segments (including 359 introgressed segments with a single bin marker; Figure 4). The average number of introgressed segments per chromosome in these lines was 88.6. The length of these introgressed segments ranged from 0.02 Mb to 47.14 Mb, and their average length was 3.3 Mb. Among the above introgressed segments, 62.19% of the introgressed segments were less than 1 Mb, 24.49% were between 1 and 5 Mb, and large segments greater than 10 Mb accounted for only 7.45%. The total length of introgressed segments in the CSSLs was 5589.34 Mb, which was 5.71 times larger than the soybean genome and covered 89.65% of the genome of wild soybean (Table 4). In the future, we will further expand the CSSLs on this basis of 129 lines to complete the construction of whole-genome substitution lines.    Table 5). The PoOSPN and SPW were no QTL detected steadily for more than 2 years. Compared with total QTLs mapped for each trait, 23, 7, 1, and 1 QTLs were repeatedly detected in 2, 3, 4, and 5 traits, respectively. Among the QTLs that were detected steadily for more than 2 years, QTLs with the highest proportions of phenotypic variation explained were qPoTSPN13-1, qPoTHSPN13-1, qPoFSPN20-3, qSL17-1, qSW17-1, and qHSW17-3, respectively ( Table 5). qPoTSPN13-1 and qPoTHSPN13-1 are the same locus detected on chromosome 13, and the proportions of phenotypic variation explained were 20.48 and 16.14%, respectively. qPoFSPN20-3 is a QTL detected on chromosome 20, and the proportion of phenotypic variation explained was 30.69%. qSL17-1 and qSW17-1 are the same locus detected on chromosome 17, and the proportions of phenotypic variation explained were 28.76 and 16.54%, respectively. qHSW17-3 is a QTL detected on chromosome 17, and the proportion of phenotypic variation explained was 37.58%. When comparing QTLs related to different traits that were stably detected for more than 2 years, we found that two QTLs, qPoTSPN18-1  and qSL18-1, were detected at locus Bin9163; three QTLs, qPoTSPN10-1, qPoFSPN10-1, and qSW10-2, were detected at locus Bin5399; two QTLs, qPoTSPN13-1 and qPoTHSPN13-1, were detected at locus Bin6769; three QTLs, qPoTSPN20-3, qPoTHSPN20-1, and qPoFSPN20-3, were detected at locus Bin10791; and the locus Bin8866 was detected as qSL17-1, qSW17-1 and qHSW17-4 ( Figure 5). Compared with the QTL detected only once, the QTL repeatedly mapped between different traits or between different years of the same trait has higher feasibility. The above QTLs will be the focus of further research.

Establishment of a Secondary Segregating Population and Fine Mapping
R183 was a material in CSSLs with extremely small HSW; several HSW QTLs stably detected more than 2 years were located in the introgressed segment of R183 (bin2808, bin5553, and bin6188). The secondary segregating population was constructed by backcrossing R183 (12.74 g) with SN14 (19.73 g; Supplementary Figures 2A,B) to identify candidate genes regulating soybean HSW. Forty-five pairs of SSR markers with polymorphisms between parents were screened. Preliminary mapping of QTLs employing the secondary segregating population R183-F 2 and the 45 SSR markers detected 5 QTLs (Supplementary Figure 2C; Table 6). Among them, qHSW-12-1 was located between two flanking markers, M371 (7.27 Mb) and M570 (11.68 Mb). qHSW-12-1 had the highest proportion of phenotypic variation explained (21.42%). Next, fine mapping of qHSW-12-1 was performed ( Figure 6A).
One heterozygous plant in the target interval was screened from the secondary segregating population R183-F 2 , and 74 R183-F 2:3 plants were obtained to narrow the target interval. Fine mapping of qHSW-12-1 was performed by the secondary segregating population R183-F 2:3 and 5 SSR markers in the target interval. qHSW-12-1 was mapped between two flanking markers, M371 (7.27 Mb) and M454 (8.50 Mb; Supplementary Figures 2D,E). The fine mapping QTL interval was larger, so individuals with heterozygous target segments were selected to construct the RHL population, and a total of 144 individuals were obtained. Ten SSR markers were employed to increase the density of primers in the target interval. Finally, qHSW-12-1 was narrowed down to a 64 kb region between M371 (7.27 Mb) and M374 (7.33 Mb; Figure 6B; Supplementary Figures 2F,G).
According to the Williams 82 soybean reference genome sequence in the Phytozome database (See Footnote 4), the genome interval of qHSW-12-1 contains a total of 5 genes (Glyma.12G088700, Glyma.12G088800, Glyma.12G088900, Glyma.12G089000, and Glyma.12G089100; Figure 6B). Glyma.12G088700, a transcription factor related to drought resistance and salt tolerance (Liao et al., 2008;, showed no difference between the parents in the sequences of the upstream 2000 bp promoter or the CDS region. Glyma.12G088800 has no genetic annotation, and quantitative real-time PCR (qRT-PCR) showed that the gene was not expressed during seed development periods. Glyma.12G089000 (annotated as transmembrane 9 superfamily member 5) is a gene that has not been studied at present. qRT-PCR showed that there was a significant difference in gene expression between parents at the COT and EM1 stages, but there was no significant difference at the EM2 and MM stages. Glyma.12G089100 is also called GmPHR20. Its homologous gene (GmPHR25) in soybean is an important regulator of the phosphorus signal network (Xue et al., 2017), and its homologous gene (AT3G04030) in Arabidopsis may be involved in nitrogen reuse (Nakano et al., 2017). However, qRT-PCR showed that there was no difference in the relative expression of the gene between parents.  There were 7 SNP differences in the promoter region and 2 SNP differences in the CDS region in Glyma.12G088900 between parents (Supplementary Figures 3-5). qRT-PCR showed that there were extremely significant differences in the gene during the seed development periods of both parents (Figure 7A; Supplementary Table 5). Dnasp5.0 software was used to analyze the haplotype distribution of Glyma.12G088900 SNP sequences in germplasm resource populations. There was an extremely significant difference in HSW between two excellent haplotypes, Hap1, with the same genotype as R183, and Hap2, FIGURE 5 | The QTLs were detected steadily for more than 2 years. The QTLs detected steadily for 2 years were yellow; The QTLs detected steadily for 2 years were green; The QTLs detected steadily for 5 years were red.

Analysis of Epistasis Between Loci
The epistatic effects were analyzed by the BLUE values of 8 traits in 5 years. The results showed that 955 pairs of epistatic QTLs were obtained from the other 6 traits except PoFSPN and SPW (Figure 8; Supplementary Table 7). Among these epistatic QTLs, the LOD values were between 10.01 and 36.08, and the proportions of phenotypic variation explained were between 0.11 and 12.85%. There were 171, 79, 6, 38, 282, and 379 pairs of epistasis in HSW, PoOSPN, SL, SW, PoTSPN, and PoTHSPN, respectively. Among them, the maximum proportions of phenotypic variation explained were 0.63, 0.46, 12.85, 1.60, 0.53, and 0.35%, respectively, and the minimum proportions were 0.14, 0.28, 5.52, 1.12, 0.16, and 0.11%, respectively. Except for SL, the proportions of phenotypic variation explained of epistasis about other traits were lower, indicating that the main effect QTL played an important role in the regulation of target traits. In HSW, 19 pairs of epistasis occurred between two major QTL, the other occurred between a major QTL and a locus, and a total of 129 pairs played a positive role in increasing HSW. qHSW17-3 not only had epistatic interaction with 5 major QTL (qHSW3-2, qHSW5-3, qHSW4-1, qHSW18-2, qHSW20-3), but also had epistatic interaction with 110 motifs, indicating that the QTL may play an important role in the regulation of HSW.
Seventy-eight pairs of epistasis related to PoOSPN occurred between one major QTL and one locus, only one pair of epistasis occurred between two major QTL. Among them, 59 pairs played a positive role in reducing PoOSPN. All the above 79 pairs of epistasis were related to qPoOSPN14-1, indicating that qPoOSPN14-1 played an important role in regulating PoOSPN.
Six pairs of epistasis related to SL occurred between one major QTL and one locus, in which the pairs of epistasis between qSL16-1 and 2 loci had negative effects on SL increase and the pairs of epistasis between qSL17-1 and 4 loci played positive roles in SL increase. The results showed that qSL17-1 may play an important role in regulating SL.
Thirty-eight pairs of epistatic effects related to SW occurred between a major QTL and a locus, in which 20 pairs of epistasis had negative effects on the increase of SW. There were epistatic interactions between qSW1-2 and 34 loci, indicating that qSL17-1 may play an important role in regulating SW.
Eighty-eight pairs of epistasis related to PoTSPN occurred between two major QTL, and the other occurred between one A B FIGURE 6 | QTL fine mapping. (A) The qHSW-12-3 locus was detected on chromosome 12. Positional mapping narrowed the qHSW-12-3 locus to a 64.4 kb region between M371 and M374. The numbers above the bars indicated the number of recombinants. (B) Fine-map the genotype of the population and the significance of the corresponding phenotype. Gray indicates Suinong14 genotype, black indicates ZYD00006 genotype. a, b: Different letters represent significant differences between each other at the 0.01 level.
major QTL and one locus. A total of 261 pairs of epistatic played positive roles in reducing PoTSPN. qPoTSPN11-1, qPoTSPN13-1 and qPoTSPN20-3 not only had epistatic interactions with 6, 4, and 4 major QTLs, respectively, but also had epistatic interactions with 14, 27, and 28 loci, respectively, indicating that the three QTLs may play important roles in regulating PoTSPN.
One hundred sixty pairs of epistasis related to PoTHSPN occurred between two major QTLs, and the other occurred between one major QTL and one locus. qPoTHSPN11-1, qPoTHSPN13-1, and qPoTHSPN20-1 had epistatic interactions not only with 4, 2, and 4 major QTLs, respectively, but also with 51, 56, and 45 loci, respectively, indicating that the three QTLs may play important roles in regulating PoTHSPN.

DISCUSSION
Wild soybean was domesticated approximately 6,000-9,000 years ago (Zhou et al., 2015), and is recognized as the wild ancestor of cultivated soybean (Zhang et al., 2018). In the process of domestication, due to purposeful selection by human beings, the genetic diversity of cultivated soybean became much lower than that of wild soybean (Hyten et al., 2006;Lam et al., 2010). In this study, CSSLs were constructed between cultivated soybean SN14 and wild soybean ZYD00006. Through multigeneration backcrossing, the introduction of the wild soybean genome greatly enriched the genetic base of cultivated soybean, which led to rich phenotypic variation in the CSSLs. For example, the maximum HSW is 25.69 g, the minimum is 9.83 g; the maximum SPW is 38.68 g, the minimum is 11.62 g; and the maximum PoTSPN is 59.06%, the minimum is 15.08% (Figure 2; Supplementary Table 1). In this study, we also employed SNP markers to divide the soybean genome into 3,780 bin markers. High-density bin markers can improve the efficiency and accuracy of QTL mapping, which makes it possible to perform fine mapping of QTLs and gene mining.
In the next step of research, 129 lines covering 89.65% of wild soybean genome were selected from CSSLs and hybridized with SN14. MAS was used to shorten the introgressed fragments, purify the genetic background of the CSSLs and improve the genome coverage of wild soybean. Finally, we will complete the construction of whole-genome substitution lines so that they have greater utilization value.
In order to increase the accuracy of QTL mapping, we integrated the phenotypes in different years to calculate the BLUE value of each trait. BLUE, derived from the linear mixed model, has been popularly used to estimate animal and plant breeding values for a few decades (Kim, 2020). BLUE can more effectively eliminate the deviation caused by the environment and improve the accuracy FIGURE 8 | Epistasis analysis of CSSLs. Epistasis of seed-pod traits that have repeatedly appeared over the years. Dark blue represents the epistatic effect among HSW loci, red represents the epistatic effect among PoOSPN loci, black represents the epistatic effect among SL loci, yellow represents the epistatic effect among SW loci, green represents the epistatic effect among PoTHSPN loci, gray represents the epistatic effect among PoTSPN loci. of QTL mapping. Muqaddasi et al. (2017) studied anther extrusion of wheat accessions by genome-wide association studies with BLUE of anther extrusion. Then, they predicted the candidate genes which could be directly or indirectly involved in anther extrusion.  employed the BLUE of soybean plant height and the number of nodes on the main stem for QTL mapping, and predicted two candidate genes for simultaneous regulation of these two traits. In addition, we selected appropriate method to improve the accuracy of QTL mapping. Because most CSSLs used for QTL mapping are non-idealized CSSLs (There is only one import fragment in each CSSL.; Li et al., 2015;Bux et al., 2019). There are redundant markers, and there may be multicollinearity between markers. The goal of QTL mapping is to explore the impact of different markers on target traits and quantitatively describe their impact on target traits. Multicollinearity between markers will reduce the positioning ability and calculation accuracy, so multicollinearity must be controlled. Wang et al. (2006) developed RSTEP-LRT model for QTL mapping of non-idealized CSSLs. Firstly, the model will evaluate the multicollinearity between different markers and delete redundant markers. However, there is no theoretical method to find an appropriate threshold to calculate whether the multicollinearity is high. In this study, we used the condition number −1,000 as the threshold of multicollinearity, which was one preset value in the software. It is equivalent to deleting duplicate markers, to reduce the adverse effect of multicollinearity on QTL mapping.
QTL mapping is based on the difference in natural alleles among parent varieties (Ma et al., 2019), and its mapping accuracy is limited by the mapping population. CSSLs can eliminate background interference during QTL mapping and maintain high mapping accuracy when QTLs are located on very small fragments. In recent years, CSSLs have been widely used in QTL mapping and candidate gene mining (Balakrishnan et al., 2019). Huo et al. (2017) used CSSLs to fine map the number of grains per panicle and verify the function of the candidate gene NOG1. Yang et al. (2017) fine mapped soybean HSW, protein content, and oil content using CSSLs. The bidirectional sucrose transporter gene Glyma.15 g049200 was predicted to be a pleiotropic regulatory gene that regulates HSW, protein content, and oil content. In our study, QTL mapping of seed-pod traits in wild soybean was carried out by CSSLs of wild soybean. Eight QTLs related to HSW were identified for more than 2 years, of which four QTLs coincided with previous mapping results (Hyten et al., 2004;Gai et al., 2007;Kuroda et al., 2013), namely qHSW9-2, qHSW11-2, qHSW17-3, and qHSW17-3. Five-seed length QTLs and four seed width QTLs were reported steadily for more than 2 years, and two of them were seed width QTLs located on chromosome 1 for the first time. qPoTSPN20-3, qPoTHSPN20-1, and qPoFSPN20-3 are the same locus detected on chromosome 20. This is the only QTL detected that has been stable for more than 2 years and is related to the POTSPN, PoTHSPN, and PoFSPN. Compared with the results of previous studies, we found that the QTL detected on chromosome 20 coincided with the cloned Ln (Sayama et al., 2017), and Ln had been demonstrated to have a polymorphism to control the leaf shape and pod number for soybean. The above results reflect the accuracy and effectiveness of this set of CSSL materials in determining the positions of QTLs.
To explore the excellent alleles regulating HSW, we constructed a secondary segregation population by backcrossing CSSL-R183 (which had an extremely small HSW) with SN14. The HSW QTL was mapped to the 64.4 Kb region of chromosome 12 by two fine mapping. Glyma.12G088900 was preliminarily identified as a candidate gene by sequence analysis of genes in the 64.4 kb region, relative expressivity analysis at important developmental stages, and haplotype analysis. Glyma.12G088900 belongs to the F-box/LRR family of proteins, which can regulate a variety of activities, such as regulating plant floral organ development, responding to biotic and abiotic stresses, and regulating growth, development and crop yield (Chen et al., 2013;Jia et al., 2015;Baute et al., 2017;He et al., 2017). At the same time, the homologous gene in Arabidopsis (AT1G67190) can interact with SKP1-like protein to form the SCF complex and regulate grain development, in which ASK1 and ASK2 have the function of regulating embryogenesis and seed development (Liu et al., 2004;Kuroda et al., 2012).
Many studies have shown that epistatic effects were important factors affecting agronomic traits of soybean and important genetic components of quantitative trait variation, and the epistatic effects between major QTLs had very important effects on quantitative traits (Jiang et al., 2018). Qi et al. (2017) used RIL and CSSLs to carry out QTL mapping and epistatic analysis of soybean oil traits, and a total of 11 QTLs and 4 pairs of epistatic QTLs were detected, which provided important information for soybean molecular-assisted breeding. Balakrishnan et al. (2019) used CSSLs to carry out QTL mapping and epistatic analysis. A total of 22 QTLs and 17 pairs of significant epistatic QTLs for 11 traits were detected, which provided valuable resources for gene discovery and yield improvement. In this study, 955 pairs of epistatic QTLs were detected by CSSLs to analyze seed-pod traits of soybean. Among them, qHSW17-3, qPoOSPN14-1, qPoTSPN11-1, qPoTSPN13-1, qPoTSPN20-3, qPoTHSPN11-1, and qPoTHSPN20-1 have epistatic interactions not only with other major QTLs, but also with a large number of loci. But the proportions of phenotypic variation explained of above epistasis were low, the highest is only 0.62%. However, when a large number of epistatic effects were aggregated, it was still an important part that cannot be ignored in actual production. For breeding programs to improve seed-pod traits, in addition to the most important process of aggregating reliable QTLs (especially the main QTLs), attention should also be paid to the loci and interaction effects that interact with main QTLs, then the QTLs can make the greatest contribution to target traits (Jiang et al., 2011). In addition, bin5858 (qPoTSPN11-1, qPoTHSPN11-1) had epistasis interactions with bin6769 (qPoTSPN13-1, qPoTHSPN13-1) and bin10791 (qPoTSPN20-3, qPoTHSPN20-1) in PoTSPN, PoTHSPN, respectively. Interestingly, the two pairs epistasis resulted in a decrease in PoTSPN, while an increase in PoTHSPN. The results showed that seed-pod traits were often closely related, and the same locus can not only directly affect different pod characters, but also indirectly regulate them through epistasis (Fang et al., 2017;Yang et al., 2021). The loci controlling several pod traits and the loci with a large number of epistasis found in this study can provide valuable information for molecular-assisted breeding of soybean.

CONCLUSION
In this study, 580,542 SNPs were identified between SN14 and ZYD00006, and were divided into 3,780 bin markers. A total of 3,780 bin markers were used for QTL mapping of seed-podrelated traits in CSSLs. A total of 170 QTLs were detected, of which 32 QTLs appeared stably in different years. A secondary segregation population was constructed for HSW, and qHSW-12-1 was mapped to the 64.4 Kb region of chromosome 12. qRT-PCR showed that there were significant differences in the gene during seed development in the two parents. Further analysis showed that there was a difference in Glyma.12G088900 in the promoter region and an amino acid difference in the coding region, which may affect HSW. A total of 955 pairs of epistatic sites were detected in epistatic analysis. Among them, 161 QTL mapped by CSSLs have epistatic effect. The construction of wild soybean CSSLs provides excellent genetic materials for broadening the genetic base of cultivated soybean and mining important candidate genes, and the genetic study of related traits lays a theoretical foundation for the study of seed-pod traits in soybean.

DATA AVAILABILITY STATEMENT
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found in the article/Supplementary Material.

AUTHOR CONTRIBUTIONS
CL, HJ, and QC conceived the study and designed and managed the experiments. MY, CL, and ZQ provided plant lines. HZ, FC, and RW performed trials and collected data. ZZ, RZ, DX, and HZ completed statistical analyses of phenotypic data. HZ, LH, and JX contributed to writing the paper. All authors contributed to the article and approved the submitted version.