Identification and Mapping of a New Soybean Male-Sterile Gene, mst-M

The use of sterility is common in plants and multiple loci for hybrid sterility have been identified in crops such as rice. In soybean, fine-mapping and research on the molecular mechanism of male sterility is limited. Here, we identified a male-sterile soybean line, which produces larger, abnormal pollen grains that stain poorly with I2-KI. In an inheritance test, all F1 plants were fertile and the F2 and F2:3 populations conformed with the expected segregation ratio of 3:1 (fertility:sterility) (p = 0.82) and showed a 1:2:0 ratio of homozygous fertile: heterozygous fertile: homozygous sterile genotypes (p = 0.73), suggesting that the sterility was controlled by a single recessive gene (designated “mst-M”). Bulked segregant analysis showed that almost all single-nucleotide polymorphisms (SNPs; 95.92%) were distributed on chromosome 13 and 868 SNPs (95.81%) were distributed in the physical region of Chromosome 13.21877872 to Chromosome 13.22862641. Genetic mapping revealed that mst-M was flanked by W1 and dCAPS-1 with genetic distances of 0.6 and 1.8 cM, respectively. The order of the consensus markers and known sterility genes was: Satt146 – (5.0 cM) – st5 – (2.5 cM) – Satt030 – (15.3 cM) – ms6 – (5.0 cM) – Satt149 – (39.5 cM) – W1 – (0.6 cM) – mst-M – (14.1 cM) – Satt516 (7.5 cM) – ms1 – (16.3 cM) – Satt595. These results suggest that mst-M is a newly identified male-sterility gene, which represents an alternative genetic resource for developing a hybrid seed production system for soybean.


INTRODUCTION
Sterility is a common phenomenon among plants. On the basis of the mode of inheritance, two main types of sterility have been identified in plants: cytoplasmic sterility and nucleus-dependent sterility (Zhang et al., 2008;Chen and Liu, 2014;Yang et al., 2014;Speth et al., 2015;Bohra et al., 2016;Chang et al., 2016;Liu et al., 2018;Xie et al., 2018). In plant breeding, hybrid seeds/lines are advantageous because they produce high yields. In comparison to female sterility, male sterility, including cytoplasmic male sterility (CMS) and genetic male sterility (GMS), has wide applications in commercial crop hybrids because male sterility greatly increases the effectiveness of F 1 hybrid seed production without manual pollination and can dramatically reduce production costs (Cheng et al., 2007;Xu et al., 2007;Chen et al., 2011;Huang et al., 2014). The most successful application of male sterility in crop hybrid seed production is in rice (Oryza sativa; Cheng et al., 2007;Huang et al., 2014). In recent years, many male sterility genes have been identified and cloned, and the underlying genetic and molecular mechanisms have been described. For example, MS1, a newly evolved gene in wheat (Triticum aestivum; Poaceae), is specifically expressed in microsporocytes and is essential for microgametogenesis . OsPKS2 encodes a polyketide synthase that is involved in pollen wall formation in rice. Male sterility can be caused by OsPKS2 mutation (Zou et al., 2018). Two-line hybrid rice was developed based on the discovery of photoperiod-sensitive male sterility (PSMS) germplasm, which is male-sterile under longday conditions but fertile under short-day conditions (Fan et al., 2016). Recently, the molecular mechanism of a PSMS gene, Pms1, which encodes a long-non-coding RNA, PMS1T, was elucidated (Fan et al., 2016). In addition, other male-sterility genes, such as ZmMs33 in maize (Zea mays; Xie et al., 2018) and MSH1 in Brassica juncea , have been cloned and well characterized.
Soybean is an economically important crop that is grown worldwide for the high contents of oil (20-25%) and protein (42-45%) in its seeds (Adak and Kibritci, 2016). Although soybean is extremely important for human consumption, it has a lower yield than other important food crops, such as wheat, rice, and maize (Wen et al., 2016;Gao et al., 2018;Ma et al., 2018). Although a three-line system based on CMS has been developed in soybean, large-scale hybrid seed production is difficult to achieve because of the extremely low frequency of natural cross-pollination (Ray et al., 2003). Therefore, the application of male sterility in soybean hybrid seed-production research has typically lagged behind other crops. At least three CMS restorer loci have been identified in soybean . The CMS restorer locus Rfm was fine-mapped to chromosome 16 between the flanking markers GmSSR1602 and GmSSR1610 with genetic distances of 0.11 and 0.2502 cM, respectively, and a pentatricopeptide repeat gene was predicted to be the candidate restorer gene. Yang et al. (2010) mapped two independent Rf loci linked to Satt626 in molecular linkage group (MLG) M and Satt300 in MLG A1 at genetic distances of 9.75 and 11.18 cM, respectively. With regard to GMS, more than 20 male-sterility loci have been reported, including ms1-ms9, msMOS, msp, st1-st8, ASR-7-206, A03-2137, A05-133, A06-204, andst_A06-2/6 (Owen, 1928;Hadley and Starnes, 1964;Palmer, 1974Palmer, , 2000Palmer et al., 1978Palmer et al., , 2008Delannay and Palmer, 1982;Palmer and Kaul, 1983;Palmer, 1985, 1988;Graybosch et al., 1987;Horner and Palmer, 1995;Jin et al., 1997Jin et al., , 1998Palmer and Horner, 2000;Cervantes-Martinez et al., 2007Rebeccaa et al., 2011;Baumbach et al., 2012), the majority of which have been mapped to a linkage group (Yang et al., 2014;Speth et al., 2015). However, none of these loci have been successfully cloned or fine-mapped. Therefore, the molecular mechanisms of male sterility in soybean are largely unknown.
In this study, we identified a soybean male-sterile mutant line from a soybean breeding line. We investigated the genetic mechanism of the mutant line and fine-mapped the locus for male sterility. Our results provide a foundation for cloning malesterility genes and elucidating the molecular mechanism of male sterility in soybean.

Plant Materials
In our soybean breeding program, we observed an advancedgeneration soybean breeding line (here designated "Fertility wild type", "F-wt") that showed segregation for sterility. The sterile line is here designated "Sterility-Mutant" (St-M). The St-M line was preserved in the heterozygous F-wt/St-M line. Soybean 'Jidou No. 12' (JD12) was used as the male or female parent in crosses with St-M. F 1 , F 2 , F 2:3 , BCF 1 , BCF 1:2 , and BCF 2 populations and a F 5:6 residual heterozygous (RH) line were developed to investigate the genetic mechanism of sterility and fine-map the sterility gene in St-M.

Morphological Analysis
Anthers from unopened flowers were excised and stained in 1% iodine-potassium iodide (I 2 -KI) solution. The morphology of stained pollen grains, including the shape, size, and color of the grains, was observed under a Leica EZ4 HD microscope. Images of stained pollen grains were captured using a Leica DM2000 LED microscope. The total number of pollen grains of each anther in the visual field was artificially counted, with nine replications for each phenotype.
Mature pollen grains of St-M and F-wt plants were dissected and immediately examined under a scanning electron microscope (SEM) to determine morphological differences between the two lines following Willingham and Rutherford (1984). The diameter of the observed pollen grains was measured according to the scale of the pollen SEM images.

Statistical Analysis
The segregation of sterility phenotypes was evaluated in the F 1 , F 2 , F 2:3 , BCF 1 , BCF 1:2 , and BCF 2 populations. Goodness of fit to theoretical ratios was assessed using the chi-square (χ 2 ) test in accordance with methods described by Bailey (2012), i.e., χ c 2 = (|E O − E| − 0.5) 2 /E or χ 2 = (E O − E) 2 /E, where E O and E are the observed and expected frequencies, respectively. The Yates correction factor (0.5) was applied in the χ c 2 calculation with one degree of freedom (Brown, 2004).

Development of dCAPS Markers
Primers for the derived cleaved amplified polymorphic sequences used in this study were designed using the dCAPS Finder 2.0 program 1 (Neff et al., 2002). Because no restriction site adjacent to the mutation site was available, a single nucleotide mismatch was introduced adjacent to the single-nucleotide polymorphism (SNP) position to create a restriction enzyme recognition site in the amplicon of one allele but not the other. The genotypes of the tested soybean plants were determined with the dCAPS marker. The PCR amplification reactions were carried out as follows. The reaction mixture contained 1 µL (50 ng) template DNA, 2.5 µL of 10× Ex Taq TM HS buffer, 2 µL of 0.2 mM dNTPs, 0.125 µL (1 U) Ex Taq HS DNA polymerase, 0.5 µL (15 pmol) of each forward and reverse primer, and water to make up the volume to 25 µL. The following PCR amplification procedure was adjusted according to the primer set used: 95 • C for 5 min, and then 30 cycles of 95 • C for 30 s, 53-58 • C for 30 s, and 72 • C for 30 s, followed by a final extension at 72 • C for 10 min. The amplicons were digested using suitable restriction endonucleases in a final volume of 25 µL in accordance with the manufacturer's instructions. The digested products were separated by electrophoresis in a 3% agarose gel and visualized by staining with ethidium bromide.

DNA Extraction and SSR Marker Analysis
Total genomic DNA was extracted from soybean leaves using the cetyltrimethylammonium bromide (CTAB) method following Doyle (1991) with minor modifications. Sixteen simple sequence repeat (SSR) markers located on soybean chromosome 13 flanking the male sterility gene were selected from the Soybase database 2 . Marker polymorphism between St-M and JD12 was evaluated. Finally, three polymorphic SRR markers, namely Satt516, Satt146, and Satt149, were selected for genotyping of progenies. For SSR analysis, amplification reactions were carried out as described for the dCAPs marker analysis. The PCR products amplified by the SSR primers were visualized after electrophoresis in an 8% polyacrylamide gel followed by silver staining. All primers were synthesized by Shanghai Invitrogen Biotechnology Co., Ltd. (Shanghai, China).

Bulked Segregant Analysis Based on Genomic DNA Resequencing
Based on the phenotypes of the offspring, a single RH plant segregating for sterility and fertility was identified and selected from the F 6 generation. Seeds from the RH plant were individually harvested to form a sub-F 2 population, which consisted of 135 plants. Equal amounts of leaves from male-sterile F 2 plants were sampled and homogenized, whereas leaf samples from male-fertile F 2 plants were collected individually. The genotypes of the male-fertile F 2 plants were determined based on their progenies. Equal amounts of leaves from homozygous F 2 plants were sampled and homogenized. Finally, two pools were formed from leaves of 34 plants homozygous for male sterility and leaves of 36 plants homozygous for male fertility, respectively. Genomic DNA was extracted from the homogenized sample pools using the established CTAB protocol (Doyle, 1991). The genomic DNA was used to construct sequencing libraries with the TruSeq Library Construction Kit following the manufacturer's protocol, and the libraries were sequenced using an Illumina Hi-seq platform. The sequencing reads were aligned to the soybean reference genome Williams 82.a2.v1 using the BWA software 3 with default parameters. Subsequent processing, including removal of duplicate reads, was performed using SAMtools and PICARD 4 . The raw SNP sets were called by SAMtools with the parameters '−q 1 −C 50 −m 2 −F 0.002 −d 1000. ' We then filtered the data sets using the following criteria:

Scoring and Linkage Analysis
The genotype and phenotype for each F 2 plant were recorded based on SSR or dCAPS allelic patterns. A plant was scored as homozygous for the male-sterile parent alleles (A), homozygous for the male-fertile parent alleles (B), or heterozygous for the F 1 alleles (H). An asterisk ( * ) was used to represent missing data. The phenotype of male sterility and white flowers was coded as "A, " and other phenotypes were coded as "D." Based on the genotype and phenotype scores, a linkage map was constructed using JoinMap 4.1 with Kosambi mapping methods (Kosambi, 2016). Linkage maps were drawn using the MapChart 2.2 software (Voorrips, 2002).

Phenotypic Characterization of Pollen Grains From Sterile and Fertile Plants
The growth and development of the sterile line (St-M) and the fertile wild-type parent (F-wt) were compared over an entire growing season. No difference in phenotype between St-M and F-wt was observed before flowering. From the R1 growth stage, St-M plants showed early abscission of flowers or development of small, fleshy but seedless pods (Figure 1b), whereas F-wt plants showed normal flower and fruit development (Figure 1a). Observation of about 3400 St-M sterile plants showed that sterility was absolute with no seed development observed. Given that St-M plants could produce small pods, we speculated that the sterility of St-M was not caused by the pistil, but rather by abnormal pollen development.
To test this hypothesis, we observed pollen grains microscopically. The pollen grains from St-M plants varied greatly in size and stained poorly with I 2 KI, whereas pollen grains from F-wt plants were uniform in size and were stained intensely with I 2 KI (Figures 1e,f). These observations indicated that the St-M pollen grains were aborted. Observation of pollens grains from St-M and F-wt with a SEM showed that F-wt produced small, rounded pollen grains full of cytoplasm, whereas St-M pollens grains were shriveled or collapsed (Figures 1c,d). In a previous study, cytological analysis suggested that male sterility in ms1 soybean was caused by the failure of cytokinesis after telophase II of meiosis (Albertsen and Palmer, 1979). Lipids and starch were deposited in the enlarged but non-functional pollen grains (Albertsen and Palmer, 1979). To determine whether the mst-M gene was involved in similar process, the number and diameter of pollen grains in F-wt and St-M plants were evaluated. The results showed that the average total number of pollen grains per F-wt flower was about 580, whereas St-M flowers (150) had about one-quarter the pollen grains of F-wt (Figure 2A). The average diameter of St-M pollen grains was about 36 µm, which was approximately 1.6-fold greater than that of F-wt pollen grains ( Figure 2B). We concluded that the sterility of St-M was caused by abortion of the pollen grains.   Table S1). This result implied that the pistil of St-M was normal and the sterility of St-M was caused by pollen abortion, which was consistent with the microscopic observations. The inheritance of male sterility was evaluated in six progeny populations generated from the crosses St-M (♀) × JD12 (♂) and JD12 (♀) × F 1 [St-M (♀) × JD12 (♂)]. All F 1 plants were fertile, and the F 2 populations, which consisted of 418 plants, exhibited a good fit to the expected segregation ratio of 3:1 (fertility:sterility) (p = 0.82; Table 1). Progeny testing of the 316 F 2 plants showed a 1:2:0 ratio of homozygous fertile:heterozygous fertile:homozygous sterile genotypes (p = 0.73). The segregation results for the F 1 , F 2 , and F 2:3 populations suggested that the sterility of St-M was controlled by a single recessive gene. We designated the gene that controlled sterility "male sterile-Mutant" (mst-M).
Backcrosses were also performed to determine whether cytoplasm could affect sterility in the crosses. All 22 BCF 1 progeny showed normal fertility like the female parent JD12 ( Table 1). The 22 BCF 1 progeny displayed a 1:1 ratio of homozygous fertile:heterozygous fertile genotypes (p = 0.52). The BCF 2 -Seg population, which consisted of 487 BCF 2 individuals from nine segregated BCF 1:2 lines, showed a good fit to a theoretical ratio of 3:1 (fertility:sterility) (p = 0.20). These results supported the conclusion that the sterility of St-M was controlled  by a single recessive gene, and further implied that mst-M was not affected by the cytoplasm.

Bulked Segregant Analysis Through Next-Generation Sequencing
Fertile and sterile DNA bulks, which were developed from a heterozygous fertile F 5:6 line, were used for bulked segregant analysis to localize the mst-M gene to an individual chromosome through whole-genomic next-generation sequencing. A total of 906 variable SNPs of high quality (>100) were identified between the fertile and sterile bulks (Supplementary Table S2). More than 76% of the SNP variation was distributed in introns, intergenic regions, and downstream of coding regions, and was unlikely to cause mst-M function deficiency, whereas 1.5, 4.53, 11.48, 0.11, and 5.85% of the SNP variation was distributed in 5 -UTR, 3 -UTR, upstream, splicing, and exonic coding regions, respectively ( Figure 3A). Almost all of the SNPs (95.92%) were distributed on chromosome 13 and 868 SNPs (95.81%) were distributed in the physical region of Chromosome 13. 21877872 to Chromosome 13. 22862641 ( Figure 3B). These results strongly suggested that the mst-M gene was located on chromosome 13 approximately in the physical region of Chromosome 13. 21877872 to Chromosome 13. 22862641. All of the original re-sequencing data for the fertile and sterile DNA bulks have been submitted to the SRA database (SRA accession number PRJNA509511; BioSample accessions numbers SAMN10583719 and SAMN10583720, respectively).

Genetic Mapping and Allele Analysis
To determine the genetic location of the mst-M gene and the relationship of mst-M with three previously identified male-sterility genes (ms1, ms6, and st5) (Figures 4A-C), the marker distribution on chromosome 13 was analyzed. Three polymorphic SSR markers, three developed dCAPs markers, and a morphological marker (the flower color locus W1) were used to detect the genotypes of a large F 2 population consisting of 1138 individual plants (Supplementary Table S3). Genetic mapping revealed that mst-M was flanked by W1 and dCAPs-1. The genetic distance of the two markers was 0.6 and 1.8 cM from mst-M ( Figure 4D). Five SSR markers, namely Satt146, Satt149, Satt030, Satt516, and Satt595, were used as consensus markers to project all four male-sterility genes on an integrated map of linkage group F. The integrated linkage map showed that the order of the consensus markers and sterility genes was as follows: These results suggested that mst-M was a newly identified malesterility gene.

DISCUSSION
Two main kinds of male sterility have been well documented, CMS and GMS. For CMS, male sterility genes are usually maintained through crossbreeding with a maintainer line, and the CMS system has been widely used in rice hybrid production. However, although a three-line hybrid system based on CMS has been developed in soybean, large-scale hybrid seed production is difficult to achieve because of the extremely low frequency of natural cross-pollination. Therefore, the main limiting factor for CMS application in soybean hybrid seed production is the low frequency of natural cross-pollination. In the last 10 years, many researchers have tried to find external factors that can improve the frequency of natural cross-pollination. For example, insects such as alfalfa leaf-cutting bees and honey bees cannot only improve the soybean yield but also significantly enhance the outcrossing pod-set rate of CMS lines (Ortiz-Perez et al., 2008;Yang et al., 2008;Zhao et al., 2009;Wang et al., 2010;Milfont et al., 2013;Blettler et al., 2017;Dai et al., 2017), and a relatively low temperature before the flowering stage can also enhance the soybean outcrossing rate (Shimamura et al., 2010).
In contrast, GMS, in which the male sterility genes are mainly preserved in heterozygous plants, has been successful applied in the recurrent selection (RS) breeding system. The RS breeding system is an effective strategy that can accelerate the breeding process by pyramiding multiple elite genes. Although  Song et al. (2004). Green, bold, and italic fonts represent loci for identified genes. Blue, bold, and italic fonts with underlines are the consensus markers used for projection onto Integrated-map-F. the need for emasculation and manual pollination limited the utility of RS in breeding processes for autogamous species, the introduction of male sterile genes to the RS breeding system changed the situation. In wheat, more than 40 new varieties have been developed based on the DMS wheat-based RS breeding platform, which uses the dominant dwarf gene Rht-D1c and male-sterile gene Ms2 together (Liu and Yang, 1991;Ni et al., 2017;Xia et al., 2017). In soybean, RS breeding platform have been developed by using ms1 gene, and several commercial cultivars have also been developed using this RS breeding system, such as Jidou19, Jidou 20, and Jidou 21 (Zhao et al., 2010a(Zhao et al., ,b, 2011. However, ms1 plant always produce a high frequency of twin seedlings which considerably disrupts the genetic diversity of RS breeding platform, and it is a difficult and time-consuming work to eliminate these useless twin seedlings. In contrast, the newly identified mst-M gene can cause absolute sterility plant which is very useful in maintaining the genetic diversity of RS breeding platform. More importantly, mst-M gene is closely linked to flower color and this linked morphological trait will facilitate breeders accelerating breeding process.
The exploitation of heterosis through the CMS system and the breeding of new varieties through the RS system are both economic, effective, and feasible approaches to improving crop yields. Male sterility plays an important role in the process of improving soybean yields. Unfortunately, although more than 20 genetic loci for male sterility have been identified in soybean, none of these loci have been developed for use in hybrid seed production. In comparison to research on maize or rice, there has been little fine-mapping and research on the molecular mechanism of male sterility in soybean, and none of the known male-sterility loci have been cloned using a genetic map-based approach. Therefore, newly identified male-sterility loci could provide alternative genetic material for developing hybrid seeds, and localization of the chromosomal positions of male-sterility genes will promote map-based cloning. In the present study, we identified and fine-mapped a novel male-sterility locus designated "mst-M". Genetic analysis indicated that mst-M was closely linked with the flower color locus W1, and it was comapped to chromosome 13 with ms1, ms6, and st5. Because there is no sterile maintainer line for ms1, ms6, or st5, an allelism test of mst-M with ms1, ms6, or st5 could not be performed. Therefore, the genetic locations and pollen morphology of mst-M, ms1, ms6, and st5 are compared and discussed below.
The ms1 gene was first reported in 1971, and a high frequency of twin seedlings was observed in male-sterile plants (Brim and Young, 1971;Kenworthy et al., 1973). Cytological analysis suggested that male sterility in ms1 soybean was caused by the failure of cytokinesis after telophase II of meiosis (Albertsen and Palmer, 1979). However, lipid and starch deposits in the enlarged but non-functional pollen grains developed as in normal pollen grains (Albertsen and Palmer, 1979). In the present study, the pollen grains produced by mst-M male-sterile plants stained poorly with I 2 -KI (Figure 1f), which suggested that few or no starch grains were formed in the pollen grains. This result implied that the molecular mechanism of sterility caused by mst-M was different to that of ms1. An additional notable difference between mst-M and ms1 male-sterile plants is the occurrence of twin seedlings; absolute sterility was observed in mst-M male-sterile plants and no twin seedlings were seen based on observation of thousands of individual plants. Thus, we speculate that mst-M is non-allelic to ms1.
Unlike ms1, the process of meiosis in ms6 male-sterile plants is similar to that of fertile plants, with normal meiocytes forming in metaphase I and telophase II progressing normally but without microspore wall formation (Skorupska and Palmer, 1989). These observations imply that the pollen grains of ms6 male-sterile plants will not vary greatly in size nor I 2 KI staining intensity, which differs markedly from the pollen grains of mst-M malesterile plants (Figures 1d,f). Furthermore, a unique pleiotropic effect of the ms6 allele is that it may cause a smaller flower size (Skorupska and Palmer, 1989), which differs from the phenotypic effects of the other male-sterility genes. These characteristics strongly suggest that mst-M is non-allelic to ms6.
Plants with the st5 gene, which was identified as a desynaptic mutant gene, may show an absence of chromosome pairing at diakinesis (Palmer and Kaul, 1983). A notable characteristic of pollen grains in st5 male-sterile plants is the wide variation in pollen grain size and poor staining with I 2 -KI (Palmer and Kaul, 1983). Both features are similar to those of pollen grains of mst-M male-sterile plants (Figure 1f). However, recombination percentages between st5 and W1 of about 26.7% were reported based on observation of two different F 2 populations (Palmer and Kaul, 1983), and st5 was mapped to the interval between the markers Satt146 and Satt030 with genetic distances of 5.0 and 2.5 cM, respectively (Speth et al., 2015). In the present investigation, male sterility was always accompanied by white flowers in the 1138 individual plants used for genetic mapping, and only six plants showed a chromosomal rearrangement between mst-M and W1. We localized mst-M to the interval between locus W1 and the marker dCAPs-1 with genetic distances of 0.6 and 1.8 cM, respectively. Therefore, mst-M is distinctly non-allelic to st5 based on the predicted physical positions for mst-M and st5.
The novel male-sterility gene identified in the present study provides an alternative genetic resource for development of a hybrid seed production system for soybean. Most importantly, the closely linked W1 locus enables breeders to distinguish malesterile plants at the seeding stage through hypostyle color. In addition, the present study lays a foundation for map-based cloning of male-sterility genes in soybean. Elucidation of the molecular mechanism of reproductive pathways would help to facilitate the cloning of male-sterility genes.