Identification of the Important Genes of Bradyrhizobium diazoefficiens 113-2 Involved in Soybean Nodule Development and Senescence

Legume nodule development and senescence directly affect nitrogen fixation efficiency and involve a programmed series of molecular events. These molecular events are carried out synchronously by legumes and rhizobia. The characteristics and molecular mechanisms of nitrogen fixation at soybean important developmental stages play critical roles in soybean cultivation and fertilizer application. Although the gene expression of soybean were analyzed in nodules at five important soybean developmental stages, information on the expression of rhizobial genes in these nodule samples is limited. In the present study, we investigated the expression of Bradyrhizobium diazoefficiens 113-2 genes in the nodule samples from five developmental stages of soybean (Branching stage, flowering stage, fruiting stage, pod stage and harvest stage). Similar gene expression patterns of B. diazoefficiens 113-2 were existed during optimal symbiotic functioning, while different expression patterns were found among early nodule development, nitrogen fixation progress and nodule senescence. Besides, we identified 164 important different expression genes (DEGs) associated with nodule development and senescence. These DEGs included those encoding nod, nif, fix proteins and T3SS secretion system-related proteins, as well as proteins involved in nitrogen metabolism, ABC transporters and two-component system pathways. Gene Ontology, KEGG pathway and homology analysis of the identified DEGs revealed that most of these DEGs are uncharacterized genes associated with nodule development and senescence, and they are not core genes among the rhizobia genomes. Our results provide new clues for the understanding of the genetic determinants of soil rhizobia in nodule development and senescence, and supply theoretical basis for the creation of high efficiency soybean cultivation technology.

Legume nodule development and senescence directly affect nitrogen fixation efficiency and involve a programmed series of molecular events. These molecular events are carried out synchronously by legumes and rhizobia. The characteristics and molecular mechanisms of nitrogen fixation at soybean important developmental stages play critical roles in soybean cultivation and fertilizer application. Although the gene expression of soybean were analyzed in nodules at five important soybean developmental stages, information on the expression of rhizobial genes in these nodule samples is limited. In the present study, we investigated the expression of Bradyrhizobium diazoefficiens 113-2 genes in the nodule samples from five developmental stages of soybean (Branching stage, flowering stage, fruiting stage, pod stage and harvest stage). Similar gene expression patterns of B. diazoefficiens 113-2 were existed during optimal symbiotic functioning, while different expression patterns were found among early nodule development, nitrogen fixation progress and nodule senescence. Besides, we identified 164 important different expression genes (DEGs) associated with nodule development and senescence. These DEGs included those encoding nod, nif, fix proteins and T3SS secretion system-related proteins, as well as proteins involved in nitrogen metabolism, ABC transporters and two-component system pathways. Gene Ontology, KEGG pathway and homology analysis of the identified DEGs revealed that most of these DEGs are uncharacterized genes associated with nodule development and senescence, and they are not core genes among the rhizobia genomes. Our results provide new clues for the understanding of the genetic determinants of soil rhizobia in nodule development and senescence, and supply theoretical basis for the creation of high efficiency soybean cultivation technology.

INTRODUCTION
The symbiotic relationship between legume and soil rhizobia leads to the formation of root nodule and fixation of atmospheric nitrogen (Ferguson et al., 2010). Depending on the persistence of the nodule meristem and morphology characteristics of nodules, nodules in legumes can be divided into determinate nodules and indeterminate nodules (Hirsch, 1992). Nodules formed in Glycine max and Lotus japonicus are determinate nodules, they have non-persistent meristems and predominately contain one specific form of the bacterial development, which is determined by the nodule's age (Karr and Emerich, 1988;Karr et al., 1990). Indeterminate nodules, such as those in Medicago truncatula and Astragalus sinicus, contain persistent meristems and a mixture of bacteroid forms in all stages of development (Vasse et al., 1990). In both determinate and indeterminate nodules, nodule development directly affects nitrogen fixation activity, and the premature senescence of nodules negatively regulates the efficiency of symbiotic nitrogen fixation (Serova et al., 2018). Delaying nodule senescence is a practical measure to increase nitrogen fixation amount and solve the problem of late defertilization in legume crops. Therefore, studies on the root nodule development and senescence are of great significance in the efficient use of nitrogen sources and legume crop yield.
Legume nodule development is initiated by an exchange of chemical signals between legumes and soil rhizobia, and included lots of protein-protein interactions between legumes and soil rhizobia proteins . The leguminous molecular events involved in nodulation and nodule development are quite well understood (Li Q. G. et al., 2013;Mergaert et al., 2020;Roy et al., 2020). The characteristics and the corresponding detection methods, the key leguminous genes currently known to be involved in the regulation of nodule senescence and various abiotic factors affecting nodule senescence were summarized in our previous review (Zhou et al., 2021). Except for legumes, the soil rhizobia also play critical roles in symbiotic nitrogen fixation progress (Lindström and Mousavi, 2020). Because Nod factors (NFs), surface polysaccharides and secreted proteins are three main determinants of host specificity in most rhizobia, the rhizobial genes that affect the biological synthesis of these signaling molecules were explored in nodulation, including nod, nif, fix genes, surface polysaccharides biosynthesis genes and secretion system-related genes Li et al., 2020). Besides, it has been demonstrated that several rhizobial genes play key roles in nodule senescence. SmgshB is a critical gene for the synthesis of glutathione, which is a carbon source that can ensure the normal development of nodules and can suppress the early senescence of nodules (Cheng et al., 2017;Yang et al., 2020). lrpL-acdS genes, which encode an ACC deaminase enzyme, are associated with nitrogen fixation and delayed nodule senescence (Tittabutr et al., 2015). The sigma factors RpoH1 and RpoH2 are involved in different stress responses and negatively regulate nodule senescence (Martínez-Salazar et al., 2009). Anabaena variabilis flavodoxin, a protein involved in the response to oxidative stress, can significantly delay nodule senescence in alfalfa (Redondo et al., 2009). Therefore, both leguminous genes and rhizobial genes are critical to nodule development and senescence.
Soybean (Glycine max) is an important leguminous crop and has been grown worldwide for edible oil, food and feed material (Palander et al., 2005). Large amounts of N from the atmosphere fixed by soybean symbiotic nitrogen fixation can be widely available for growth of soybean and other intercropping and rotation crops (Biswas and Gresshoff, 2014). Branching stage, flowering stage, fruiting stage, pod stage and harvest stage are five important developmental stages for soybean cultivation and fertilizer application studies, and the nitrogen fixation characteristics of these stages have been well studied (Fehr et al., 1971). In our previous study, the expression of nodule genes at above-mentioned five stages was assessed quantitatively using RNA-Seq, we only analyzed the important different expression genes (DEGs) of soybean in nodule development and senescence, but not of Bradyrhizobium diazoefficiens 113-2 (Yuan et al., 2017).
B. diazoefficiens 113-2 was collected from paddy fields in Hengyang area of Hunan Province, China in 1972 by Xuejiang Zhang, and its genome shared a large synteny blocks and a high ANI value with that of B. diazoefficiens USDA110 (Li et al., 2020). In this report, we investigated the expression of B. diazoefficiens 113-2 genes in the nodule samples from abovementioned five developmental stages of soybean and identified 164 important DEGs associated with nodule development and senescence. These DEGs included those encoding nod, nif, fix proteins and T3SS secretion system-related proteins, as well as proteins involved in nitrogen metabolism, ABC transporters and two-component system pathways. Our results firstly connected the action mode of rhizobial genes during nodule development with the developmental stages of soybean, which should provide new clues for the understanding of the genetic determinants of soil rhizobia in nodule development and senescence, and shed new light on the molecular mechanisms of nitrogen fixation at above-mentioned five developmental stages of soybean.

RESULTS
Expression Analysis of the Genes of B. diazoefficiens 113-2 We previously performed RNA-Seq analysis for five different nodule samples from five important development stages of soybean (Yuan et al., 2017) and sequenced the genome of B. diazoefficiens 113-2 (Li et al., 2020). In this report, we want to investigate the expression of the genes of B. diazoefficiens 113-2 in nodule development and senescence by using the above-mentioned RNA-Seq data. The genome and gene mapping details of B. diazoefficiens 113-2 were shown in Supplementary  Table 1, and the results of sequencing saturation, reads coverage and reads random analyses were displayed in Supplementary  Figures 1-3. As shown in Table 1, 2234 (about 25.38%) genes of B. diazoefficiens 113-2 were not identified in these five nodule samples, and more than 30% of the genes (except for Pod stage_N) contained meaningless fpkm values (0∼1). For the rest genes, most of the fpkm values were between 1 and 1,000, and a very small number of the fpkm values were above 10,000 ( Table 1). Besides, the numbers of genes with meaningful fpkm values (>1) at one or more nodule samples were shown in Figure 1A. There are 966 genes consistently expressed at five nodule samples, 1085 genes were consistently found at four nodule samples, and 1252, 1552, and 1771 genes expressed at three, two and only one nodule sample, respectively. The detail fpkm information was shown in Supplementary Table 2.
To study the correlation of the expressions of B. diazoefficiens 113-2 genes among the five nodule samples, we calculated the Pearson correlation coefficients of each two nodule samples based on the total gene expression levels ( Figure 1B). The Pearson correlation coefficients between Branching stage_N and the other four nodule samples were 0.54∼0.63, indicating that the expression of the genes of B. diazoefficiens 113-2 in Branching stage_N was very different from other four nodule samples. The Pearson correlation coefficients between harvest stage_N and flowering stage_N, fruiting stage_N or pod stage_N were less than 0.9, suggesting that relative lower correlation between these nodule samples. While the Pearson correlation coefficients among flowering stage_N, fruiting stage_N and pod stage_N were more than 0.9, indicating that relative higher correlation were existed in these three nodule samples.

Different Expression Gene Identification Between Different Nodule Samples
To screen the important genes of B. diazoefficiens 113-2 involved in nodule development and senescence, a false discovery rate (FDR) ≤ 0.001 and | log2 ratio| ≥ 1 were used to identify the DEGs in the ten Groups. The numbers of up-regulated and down-regulated DEGs in these ten Groups were shown in Figure 2A, and the ID information on these DEGs was provided in Supplementary Table 3 According to different reference developmental stages N, above-mentioned ten Groups were divided into four Classes (Figure 2A). The number of DEGs consistently found in all four Classes was 15 and the total number of DEGs identified during the soybean nodule development was 164 ( Figure 2B).
To investigate the expression profile of the 164 DEGs during nodule development and senescence, we analyzed the RNA-Seq results of these DEGs at above-mentioned five nodule samples, and the result showed that these DEGs had various expression patterns (Figure 3). 64 DEGs reached their peaks at branching stage_N or flowering stage_N, and most of them were down-regulated during nodule development and senescence ( Figure 3A), indicating that these DEGs mainly play roles in the early stages of nodule development. 45 DEGs reached their peaks at flowering stage_N or fruiting stage_N, and had relative low expression at branching stage_N or harvest stage_N ( Figure 3B), suggesting that these DEGs mainly participate in nitrogen fixation progress. About 55 DEGs reached their peaks at harvest stage_N or pod stage_N, and most of them were up-regulated during nodule development and senescence  (3), pod stage (4) and harvest stage (5), and the pheatmap packages in R were used to product this heatmap.
( Figure 3C), indicating that these DEGs mainly play roles in nodule senescence.

DEG Functional Ontology and Kyoto Encyclopedia of Genes and Genomes Pathway Enrichment Analysis
To evaluate the potential functions of the 164 DEGs, we used Gene Ontology (an internationally standardized gene function classification system) to classify these DEGs. Among these DEGs, 62 genes had no functional GO term, and a total of 150 GO terms were assigned for the rest genes, while about 2/3 (101 out of 150) functional GO terms were specific (only for one gene) (Supplementary Table 6). The rest49 Go terms were listed in Figure 4 and were divided into three categories: cellular components, biological process and molecular function. The cellular components mainly included integral component of membrane and membrane. The main biological processes of the DEGs were oxidation-reduction process, nitrogen fixation, metabolic process, transport and regulation of transcription. The molecular functions associated with the DEGs mainly focused on oxidoreductase activity, catalytic activity, ATP binding, nucleic acid binding and DNA binding. KEGG is the major public database for pathway enrichment analysis (Kanehisa et al., 2016). In this report, we used KEGG database to classify these DEGs. Among these DEGs, 74 genes had no classified KEGG pathways, and a total of 149 KEGG pathway subgroups were assigned for the rest genes, while about 76% (113 out of 149) pathway subgroups were specific (only for one gene) (Supplementary Table 7). The other 36 pathway subgroups associated with the DEGs were shown in Figure 5. Most of these pathways were metabolic-related pathways. The pathways with the greatest numbers were metabolic pathway, followed by FIGURE 5 | KEGG pathway enrichment analyses of DEGs of B. diazoefficiens 113-2 between different nodules from different developmental periods of soybean. 36 KEGG pathways were analyzed and the x-and y-axes represent pathway categories and the number of genes in each pathway, respectively. microbial metabolism in diverse environments, biosynthesis of secondary metabolites and biosynthesis of antibiotics.

DEGs Associated With Nitrogen Metabolism, ABC Transporters and Two-Component System Pathways
To investigate the regulation of the DEGs of 113-2 during nodule development and senescence, we analyzed nitrogen metabolism (K00910), ABC transporters (K02010) and Two-Component system (K02020) pathways (obtained by KEGG) 1 in more detail (Figure 6). Two KEGG gene sets in K00910, seven in K02010 and six in K02020 were differentially expressed during nodule development and senescence. One gene that matched K00910 pathway gene NirK (113-2GL002109) was 1 http://www.kegg.jp/kegg/kegg1.html up-regulated in Harvest vs. Branching, Harvest vs. Flowering, Harvest vs. Fruiting and Harvest vs. Pod, suggesting that this gene has relative higher expression in harvest stage nodules and plays key roles in nodule senescence. Among the three DEGs that matched k02010 pathway gene nifDKH, 113-2GL007881 has different expression patterns with 113-2GL007904 or 113-2GL007905 in Flowering vs. Branching, Fruiting vs. Branching, Pod vs. Branching and Harvest vs. Branching groups, indicating that nifDKH gene may play diverse roles in nodulation and nodule development. For K02010 pathway gene sets, three genes that matched modC, modA, gltI, and aatJ were down-regulated in all of the detected groups, the other gene that matched pstS was up-regulated in Pod vs. Branching and Pod vs. Flowering groups, and the rest two genes lptG and urtA were down-regulated then up-regulated during nodule development and senescence, suggesting that they may have roles in both nodulation and nodule senescence. Three K02020 pathway gene sets (pstS, gltI and aatJ) were the same as that of K02010 pathway, and among the rest three gene sets, one DEG that matched fixK (113-2GL006786) was up-regulated in all of the detected groups, one gene that matched ccoN (113-2GL006780) was up-regulated in most of the detected groups and only down-regulated in Fruiting vs. Flowering group, while the rest one DEG that matched atoB (113-2GL005802) was down-regulated in most of the detected groups and only up-regulated in Flowering vs. Branching group.

DEGs Encoding Nod and T3SS Secretion System-Related Proteins
In most rhizobia, NFs, surface polysaccharides and secreted proteins are the main determinants of host specificity (Putnoky et al., 1988;Lorkiewicz, 1997;Li et al., 2014). To examine whether these three signaling molecules also play roles during nodule development and senescence, we explored DEGs that required for the biological synthesis of these signaling molecules between different nodules from different developmental periods of soybean, including five Nod genes, eight nif genes, three fix genes and 27 T3SS-related genes (Figure 7). 16 DEGs (two Nod genes, seven nif genes, one fix gene and six T3SSrelated genes) were down-regulated in all of the detected groups, suggesting that these genes mainly play roles in nodule development rather than nodule senescence. Ten DEGs (one Nod gene, one fix gene and eight T3SS-related genes) were up-regulated in all of the detected groups, and most of these genes (except for 113-2GL007701) have relative higher expression in pod stage nodules or harvest stage nodules, meaning that these genes may participate in regulating nodule senescence. Seven DEGs (113-2GL007881, 113-2GL007641, 113-2GL007564, 113-2GL007604, 113-2GL007623, 113-2GL007879, and 113-2GL007892) have highest expression in flowering stage nodules or fruiting stage nodules, reflecting that these genes may be critical to nitrogen fixation progress. Four DEGs (113-2GL001050, 113-2GL008064, 113-2GL002555, and 113-2GL004188) were down-regulated then up-regulated during nodule development and senescence, suggesting that these genes may play roles in both in the early stages of nodule development and nodule senescence. Five genes (113-2GL000406, 113-2GL005174, 113-2GL006780, 113-2GL007717, and 113-2GL008548) have relative high expression in flowering, fruiting or pod stage nodules, indicating that they mainly play roles in nitrogen fixation progress. The rest one gene (113-2GL007717) was only down-regulated in Harvest/Flowering group.

Orthologs of the DEGs
To further study the roles of these rhizobial DEGs during nodule development and senescence, we identified their orthologs in 113-2 itself, USDA110, Mesorhizobium loti MAFF303099, Mesorhizobium huakuii 7653R and Sinorhizobium meliloti 2011. Four homologous pairs (113-2GL000406/113-2GL001850, 113-2GL004736/113-2GL006371, 113-2GL004195/113-2GL002328, and 113-2GL003941/113-2GL001604) were existed in these 164 DEGs. 16 genes have no orthologs, and most of them have no COG functions (Supplementary Table 8), indicating that these DEGs were mainly strain-specific. 84 DEGs have only orthologs in USDA110, suggesting that these DEGs were mainly species-specific or host specific, and among these DEGs, about 13.1% (11 out of 84) have orthologs in 113-2 itself and more than one orthologs in USDA110 ( Table 2). Among the rest 62 DEGs, 35 DEGs have one or more orthologs in all of the five rhizobia, 11 DEGs have orthologs in four rhizobia and 16 DEGs have orthologs in three rhizobial, reflecting that not all of these nodule development and senescence-related DEGs are core genes among these four rhizobial ( Table 3). Among the core 35 DEGs, about 60% (21 out 35) were mainly expressed in the early stages of nodule development, six DEGs showed highest expression in the progress of nitrogen fixation, and eight DEGs may play critical roles in nodule senescence, indicating that these core genes have diverse roles during nodule development and senescence ( Table 3). Among these four rhizobia, 113-2 or USDA110 and M. loti MAFF303099 can form determinate nodules with soybean and Lotus, respectively, and nine DEGs of 113-2 have orthologs only in USDA110 and M. loti MAFF303099 (Table 3). Besides, we selected four gene groups with more orthologs to perform phylogeny analyses, and the results revealed closer phylogenetic relationships between the two strains in the same genus (Supplementary Figures 4-7).
Our previous study has indicated that 113-2 has an extremely close phylogenetic relationship to USDA110 (Li et al., 2020). An interspecies protein interactome was constructed and 290 USDA110 proteins associated with soybean symbiosis were provided . In this report, about 203 USDA110 proteins were searched from the orthologs of the DEGs (Tables 2, 3). We compared these two USDA110 protein groups and discovered six same proteins (bll7322, blr1971, blr1091, blr1516, blr0462, and bll7600), which were detected to be expressed in rhizoidal of root nodules during symbiosis in previous studies Delmotte et al., 2010;  Cuklina et al., 2016;Zhang et al., 2018). Then we predict the interaction proteins of the six DEGs (orthologs of the six USDA110 proteins) according to the resources from the constructed protein interactome , and the results were showed in Table 4. These rhizoidal proteins have more than two predicted interacting proteins in soybean, especially for bll7322 or 113-2GL000406 or 113-2GL001850, and blr1971 or 113-2GL007693; they have 21 and 25 predicted interacting proteins in soybean, respectively.

DISCUSSION
Symbiotic nitrogen fixation system is of great significance in agriculture and ecology, and nodule development and senescence directly affects nitrogen fixation efficiency. Nodule development and senescence involves a programmed series of molecular events that are carried out synchronously by both legumes and rhizobia (Vasse et al., 1990;Cermola et al., 2000;Oldroyd et al., 2011;Roux et al., 2014;Alemneh et al., 2020). The nitrogen fixation characteristics of five soybean developmental stages (Branching stage, flowering stage, fruiting stage, pod stage and harvest stage) have been well studied (Fehr et al., 1971), the expression of nodule genes at these developmental stages was assessed quantitatively using RNA-Seq, and the important DEGs of soybean that associated with nodule development and senescence were identified in our previous study (Yuan et al., 2017). In the present study, we firstly investigated the expression of rhizobial genes in the nodule samples from above-mentioned five developmental stages of soybean by using our previous RNA-Seq data (Yuan et al., 2017)and identified 164 important rhizobial DEGs associated with nodule development and senescence.

The Gene Expression of B. diazoefficiens 113-2 at Different Nodule Samples From Different Soybean Developmental Stages
Previous comparative transcriptomic analyses have explored the gene expression patterns of rhizobia during chemoautotrophic growth (Franck et al., 2008) or symbiotic non-growth (Vercruysse et al., 2011), under different stresses (Ampe et al., 2003;Chang et al., 2007;Cytryn et al., 2007), in determinate   , and in different zones of the indeterminate nodule (Roux et al., 2014). For the bacteroids in soybean symbiosis, several transcriptomic analyses of bacteroids focused on symbiotic gene region (Hauser et al., 2006) and single time point during optimal symbiotic functioning Pessi et al., 2007; Franck et al., 2014). It is worth noting that a multiple time-point microarray study in B. diazoefficiens bacteroids isolated from soybean nodules from nodulation to nodule senescence, which revealed a shift gene expression patterns of B. diazoefficiens during symbiotic process, especially in nodule senescence (Franck et al., 2018). In this report, we also performed a multiple time-point transcriptomic ananlysis, the difference is that the RNA samples were collected from different nodule samples from five important developmental stages of soybean (Yuan et al., 2017). We firstly investigated the expression of rhizobial genes in symbiosis at above-mentioned five developmental stages of soybean, which should shed new light on the molecular mechanisms of nitrogen fixation during the growth and development of soybean.   Zhang et al., 2018 In the five soybean nodule samples, about 25.38% (2234 out of 8801) of B. diazoefficiens 113-2 genes were not detected, indicating that these genes may not participate in symbiotic progress. The pod stage nodules possessed the highest number of genes with meaningful fpkm values (>1), suggesting the initiation of a series of new processes, which might be associated with nodule senescence. The correlation analysis of the gene expressions in the five nodule samples showed that the expression of B. diazoefficiens 113-2 genes at Branching stage or harvest stage has relative lower correlation with the other soybean development stages, but high correlations were existed among flowering stage, fruiting stage and pod stage (Figure 1B), indicating that similar gene expression patterns of B. diazoefficiens 113-2 were found during optimal symbiotic functioning, but different expression patterns were existed among early nodule development, nitrogen fixation progress and nodule senescence. These results also revealed a shift gene expression pattern of B. diazoefficiens 113-2 in symbiosis during the growth and development of soybean.

DEGs of B. diazoefficiens 113-2 Associated With Nodule Development and Senescence
Previous study showed that > 800 USDA110 genes were up-or down-regulated at each time point of soybean nodule development (Franck et al., 2018), while in this report, the highest number of DEGs at the ten groups was 83 (Figure 2A). The major reason may be that we used our previous soybean nodule RNA-Seq data, in which the RNA samples were mixed with both soybean RNA and B. diazoefficiens 113-2 RNA (Yuan et al., 2017). Our main purpose is to screen the important DEGs that can interact with host and are mainly involved in nodule development and senescence. The gene expression of soil rhizobia inside of the host was different from free live state (Vercruysse et al., 2011), so we did not isolate B. diazoefficiens 113-2 from the nodule samples.
Lots of important rhizobial genes have been showed to participate in nodule development and senescence (Voroshilova et al., 2001;Tittabutr et al., 2015;Franck et al., 2018). Recently, we sequenced the genome of B. diazoefficiens 113-2 (Li et al., 2020) and identified 164 DEGs of B. diazoefficiens 113-2 during the soybean nodule development in this study. Among these DEGs, 49 DEGs had both no functional GO term and no classified KEGG pathway (Supplementary Tables 6, 7), and about 13 DEGs were not clearly annotated (Supplementary  Table 3), suggesting that the nodule development and senescence in determinate nodules or indeterminate nodules was also mainly determined by host legume plants, which is similar to our previous study (Li et al., 2020).

The Potential Roles of Host Specificity-Related DEGs in Nodule Development and Senescence
In most rhizobia, the genes that affect the biosynthesis of nodulation factors, secretion system and surface polysaccharides are often play critical roles in host specificity (Horvath et al., 1986;Philip-Hollingsworth et al., 1989;Wang et al., 2014). In this report, five Nod genes, eight nif genes, three fix genes and 27 T3SS-related genes (no genes associated with surface polysaccharides biosynthesis) were identified (Figure 7). The Nod genes primarily played roles in nodulation (Moulin et al., 2004), and among the identified five Nod genes, only Noek (113-2GL001050) and NolG (113-2GL008064) may play roles in the early state of nodule development. NoeJ (113-2GL007583) and NolG (113-2GL007643) may participate in nitrogen fixation progress and NodG (113-2000663) may have roles in nodule senescence. The nif and fix genes are crucial for nitrogen fixation progress (Fischer and Hennecke, 1984;Hennecke, 1990). NifH, NifD, NifK, NifE, and NifN proteins are the core components of nitrogenase (Raymond et al., 2004), and the FixABCX genes were found to encode a membrane complex participating in electron transfer to nitrogenase (Earl et al., 1987;Edgren and Nordlund, 2004). The two main regulatory cascades that can be found in rhizobia are the RpoN−NifA and the oxygen−responsive two−component FixL−FixJ system, together with FixK (Lindström and Mousavi, 2019), which was found to regulate positively and negatively nitrogen fixation genes in Rhizobium meliloti (Batut et al., 1989). Among the identified eight Nif genes, only NifH (113-2GL007881) and NifW (113-2GL007879) reached the expression peaks at flowering stage_N or fruiting stage_N, the rest six nif genes NifB (113-2GL007888), NifD/K (113-2GL007904, 113-2GL007095), NifE (113-2-GL007903), NifN (113-2GL007902), NifQ (113-2007880), and NifX (113-2GL007901) may mainly involved in regulating the early nodule development. For the three fix genes, FixA (113-2GL007641) and FixC (113-2GL007877) had relative higher expression at branching stage_N and flower stage_N, implying that they may mainly play roles in nitrogen fixation progress, while FixK (113-2GL006786) may mainly have roles in nodule senescence. The rhizobia T3SS and its secreted effectors has been reported to modulate nodulation and host range (Okazaki et al., 2013). For the 27 T3SS-related genes, only seven DEGs had relative higher expression at branching stage_N, and the rest 20 genes may participate in nitrogen fixation progress or nodule senescence, meaning that the T3SS-related genes have roles in the whole nodule development and senescence rather than only in nodulation, which is similar to our previous study (Yuan et al., 2017). Together, these results revealed that the identified host specificity-related DEGs have diversified roles in symbiosis, meaning that host specific regulation maybe existed in the whole progress of nodule development and senescence, not only in nodulation.
In summary, we firstly combined the expression characteristics of rhizobial genes in symbiosis with the growth and development of soybean, and identified 164 important rhizobial DEGs associated with nodule development and senescence. Among these DEGs, many are firstly identified and showed to associate with nodule development and senescence. Besides, we focused on the DEGs encoding nod, nif, fix proteins and T3SS secretion system-related proteins, as well as proteins involved in nitrogen metabolism, ABC transporters and two-component system pathways. Our results supply valuable basises for studying the genetic determinants of soil rhizobia in symbiosis and the creation of high efficiency soybean cultivation technology.

Genome and Gene Mapping
The clean reads for the five nodule samples at five developmental stages of soybean (branching stage, flowering stage, fruiting stage, pod stage and harvest stage) were obtained from our previous study (Yuan et al., 2017). The raw sequence reads have been submitted to NCBI under the assigned accession number PRJNA765164, and the BioSample accessions included SAMN21545854, SAMN21545855, SAMN21545856, SAMN21545857, and SAMN21545858. The information of the reference genome (B. diazoefficiens 113-2) was previously described (Li et al., 2020). We map clean reads to the reference genome using HISAT (Cock et al., 2010), and map clean reads to reference transcripts using Bowtie2 (Kim et al., 2015). The mapping details are shown in Supplementary Table 1.

Gene Expression Analysis and Correlation Analysis Between Samples
We calculate the gene expression level for each nodule sample with RSEM (Li and Dewey, 2011). Besides, in order to reflect the gene expression correlation between the nodule samples, we calculated the Pearson correlation coefficients for all gene expression levels between each two samples using cor function in R software, and reflected these coefficients in the form of heat maps. Heatmap was performed using TBtools software .

Screening of DEGs Between Different Nodule Samples
To screen the DEGs between nodule samples, in combination with the false discovery rata (FDR) ≤ 0.001 and | log2 ratio| ≥ 1, the fold changes in gene expression were used to evaluate the significance of gene expression differences in the ten Groups which are classified according to our previous study (Yuan et al., 2017). Data of the DEGs in the ten Groups between five nodule samples are provided in Supplementary Table 3.

Gene Ontology Functional and Kyoto Encyclopedia of Genes and Genomes Pathway Analyses of DEGs
Gene Ontology (GO) is a database created by the Gene Ontology Consortium and is a classification system for genes' biological functions. The method of GO functional enrichment analysis first maps all DEGs to terms in the GO database 2 and the computing method as previously described (Yuan et al., 2017). Kyoto Encyclopedia of Genes and Genomes (KEGG) is a public pathway-related database, 3 and the computing method also were described in our previous study (Yuan et al., 2017).

Orthologs Analysis of the DEGs
Firstly, we performed Core/Pan gene analysis for B. diazoefficiens 113-2, M. loti MAFF303099, M. huakuii 7653R, and S. meliloti 2011. We used the CD-HIT 4.66 4 rapid clustering of similar proteins software (Edgar, 2004)with a threshold of 50% pairwise identity and 0.7 length difference cut off in amino acid to cluster the Core/Pan genes of these four strains, and obtained the pan gene pool. Then we used the identified DEGs as queries to search the orthologs in the pan gene pool.

Phylogenetic Analysis
Phylogenetic trees were conducted in MEGA X (Kumar et al., 2018). The evolutionary history was inferred using the Neighbor-Joining method (Saitou and Nei, 1987). The percentage of replicate trees in which the associated taxa clustered together in the bootstrap test (1000 replicates) are shown next to the branches.

DATA AVAILABILITY STATEMENT
The original contributions presented in the study are included in the article/Supplementary Material, further inquiries can be directed to the corresponding author/s.

AUTHOR CONTRIBUTIONS
SY, HC, and XZ designed this work. SY wrote the manuscript. SY, SZ, and YF performed most of the analysis, figures, and tables. CZ, YH, ZS, SC, WG, HY, ZY, and DQ contributed substantially to the completion of this work. All authors contributed to the article and approved the submitted version.