Genome-wide analysis of the R2R3-MYB gene family in Spatholobus suberectus and identification of its function in flavonoid biosynthesis

Spatholobus suberectus Dunn (S. suberectus), a plant species within the Leguminosae family, has a long history of use in traditional medicines. The dried stem of S. suberectus exhibits various pharmacological activities because it contains various flavonoids. Diverse functions in plants are associated with the R2R3-MYB gene family, including the biosynthesis of flavonoids. Nonetheless, its role remains unelucidated in S. suberectus. Therefore, the newly sequenced S. suberectus genome was utilized to conduct a systematic genome-wide analysis of the R2R3-MYB gene family. The resulting data identified 181 R2R3-SsMYB genes in total, which were then categorized by phylogenetic analysis into 35 subgroups. Among the R2R3-SsMYB genes, 174 were mapped to 9 different chromosomes, and 7 genes were not located on any chromosome. Moreover, similarity in terms of exon-intron structures and motifs was exhibited by most genes in the same subgroup. The expansion of the gene family was primarily driven by segmental duplication events, as demonstrated by collinearity analysis. Notably, most of the duplicated genes underwent purifying selection, which was depicted through the Ka/Ks analysis. In this study, 22 R2R3-SsMYB genes were shown to strongly influence the level of flavonoids. The elevated expression level of these genes was depicted in the tissues with flavonoid accumulation in contrast with other tissues through qRT-PCR data. The resulting data elucidate the structural and functional elements of R2R3-SsMYB genes and present genes that could potentially be utilized for enhancing flavonoid biosynthesis in S. suberectus.

A majority of R2R3-MYB genes are largely conserved in different species, and the similarity of their sequence allows them to be categorized into the same subgroups.Nevertheless, interspecies variations still exist.As demonstrated by a plant study that exhibited extensive evolutionary expansion of this gene family by comparatively analyzing these genes across various species (Ito, 2005;Zhuang et al., 2021).The R2R3-MYB genes expansion is involved in a variety of developmental and growth processes in plants as well as disease resistance, hormone signal transduction, and biotic and abiotic stress (Aoyagi et al., 2014;Anwar et al., 2019;Wang et al., 2021;Zhuang et al., 2021;Yang et al., 2022).Increasingly, the research data of various plant species have implicated R2R3-MYB TFs in the modulation of secondary metabolism (particularly flavonoid biosynthesis and metabolism).In Erigeron breviscapus, MYBP1 acts as a positive regulator and is linked to the regulation of flavonoid accumulation.It activates the transcription of flavonoid-associated genes by directly binding to their promoters (Zhao et al., 2022).EsMYB9, a subfamily of R2R3-MYB TFs, regulates the flavonoid biosynthesis pathway in Epimedium sagittatum by activating the expression of the chalcone synthase promoter (Huang et al., 2017).
Numerous studies on crop and horticultural plants concerning the members of the R2R3-MYB gene family have added to the existing data regarding their functions, evolutionary history, and transcriptional regulatory mechanism.Nonetheless, their functions in Spatholobus suberectus Dunn (S. suberectus), a traditional Chinese medicinal herb known as jixueteng, are not well understood.The dried stem of this plant depicts diverse pharmacological activities, and the primary bioactive constituents were determined to be flavonoids (Wang et al., 2017;Song et al., 2022).Catechin, the flavonoid with the highest content, can promote the proliferative capacity of the hematopoietic progenitor cells.Additionally, genistein, isoliquiritigenin, and formononetin have been demonstrated to have efficacy in cancer preventive or therapeutic strategies (Wang et al., 2008;Wang, 2011;Peng et al., 2016).At present, these four flavonoid biosynthetic pathways have been well elucidated in S. suberectus, and over 70% of genes involved in flavonoid biosynthesis had MYB binding sites in their promoter regions (Qin et al., 2020).This result verified the role of MYB TFs in regulating intermediates in the flavonoid biosynthesis pathway.However, the members of the R2R3-SsMYB gene family concerning their modulation of flavonoid biosynthesis remain to be investigated.
As described in our previous study, the genome of S. suberectus has been sequenced (Qin et al., 2019), facilitating a genome-wide analysis and identification of the functions of genes in the R2R3-SsMYB family.In the present study, we conducted a genome-wide analysis of the R2R3-SsMYB family, including sequence features, phylogenetic relationships, gene structure, motif recognition, collinearity, and chromosomal location.Candidate R2R3-SsMYB genes associated with flavonoid biosynthesis in the correlation analysis were identified and assessed by qRT-PCR.Our study not only serves as a comprehensive analysis of various characteristics of the R2R3-SsMYB family but also provides valuable insights for further functional assessment of the genes involved in flavonoid biosynthesis.

Identification of Spatholobus suberectus R2R3-MYB genes
The Hidden Markov Model (HMM) profile of the MYB DNAbinding domain with accession number (PF00249) was accessed at the Pfam database (http://pfam.xfam.org/)(Finn et al., 2016).This profile was utilized as a query for an HMM search in the S. suberectus genome with default parameters using HMMER version 3.0 (Finn et al., 2011) for the identification of MYB genes.NCBI's Conserved Domains Database (CDD) (http:// www.ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi) and the database Simple Modular Architecture Research Tool (SMART) (http:// smart.embl-heidelberg.de/)were used to confirm the acquired R2R3-MYB protein sequences in S. suberectus.

Sequence analysis and structural characterization of R2R3-SsMYB genes
All R2R3-SsMYB genes were imported into the ProtParam tool (http://web.expasy.org/protparam/) to assess the isoelectric point, molecular weight, number of amino acids, and aliphatic and instability indices.The structures of all R2R3-SsMYB genes were visualized with the TBtools version 1.045 software using the genomic sequences and coding regions of the R2R3-SsMYB genes, with the lengths and numbers of the exons and introns included (Chen et al., 2020).The conserved motifs of the R2R3-MYB protein sequences were using the motif-based sequence analysis tool MEME Suite (https://meme-suite.org/meme/tools/meme) (Bailey et al., 2009).The analysis was set to identify a maximum of 20 motifs with an optimum motif width range of 6 to 100 amino acids (Zhuang et al., 2021).

Phylogenetic analysis of R2R3-SsMYB genes
Phylogenetic trees were generated utilizing the R2R3-MYB protein sequences of S. suberectus and A. thaliana.We performed multiple sequence alignment using MAFFT version 7.427 with the default parameters for better alignment speed and accuracy.A maximum likelihood (ML) phylogenetic tree was constructed using the software Molecular Evolutionary Genetics Analysis (MEGA) version 7.0 and the following parameters: Poisson model, partial deletion, and 1000 bootstrap replicates.Visualization of the tree was executed through the software FastTree (Price et al., 2009).Furthermore, similar methods were employed for the establishment of a separate phylogenetic tree with all the R2R3-SsMYB protein sequences.The Tbtools version 1.045 (Chen et al., 2020) was utilized to create a visual representation combining the gene structures, phylogenetic tree, and conserved motifs of the R2R3-SsMYB protein sequences.

Genomic localization, collinearity analysis, and gene duplication of R2R3-SsMYB genes
The physical positions of identified R2R3-SsMYB genes were mapped to nine chromosomes of S. suberectus using MapGene2Chrom (MG2C) (http://mg2c.iask.in/mg2c_v2.0/),a tool for quickly drawing physical gene maps in SVG format based on the input data (Chao et al., 2021).The determination and visualization of the collinearity of the interspecific and intraspecific genes were conducted through the Multiple Collinearity Scan toolkit (MCSscanX) and the circos multiple synteny plot, respectively.The parameters of MCSscanX were as follows: gap_penalty: −1, E-value: 1e-10 (Chao et al., 2021).The determination and visualization of the collinearity of the interspecific and intraspecific genes were conducted through the Multiple Collinearity Scan toolkit (MCSscanX) and the circos multiple synteny plot, respectively (Wang et al., 2012).To estimate duplication events, the nonsynonymous (K a ) and synonymous (K s ) substitution rates and evolutionary constraint (K a / K s ) between the duplicated pairs of R2R3-SsMYB genes were calculated using KaKs_Calculator 2.0 (Wang et al., 2010).Circos version 0.69 was used to graphically present the synteny blocks of the orthologous R2R3-MYB genes between S. suberectus and A. thaliana, S. suberectus and G. max (Krzywinski et al., 2009).

Expression analyses by RNA−seq and correlation analysis of R2R3-SsMYB genes
The expression pattern of R2R3-SsMYB genes in the root, stem, and other tissues were examined by retrieving the transcriptomic data of the putative R2R3-SsMYB genes in the various tissues from prior research that utilized these gene IDs as the queries (Qin et al., 2020).The abundance of R2R3-SsMYB transcripts was presented in the format of fragments per kilobase (of exon model) per million mapped reads (FPKM).To cluster genes with the same or similar expression, we conducted hierarchical clustering on the log 2 (FPKM+1) from the RNAseq data using Cluster version 3.0 and visualized the resulting data with Java TreeView (Zhuang et al., 2021).In order to further investigate R2R3-SsMYB related to flavonoid components, the contents of flavonoid, catechin, genistein, isoliquiritigenin, and formononetin in the different tissues were obtained from previous research (Qin et al., 2020).The relationship between R2R3-SsMYB genes and flavonoid content was analyzed by Spearman rank correlation analysis in R version 3.6.2,and p < 0.05 was considered statistically significant.

Expression analyses of R2R3-SsMYB genes by qRT-PCR
The Guangxi Botanical Garden of Medicinal Plants was utilized for the cultivation of the S. suberectus plants with samples from leaves, stems, roots, fruits, and flowers acquired from 12-year-old plants.Each plant tissue was sampled in triplicate.The plant samples were exposed to liquid nitrogen for freezing with subsequent storage at −80°C for RNA extraction.Extraction of the total RNA utilized the FastPure Universal Plant Total RNA Isolation Kit (Vazyme, China).Gel electrophoresis and NanoDrop 2000 spectrophotometer (Thermo Scientific, United States) were employed for assessing the quality and concentration of the RNA samples.HiScript ® III RT SuperMix for qPCR (Vazyme, China) was designed for the synthesis of cDNA.The amplification of 20ml final volume involved the StepOne real-time PCR system (Applied Biosystems, United States) as well as the ChamQTM Universal SYBR® qPCR Master Mix (Vazyme, China).The qRT-PCR reactions involved three biological replicates, and 18S was chosen as an endogenous reference gene (Ferradas et al., 2016).The amplification specificity was examined by melting curve analysis, and the degree of expression of the gene was calculated using the 2 −DDCT method.To test for significant differences between samples, we employed Duncan's multiple range test (SPSS version 17.0).Primer version 5.0 was utilized for designing the specific R2R3-SsMYB primers (Supplementary Table 1).

Statistical analysis
The three biological replicates were utilized in the experiments.R software was utilized for statistical significance (Student's t-tests) with the significance level set at p < 0.05.

Identification and characterization of Spatholobus suberectus R2R3-MYB family genes
The assessment of the plant genome resulted in identifying 272 candidate genes that coded for MYB domain-containing proteins.

Spatholobus suberectus
The functions associated with the R2R3-SsMYB genes and their evolutionary history were examined in an ML tree established through FastTree.This tree comprised 181 R2R3-SsMYB and 138 R2R3-AtMYB genes (Figure 1).The 181 R2R3-SsMYB genes were categorized into 35 subgroups (A1-A35), with 22 of the groups (comprising 117 R2R3-SsMYB genes) exhibiting congruence with the R2R3-AtMYB proteins phylogenetic tree that was established previously.Additionally, 13 specific subgroups in S. suberectus did not cluster with A. thaliana.Furthermore, A5 and A24 subgroups just included R2R3-SsMYB genes but no R2R3-AtMYB, indicative of the occurrence of these genes in S. suberectus during the evolutionary process.S10 and S12 subgroups only contained R2R3-AtMYB genes with no R2R3-SsMYB genes, revealing that some evolutionary alterations were present in the genome, and the R2R3-MYB genes could have been acquired in A. thaliana or lost in S. suberectus during evolution.The gain and loss of species-specific R2R3-MYB genes could have resulted in functional divergence.

Conserved gene structure and motif composition of R2R3-SsMYBs
Structural analysis of genes provided information on gene function and evolution.The pattern of the exon-intron structure of R2R3-SsMYB genes was examined to gain insight into their structural diversity and the composition of their motif (Figures 2A, B).The resulting data indicated a total of 181 R2R3-SsMYB genes with exons varying from 1 to 11, constituting 98.3% of the total R2R3-SsMYB genes.The majority (105 of 181) of the R2R3-SsMYB genes had typical splicing (two introns and three exons), where a single exon was depicted in 8 R2R3-SsMYB genes that had no intron, such as SsMYB14, SsMYB15, SsMYB26, SsMYB27, SsMYB39, SsMYB72,  SsMYB73, and SsMYB109.Characteristics of gene structure, including the number of introns, were included in the phylogenetic analysis of the R2R3-SsMYB family.Genes within the same subgroup exhibited similarity in exon-intron structure due to the fully conserved position(s) of the intron(s).For example, SsMYB38, SsMYB85, SsMYB100, SsMYB114, and SsMYB164 in subgroup A7 each contained two exons; SsMYB8, SsMYB105, SsMYB81, and SsMYB131 in subgroup A18 each contained three exons; and SsMYB1, SsMYB25, SsMYB34, SsMYB127, and SsMYB148 in subgroup A6 each contained six exons (Supplementary Table 2).
The 15 conserved motifs of all the R2R3-SsMYB proteins were predicted to further reveal the diversification of S. suberectus (Figure 2C, Supplementary Table 3).Motifs 1-5 were commonly found together in most of these proteins.Motif 3 was present in most genes, with the exception of SsMYB23, SsMYB111, and SsMYB175.In general, the composition of structural motifs varied among different subgroups but exhibited similarity within the same subgroup.Motif 13 was found only in SsMYB40, SsMYB41, SsMYB42, SsMYB46, SsMYB47, and SsMYB161, and these genes were clustered in subgroup A24.Motif 14 was found only in SsMYB14, SsMYB15, SsMYB26, SsMYB27, SsMYB93, SsMYB94, SsMYB109, and SsMYB119, and these genes were clustered in subgroup A1 (Figure 2C, Supplementary Table 2).The conserved motifs observed in specific subgroups suggest that R2R3-SsMYB proteins within the same subgroup, sharing these motifs, may perform similar functions, as evidenced by the results of the phylogenetic analysis.

Chromosomal location of R2R3-MYB genes in Spatholobus suberectus
The R2R3-SsMYB gene sequences were utilized to determine their chromosomal locations.The resulting data depicted that this distribution was uneven, with 174 out of 181 R2R3-SsMYB genes present on 9 chromosomes: 16 R2R3-SsMYB genes were present on chromosomes 1 and 8; 20 on the 2 nd ; 24 on the 3 rd and 4 th ; 15 on the 5 th ; 25 on 6 th ; 21 on the 7 th ; 13 on the 9 th .In terms of genes, the 6 th chromosome (25) was at the top, with the 3 rd and 4 th (24) following after, with the least number noted for the 9 th (13).Additionally, 7 R2R3-SsMYB genes were located on the scaffold, while most of the remaining genes were present at the ends of the chromosomes (Figure 3).

Duplication events of Spatholobus suberectus R2R3-MYB genes
Gene duplication, especially segmental and tandem duplication events, is a vital driver of the evolution and diversification of gene families.Establishing the collinearity of the R2R3-MYB genes in S. suberectus facilitated the assessment of their potential associations and duplication events (Figure 4).The current research determined that 100 duplicated gene pairs of R2R3-SsMYB existed in the S. suberectus genome and were primarily categorized as segmental duplication.R2R3-SsMYB gene pairs indicated as segmental duplications numbered 99, while only one pair of R2R3-SsMYB genes (SsMYB55/SsMYB56) were found as tandem repeats in S. suberectus.To evaluate the selection of the duplicated R2R3-SsMYB gene pairs, the non-synonymous to synonymous substitution ratios (Ka/Ks) were quantified by assessing the gene duplications by means of whole genome analysis.In the present study, the Ka/Ks ratios of all the R2R3-SsMYB gene duplicated pairs were lower than one, which suggests that these genes were subjected to purifying selection.This indicates that these duplicated genes are important for maintaining the functions of the R2R3-SsMYB family in S. suberectus (Supplementary Table 4).
To shed light on the R2R3-SsMYB family concerning its evolutionary history, S. suberectus and two other representative species, particularly, A. thaliana and Glycine max were analyzed comparatively through two orthologous analyses (Figure 5).There were 73 orthologs between S. suberectus and A. thaliana and 336 orthologs between S. suberectus and G. max (Supplementary Tables 5, 6).Previous studies had reported that G. max contains 20 chromosomes with a genome size of 950 Mb, while A. thaliana contains only five chromosomes with a genome size of 125 Mb (The Arabidopsis Genome Initiative, 2000; Schmutz et al., 2010).The higher chromosomal number and increased genomic size led to increased orthologous events in SsMYB-GmMYB in contrast with SsMYB-AtMYB.Chromosomal locations of S. suberectus R2R3-MYB genes.174 R2R3-SsMYB genes mapped on 9 chromosomes, and the other 7 R2R3-SsMYB genes belonged to unassembled scaffolds.Chromosomal locations of R2R3-SsMYB genes were mapped based on the S. suberectus genome.The chromosome number is indicated at the top of each chromosome.

Determination of the upstream regulatory R2R3-SsMYB genes of flavonoid biosynthesis
The R2R3-SsMYB genes were assessed among root, stem, and other tissues concerning their expression.The data procured previously were utilized to assess the expression levels of R2R3-SsMYB genes in the various tissues (Figure 6).Transcriptome analysis showed that the members of the R2R3-SsMYB family were differentially expressed in diverse tissues (Supplementary Table 7).Many R2R3-SsMYB transcription factors were expressed specifically in the stem, such as SsMYB38, SsMYB47, SsMYB95, SsMYB106, SsMYB114, and SsMYB179.The expression level of SsMYB106 in the stem was 30.19 times higher than that in the leaf and 31.57times that in the flower.
Flavonoid contents in percentage vary across the tissues (root, stem, leaf, flower, and fruit).The contents of flavonoid and catechin were the highest in the stem, the content of genistein was the highest in the fruit, and the contents of isoliquiritigenin and formononetin were the highest in the root (Qin et al., 2020).To further identify the upstream regulatory R2R3-SsMYB genes of flavonoid biosynthesis, correlation analysis was conducted with previous RNA-Seq and content data.R2R3-SsMYB genes remarkably linked to flavonoid, catechin, genistein, isoliquiritigenin, and formononetin contents were labeled (Figure 7).In detail, SsMYB38, SsMYB86, SsMYB120, SsMYB126, and SsMYB135 were considerably linked to the flavonoid concentration.SsMYB10, SsMYB50, SsMYB106, and SsMYB133 were strongly associated with catechin levels.SsMYB2, SsMYB26, SsMYB48, SsMYB64, SsMYB77, SsMYB84, SsMYB105, and SsMYB156 were markedly correlated with genistein content.Furthermore, SsMYB109, SsMYB121, SsMYB124, SsMYB128, and SsMYB172 were considerably linked to isoliquiritigenin and formononetin.

Expression analyses of R2R3-SsMYB genes by qRT−PCR analysis
The pattern of R2R3-SsMYB genes regarding their expression in flavonoid biosynthesis was examined by assessing the candidate R2R3-SsMYB genes in five tissues of S. suberectus (Figure 8).Almost every R2R3-SsMYB gene in S. suberectus which was significantly correlated with flavonoid and catechin concentration, such as SsMYB10, SsMYB38, SsMYB50, SsMYB86, SsMYB106, SsMYB120, SsMYB133 and SsMYB135, were specifically highly expressed in the stem.
SsMYB2, SsMYB26, SsMYB48, SsMYB64, SsMYB77, SsMYB84, SsMYB105 and SsMYB156, which were significantly correlated with genistein content, were highly expressed, specifically in the fruit of S. suberectus.Most genes significantly correlated with the isoliquiritigenin and formononetin were specifically highly expressed in the root of S. suberectus, such as SsMYB121, SsMYB128, and SsMYB172.These tissues vary in terms of the proportion of flavonoids found in them which is likely affected by the varying expression patterns of the relevant genes.The constant and high expression of flavonoid biosynthesis-linked R2R3-SsMYB genes is the most probable cause of the increased expression of flavonoids in S. suberectus tissues.

Discussion
Identification and phylogenetics of R2R3-MYB genes in Spatholobus suberectus Denoted as the largest subfamily among MYB TFs, the R2R3-MYB group is involved in various aspects of the secondary metabolism in plants (Anwar et al., 2019;Yang et al., 2022).Many R2R3-MYB genes have been identified in various species of plants.The data indicated that 138, 106, and 244 R2R3-MYB genes are present in A. thaliana, P. axillaris, and G. max, respectively (Du et al., 2012;Katiyar et al., 2012;Chen et al., 2021).The current research noted that 181 R2R3-SsMYB genes were detected in the S. suberectus genome.R2R3-SsMYBs accounted for 66.5% of the identified SsMYB gene family, which bears similarity to the proportion of R2R3-MYB genes in P. axillaris (68.3%) (Chen et al., 2021).In this study, R2R3-SsMYB genes were categorized into 35 subgroups.Although the number of observed subgroups was higher than that of Pogostemon cablin (31 subgroups; Zeng et al., 2023), it was lower than those of Fragaria × ananassa (37 subgroups; Liu et al., 2021) and Musa acuminata (42 subgroups; Pucker et al., 2020).The evolutionary origins and conserved functions of R2R3-SsMYB members are thought to be shared within the specific clades.As a result, based on the R2R3-AtMYBs functional clades, the putative functions of S. suberectus R2R3-MYB proteins can be speculated.Phylogenetic analyses and evolutionary relationships of the R2R3-SsMYB genes have been systematically studied among different species.

Spatholobus suberectus
The pattern of gene structure is a useful tool for studying the evolutionary associations within a gene family.The 181 R2R3-SsMYB genes were found to have varying numbers of exons, that ranged from 1 to 11.Most of the R2R3-SsMYBs, like those in other plant species, had three exons and two introns (Jiang et al., 2004;Liu et al., 2014).In this research, the exon/intron patterns of R2R3-SsMYB genes exhibited similarity within the same subfamily, with the highest number of introns not exceeding two in most of the genes.These data were congruent with prior research that exhibited the presence of two introns (at maximum) in most land plant R2R3- MYB genes (Yin et al., 2022).The majority of R2R3-MYB genes belonging to the same subgroup had similar functions and were likely to exhibit similar motif compositions, but there were significant differences among subgroups.For example, subgroup A4 contained motifs 5 and 8, which are involved in inhibiting anthocyanin synthesis, while subgroup A6 contained motifs 7 and 9, which help to promote anthocyanin synthesis (Liu et al., 2021).Therefore, these motifs were conserved within specific subgroups, and proteins in the same subgroup that share these motifs likely have similar functions.

Gene duplication of R2R3-MYB genes in Spatholobus suberectus
Gene duplication is a major factor involved in the expansion of gene families and the generation of new genes.The form of gene replication in plants includes whole-genome duplication (WGD) events, and tandem as well as segmental duplication (Zheng et al., 2015).Based on our previous study, two WGD events in S. suberectus, Expression levels of R2R3-SsMYB genes in roots, stems, leaves, flowers, and fruits using RNA-seq.Correlation analysis between R2R3-SsMYB genes and flavonoid content was analyzed by Spearman rank correlation analysis in R (v3.6.2), and p < 0.05 (*) was considered statistically significant.
and three in G. max were identified (Qin et al., 2020), which might result in a significantly lower number of R2R3-SsMYB (181) than R2R3-GmMYB (244).Gene duplication, when it occurs in different chromosomes, can be termed segmental duplication.However, within the same chromosomes, this is termed tandem duplication.These duplication types are the driving factor behind the diversity of species, which in turn might be crucial for enabling the plants to adapt to continuously changing environments (Qiao et al., 2019;Schilling et al., 2020).The current research documented that the expansion of R2R3-SsMYB could be linked to both the aforementioned duplication events, which bear similarity to those in Nicotiana tabacum (Yang et al., 2022).There were 99 segmental duplications and one tandem duplication for R2R3-MYB genes in S. suberectus, implying that the former events were a primary cause of the expansion of R2R3-SsMYB genes.Ka/Ks analysis implied that most of the R2R3-SsMYB genes were subjected to purifying selection, indicating high conservation during evolution.Nonetheless, some R2R3-SsMYB genes were found to be under positive selection, implying that they may have acquired new functions during evolution.Comparative orthologous analysis showed that there was a large amount of collinearity between S. suberectus and A. thaliana, and between S. suberectus and G. max, indicating the presence of gene duplication at the level of chromosomes.

Candidate R2R3-SsMYB genes significantly associated with flavonoid synthesis
Prior research has implicated R2R3-MYB genes in essential functions in the flavonoid biosynthesis in a variety of plants, such as EbMYBP1 as a regulator implicated in the regulation of flavonoid accumulation in Erigeron breviscapus (Zhao et al., 2022).In A. thaliana, AtMYB44, AtMYB123, and AtMYB112 have exhibited involvement in the modulation of flavonoid biosynthesis (Lepiniec et al., 2006;Jung et al., 2010;Lotkowska et al., 2015).In this study, 22 candidate R2R3-SsMYB genes were significantly associated with flavonoid synthesis, with SsMYB26, SsMYB126, and SsMYB 133 clustered into the same group as AtMYB44, AtMYB123, and AtMYB112, indicating that these genes might be participating in the synthesis of flavonoid.These genes should be assessed in-depth regarding any possible functions they execute through further research.

Conclusion
This research provided a thorough genome-wide analysis of the R2R3-SsMYB family.In total, 181 R2R3-SsMYBs were determined in S. suberectus and categorized into 35 subgroups, among which 174 R2R3-SsMYB genes were mapped to 9 chromosomes.The same subgroup of R2R3-SsMYB genes displayed conserved motif compositions and similarity in exon-intron structures, reinforcing the outcomes of the phylogenetic analysis.The expansion of the R2R3-SsMYB gene family was primarily driven by segmental duplication events, as indicated by the synteny analysis.The Ka/ Ks analysis suggested that the R2R3-SsMYB gene family underwent purifying selection.In total, 22 R2R3-SsMYB genes were remarkably linked to the amount of flavonoid, catechin, genistein, isoliquiritigenin, and formononetin.These results provide insights into the roles of R2R3-SsMYB TFs in flavonoid biosynthesis and a foundation for further research characterizing the functions of R2R3-MYB genes in S. suberectus.
FIGURE 2 Analysis of gene structure and conserved motifs depending on the phylogenetic relationships in R2R3-SsMYB genes.(A) Phylogenetic tree established utilizing 181 R2R3-SsMYB proteins with the ML method.(B) Exon/intron structure analysis of R2R3-SsMYB genes.Respective black lines, as well as orange and blue boxes, denote introns, exons, and untranslated regions (UTRs).(C) Conserved motifs of R2R3-SsMYB genes elucidated by MEME Suite (represented by the various colored boxes).The scale bar of each R2R3-SsMYB gene is shown at the bottom.

FIGURE 4
FIGURE 4Collinearity analysis of the R2R3-MYB gene family in S. suberectus.All the synteny R2R3-SsMYB gene pairs were presented by curved lines and set as the same color.

FIGURE 5
FIGURE 5 Collinearity analysis of R2R3-MYB genes between S. suberectus and two representative plant species [A.thaliana and G. max].Presentation of gene pairs and syntenic R2R3-MYB gene pairs by curved lines and set as the same color.

FIGURE 8
FIGURE 8 Expression analysis of 22 selected candidate R2R3-SsMYB genes in diverse tissues of S. suberectus.Indication of mean ± SE of three independent replicates through error bars.Various lowercase letters (a, b, c, d, and e) vary significantly (p < 0.05).