New Insights Into the Plastome Evolution of the Millettioid/Phaseoloid Clade (Papilionoideae, Leguminosae)

The Millettioid/Phaseoloid (MP) clade from the subfamily Papilionoideae (Leguminosae) consists of six tribes and ca. 3,000 species. Previous studies have revealed some plastome structural variations (PSVs) within this clade. However, many deep evolutionary relationships within the clade remain unresolved. Due to limited taxon sampling and few genetic markers in previous studies, our understanding of the evolutionary history of this clade is limited. To address this issue, we sampled 43 plastomes (35 newly sequenced) representing all the six tribes of the MP clade to examine genomic structural variations and phylogenetic relationships. Plastomes of the species from the MP clade were typically quadripartite (size ranged from 140,029 to 160,040 bp) and contained 109–111 unique genes. We revealed four independent gene losses (ndhF, psbI, rps16, and trnS-GCU), multiple IR-SC boundary shifts, and six inversions in the tribes Desmodieae, Millettieae, and Phaseoleae. Plastomes of the species from the MP clade have experienced significant variations which provide valuable information on the evolution of the clade. Plastid phylogenomic analyses using Maximum Likelihood and Bayesian methods yielded a well-resolved phylogeny at the tribal and generic levels within the MP clade. This result indicates that plastome data is useful and reliable data for resolving the evolutionary relationships of the MP clade. This study provides new insights into the phylogenetic relationships and PSVs within this clade.

Some species of Leguminosae, especially those of the subfamily Papilionoideae, have acquired significant plastome structural variations (PSVs) during their evolution. These PSVs includes loss of IR (e.g., Lavin et al., 1990;Doyle et al., 1996), gene or plastome segment inversion (Choi and Choi, 2017), IR expansion, and/or contraction (Choi and Choi, 2017), and gene loss Sabir et al., 2014;Asaf et al., 2017). Most members of papilionoids, with the exception of a few early diverging lineages, share a 50-kb inversion in the LSC (Doyle et al., 1996). Previous studies have reported multiple inversions of 23, 24, or 36-kb in the Genistoid clade (Martin et al., 2014;Choi and Choi, 2017;Feng et al., 2017;Keller et al., 2017), a 39kb inversion in Robinia , and a large 78-kb inversion in the subtribe Phaseolinae of tribe Phaseoleae (Bruneau et al., 1990). However, only a few studies have examined PSV in the Millettioid/Phaseoloid clade (hereafter referred as the MP clade), one of the most species-rich clades within subfamily Papilionoideae.
The MP clade consists of more than 3,000 extant species with a global distribution (Schrire, 2005a;Schrire, 2005b;Schrire, 2005c andSchrire, 2005d;Schrire et al., 2009). Many species of this clade are economically important (Simpson and Ogorzaly, 2001;Baker, 2004) Some previous studies based on nuclear ribosomal ITS (Hu et al., 2002) and a few plastid loci (Hu et al., 2000;Kajita et al., 2001;Pennington et al., 2001;Wojciechowski et al., 2004;Cardoso et al., 2013;LPWG, 2017) have made progress in clarifying evolutionary relationships of the MP clade. However, some deep relationships, particularly at tribe level, have not been fully resolved, perhaps due to limited phylogenetic signals in these gene loci. Whole plastome sequences have been successfully applied to resolve plant evolutionary relationships , and therefore they might be of use for clarifying unresolved relationships in the MP clade. A few recent studies using limited samples have detected multiple types of PSV in this clade, such as a 78-kb inversion in Vigna radiata (L.) R. Wilczek and P. vulgaris, a 36-kb inversion in Lupinus luteus L. (Martin et al., 2014), the loss of rps16 gene in Cajanus Adans. (Guo et al., 2007;Schwarz et al., 2015), the loss of rpl2 and clpP introns (Kaila et al., 2016), and IR contraction/expansion in G. max (Saski et al., 2005;Kim et al., 2015). Investigation of the plastome of more taxa of this clade is essential for a better understanding of PSVs across this clade. In this study, we analyzed plastomes of 43 species (35 newly sequenced) representing all the six tribes of the MP clade. We investigated plastome structural diversification, and conducted phylogenetic reconstruction of the clade using plastome sequences. Deep phylogenetic relationships of the MP clade were investigated using the coding genes (CDs), noncoding regions (NCDs) and complete plastomes (CP). Our study provides important new insights into both phylogenetic relationships and PSVs within the MP clade.

Taxon Sampling, DNA Extraction, and Genome Sequencing
For this study, we used a total of plastomes of 43 species from the MP clade including one plastome from NCBI, seven plastomes from Zhang et al. (2020)'s phylogenetic study of the whole family, and newly sequenced plastomes of 35 species from 35 genera ( Supplementary Table S1). These species were selected based on the availability of tissues for sampling and their representation of previously recognized tribes in the clade (LPWG, 2013). Total genomic DNA (gDNA) was extracted from either fresh or silica-gel dried leaves using the modified CTAB method (Doyle and Doyle, 1987). The genome skimming method was used to obtain the plastome data (Zeng et al., 2018). The gDNA was fragmented and libraries size were selected for 350 bp inserts. Sequencing with 2 × 150-bp paired-end (PE) reads was performed on the Illumina Hiseq 2500/X-Ten at the Novogene (Tianjin, China) or Illumina Hiseq 2000/2500/4000/ X-Ten at the Beijing Genomics Institute (BGI) in Shenzhen, China.

Plastome Assembly and Annotation
The clean-up and quality control checks of the raw reads were performed using the Next Generation Sequencing (NGS) QC Tool Kit with default settings (Patel and Jain, 2012). Then, we assembled contigs from the PE reads via de novo assembly using GetOrganelle (Jin et al., 2019) with K-mer values 21, 45, 65, 85, 105, and127 calling SPAdes version 3.10 (Bankevich et al., 2012), using a reference genome from subfamily Papilionoideae (Arachis hypogaea L., NC_026676). Bandage v.0.80 (Wick et al., 2015) was used to visualize and filter the assembled contigs to generate a complete circular plastome. For incomplete plastomes, we filled the gaps between the contigs with consensus sequences of raw reads that were initially mapped to the reference plastome in order to obtain the complete plastome. The number of the mapped PE reads and the coverage depth were determined by mapping the paired reads against the plastome using Bowtie2 (Langmead and Salzberg, 2012) incorporated in Geneious v. 8.1.4 (Kearse et al., 2012).
The locations of the single copy (SC) and IR boundaries in the newly sequenced plastomes were determined using the same methods as Qu et al. (2019). The 'find repeat' function in Geneious was used to flank the IR regions. Then, the paired reads were remapped to the assembled plastomes to validate the SC/IR regions using Bowtie2. Finally, we visualized the read stacks of the newly assembled plastomes and compared the marked SC/IR boundaries in Geneious. The new plastomes were annotated using Dual Organellar Genome Annotator (DOGMA) web-interface (Wyman et al., 2004). We manually checked the consistency of start/stop codons and intron/exon boundaries in Geneious. The 'Find ORFs' function in Geneious was used to re-confirm the PCGs annotations, while tRNAscan-SE web service was applied to determine the tRNA genes (Schattner et al., 2005). The OrganellarGenomeDRAW [web server, (Lohse et al., 2013)] was used to draw the physical genomic map (Supplementary Figure S1). Finally, the complete newly assembled plastomes (35 in the MP clade and four outgroup species) were deposited in GenBank (Supplementary Table S1).

Plastome Structural Analysis
To investigate the patterns of genomic evolution, we analyzed and compared the structural characteristics of the 43 annotated plastomes. We examined structural characteristics such as plastome size (bp), LSC length (bp), SSC length (bp), IR length (bp), GC content (%), and gene distributions of all studied plastomes (Li et al., 2013;Supplementary Table S2; Table 1). For the contraction and expansion analysis, we compared the newly sequenced plastomes of the species from the MP clade with the A. hypogaea plastome. Afterward, we examined the variation of the genes located at the plastome termini and the boundary shifts (IR-SC) in the four junctions (J LB -LSC/IR B , J SB -IR B /SSC, J SA -SSC/IR A , and J LA -IR A /LSC) (Supplementary Figure S2). To confirm inversions, we aligned the 43 plastomes of species from the MP clade with the A. hypogaea plastome using the progressiveMauve algorithm (Wang et al., 2017). We used default settings to automatically calculate the seed weight (15), and calculated Locally Collinear Blocks (LCBs) with the minimum LCB score of 30,000 (Darling et al., 2004). The detected inversions were illustrated in Figure 3 and Supplementary Figure S3.

Phylogenetic Analysis
A total of 49 plastomes (including 43 species of the MP clade and six outgroups) were used for the phylogenetic analysis. The outgroups included two loosely related species of the subfamily Caesalpinioideae (Tamarindus indica L., NC026685, and Ceratonia siliqua L., NC026678) with plastome data downloaded from GenBank, and four more closely related species (

Protein of unknown function
Other genes ycf1 k , ycf2 We could not perform whole plastome alignment due to high PSVs in the legume plastomes. For this reason, we used the python script "get_annotated_regions_from_gb" (https://github. com/Kinggerm/PersonalUtilities) to extract the CDs and NCDs from the plastomes. We performed individual gene/region alignment in MAFFT v.7.4.0 (Katoh and Standley, 2013) with LINSI algorithm. All alignments were visualized and manually adjusted in Geneious. To reduce systematic error, we excluded noncoding loci with less than 70% taxon occupancy or alignment lengths less than 100 bp. We generated three data matrices for the phylogenetic analyses that included the CDs (81 genes for all species), NCDs (113 loci for all species), and CP (concatenated CDs and NCDs for all species). The substitution models for the three data matrices were determined using PartitionFinder2 v.2.1.1 (Lanfear et al., 2017). The evolutionary best fit models and data partitioning schemes (Supplementary Table S3) were selected using the corrected Akaike Information Criterion (AICc). Phylogenetic relationships were reconstructed using Maximum Likelihood (ML) and Bayesian Inference (BI). The ML analysis was performed using the IQ-TREE (Nguyen et al., 2015;Chernomor et al., 2016). We used the best partitioning schemes, -spp option (allowing partition-specific rates), and the ultrafast bootstrap replicates at 1000 for the analyses. The BI was performed using MrBayes v.3.2.6 ( Ronquist and Huelsenbeck, 2003). The Bayesian posterior probability (PP) was estimated with two independent Markov Chain Monte Carlo (MCMC) runs, which included one cold chain and three hot chains for 10,000,000 generations and the tree sampling frequency at every 1,000 generations. The MCMC convergence was determined, and the first 20% were discarded as burn-in using TRACER v.1.6 ( Rambaut and Drummond, 2004). Each parameter for each run obtained a sufficient effective sample size (ESS > 250). The majority-rule consensus tree was generated from the post burn-in trees. The resulting trees (ML and BI) were viewed and edited in FigTree v.1. 3.1 software (Rambaut, 2009).

Plastome Organization and Size
The mean plastome coverage ranged between 162.0 × (Philenoptera violacea ( Table S2). We observed only marginal variation in the GC content, which ranged from 34.2% in Dolichos falciformis E.Mey. of Phaseoleae to 35.8% in Indigofera spp. of Indigofereae (Supplementary Table S2).

Phylogenetic Relationships of the MP Clade
The phylogenies of the MP clade inferred from the three data matrices and two methods (ML and BI) yielded largely similar topologies, including well-resolved deep relationships of the MP clade (Figure 4). Our phylogenetic analyses strongly supported (BS ≥ 95 %, and PP = 1.0) the monophyly of the MP clade and most lineages. However, the lineage consisting of Butea monosperma (Lam.) Kuntze and Spatholobus Hassk sp. has different phylogenetic position in trees of CP and NCDs, and that of CDs, but both relationships were weakly supported. Also, the tribe Desmodieae was weakly supported to bemonophyletic in CP and NCDs datamatrices whereas strongly supported by CDs data. The tribe Indigofereae was strongly supported as sister to the remainder of the MP clade (BS = 100%, and PP = 1.0). Based on the current sampling, it is not sure if the tribe Desmodieae is monophyletic, while the tribes Millettieae and Phaseoleae appear non-monophyletic. Psoralea onobrychis Nutt. of the tribe Psoraleeae was nested within a big clade of the tribe Phaseoleae.

Gene Loss and Pseudogenization Events
Previous studies documented the loss of the genes rpl22 and infA in Lotus japonicus (Regel) K.Larsen of the Robinioid clade (Kato et al., 2000), Trifolium subterraneum L. of the IRLC (Cai et al., 2008), and G. max of the MP clade (Saski et al., 2005); this study confirmed the loss of both genes in all studied species of this clade. These two genes (rpl22 and infA) were reported lost in all the previously studied legume species (Saski et al., 2005) and almost all rosids (Millen et al., 2001). The functional copies of rpl22 and infA might have been transferred into nuclear genome [e.g., Pisum sativum L., (Gantt et al., 1991); Lupine L. species, (Martin et al., 2014)]. Previous studies suggested the loss of the ycf4 gene in Cicer L. sp., Glycine Wild. sp., and Medicago L. sp. (Magee et al., 2010;Kaila et al., 2016), or as pseudogene in P. sativum (Smith et al., 1991). Interestingly, we found ycf4 to be a normal gene in all newly sequenced plastomes of the species from the MP clade. We therefore attribute the absence of this gene in previous studies to inaccurate genome annotation, as the ycf4 gene is highly divergent (Kaila et al., 2016). The loss of the rps16 gene has been reported in some legumes   (Doyle et al., 1995). Again, we detected the loss of this gene in C. pubescens, C. ternatea, D. schlechteri, S. erecta, M. uniflorum and M. axillare of the tribe Phaseoleae of the MP clade ( Figure 1). The loss of introns (e.g., rpl2 intron 1) has occurred frequently in the plastomes of some angiosperm families as Convolvulaceae, Menyanthaceae, and Saxifragaceae (Downie et al., 1991), Leguminosae (Lee and Hymowitz, 2001;Jansen et al., 2008), and Lythraceae (Gu et al., 2016). Introns, especially those located at specific regions, are momentous in the transformational functionality and regulation of gene expressions (Xu et al., 2003). According to this study, with the exception of the loss of the clpP introns 1 and 2 in a single species of S. vestita (Phaseoleae) and the loss of ndh A and ndh B intron 1 in a single species of L. domingensis (Millettieae), two other introns (rps16 and rps12) have experienced multiple independent loss during the plastome evolution of the species from the MP clade. This finding agrees with the previous studies on the independent loss of rpsl2, rps16, and clpP introns in the MP clade (Guo et al., 2007;Schwarz et al., 2015;Kaila et al., 2016).
Consistent with previous studies in legumes, we observed the rps12 gene to have been trans-spliced (located in LSC region and the duplicated end in IR A ) during the plastome evolution of the species from the MP clade (Fonseca and Lohmann, 2017;Wang et al., 2017). Our results showed the expression of two distinct transcripts from a single gene. Previously, the rps12 gene ligation between exon 1 and 2 had been affirmed through complementary DNA sequencing of rps12 messenger RNA (mRNA) (Sharp, 1985). Thus, this evidence suggests that the rps12 gene was trans-spliced (exon 1 and exons 2-3) because of separate transcription. Trans-spliced events of a single gene during evolution are linked with two distinct transcripts encoding protein structural domains (Sharp, 1985) and reverse transcription of the trans-spliced, sequel to the insertion in the plastome (Baltimore, 1985). The exon-rearrangement paradigm during gene evolution propounds that gene fragments coding for protein structural domains (exon) are affected by reorganization into other genes (Gilbert et al., 1986). Also, RNA trans-splicing coding for rpsl2 exon 1 with transcripts from other genes may yield polypeptide variations in the plastome. These may be the underlying factor responsible for the rps12 gene trans-splicing event in the plastomes of the species from the MP clade.
Previous studies have documented pseudogenes in some species of the MP clade, for example rps16 and rpl33 in P. vulgaris (Guo et al., 2007); ycf15, rpl33, rps16, ycf68 and ycf1 in Cajanus scarabaeoides (L.) Thouars (Kaila et al., 2016); and rps16 in Lupinus (Keller et al., 2017). Our study identified rpl2, rps19, and ycf1 as pseudogenes (based on the presence premature stop codons and their reduced length) in most species of the MP  Table 1), while the rps16 and rpl33 genes were detected as normal genes in the species of the MP clade. The pseudogenization of these genes has been reported in other species, e.g. Melianthus villosus Bolus in Melianthaceae (Weng et al., 2014), Phalaenopsis aphrodite Rchb.f. in Orchidaceae (Chang et al., 2006), and Tylosema spp. in Mimosoideae (Wang et al., 2017). Pseudogenization of some genes is common in the plastomes of some plant taxa (Kim et al., 2015;Naumann et al., 2016;Keller et al., 2017). In previous studies, gene loss/pseudogenization in the plastome is attributed to rate of sequence evolution, gene transfer to the nucleus, or substitution by a nuclear-encoded protein for a plastid gene product (Ueda et al., 2008;Magee et al., 2010;Jansen and Ruhlman, 2012;Williams et al., 2015).

IR Contraction and Expansion
IR-SC boundary shifts played a significant role in the plastome size variation of the species from the MP clade ( Figure 1; Supplementary Figure S2). Significantly, a substantial expansion of the IR to include six ribosomal protein genes (rps3, rps8, rps11, rpl14, rpl16 and rpl36) resulted in the large plastome of C. cathartica (Phaseoleae) (Figure 2; Supplementary Figure S2). In contrast, in L. domingensis (Millettieae), the trnN and ycf1 genes have been relocated into the SSC following IR contraction, resulting in the smallest plastome studied of the MP clade. Additionally, the contraction/expansion of IR regions in the MP clade accounts for new positions of J LA between rps11 and rpoA; rps19 and rps3, and trnH and psbA. The IR contraction/expansions are frequent evolutionary events in angiosperm lineages, resulting in dramatic differences in the plastome length variations (e.g., Guisinger et al., 2011;Zeng et al., 2017). The rate of gene conversion during cell division/evolution and high content of short repeats (AT-rich) have also been noted as explanations for IR boundary shifts among several angiosperm lineages (Wang et al., 2008;Dugas et al., 2015;Wang et al., 2017). The same mechanisms might explain IR boundary shifts in plastomes of the species from the MP clade. The IR expansion to include the whole rps19 gene is a synapomorphic character for the M. axillare + M. uniflorum + S. erecta + D. schlechteri + L. purpureus clade. Most other IR contractions/expansions occurred independently across the MP clade.
Gene relocation within plastome has been reported in multiple previous studies (e.g., Lee et al., 2007;Kaila et al., 2016;Mower et al., 2019). For instance, the intragenomic transfer of ycf2 from the LSC region to the SSC region in lycophytes (Mower et al., 2019), the relocations of ycf3 and ycf4 within the LSC region of Menodora longiflora Engelm. ex A.Gray (Oleaceae, Lee et al., 2007), and the transfer of a block of ribosomal protein genes (rps19-rps8) from one end of the LSC region to the other end in the legumes-e.g. Vigna Savi (Perry et al., 2002), Phaseolus L. (Bruneau et al., 1990) and Cajanus spp. (Kaila et al., 2016). Similarly, our study detected translocation of genes within the LSC region in the plastomes of multiple species from the MP clade ( Figure 3). Additionally, we documented the relocation of a single gene (rpoA) in C. cathartica, and three ribosomal protein genes (rpl14, rpl16 and rps3) in a clade of Phaseoleae from one end of the LSC region to the other. Gene relocation can be associated with the subsequent contraction and expansion of the IR as observed in Pelargonium L'Hér. ex Aiton (Bruneau et al., 1990;Chumley et al., 2006). Alternatively, overlapping inversions and IR direction have been applied to explain the relocation of genes in the plastome of Oleaceae (Lee et al., 2007) and lycophytes (Mower et al., 2019), respectively. The IR expansion to include these genes is followed by the IR contraction at another end to relocate these genes into the SSC region. This appears to represent a more parsimonious explanation for the relocation of the rpoA gene and the segment comprising the genes rpl14, rpl16 and rps3.

Inversions
Several inversions including a 421-bp inversion in the mimosoid species (Wang et al., 2017), a 7.5-kb inversion in the Cercioideae (Kim and Cullis, 2017), and a large inversion of 50-kb in the subfamily Papilionoideae (Guo et al., 2007;Cai et al., 2008;Keller et al., 2017) occur in legumes. A few studies have documented the presence of inversions in species of the MP clade, such as V. radiata , L. luteus (Martin et al., 2014), and P. vulgaris (Bruneau et al., 1990). Importantly, an early molecular investigation (Bruneau et al., 1990) on plastome DNA inversions in Papilionoideae detected a large inversion (78-kb in size) between the psbA and rps11 genes in nine species of the tribe Phaseoleae. Also, prior studies documented a 50-kb inversion that spans the genes rbcL and rps16 in the plastomes of C. cajan and C. scarabaeoides (Kaila et al., 2016) and Cyamopsis tetragonoloba (L.) Taub. (Kaila et al., 2017)  Inversions might be linked with IR contraction/expansion (Bruneau et al., 1990), as shown by IV-A, B, and F in the study. The regions flanking three inversions (IV-C, D, and E) contain tRNA genes, which is consistent with the assumption that tRNA activity may influence inversion in plastome (Walker et al., 2014). Also, recombination through repeated sequences can induce inversions in plastome (Rogalski et al., 2006). We failed to detect any repeats in the breakpoint regions of these six inversions. Rearrangements such as inversions in plastid genomes of land plants are considered a useful marker to infer evolutionary relationships (Doyle et al., 1992). Large inversions have been considered informative for defining clades in legumes (Bruneau et al., 1990;Doyle et al., 1996;Dugas et al., 2015). For example, the inversion (IV-E) is synapomorphy of the monophyletic tribe Desmodieae excluding S. vestita. The IV-D occurs multiple times in tribes Millettieae and Phaseoleae. The other four inversions (IV-A, B, C, and F) occur in multiple separate lineages of Phaseoleae.

Phylogenetic Relationships in the MP Clade
Appropriate data partitioning is important for achieving accurate phylogenetic result in simultaneous utilization of multiple genes (Li et al., 2013;Saarela et al., 2018;He et al., 2019), a way may greatly abate the erroneous phylogenetic inferences caused by unequal rates and patterns of nucleotide substitutions in plastomes (Li et al., 2008). Our results indicated that ML and BI analyses with multiple genes partitioned models (CDs, NCDs, and CP) presents well-resolved evolutionary relationships of the MP clade. This study underscores the utility of plastid phylogenomics for resolving intertribal and intergeneric relationships within the MP clade (Figure 4). Evolutionary relationships among the major lineages, tribes, and genera were resolved with high support values. Consisted with previous studies (Hu et al., 2000;Wojciechowski et al., 2004;Cardoso et al., 2013;de Queiroz et al., 2015;LPWG, 2017), our analyses supported the tribe Indigofereae as sister to the remaining members of the MP clade. Desmodieae was supported as monophyletic group in previous studies (Bruneau et al., 1994;Doyle et al., 1997;Kajita et al., 2001;Stefanovic et al., 2009;Cardoso et al., 2013;de Queiroz et al., 2015;Egan et al., 2016), however this tribe was strongly supported as monophyletic by CDs but weakly supported by CP and NCDs (Figure 4). Our phylogenetic analyses suggested the polyphyly of the tribes Millettieae and Phaseoleae, which are consistent with previous studies (Hu et al., 2000;Wojciechowski et al., 2004;de Queiroz et al., 2015;Vatanparast et al., 2018). Previous studies (Wojciechowski et al., 2004;Cardoso et al., 2013;de Queiroz et al., 2015;LPWG, 2017) included multiple genera and supported the monophyly Psoraleeae. The phylogenetic analysis of Stefanovic et al. (2009) based on eight plastid genes supported the tribe Psoraleeae as sister to Phaseoleae, whereas it is nested within the Phaseoleae in this study and several other studies (e.g., Hu et al., 2000;de Queiroz et al., 2015;Vatanparast et al., 2018). Our study benefits from having a more comprehensive taxon sampling and involving whole plastome sequences for phylogenetic analysis; thus, it marks the beginning of a better understanding of evolutionary relationships in the MP Clade. For instance, our study highly supported the relationships of (1) C. ternatea + C. pubescens (BS = 100/PP = 1) and (2) A. blackii + C. ternatea + C. pubescens (BS = 100/PP = 1); these relationships were only weakly supported in previous studies (Kajita et al., 2001;Vatanparast et al., 2018). Notably, our multi-locus plastome data suggested (BS = 100%, PP = 1) the evolutionary position of S. vestita as sister to the tribe Desmodieae, in contrast with previous placement close to the subtribe Kennediinae of the tribe Phaseoleae (e.g., de Queiroz et al., 2015). Formerly, the genus Shuteria was included in the tribe Phaseoleae based on flower structures shared with core Phaseoleae species (e.g., Amphicarpaea Elliott ex Nutt., Cologania Kunth, and Dumasia DC., Lackey et al., 1981). It is noteworthy that a similar phylogenetic placement in the MP clade has been shown from analysis based on the single plastid region matK (de Queiroz et al., 2015). Therefore, our phylogeny supports the placement of S. vestita as sister to the tribe Desmodieae. Nevertheless, we expect that future phylogenetic studies would improve the understanding of the phylogenetic relationships of the genus Shuteria within the clade. Collectively our results provide important insights on the backbone relationships of the MP clade. However, additional phylogenetic study, perhaps integrating additional molecular data with morphological traits, will be necessary to fully clarify the evolutionary relationships of this clade.

Insights Into the Plastomic Evolution of the MP Clade
Some large inversions in the MP clade seem to have phylogenetic signal for the MP clade (Figure 1). The IV-A was only found in C. ternatea, IV-B and IV-C only in C. cathartica, and IV-F in the clade of L. purpureus + M. axillare + M. uniflorum + S. erecta + D. schlechteri. The IV-E was only detected in the tribe Desmodieae, which supports the monophyly of the tribe. Of note, the IV-D is a synapomorphy of one subclade of the tribes Phaseoleae and core-Millettieae, which is congruent with their closely related evolutionary relationships. Consistent with some previous studies (Martin et al., 2014;Dugas et al., 2015;Choi and Choi, 2017), our results suggest that significant plastome structural rearrangements such as inversion may provide useful information about phylogenetic relationships. However, some previous studies have suggested caution in using inversions in phylogenetic analysis. For example, a 36-kb inversion has been documented in distantly related lineages of papilionoids . Also, a 29-kb inversion has been reported from distantly related species of Ranunculaceae (Anemone L. and Clematis L., Hoot and Palmer, 1994). Additional sampling is necessary to better evaluate the utility of large PSVs for phylogenetic reconstruction in the MP clade. The independent loss of genes, exons, and introns was observed across different lineages of the MP clade. These results are consistent with previous studies that have shown multiple independent losses of specific genes in plastomes of different plant groups (e.g., Gu et al., 2016;Kaila et al., 2016). These kinds of PSV therefore seem to have low phylogenetic signal. Similarly, pseudogenization events have occurred independently across the MP lineages, indicating that these as well are likely not useful for inferences of phylogenetic relationships. Many observed PSVs in the MP clade plastomes suggest significant structural variation following the diversification of this lineage. In total, this study provides new insights into the phylogenetic relationships and PSVs within the MP clade.

DATA AVAILABILITY STATEMENT
The datasets generated for this study can be found in Genbank; the list of accession can be found in Supplementary Table 1.

AUTHOR CONTRIBUTIONS
T-SY, OO, and RZ designed the research. OO and RZ performed the experiments and assembled the plastomes. OO, RZ, and S-YC conducted the analysis. OO and T-SY wrote the manuscript. All authors revised the manuscript and approved the final manuscript.

FUNDING
This study was supported by grants from the Large-scale Scientific Facilities of the Chinese Academy of Sciences (No. 2017-LSF-GBOWS-02), the Strategic Priority Research Program of Chinese Academy of Sciences (XDB31010000), the National Natural Science Foundation of China [key international (regional) cooperative research project No. 31720103903].