Phylogenomics and phylogeography of Menispermum (Menispermaceae)

Introduction Phylogenomics have been widely used to resolve ambiguous and controversial evolutionary relationships among plant species and genera, and the identification of unique indels in plastomes may even help to understand the evolution of some plant families. Menispermum L. (Menispermaceae) consists of three species, M. dauricum DC., M. canadense L., and M. mexicanum Rose, which are disjuncly distributed among East Asia, Eastern North America and Mexico. Taxonomists continue to debate whether M. mexicanum is a distinct species, a variety of M. dauricum, or simply a synonym of M. canadense. To date, no molecular systematics studies have included this doubtful species in phylogenetic analyses. Methods In this study, we examined phylogenomics and phylogeography of Menispermum across its entire range using 29 whole plastomes of Menispermaceae and 18 ITS1&ITS2 sequences of Menispermeae. We reconstructed interspecific relationships of Menispermum and explored plastome evolution in Menispermaceae, revealing several genomic hotspot regions for the family. Results and discussion Phylogenetic and network analyses based on whole plastome and ITS1&ITS2 sequences show that Menispermum clusters into two clades with high support values, Clade A (M. dauricum) and Clade B (M. canadense + M. mexicanum). However, M. mexicanum is nested within M. canadense and, as a result, we support that M. mexicanum is a synonym of M. canadense. We also identified important molecular variations in the plastomes of Menispermaceae. Several indels and consequently premature terminations of genes occur in Menispermaceae. A total of 54 regions were identified as the most highly variable plastome regions, with nucleotide diversity (Pi) values > 0.05, including two coding genes (matK, ycf1), four introns (trnK intron, rpl16 intron, rps16 intron, ndhA intron), and 48 intergenic spacer (IGS) regions. Of these, four informative hotspot regions (trnH-psbA, ndhF-rpl32, trnK-rps16, and trnP-psaJ) should be especially useful for future studies of phylogeny, phylogeography and conservation genetics of Menispermaceae.

Menispermum, a genus of deciduous climbing woody lianas, is disjuntcly distributed in East Asia, Eastern North America and Mexico (Xiang et al., 2000;Hina et al., 2020; Figure 1). Unlike most members of Menispermaceae, which are endemic to the tropics and subtropics, the ranges of Menispermum extend well into the northern temperate zone. For example, M. dauricum DC. (described in 1817) grows in central to northeastern China, southern Siberia, Korea, and Japan, growing amongst roadside vegetation and/or within open forests (Wu et al., 2008). Menispermum canadense L. (described in 1753) occurs in temperate eastern North America, growing in deciduous woods and thickets (Purrington and Horn, 1993;Morin, 1997). Menispermum mexicanum Rose (published in 1911) is considered to be a third species in the genus because of its larger drupes, glaucous lower leaves and disjunct distribution (northern Mexico) from M. canadense. Although the two species inhabit differ hemispheres, Kundu and Guha (1980) stated that M. mexicanum is most similar to M. dauricum and proposed that M. mexicanum is a variety of M. dauricum, based on detailed morphological and comparative anatomical studies. However, Calderón de Rzedowski (1999) proposed that M. mexicanum is a synonym of M. canadense. Despite its taxonomic uncertainty, M. mexicanum has never been sampled in any published molecular studies.
Today there are numerous molecular tools available to plant systematists, and analyses of whole plastomes have become widely used to resolve the circumscription of taxa at different levels of classification within angiosperms, including the families of earlydiverging eudicots (Sun et al., 2016), the genera of Ranunculaceae (Liu et al., 2018b;Zhai et al., 2019), and even among species of Aconitum L. (Kong et al., 2017). Comparative plastome analyses also have helped to explore the structural variation of plastomes in angiosperm, such as gene indel (insertion/deletion) events, gene rearrangements, and/or inverted repeat expansion-contraction (Sun et al., 2016;Sun et al., 2018;He et al., 2019;Song et al., 2022). Plastome data can also be a good tool to analyze the phylogeographic patterns among plants (Sun et al., 2018;Demenou et al., 2020;Duan et al., 2020;Xiang et al., 2021). Due to characteristics of plastomes such as maternal inheritance, low to moderate evolutionary rate, their haploid nature which enhances genetic drift, plastome sequences may show a stronger phylogeographical pattern compared to nuclear DNA (Sugiura, 1992;Moore et al., 2010;Demenou et al., 2020;Zhang et al., 2021).

Plant material and DNA extraction
Fresh leaves from three Menispermum species and Sinomenium acutum were collected and then dried in silica-gel. In total twelve samples of M. canadense, four of M. dauricum, and one Sinomenium acutum were included. Voucher specimens were mostly deposited at the Herbarium of Zhejiang University (HZU), with the two M. mexicanum specimens (one of which is an isotype specimen) were sampled from the Gray Herbarium (GH) and Arnold Arboretum Herbarium (A) of Harvard University (Table 1). Total genomic DNA was extracted using DNA Plantzol Reagent (Invitrogen, Carlsbad, CA, United States) according to the manufacturer's protocol. Agarose gel electrophoresis and an ultraviolet spectrophotometer (K5800, KAIAO, Beijing, China) were used to check the quality and quantity of genomic DNA, respectively.

Genome sequencing, assembly, and annotation
Short-inserts of 500-bp paired-ends were used to construct the libraries by Genomic DNA Sample Prep Kit (Illumina, San Diego, CA, United States). We used tags to index DNA from each species and pooled samples together for sequencing on a HiSeqTM 2500 platform at the Beijing Genomics Institute (BGI, Shenzhen, China) to obtain

Species and sample code
Collection locality Voucher information clean reads of each sample. Then, these reads were assembled into contigs or plastid sequences using GetOrganelle v 1.7.5.3 (Jin et al., 2020), and visualized in Bandage v 0.8.1 (Wick et al., 2015) to identify whether these contigs or plastid sequences were the whole plastome for each sample. Additionally, due to low sequencing depth of two M. mexicanum herbarium samples, only small size of contigs were obtained by GetOrganelle. To obtain longer plastid sequences, these short contigs were mapped to the reference M. dauricum (MH298220, Hina et al., 2018). Yet, there are still a lot of gaps. To obtain these gapsequences, NOVOPlasty v 4.2 (Dierckxsens et al., 2017) was applied to assemble these sequences,with the neighboring protein-coding or tRNA genes in the plastome reference corresponding these gap sequences as seeds, and using the same plastome reference above. At last, these newly assembled gap-sequences were inserted into the above longer plastid sequences of M. mexicanum using software Geneious Prime ® 2021.2.2 (www.geneious.com) and we eventually generated their whole plastomes. Plastomes were annotated using '2544-plastome' dataset of CPGAVAS2 (Shi et al., 2019), with default setting. Then, these whole plastomes were illustrated with the online tool OrganellarGenome DRAW v1.3.1 (Greiner et al., 2019) and deposited in GenBank (Table 1). We also downloaded ten additional whole plastomes of Menispermaceae from NCBI, i.e., A. gusanlung , Sinomenium acutum (MN626719) and reannotated these plastomes to perform a comparative plastome analysis of these Menispermaceae. The 18 ITS1&ITS2 were also assembled using GetOrganelle, but M. mexicanum (M2) were failed.

Comparative plastome analyses
Altogether, by combining new and previously published plastomes (Table 1), a total of 12 plastomes representing 12 species of Menispermaceae were used to investigate the plastome evolution in this family. Interspecific variation was documented using mVISTA (Frazer et al., 2004) in Shuffle-LAGAN mode. Rearrangements of plastomes were checked by Mauve 2.3.0 (Darling et al., 2004). Gene and structure differences in the junctions of single copies and inverted repeat regions were visualized and compared through the software IRscope (Amiryousefi et al., 2018). The protein-coding sequences (CDS), intergenic spacer (IGS) and intron regions were extracted sequentially according to the criterion that alignment length > 200 bp and containing at least one mutation to estimate the nucleotide diversity (Pi) of plastid sequences of Menispermaceae. Pi value of each sequence was calculated in DNASP v 6.12.03 (Rozas et al., 2017).

Phylogenetic and phylogeographic analyses
The phylogeny of Menispermaceae based on whole plastome/ nrDNA was inferred using maximum likelihood (ML) and Bayesian inference (BI) methods via RAxML-HPC2 on XSEDE v 8.2.12 (Stamatakis, 2014) on CIPRES Science Gateway website (https://www.phylo.org) and MrBayes v 3.2.6 (Ronquist et al., 2012), respectively. For ML analysis, we set 1000 bootstrap replicates but used defaults for the other parameters. For BI analysis, we first evaluated the proper model of evolution (GTR +F+I+G4 for ptDNA, GTR+F+I for nrDNA) based on the Akaike Information Criterion (AIC) in ModelFinder (Kalyaanamoorthy et al., 2017), then ran two independent Markov chain Monte Carlo (MCMC) chains, with a set of 1,000,000 generations for each chain, and sampled one time for every 1,000 generations; the first 25% of the trees were discarded.
For phylogeographic analyses, we focused only on the 18 samples of Menispermum. The 18 plastomes were aligned with defaulted parameters by the plug-in of MAFFT in Geneious Prime ® 2021.2.2, and inferred the number of haplotypes (Nh) (excluding sites with gaps/missing data) in DNASP. The genealogical relationships of haplotypes were identified via TCS haplotype network using the software PopART v 1.7 (Leigh et al., 2015). The 17 ITS1&ITS2 sequences of Menispermum were analyzed using the same methods.

Plastome features
Total coverage of the 19 newly sequenced plastomes ranged from 9 × [M. mexicanum (M2)] to 536.9 × [M. canadense (C10)] (Table 2). These whole plastomes of Menispermaceae show a typical angiosperm quadripartite structure, including a pair of IR regions (IRa and IRb) and two single copy regions (LSC and SSC) ( Figure 2). All the plastomes contain 114 unique genes, consisting of 80 CDS genes, 30 tRNA genes, and four rRNA genes (Table S1). The sizes of these newly assembled whole plastomes of Menispermeae range from

Phylogenetic and the phylogeographic analyses
The phylogenetic trees we generated using ML and BI methods resulted in similar topologies with high support (Figure 3). The Coscinieae + Burasaieae clade is the first diverging lineage of the family. This is followed by the clade of Anomospermeae + Cissampelideae as sister to tribe Menispermeae, within which Sinomenium is sister to the species of Menispermum. Menispermum dauricum (Clade A) is monophyletic and sister to the Clade B of M. canadense + M. mexicanum. Clade B is divided into two highly supported clades, Clade B1 (BS = 90, PP = 1) and Clade B2 (BS = 90,  (Figure 4). In clade b, M. mexicanum (M1) was nested in clade b1 which is sister to clade b2. The 18 whole plastomes of Menispermum were identified to have 15 haplotypes ( Figure 5A). All of these are unique except that haplotypes H3, H7 and H10 are shared in two samples ( Figure 5A). The 17 ITS1&ITS2 sequences of Menispermum were identified to have 15 ribotypes ( Figure 5B). All of these are unique except that ribotype R5 is shared by three samples (C3, C4, C8) ( Figure 5B). The haplotypes and ribotypes cluster significantly into two lineages, i.e. M. dauricum and M. canadense + M. mexicanum. Yet, the haplotypes and ribotype of M. mexicanum sampled are all embedded within M. canadense.

Comparative plastome analyses
Due to minor variations in each species, we performed a comparative plastome analysis of Menispermaceae using only 12 plastomes representing 12 species. The Stephania species have the shortest plastomes (~157 kb), whereas species of Menispermeae have the longest (~163 kb). Global visualization using mVISTA and MAUVE showed that all of these 12 plastomes of Menispermaceae have a consistent gene order, except that the rpoC2 gene in the plastome of T. sinensis is inverted (Figures S1, S2). The rps19 gene of F. recisa is located in LSC, with a 32 bp-distance away from the junction of LSC and IRb (JLB), whereas the other eleven species have their rps19 located at the boundary of LSC/IRb (JLB) and the genes copied 71-174 bp sequences (yrps19) in IRa ( Figure 6). The junction of SSC/IRa (JSA) in the all 12 plastomes are located in ycf1 gene (5574 bp-5615bp), and the gene copied different length of sequences (yycf1) in IRb, which ranged from 18 bp in F. recisa to 483 bp in S. dielsiana ( Figure 6). Especially, only two species (S. dielsiana and S. epigaea) have their ndhF gene across the SSC/IRB boundary (JSB).
We also calculated the length of all genes across 29 plastomes in Menispermaceae. Of all 114 genes, 26 genes varied in length, and only minor variations (less than 42 bp) were documented among 18 genes, i.e. matK, atpF, rpoC1, rpoB, ndhK, atpE, accD, rps18, rpl20, petD, rpoA, infA, rpl23, ndhF, rpl32, ccsA, rps15, ycf1 (Table S2). The rpoC2 gene in the plastome of T. sinensis occurred a base replace from "T" to "C" in the locus of 2137 bp, which results in a premature termination of that gene, about 2,000 bp shorter than the other plastomes. The ycf2 gene in all Stephania plastomes has an approximately 417 bp deletion at locus 1787/1788 bp, compared to the plastomes from the other genera. In the plastome of P. glaucus, the ycf15 gene is about 200 bp shorter than the other 28 Menispermaceae plastomes because of a 10-bp insertion of "TATTCTATTA" in the locus of 211 bp, resulting in a premature termination. The sequences in the second extron of rps16 gene in S. dielsiana and S. epigaea, are significantly different from the other 27 plastomes, with "AGAATAAAA" and "AGAATAAAT" replacing "GT" or "AT" in the loci of 103-111 bp, which results in the rps16 gene being almost halved in the two plastomes, compared to the remaining plastomes. Two single-copy (SSC, LSC) regions exhibit a higher level of sequence variation than the IR regions ( Figure 2). In the alignment of the 12 plastomes of Menispermaceae, a total of 147 regions were extracted to calculate the value of nucleotide diversity (Pi values). They included 65 intergenic spacer (IGS) regions, 63 protein-coding (CDS) regions, 17 intron regions (of CDS/tRNA genes), and two rRNA gene (Figure 7). A total of 54 regions with Pi more than 0.05 in LSC or SSC were revealed (Figure 7). The Pi in CDSs and intron regions showed significantly lower than those in IGS regions. These CDSs ranged from 0.00317 (rps7) to 0.06347 (ycf1), only matK and ycf1 genes showed high values more than 0.05. For the intron regions, Pi ranged from 0.00436 (trnA intron) to 0.06359 (ndhA intron), four introns (trnK, rpl16, rps16 and ndhA) gene showed high diversity (Pi > 0.05). Besides, for the 65 IGS regions, the value of Pi ranged from 0.00631 (ndhB-rps7) to 0.14317 (trnS-trnG), 48 regions showed high diversity, and six showed remarkably high diversity (Pi > 0.1; i.e., trnS-trnG, trnH-psbA, ndhF-rpl32, trnK-rps16, ccsA-ndhD, trnP-psaJ; see Figure 5). The sizes and Pi values of the six hotspot regions are shown in Table S3, and corresponding phylogenetic trees based on each region are shown in Figure S3.
Within tribe Menispermeae, Sinomenium is sister to Menispermum with maximum support (BS = 100, PP = 1, Figure 3), consistent to the  Maximum likelihood (ML) and Bayesian inference (BI) phylogeny based on 29 complete plastomes of Menispermaceae, with Ranunculus as outgroup. Numbers at each node represent ML bootstrap support (BS) and BI posterior probability (PP) values, respectively. Hyphens indicate the nodes not found in the strict consensus BI tree. The phylogram on the upper left shows the relative branch lengths. Sample codes are the same as in Table 1. previous results based on plastid and/or combined rDNA sequences (Hoot et al., 2009;Jacques et al., 2011;Wang et al., 2012;Wefferling et al., 2013;Ortiz et al., 2016;Lian et al., 2020). The species M. dauricum (Clade A) from East Asia is monophyletic, and the two accessions from eastern (D1) and central China (D2) clustered into a subclade, whereas the remaining two from northern China (D3) and Japan (D4) clustered into another subclade ( Figure 1 and Table 1). A similar lineage divergence between north and south has been found in unrelated plants such as Saxifraga sect. Irregulares  indicating that this pattern is not random. Clade B (Figure 3) contains the accessions of M. canadense from the eastern United States and Canada plus two accessions of M. mexicanum from northern Mexico, which are not sister to each other. Based on the holotype specimen (U. S. National Herbarium no. 462662) collected by Dr. C. G. Pringle in the Sierra Madre, Mexico on July 9 th , 1907, Rose (1911) proposed a new species, M. mexicanum, because of its leaves not being glaucous beneath, larger drupes, and a much more southern range than M. canadense. However, Kundu and Guha (1980) considered M. mexicanum to be a variety of Asian M. dauricum based on their detailed morphological and comparative anatomical studies. Our study is the first molecular systematics study that includes all three species. Both accessions of M. mexicanum are from different herbarium specimens collected in Mexico (they cannot be misidentifications of M. canadense) and one of them M. mexicanum (M1) was sampled directly from a historical isotype specimen. These two accessions of M. mexicanum are nested within M. canadense in both phylogenetic tree (Figures 3, 4) and TCS diagram ( Figure 5), rendering neither species monophyletic. Hence, we here propose that M. mexicanum is merely a geographic disjunct population and a synonym of M. canadense. This result supports the taxonomic treatment published by Calderoń de Rzedowski (1999).
The rpoC2 gene in T. sinensis contains an inversion and significant truncation of about 2 kb (Table S2 and Figure S2), which is possibly due to the base substitution of "T/C" in the locus 2,137 bp and resulting in Maximum likelihood (ML) and Bayesian inference (BI) phylogeny based on 17 ITS1&ITS2 sequences of Menispermum, with Sinomenium as outgroup. Numbers at each node represent ML bootstrap support (BS) and BI posterior probability (PP) values, respectively. Hyphens indicate the nodes not found in the strict consensus BI tree. The phylogram on the upper left shows the relative branch lengths. Sample codes are the same as in Table 1.

FIGURE 5
Haplotype (A) and ribotype (B) networks of Menispermum. C, D, and M represent M. canadense, M. dauricum, and M. mexicanum, respectively. Sample codes are the same as in Table 1. Comparison of the LSC/IRb/SSC/IRa junctions among the 12 complete plastomes of Menispermaceae. Sample codes are the same as in Table 1. Nucleotide diversity (Pi) values of 12 Menispermaceae plastome sequences. Song et al. 10.3389/fpls.2023.1116300 Frontiers in Plant Science frontiersin.org the premature termination of that gene. The pseudogene yycf1 of S. dielsiana and S. epigaea, which may have resulted from an IR expansion, is~200 bp longer than that of the ten other Menispermaceae species (Song et al., 2022). The rps19 gene of F. recisa is located in the LSC region, resulting in the fact that yrps19 does not appear in IRa. This phenomenon also has been observed in other plant families, such as some species of Aconitum (Ranunculaceae), Dicorynia paraensis Benth.
(Fabaceae), and at least three species of Oxalis L. (Oxalidaceae) (Kong et al., 2017;Bai et al., 2021;Li et al., 2021). The plastome length of our twelve focus species shows significant differences (Figure 6), ranging from 156,843 bp (Stephania dielsiana) to 163,095 bp [M. canadense (C1)], which is mainly ascribed to insertions/deletions (indels) in the intergenic spacer regions. Yet, some gene length differences are significant between tribes or genera, and even among species. For example, an~417 bp deletion within the ycf2 gene is present in all Stephania plastomes. A 12 bp deletion in the atpF gene, which may be a molecular synapomorphy of the tribe Menispermeae, occurs only in the genera Sinomenium and Menispermum. For Menispermum, the ndhF gene in four M. dauricum plastomes has a 6 bp insertion compared with those from M. canadense and M. mexicanum. Indels and consequently premature termination of various genes were also found in the plastome evolution of Menispermeae, such as an about 200 bp truncation of P. glaucus caused by the 10-bp insertion of "TATTCTATTA". In brief, indels and premature termination events of genes are usual phenomenon of plastome evolution in angiosperms and we have documented them within Menispermaceae as well (Wang et al., 2007a;Fu et al., 2017;Wang et al., 2020;Song et al., 2022).

Highly variable regions of Menispermaceae plastomes
Several plastid gene and spacer sequences (atpB, matK, ndhF, rbcL, and trnL-F) have been used to resolve the backbone phylogeny of Menispermaceae (Ortiz et al., 2007;Jacques et al., 2008;Hoot et al., 2009;Jacques et al., 2011;Wang et al., 2012;Wefferling et al., 2013;Ortiz et al., 2016;Lian et al., 2020). However, only 63 of 72 genera and fewer than 150 of 526 species were sampled in those previous studies (Hong et al., 2001;Ortiz et al., 2007;Wang et al., 2007b;Hoot et al., 2009;Jacques et al., 2011;Wang et al., 2012;Wefferling et al., 2013;Lian et al., 2020), suggesting that additional sampling of both taxa and data might be needed to get a full picture of evolutionary relationships within the family. In this study, of the six newly proposed hotspot regions, with nucleotide diversity (Pi) values > 0.1 just four loci (trnH-psbA, ndhF-rpl32, trnK-rps16 & trnP-psaJ) are adequate to recover the monophyly of all genera and species that are represented by more than one individual (Table S3). This is encouraging and suggest that these markers will be useful candidate DNA barcodes for further studies on phylogeny and phylogeography of Menispermaceae (Barbosa-Filho et al., 2000;Singh and Bedi, 2016;He et al., 2018;Hina et al., 2020;Hao et al., 2022).

Author contributions
PL and ZY designed this study. PL and SW collected plant materials. SS and ZY assembled and analyzed the data, and prepared the figures and tables. ZY, SS, and PL wrote the original draft, KC, YW, and XJ modified the manuscript. All authors contributed to the article and approved the submitted version.