Analyses of Plastome Sequences Improve Phylogenetic Resolution and Provide New Insight Into the Evolutionary History of Asian Sonerileae/Dissochaeteae

Sonerileae/Dissochaeteae (Melastomataceae) comprises ca. 50 genera, two thirds of which occur in Southeast Asia. Phylogenetic relationships within this clade remain largely unclear, which hampers our understanding of its origin, evolution, and biogeography. Here, we explored the use of chloroplast genomes in phylogenetic reconstruction of Sonerileae/Dissochaeteae, by sampling 138 species and 23 genera in this clade. A total of 151 complete plastid genomes were assembled for this study. Plastid genomic data provided better support for the backbone of the Sonerileae/Dissochaeteae phylogeny, and also for relationships among most closely related species, but failed to resolve the short internodes likely resulted from rapid radiation. Trees inferred from plastid genome and nrITS sequences were largely congruent regarding the major lineages of Sonerileae/Dissochaeteae. The present analyses recovered 15 major lineages well recognized in both nrITS and plastid phylogeny. Molecular dating and biogeographical analyses indicated a South American origin for Sonerileae/Dissochaeteae during late Eocene (stem age: 34.78 Mya). Two dispersal events from South America to the Old World were detected in late Eocene (33.96 Mya) and Mid Oligocene (28.33 Mya) respectively. The core Asian clade began to diversify around early Miocene in Indo-Burma and dispersed subsequently to Malesia and Sino-Japanese regions, possibly promoted by global temperature changes and East Asian monsoon activity. Our analyses supported previous hypothesis that Medinilla reached Madagascar by transoceanic dispersal in Miocene. In addition, generic limits of some genera concerned were discussed.

The estimated age and biogeographical history of Sonerileae/ Dissochaeteae also remain controversial. Several studies have explored these issues using fossil-calibrated phylogeny based on DNA sequences of one or a few regions (ndhF,rbcL,rpl16,nrITS,ETS,18S,26S) Morley and Dick, 2003;Renner, 2004a;Renner, 2004b;Berger et al., 2016;Veranso-Libalah et al., 2018). The stem age of Sonerileae/Dissochaeteae was estimated to be 19 Mya , 38 Mya (Berger et al., 2016), 39.63 Mya (Veranso-Libalah et al., 2018, or 73 Mya (Morley and Dick, 2003). Previous authors had proposed three competing hypotheses on the biogeographical history of Sonerileae/Dissochaeteae: (1) diversification in SEA during Miocene and subsequent transoceanic dispersal of two sublineages from SEA to Madagascar and Africa; (2) dispersal from South America to Africa ca. 74 Mya ago followed by diversification and rafting on the Indian Plate to SEA ("Indian Ark" hypothesis); and (3) trans-Atlantic dispersal from South America to Africa during Late Eocene. The merit of these hypotheses needs further testing.
Previous studies were based on sequence data of nuclear ribosomal ITS (nrITS) and/or a few chloroplast markers (trnV-trnM, ndhF, rbcL, rpl16), both of which have their own limitations. nrITS, although proven useful in phylogenetics of Melastomateae Veranso-Libalah et al., 2017;Veranso-Libalah et al., 2018), failed to resolve the backbone of Sonerileae/Dissochaeteae. Chloroplast DNA sequences have been extensively used in phylogenetic analyses of angiosperms because of their conserved structure, high copy numbers, and uniparental inheritance (Birky, 1995). However, the use of only a few chloroplast genes is often insufficient to resolve genera or species level relationships due to low mutation rate (Shaw et al., 2007;Dong et al., 2012;Shaw et al., 2014). The advent of next-generation sequencing technologies offers a costeffective means to obtain chloroplast genomic data, which have been successfully used to tackle difficult phylogenetic questions of plants from deep to shallow taxonomic level (Ma et al., 2014;Stull et al., 2015;Zhang et al., 2016;Heckenhauer et al., 2018;Niu et al., 2018;Wen et al., 2018). A chloroplast phylogenomic approach may help to better elucidate the relationships and biogeography of Sonerileae/Dissochaeteae. This paper aims to (1) reconstruct the phylogenetic relationships in Sonerileae/Dissochaeteae, (2) infer the divergence times and biogeographical history, and (3) reassess the current generic delimitations based on the resulted phylogeny. To this end, we include in this study 138 species representing 23 genera in Sonerileae/Dissochaeteae, with a special emphasis on the widely distributed Phyllagathis.
Two phylogenomic datasets were assembled. (1) The Melastomataceae dataset, which contained all chloroplast genomes available in the family, was assembled for phylogenomic analysis. The resulted tree was also used as input tree in divergence time estimation and ancestral range reconstruction. The dataset was pre-analyzed using an outgroup from Myrtaceae (Eucalyptus grandis W. Mill ex Maiden). The most basal clade of Melastomataceae, Memecylon-Pternandra, was then selected as outgroup for this dataset. (2) The Sonerileae/Dissochaeteae dataset, which was used for comparison of phylogenies generated by chloroplast genome and nrITS sequence data. It comprised the chloroplast genomes of this clade and Blakea schlimii (Naudin) Triana (Blakeeae), with the latter selected as an outgroup.
To facilitate discussion, two additional datasets were assembled. An nrITS dataset parallel to the Sonerileae/ Dissochaeteae genomic dataset (excluding Opisthocentra clidemioides Hook.f. as its nrITS sequence was not available) was analyzed for comparison. Another dataset of five concatenated plastid regions (rbcL, rpl16, ndhF, psbK-psbL, accD) (hereafter referred to as cp-5 gene dataset) with partially missing data was constructed to test the phylogenetic position of those African and South American species without available plastid genomic data. Please see Table S2 for detailed sampling list and GenBank accession numbers for nrITS and plastid markers.

DNA Isolation, chloroplast genome Sequencing
Total DNA was extracted from silica-gel dried leaves or fresh leaf tissue (when available) using the modified CTAB procedure (Doyle and Doyle, 1987) or using HiPure Plant DNA Mini Kit (Magen, Guangzhou, China) following the manufacturer's protocols. Libraries were prepared from the total genomic DNA of 151 samples using Next Ultra II DNA Library Construction Kit (NEB, Beijing, China) following the manufacturer's protocols. Shotgun sequencing was then performed on an Illumina HiSeq ™ 2500 platform (150 bp paired-end reads) at Vazyme (Nanjing, China)/Novogene (Beijing, China).
The nrITS region was amplified and sequenced using universal primers (White et al., 1990) or assembled and extracted from our genomic shotgun sequencing reads. For polymerase chain reaction (PCR) amplification and sequencing, we followed the same procedure described in Zou et al. (2017). A mapping-based method was used to extract nrITS sequences from NGS sequencing data. First, nrITS sequences of most closely related species were applied as references to construct a BWA index (Li and Durbin, 2010). Short reads were then mapped to the reference with BWA-MEM. The resulting aligned SAM file were sorted and converted to BAM format. Single nucleotide polymorphisms (SNPs) and indels calling were conducted by SAMtools mplieup (Li et al., 2009) and BCFtools (https://github.com/ samtools/bcftools). Finally, BCFtools was used to replace corresponding positions of reference with SNP information using consensus option, resulting in a FASTA sequence for a synthetic sequence of nrITS. Eighty nrITS sequences were newly generated.

Plastid genome Assembly and Annotation
To assemble the chloroplast genome of 151 samples, the total sequencing output, approximately 13 Gb of paired-end (PE z= 150 bp) sequence data per sample, was used as input into NOVOPlasty v1.2.4 (Dierckxsens et al., 2017). The partial sequence of rbcL (ribulose-1,5-bisphosphate carboxylase/ oxygenase large subunit) of Melastoma candidum D. Don (GenBank accession number GQ436728) was adopted as the seed sequence in the seed-and-extend algorithm implemented in NOVOPlasty v1.2.4 (Dierckxsens et al., 2017). Annotation of the chloroplast genome was performed using the DOGMA online tool (Wyman et al., 2004) and then manually checked with the start/stop codons and junctions between introns and exons. The circular chloroplast genome maps were drawn with OGDRAW v1.3 (Lohse et al., 2007).

Sequence Alignment
Chloroplast genome sequences were aligned using MAFFT v7.042 (Katoh and Standley, 2013) with default settings. Only one copy of the IRs was used in the final alignment to avoid overrepresentation of duplicated sequences. Dubiously aligned regions may bias phylogenetic inferences (Misof and Misof, 2009) and previous phylogenetic analyses of Melastomataceae based on plastid genome showed that among all the analytical schemes explored, only the non-coding regions without filtering of ambiguous aligned base pairs resulted in conflicting topology (Reginato et al., 2016). Therefore, we removed the poorly aligned regions from all phylogenomic datasets before subsequent analyses using trimAl v1.2 with "-gappyout" mode (Capella-Gutiérrez et al., 2009). Sequences of nrITS and the five plastid markers were aligned using SeqMan v7.1.0 (DNASTAR, Madison, WI, USA) and manually adjusted.

Data Partitioning
We explored the issue of data partitioning using the Melastomataceae phylogenomic dataset. Five partitioning schemes were employed in the maximum likelihood (ML) and Bayesian analyses: (1) no partitions, (2) two partitions, coding and noncoding sequences, (3) three partitions corresponding to large single copy (LSC) region, small single copy (SSC) region, and inverted repeat region (IR), (4) six partitions, viz. protein coding genes divided by three codon positions, tRNAs, rRNAs, and noncoding sequences, and (5) 15 partitions, which was determined as the best-fit partition scheme by PartitionFinder v2.1.1 (Lanfear et al., 2012), based on the following strategy. Firstly all protein coding genes were divided into 12 clusters by function as shown in Figure S1 (each cluster was colored uniquely). For two of these clusters, "other genes" (accD, ccsA, and cemA) and "hypothetical chloroplast reading frame" (ycf1, ycf2, ycf3, and ycf4), which comprised several genes with different or unknown functions, each gene was further treated as a separate cluster. Each of the above 17 gene clusters was then divided into three subsets by codon position. With noncoding sequences, rRNA genes and tRNA genes treated as another three subsets, the Melastomataceae phylogenomic dataset was splitted into 54 subsets in total and then these subsets were assigned as input into PartitionFinder to select best-fit partitioning scheme and corresponding nucleotide substitution models.

Model Selection
The best-fitting models for each partition in the first four partitioning schemes as well as for nrITS dataset and cp-5 gene dataset were determined using the Akaike information criterion (AIC) (Posada and Buckley, 2004) in Modeltest version 3.7 (Posada and Crandall, 1998). For the fifth partitioning scheme, model for each partition is determined using PartitionFinder. For a summary of the model selection, see Table S3.

Bayesian Inference Analyses
Bayesian inference (BI) analyses were carried out in MrBayes 3.2.6 (Huelsenbeck and Ronquist, 2001) on the CIPRES cluster (Miller et al., 2010). When the model selected by Modeltest was not available in MrBayes, a more parameterized model was used (TVM was replaced by GTR, Table 2 and Table S3). A recent empirical study has demonstrated that in certain situations, using both parameters I and G to accommodate rate variation across sites could lead to non-optimal values for both parameters (Moyle et al., 2012). Therefore, we also ran a parallel analysis replacing the selected model GTR+I+G with GTR+G and compared the results to detect potential parameter interaction. Two independent Markov chain Monte Carlo (MCMC) analyses were run each with four simultaneous chains (three heated and one cold) for 3,000,000 generations with the temperature parameter set to 0.08. Trees were sampled every 100 generations, with the first 7,500 trees (25%) discarded as burn-in, and the remaining trees were used to construct a 50% majority-rule consensus tree with Bayesian posterior probabilities (PP). Convergence was considered reached when the average standard deviation of split frequencies fell below 0.01. The effective sample sizes (ESS) were also assessed for all parameters and statistics using Tracer v1.7.1 (Rambaut et al., 2018). All ESS were obtained with values higher than 200, indicating that all parameters were sampled sufficiently for all chains to converge.

Maximum Likelihood Analyses
Maximum likelihood analyses were conducted in RAxML version 8.2.10 (Stamatakis, 2014) using the GTR+G model as recommended by the author. Node support was estimated with 1000 bootstrap replicates using a fast bootstrapping algorithm in RAxML (Stamatakis et al., 2008).

Comparisons of Partitioning Strategies
To compare the five partitioning strategies employed, we calculated marginal likelihood and Bayes factor (the ratio of marginal likelihoods from two competing models) using Tracer v1.7.1 (Rambaut et al., 2018) as described in Ma et al. (2014). We also used PartitionFinder to choose the best partition scheme by constraining the substitution model to GTR+G/GTR+I+G. The partition scheme with lowest AIC was considered to be the most fitting one. The best partition scheme was then applied to the final analyses of the two plastid phylogenomic datasets.

Dating Priors
Three dating priors were utilized, one secondary calibration from a recent study of Myrtales (Berger et al., 2016), and two fossils of Melastomataceae widely used in previous biogeographical studies of the family Morley and Dick, 2003;Renner, 2004a;Renner, 2004b;Veranso-Libalah et al., 2018). The secondary calibration from Berger et al. (2016) is used to constrain the age of Melastomataceae s.l. (including Memecylon, node a) at 64.5 .1 Ma, 95% highest posterior density (HPD)]. An Eocene fossil leaf from North America (Hickey, 1977) had the basic venation of Melastomataceae s.s. (excluding Memecylon). Conservatively, we used it to constrain the age of node b (excluding the most basal clade Memecylon-Pternandra) at 53 Ma. Another fossil prior is Miocene seed characteristic of Melastomateae and Rhexieae (Collinson and Pingen, 1992;. It was used to constrain the Melastomateae-Marcetieae-Rhexieae node (node c) at 26-23 Ma.

Beast Analyses
Divergence time estimation was performed in BEAST 2.5.2 (Bouckaert et al., 2014), using an uncorrelated lognormal relaxed clock with a birth-death speciation process (Kendall, 1948;Nee et al., 1994;Gernhard, 2008). Due to limited computational budget, sequences of nine protein coding genes (atpB, matK, ndhF, psaB, psbB, rbcL, rpl2, rpoC2, rps4; aligned length: 15984 bp) from nine gene clusters were extracted from the Melastomataceae dataset and assembled into a combined matrix as input alignment to BEAST. For the secondary calibration, we used a normal distribution with a standard deviation equivalent to the 95% HPD estimate of Berger et al. (2016). For the two fossil priors, a lognormal distribution with a mean of 1.5 and a standard deviation of 1 was adopted to allow for the possibility that the nodes are older than the fossils themselves (Sauquet, 2013;Berger et al., 2016). We ran two independent MCMC analyses, each of 350,000,000 generations sampling every 1,000 generations. The effective sampling of all parameters and convergence of independent chains were examined using Tracer version 1.7.1 (Rambaut et al., 2018). The obtained parameters and distribution of effective priors were broadly similar to those of corresponding specified priors, indicating that the calibration strategy was relatively reliable (Figure S2). LogCombiner v.2.4.5 (Bouckaert et al., 2014) was then used to combine the output files of independent runs, after the removal of 10% of samples as burn-in. Finally, estimated divergence time information was annotated to a constrained ML tree generated by the Melastomataceae dataset under best partition scheme using TreeAnnotator v.2.4.5 (Bouckaert et al., 2014).

Ancestral Range Estimation
Ancestral range estimation was carried out using RASP (Yu et al., 2015). We used the annotated tree generated from BEAST analysis as input of the ancestral range estimation (ARE). The best-fit model Bayarealike+j was selected from the six models implemented in the software based on the AIC and likelihood ratio test results. We identified five geographical areas modified from Good (1974), Myers et al. (2000), and recent studies (Berger et al., 2016;Veranso-Libalah et al., 2018): (A) North America; (B) South America; (C) Indo-Burma, also including part of southern and western Yunnan, southernmost Guangxi and Guangdong, and Hainan Island; (D) Sundaland; and (E) Sino-Japanese region, including most of central and southern mainland China, Taiwan, and Ryukyu. All the species in the dataset were coded as present or absent for each of the five areas (Table S4) based on herbarium specimens, literature (Chen, 1984a;Chen, 1984b;Hansen, 1985;Maxwell, 1989;Regalado, 1990;Cellinese, 2002;Cellinese, 2003;Chen and Renner, 2007;Kartonegoro and Veldkamp, 2010;Lin et al., 2015;Lin et al., 2017), and online database (Global Biodiversity Information Facility, http://www.gbif.org). No dispersal scenario and ancestral areas were assumed a priori for the analysis. In RASP we allowed the inferred ancestor to occupy up to two areas, corresponding to the maximum number of areas occupied by any extant species.

characteristics of chloroplast genomes
One hundred and fifty-one complete chloroplast genomes in Melastomataceae were newly sequenced and assembled in this study ( Table S1). All newly obtained genomes are evolutionally conservative and similar to the ones previously published in Melastomataceae (Reginato et al., 2016;Ng et al., 2017;Zhou et al., 2017, Tan et al., 2019. Their genome sizes range from 153,291 to 158,960 bp with an average length of 155,986 bp. A total of 129 genes were annotated, viz. 84 protein-coding genes, 37 tRNA, and 8 rRNA, in all chloroplast genomes except that the rpoC1 gene of Sonerila cantonensis Stapf (Liu 510) is pseudogenized. A gene map for the chloroplast genome of Phyllagathis rotundifolia (Jack) Blume is shown in Figure S1 as a representative.

comparison of Different Partitioning Schemes
For each of the five partitioning schemes, trees generated by ML and BI analyses were broadly similar irrespective of the model used (GTR+G and GTR+I+G). Generally, BI analyses under selected GTR+I+G model recovered higher supported relationships than under GTR+G model (data not shown). Also, partition3 with best fit GTR+I+G model was favored over other partition schemes in both comparisons of AIC value and Bayes Factors (Table 1). Therefore, we applied partition3 to the final analyses of the two plastid phylogenomic datasets. For BI analyses, all three subsets (LSC, IR, and SSC) in partition3 scheme were analyzed with GTR+I+G ( Table 2).

Phylogenetic Analyses
Statistics of sequences sampled in Melastomataceae and Sonerileae/Dissochaeteae phylogenomic datasets are Analyses of the Melastomataceae dataset revealed that all genera sampled in Sonerileae/Dissochaeteae, except Ochthocharis Blume, formed a strongly supported clade (PP = 1.00, BS = 83%) (Figure 1). Ochthocharis, previously placed in Sonerileae, showed closer relationship with Rhexieae, Marcetieae, Melastomateae, and Microlicieae instead of other genera in Sonerileae/Dissochaeteae. Phylogenetic relationships within Sonerileae/Dissochaeteae inferred from the two chloroplast phylogenomic datasets were nearly identical (Figures 1 and S3). The Asian Dissochaeta-Pseudodissochaeta (PP = 1.00, BS = 100%) was recovered as the most basal clade in the complex, followed by a split (PP = 1.00, BS = 100%) between the South American Opisthocentra Hook.f. and a large clade consisting of the remaining Asian species (PP = 1.00, BS = 100%) (Figures 1 and S2). As shown in Figure 1, 17 species clusters were recovered in Sonerileae/Dissochaeteae. The backbone phylogenies of the complex were only partially supported. Relationships among major lineages in node H remained unresolved (Figure 1). The details of trees based on the nrITS and cp-5 gene datasets were shown in Table 2 and Figures S4 and S5.

Divergence Time Estimates
The dated phylogeny obtained from the BEAST analysis is shown in Figure 2. Divergence

Biogeographical History
Ancestral range reconstruction for Melastomataceae using RASP is shown in Figure 2, with the area probability inferred for each node represented by pie charts. The ancestral areas of Sonerileae/ Dissochaeteae (node A) and the core Asian Sonerileae/ Dissochaeteae (node C) were estimated to be South America with moderate support (area B, p = 0.64; Figure 2, Table 3) and Indo-Burma with strong support (area C, p = 0.98; Figure 2, Table 3), respectively. Thirty-three dispersal events were detected within Sonerileae/Dissochaeteae, of which two were intercontinental dispersals from South America to Asia [33.96 Mya (95% HPD: 28.33-39.8 Mya), node A; 27.79 Mya (95% HPD: 21.91-33.94 Mya), node B] and the remaining were dispersals among different regions of SEA. The most common dispersals in SEA were those from Indo-Burma to Sino-Japanese region (19 out of 31) and from Indo-Burma to Sundaland (5 out of 31). Age estimation showed ongoing dispersals from Indo-Burma to Sino-Japanese region during the past 20 Mya (19.64-0.87 Mya), whereas those from Indo-Burma to Sundaland were relatively ancient, ranging from 17.31 to 9.46 Mya. Ancestral ranges and relative probabilities of clades of interest are given in Table 3.

DIScUSSION comparison of Trees generated by chloroplast genome and nrITS Sequence Data
The phylogenetic tree generated from chloroplast genomic data is generally better resolved than the nrITS tree in terms of relationships both within and among major clades (Figures 1, S3,  and S4). As shown in Figure 3, 87% of the nodes in the plastid tree received moderate to strong support comparing to 55% in the nrITS tree. The phylogenetic affiliation of several species, unresolved in the nrITS phylogeny, were recovered by plastid phylogenomic data. For example, Phyllagathis rotundifolia, the type of Phyllagathis, was recovered as the sister group of Anerincleistus clade (Figures 1 and S3). Nevertheless, plastid phylogenomic analyses failed to fully resolve the backbone phylogeny of Sonerileae/Dissochaeteae (Figures 1 and S3). The weakly supported short internodes following node C and node H, together with our divergence time estimations, indicate putative rapid radiation around early (20.25 Mya) and middle Miocene (13.22 Mya). Therefore, even chloroplast genomic sequences cannot satisfactorily resolve the relationships among clades of Sonerileae/Dissochaeteae that evolved through rapid radiation. Comparison of the plastid and nrITS trees revealed several strongly supported incongruences regarding the interspecific relationships within some species clusters, e.g. Anerincleistus clade, Bredia clade, Medinilla clade, and Tashiroea clade. The factors commonly invoked as the potential causes of incongruence between plastid and nrITS phylogenies include sampling error, long-branch attraction, incomplete lineage sorting, hybridization, and subsequent introgression (Rieseberg and Soltis, 1991;Soltis and Kuzoff, 1995;Soltis et al., 1996;Wendel and Doyle, 1998). Sampling error can be ruled out as the main factor based on the high PP and BS values of the incongruent topologies (PP = 1.00, BS > 80%). The possibility of long-branch attraction (LBA) is also rejected for two reasons. The data were analyzed using model-based methods that are less sensitive to LBA, besides, no long terminal branches were involved in these incongruences. Ancestral polymorphism may survive recent speciation leading to discordant gene trees at interspecific levels (Wendel and Doyle, 1998;Knowles and Carstens, 2007). Hybridization and the transfer of alleles across the species barrier is also widespread in plants, especially for closely related species. Therefore, both lineage sorting and hybridization could be the cause for interspecific level incongruence.
The plastid genome tree and nrITS tree are congruent in most major lineages of Sonerileae/Dissochaeteae. Of the 17 major clades recognized in the nrITS tree, 15 were also strongly supported by plastid genomic data (Figures 1, S3, and S4). Incongruences lies in two clades, viz. the Blastus clade and unnamed clade 2. Blastus Lour. is morphologically highly homogeneous and distinct from other genera. This genus, although recovered as monophyletic in the nrITS phylogeny (Figure S4), formed two separate clades in the plastid genome tree corresponding to axillary and terminal inflorescences (Figures 1 and S3). The unnamed clade 2 contains species sampled in Driessenia Korth, Phyllagathis, and Heteroblemma (Blume) Cámara-Leret, Ridd.-Num. & Veldkamp. It was well recognized in nrITS phylogeny ( Figure S4) but was revealed to be paraphyletic in the plastid phylogeny (Figures  1 and S3), forming a larger lineage together with the Medinilla clade and two species of Phyllagathis from Vietnam, P. prostrata C. Hansen, and an undescribed new species. However, detection of strongly supported discordance and assessment of the potential causes such as hybridization and introgression are hampered by poor resolution of some parts of the phylogenetic trees.
Chloroplast phylogenomic data together with the even less informative nrITS sequence data failed to fully tackle the phylogenetic relationships within Sonerileae/Dissochaeteae. In future analyses, plastid genomic data should be combined with sequence data from multiple nuclear genes to unravel the phylogeny of Sonerileae/Dissochaeteae and the underlying evolutionary processes.

Origin and Biogeography
Sonerileae/Dissochaeteae exhibits a disjunct distribution between South America and the Old World, with its distribution centered in SEA. Molecular dating and biogeographical analyses indicated a South American origin for this clade during late Eocene (stem age: 34.78 Mya; 95% HPD: 29.07-40.53 Mya; Figure 2, Table 3). Our result agrees with Berger et al. (2016) and Veranso-Libalah et al. (2018) who estimated the age of Sonerileae/Dissochaeteae to be 38 and 39.63 Mya respectively based on limited sampling of this clade. Two previous age estimations for this clade, 19 Mya  and 73 Mya (Morley and Dick, 2003), are not supported with our data.
At the base of Sonerileae/Dissochaeteae, an Asian clade Dissochaeta-Pseudodissochaeta (stem age: 33.96 Mya; 95% HPD: 28.33-39.8 Mya) branched off first followed by a split at node B (27.79 Mya; 95% HPD: 21.91-33.94 Mya) between the South American clade Opisthocentra and the remaining Asian species (node C) (Figure 2). Ancestral range estimation indicated two dispersal events from South America to the Old World during late Eocene (33.96 Mya; 95% HPD: 28.33-39.8 Mya) (node A) and Mid Oligocene (27.79 Mya; 95% HPD: 21.91-33.94 Mya) (node B) respectively. Phylogenetic analyses of the cp-5 gene dataset revealed that within node B the South American Opisthocentra, Boyania Wurdack, Phainantha Gleason and the African clade comprising Gravesia Naudin, Calvoa Hook.f., Dicellandra Hook.f., and Amphiblemma Naudin diverged successively, followed by a subsequent split of the Asian clade (node C) ( Figure S5). The same pattern is also observed in a most recent phylogenetic study of Bertolonieae and Sonerileae/Dissochaeteae (Bacci et al., 2019). These data contradicted the previous view that Gravesia, Calvoa, and Amphiblemma arrived in Africa and Madagascar via long-distance dispersal from SEA (Clausing and Renner, 2001b;Renner, 2004a;Renner, 2004b), clearly indicating a second dispersal event from South America to Africa and Asia in node B. Based on the inferred age of the two dispersal events [33.96 Mya (95% HPD: 28.33-39.8 Mya) and 27.79 Mya (95% HPD, 21.91-33.94 Mya)], the "Indian Ark" hypothesis (Morley and Dick, 2003) is refuted. There are two alternatives. Direct trans-Atlantic dispersal of the lineages to the Old World is possible via oceanic steeping stones. Also, the basal lineages might have migrated from South America to North America FIgURE 1 | Maximum likelihood (ML) phylogenetic tree of Melastomataceae based on chloroplast genome sequences with 112 genes included shown as a cladogram (phylogram inset). Bootstrap values obtained from ML analyses (left) and Bayesian posterior probabilities resulting from Bayesian inference (BI) (right) are given on the branches. An asterisk denotes a branch collapsed in BI. The types of the genera sampled in Sonerileae/Dissochaeteae are indicated with boxes. Clades A, B, C, and H represent the nodes discussed in the text. "Unnamed clade 2" denotes a group of species which comprised the unnamed clade 2 in the nrITS tree ( Figure S4). and then entered Eurasia through the North Atlantic land bridge and spread to Africa and SEA during Eocene when the global temperature was still high. The latter hypothesis is supported by Eocene and Miocene Melastomataceae fossils discovered from North America and Europe (Hickey, 1977;Collinson and Pingen, 1992;Wehr and Hopkins, 1994). Both scenarios were proposed for Melastomateae (Veranso-Libalah et al., 2018), another tribe with transtropical disjunction in the family. The core Asian Sonerileae/Dissochaeteae (node C) began to diversify around early Miocene (crown age: 20.25 Mya; 95% HPD: 15.71-25.24 Mya) in Indo-Burma and dispersed subsequently southward to Malesia and northward to Sino-Japanese (Figure 2). This result is congruent with the findings of previous meta-analyses, which showed that Indo-Burma's biota (Indochina) had been predominantly characterized by in situ diversification and subsequent emigration since at least the early Miocene (De Bruyn et al., 2014). A series of short internodes with unsatisfactory resolution were observed following node C and node H, indicating the onset of rapid radiation around early (20.25 Mya, Mya) and middle Miocene (13.22 Mya, Mya) respectively (Figures 1 and 2). The early Miocene collision of Australia with the eastern margin of Sundaland (Hall, 2009), the onset of East Asia monsoons in early Miocene and its strengthening to maximum in Mid Miocene (Sun and Wang, 2005;Guo et al., 2008) and the Mid Miocene Climate Optimum (MMOC, Mya) resulted in increased topographic complexity, wet and warm climate, and development of widespread evergreen rainforest. These factors might in turn promoted Miocene radiation of Asian Sonerileae/ Dissochaeteae in Indo-Burma and their subsequent dispersal into Malesia and Sino-Japanese. The link between warmer and wetter climate and higher speciation was confirmed in a recent study (Kong et al., 2017), showing that global temperature changes and East Asian monsoons had played crucial roles in floristic diversification.

Frontiers in Plant
Finally, our analyses estimated a stem age of 13.87 Mya (95% HPD: 10.13-17.76 Mya) for the bird dispersed Medinilla clade (Figure 2). Analyses of the cp-5 gene dataset showed that the Madagascan species of Medinilla were nested within Asian species, branching off after M. fengii-M. petelotii ( Figure S5). Therefore, we agree with  and Renner (2004a)

Taxonomic Implications
Of the 23 genera sampled from Sonerileae/Dissochaeteae, only Ochthocharis fell out of this clade in the present analyses. As shown in Figure 1, it was close to Rhexieae, Marcetieae Melastomateae, and Microlicieae, conforming to the finding of Veranso-Libalah et al. (2018). Ochthocharis is a distinct genus with an African-Asian distribution, comprising nine species mostly found in the coastal lowland in wet and riverine habitats. Morphologically, it is readily distinguished by the indumentum, the structure of ovary, fruit, and seeds, showing no close resemblance to any other Asiatic genus (Hansen and Wickens, 1981). Ochthocharis should be excluded from Sonerileae/Dissochaeteae based on molecular data. However, a broader sampling of the genus and Ancestral ranges and relative probabilities (p > 10%) of these clades estimated under the Bayarealike+j model are also shown. North America (A); South America (B); Indo-Burma (C); Sundaland (D); Sino-Japanese region (E).
its close relatives (Rhexieae, Microlicieae, etc.) is still needed to resolve their phylogenetic relationships. A dozen of major lineages in Sonerileae/Dissochaeteae, well recognized in both nrITS and plastid genomic phylogeny (Figures 1, S3 and S4), are uncovered in this study, which provides some important insights for taxonomy. A comparison of these lineages is shown in Table S5.

Pseudodissochaeta
Pseudodissochaeta Nayar is a small genus of shrubs or small trees endemic to Indochina. Nayar (1969) proposed that this genus is close to Dissochaeta, but differs in the erect habit, connectives hardly produced and ventrally biauriculate, and extra-ovarial chambers descending to the middle or the base of the ovary. However, Chen (1983) considered Pseudodissochaeta as a congener of Medinilla and reduced the former. Chen (1984a) and Chen and Renner (2007) Figures 1, S3 and S4). The generic status of Pseudodissochaeta should be retained.

Anerincleistus
The Anerincleistus clade comprises five species of Phyllagathis and eight species of Anerincleistus, including A. macrophyllus Bakh.f., a species morphologically close to A. hirsutus Korth., the type of this genus. Species of this clade occur in Borneo, Malay Peninsula, and Sumatra. They are similar in having eight isomorphic stamens with minute connective appendages, but are quite diverse in habit, leaf morphology, inflorescence morphology, and capsule morphology ( Table S5). Analyses of nrITS sequence data failed to resolve the phylogenetic affiliation of this clade ( Figure S4), but chloroplast phylogenomic analyses recovered it as sister to the type of Phyllagathis with strong support (Figures 1 and S3). Based on this result, Phyllagathis should be recircumscribed to include this clade. Nevertheless, this relationship needs to be further tested using nuclear sequence data other than nrITS before formal taxonomic treatment.

Bredia and Tashiroea
Molecular phylogenetic analyses reveal that the type of Bredia is nested in a clade of 21 species, while Tashiroea, a genus previously synonymized in Bredia, falls in another distantly related clade of 10 species (Figures 1, S3, and S4). Both clades are distributed in southern mainland China, Taiwan, and the Ryukyu islands. Phylogenetically, the Tashiroea clade is sister to Scorpiothysus and the Bredia clade is nested in an internally unresolved larger branch with Blastus, Fordiophyton, etc. Morphologically, they differ in a series of characters including indumentum, texture of leaves, and capsule morphology ( Table S5). Molecular and morphological evidence confirm that the Bredia and Tashiroea clades represent distantly related lineages. Bredia should be recircumscribed to include the former clade and Tashiroea should be resurrected for the latter.

Cyphotheca and Sporoxeia
The Cyphotheca clade comprises the monotypic Cyphotheca, Phyllgathis fengii C. Hansen, and P. tentaculifera C. Hansen. The Sporoxeia clade contains Sporoxeia, P. hispidissima (C. Chen) C. Chen, and P. longicalcarata C. Hansen. Both clades are nested in the internally poorly resolved node H with Bredia clade, Fordiophyton clade, Blastus and Styrophyton clade, etc. (Figures 1, S3, and S4). The two clades, Cyphotheca and Sporoxeia, are morphologically distinct from other clades within node H in the mature ovary crown enclosing an obpyramidal space and 4-horned placental column (Table S5). From each other, they differ in anther morphology (connectives thickened without dorsal appendage vs. connectives dorsally spurred). Although the phylogenetic relationships in node H needs to be further tested, both Cyphotheca and Sporoxeia may have to be expanded to include additional species currently placed in Phyllagathis.

Unnamed Clade 1
This clade consists of ten species of Phyllagathis occurring in southernmost mainland China (6 spp), Vietnam (2 spp), and Borneo (2 spp) (Zhou et al., 2019; this study). The close relationship among some of these allopatric species in Phyllagathis, proposed by Hansen (1992), is confirmed here. Species in this clade are similar in the isomorphic stamens, dorsally spurred connectives, terminal inflorescences, umbellate or cymose (rarely terminal or axillary solitary flower), enlarged ovary crown forming an obpyramidal depression on the top, horned placental column, and thready placenta. The phylogenetic affiliation of unnamed clade 1 is poorly resolved. But this clade showed no close relationship with the type of Phyllagathis in all analyses. If these results reflect the true relationships, species in this clade should be excluded from Phyllagathis and treated as a distinct genus.

DATA AVAIlABIlITY STATEMENT
All the sequencing data generated in this study has been deposited in GenBank with accession numbers MK994778-MK994928 (complete chloroplast genome sequences) and MN031159-MN031238 (ITS sequences).