Population Differentiation and Demographic History of the Cycas taiwaniana Complex (Cycadaceae) Endemic to South China as Indicated by DNA Sequences and Microsatellite Markers

Historical geology, climatic oscillations, and seed dispersal capabilities are thought to influence the population dynamics and genetics of plants, especially for distribution-restricted and threatened species. Investigating the genetic resources within and among taxa is a prerequisite for conservation management. The Cycas taiwaniana complex consists of six endangered species that are endemic to South China. In this study, we investigated the relationship between phylogeographic history and the genetic structure of the C. taiwaniana complex. To estimate the phylogeographic history of the complex, we assessed the genetic structure and divergence time, and performed phylogenetic and demographic historical analyses. Two chloroplast DNA intergenic regions (cpDNA), two single-copy nuclear genes (SCNGs), and six microsatellite loci (SSR) were sequenced for 18 populations. The SCNG data indicated a high genetic diversity within populations, a low genetic diversity among populations, and significant genetic differentiation among populations. Significant phylogeographical structure was detected. Structure and phylogenetic analyses both revealed that the 18 populations of the C. taiwaniana complex have two main lineages, which were estimated to diverge in the Middle Pleistocene. We propose that Cycas fairylakea was incorporated into Cycas szechuanensis and that the other populations, which are mainly located on Hainan Island, merged into one lineage. Bayesian skyline plot analyses revealed that the C. taiwaniana complex experienced a recent decline, suggesting that the complex probably experienced a bottleneck event. We infer that the genetic structure of the C. taiwaniana complex has been affected by Pleistocene climate shifts, sea-level oscillations, and human activities. In addition to providing new insights into the evolutionary legacy of the genus, the genetic characterizations will be useful for the conservation of Cycas species.

Historical geology, climatic oscillations, and seed dispersal capabilities are thought to influence the population dynamics and genetics of plants, especially for distribution-restricted and threatened species. Investigating the genetic resources within and among taxa is a prerequisite for conservation management. The Cycas taiwaniana complex consists of six endangered species that are endemic to South China. In this study, we investigated the relationship between phylogeographic history and the genetic structure of the C. taiwaniana complex. To estimate the phylogeographic history of the complex, we assessed the genetic structure and divergence time, and performed phylogenetic and demographic historical analyses. Two chloroplast DNA intergenic regions (cpDNA), two single-copy nuclear genes (SCNGs), and six microsatellite loci (SSR) were sequenced for 18 populations. The SCNG data indicated a high genetic diversity within populations, a low genetic diversity among populations, and significant genetic differentiation among populations. Significant phylogeographical structure was detected. Structure and phylogenetic analyses both revealed that the 18 populations of the C. taiwaniana complex have two main lineages, which were estimated to diverge in the Middle Pleistocene. We propose that Cycas fairylakea was incorporated into Cycas szechuanensis and that the other populations, which are mainly located on Hainan Island, merged into one lineage. Bayesian skyline plot analyses revealed that the C. taiwaniana complex experienced a recent decline, suggesting that the complex probably experienced a bottleneck event. We infer that the genetic structure of the C. taiwaniana complex has been affected by Pleistocene climate shifts, sea-level oscillations, and human activities. In addition to providing new insights into the evolutionary legacy of the genus, the genetic characterizations will be useful for the conservation of Cycas species.

INTrODUCTION
Geological events and climate fluctuations during the Pleistocene have profoundly shaped the distribution and speciation of organisms (Hewitt, 2000;Hewitt, 2004). Most tropical species experienced a relatively moderate temperature in the Pleistocene. However, climatic oscillations have also significantly affected the genetic structure, population dynamics, and divergence of many extant global flora and fauna (Pauls et al, 2013;Yang et al., 2016;Mekonnen et al., 2018). Historical processes (e.g., the Ice Age) have left marked imprints on the genetic structure of extant species, especially for long-evolved organisms (Lo´pez-Pujol et al., 2011;Liu et al., 2015). In South China, which is recognized as a biodiversity hotspot, endemic species are currently confronted by dramatic habitat loss due to climate change and anthropogenic activity (Myers et al., 2000;Lo´pez-Pujol et al., 2006;Bax and Francesconi, 2018). Explaining the distribution of endemism plays the important role of understanding historical, evolutionary, and conservation implications of the organisms (Mittermeler, 2017).
Cycads are the oldest and the most primitive group of gymnosperm. They share evolutionary history and morphological characteristics with ferns (spermatophytes with flagella) and gymnosperms (naked seeds) (Zheng et al., 2017). Therefore, cycads are an ideal objective to study the evolutionary history of seed plants. The issues to infer the demographic history of cycads endemic to South China are currently of much interest. Cycads originated before the mid-Permian and reached their greatest diversity during the Jurassic-Cretaceous (Jones, 2002). However, molecular phylogenetic analyses indicated that the extant cycads underwent a synchronous global re-diversification and are not much more than 12 million years old (Nagalingum et al., 2011). In recent years, researchers have studied cycad phylogeny Wei et al., 2015), conservation (Feng et al., 2014;Feng et al., 2016;Feng et al., 2017;Gregory and Lopez-Gallego, 2018;Tang et al., 2018a;Zheng et al., 2017), biogeography (Gutiérrez-Ortega et al., 2017;Meerow et al., 2018;Cabrera-Toledo et al., 2019), pollination biology (Tang et al., 2018b;Santiago-Jimnez et al., 2019), and physiology (Vovides and Galicia, 2016). Cycads comprise two families (Cycadaceae and Zamiaceae), with 10 genera and 355 accepted species (Calonje et al., 2019). Unfortunately, 88% of cycads are on the International Union for Conservation of Nature (IUCN) Red List of Threatened Plants (Xiao et al., 2018). Cycas L. is the sole genus of the Cycadaceae, and its 117 species are mainly distributed in tropical and subtropical regions of Southeast and East Asia, East Africa, Oceania, and Madagascar (Fragniere et al., 2015). Cycas is monophyletic and is sister to all of the other extant cycad genera (Salas-Leiva et al., 2013;Liu et al., 2018). South China is viewed as the origin of Cycas (Xiao and Moeller, 2015). Cycas species in China are all facing potential endangerment challenges and in need of protection due to the over-collection in wild for the cultivation of commercial plants as well as their ornamental attributes. However, there are still some difficulties in putting the Cycas protection action into practice, including the controversial taxonomic status of some Cycas species and lack of genetic resources within and among taxa, which is very important for conservation management.  (Figure 1). They are mainly distributed in low-/middle-elevation tropical rainforest and hills. Within the complex, C. hainanensis C. J. Chen (Cheng et al., 1975), C. changjiangensis N. Liu (Liu, 1998), and C. lingshuigensis Fu (2004) are endemic to Hainan Island. C. taiwaniana Carruthers (Carruther, 1893) has only one wild population, which is in Fujian, and is distinguished by the denticulate margin of its megasporophylls. C. szechuanensis Cheng et L. K. Fu (Cheng et al., 1975), which currently occurs in Guangdong and Sichuan provinces, is distinguished from other taxa by its loose megasporophyll cone and its densely dark-brown tomentum in young leaves. C. fairylakea D. Y. Wang (Chen and Wang, 1995), which is also distributed in Guangdong, Fujian, differs from C. szechuanensis in having a megasporophyll with a distinct apical segment. These species are endangered and their wild populations are dramatically declining due to habitat loss and overexploitation by the ornamental and medical trade. An improved understanding of their population genetic structure and evolutionary history is required for assessing the conservation status of the taxa and for establishing effective management strategies (Funk et al., 2012;Allendorf et al., 2013). Previous studies of the C. taiwaniana complex have considered morphological characteristics (Chen and Wang, 1995;Chen and Stevenson, 1999;Wang, 2000;Huang, 2001;Liu and Qin, 2004), geographical distribution (Jian et al., 2005a), karyotype (Nayak et al., 2005), and phylogeography (Nong et al., 2011). Previous studies of their population genetics, however, were based on either limited taxa (Jian et al., 2005b;Jian et al., 2006;Huang et al., 2013a) or insufficient genetic data (Mo et al, 2008). A comprehensive understanding of the population genetics of the C. taiwaniana complex based on different molecular markers is needed.
In this study, we used two chloroplast intergenic spacers (chloroplast DNA, cpDNA), two single-copy nuclear genes (SCNGs), and six microsatellite markers (simple sequence repeats, SSR) to explore how the Quaternary climate oscillations may have influenced the current population structure of the C. taiwaniana complex. The research had three objectives: 1) to investigate the patterns of genetic variation of the C. taiwaniana complex; 2) to examine the historical population demography relative to climate fluctuations; and 3) to reconstruct phylogenetic relationships and estimate divergence time among populations. The genetic data can contribute to the management of the endangered species and can provide insights into the genetic structure and history of the C. taiwaniana complex.

Sampling and DNA Extraction
A total of 354 individuals from six closely related species of the C. taiwaniana complex were obtained from 18 populations in Hainan, Guangdong, and Fujian provinces of South China ( Figure 1 and Table 1). Fresh leaves were collected in silica gel and stored at −20°C until further processing. Within the 354 samples, 8-11 individuals from each population were randomly selected for chloroplast and single-copy nuclear DNA sequencing. All 354 individuals were used for the microsatellite study. One individual of Cycas pectinata was sampled and used as outgroup in the phylogenetic analysis. Total genomic DNA was extracted from dried leaves using a modified cetyltrimethyl-ammonium bromide (CTAB) method (Doyle and Doyle, 1987).

DNA Sequencing and Microsatellite genotyping
Two chloroplast intergenic spacers (trnS-trnG and atpB-rbcL) and two unpublished single-copy nuclear genes (EX929503  Table 1. Frontiers in Genetics | www.frontiersin.org December 2019 | Volume 10 | Article 1238 and FJ393265) were sequenced for complete analysis (primer information is provided in Supplementary Table 1). The two single-copy nuclear genes were developed from the transcriptome between C. hainanensis and C. changjiangensis. We queried the orthologous pairs with best hits against 959 sets of single-copy nuclear genes shared by Arabidopsis, Populus, Vitis, and Oryza (known as APVO genes) (Duarte et al., 2010) using TBLASTN and then designed exon-anchoring and intron-spanning primer using Primer Premier 5. The PCR amplifications for cpDNA were performed in a 35-μl reaction volume containing 2 μl of DNA template(10-40 ng), 3.5 μl of 10× buffer (Takara), 1.75 μl of dNTPs (Takara), 0.35 μl of each primer (forward and reverse), 1.75 μl of MgCl 2 , and 0.35 μl of Taq DNA polymerase (5 U/μl; rTaq, Takara). For SCNGs, the PCR reactions contained 5 μl of DNA template, 0.8 μl of each primer (forward and reverse), 10 μl of 2× Taq Master Mix (Takara), and 3.4 μl of double-distilled water. The PCR amplification of cpDNA began with an initial denaturation of 4 min at 95°C; followed by 35 cycles of 94°C for 45 s, annealing at 53-55°C for 60 s, and 72°C for 90 s; and a final extension at 72°C for 10 min. For SCNGs, the procedure began with an initial denaturation of 4 min at 95°C; followed by 34 cycles of 95°C for 30 s, 55-58°C for 60 s, and 72°C for 60 s; and a final extension at 72°C for 10 min. The product sizes were determined on 1.5% agarose gels in TBE buffer stained with ethidium bromide, and the products were then purified with the Gel Extraction Mini Kit. The PCR products were sequenced using an ABI 3730XL automated sequencer with the BigDye(r) Terminator v3.1 Cycle sequencing kit (Applied Biosystems). The sequences in this study have been deposited in GenBank (accession numbers MK613454-MK613787, and MK620342-MK620685).
Microsatellite markers were selected from developed nuclear microsatellites in Cycas (Li et al., 2009;Zhang et al., 2009). Six polymorphic and cross-transferable microsatellites loci were chosen for genotyping (Supplementary Table 2). PCR amplifications were carried out in 20 μl reaction volume containing 5 μl of DNA template, 2 μl of 10× buffer (Takara), 0.4 μl of dNTPs (Takara), 0.6 μl of each primer (forward and reverse), 0.2 μl of Taq DNA polymerase (5 U/μl; rTaq, Takara), and 11.2 μl of double-distilled water. The PCR amplification was conducted with the following conditions: an initial denaturation of 4 min at 95°C; followed by 35 cycles of 95°C for 30 s, annealing temperature of 50-56°C for 30 s, and 72°C for 1 min; and a final extension at 72°C for 10 min. PCR products were electrophoretically separated and screened on an ABI 3730XL automated sequencer (Applied Biosystems). The profiles were read using GeneMapper, version 4.0.

DNA Sequence Analyses
Sequences were edited in SeqMan, version 7.10 (Swindell and Plasterer, 1997) and were multiple aligned with BioEdit, version 7.0.9 (Hall, 1999). Two cpDNA fragments were concatenated with a congruency assessment (Farris et al., 1994) using PAUP* 4.0b10 (Swofford, 2002). For the two SCNGs, heterozygous sites were phased using DnaSP, version 5.0 (Librado and Rozas, 2009). The combined cpDNA matrix and phased SCNGs were used in the following analyses. In addition, cpDNA and SCNG datasets were combined for the phylogenetic analysis.
Nucleotide diversity (P i ) and haplotype diversity (H d ) were calculated with DnaSP, version 5.0 (Librado and Rozas, 2009). Total genetic diversity (H T ), average within-population diversity (H S ), and two parameters of genetic differentiation (G ST and N ST ) were estimated using Permut version 2.0 (Pons and Petit, 1996). Permut with 2000 permutations was then used to compare G ST and N ST at the DNA sequence level. The genetic variation within and between populations was assessed by the hierarchical analysis of molecular variance (AMOVA) with Arlequin, version 3.0 (Excoffier et al., 2005) with 10,000 permutations based on two partitions: 1) populations were grouped according to species and 2) populations were grouped according to distribution regions (Guangdong, Fujian, and Hainan). Haplotype networks were conducted with Network version 4.5 using the median-joining method (Bandelt et al., 1999). Indels were treated as single mutation events. We inferred the demographic dynamics of the C. taiwaniana complex at the gene level by performing three tests. First, we used DnaSP, version 5.0 to examine the following neutrality tests: Fu and Li's D* and Fu and Li's F* (Fu, 1997) and Tajima's D (Tajima, 1989) in F S value significantly less than zero indicates directional selection. The pairwise mismatch distributions were then examined in DnaSP, version 5.0. Mismatch distribution with a unimodal shape indicates that populations have experienced recent expansion. The sum of squared deviations (SSD) and raggedness index as well as their significance values were calculated using Arlequin, version 3.0. Finally, Bayesian skyline plot (BSP) analyses were performed with BEAST, version 2.3.0 (Drummond and Rambaut, 2007). The Markov chain Monte Carlo (MCMC) analysis was run for 10 8 generations with sampling for every10 4 iterations. Tracer, version 1.7 (Rambaut et al., 2014) was used to estimate the effective sample size (ESS). An ESS value greater than 200 indicates acceptable mixing and sufficient sampling. The coalescent Bayesian skyline was selected as priori. The other related parameter settings (evolutionary rates, substitution model, and molecular clock) are specified in the following sections.

Phylogenetic and Molecular Dating Analyses
Bayesian inference (BI) as implemented in MrBayes, version 3.2.6 (Ronquist et al., 2012) was used to construct a phylogenetic tree of the 18 populations in the C. taiwaniana complex with the combined cpDNA-single-copy nuclear DNA (scnDNA) dataset. Prior to combining the sequences, the congruence was examined using the partition-homogeneity test (Farris et al., 1994). The best substitution models for each marker were inferred under the Akaike information criterion (AIC) in MrModeltest, version 3.2 (Nylander, 2004). Furthermore, partition-specific DNA evolution models were used for each gene. Two parallel MCMC analyses started from a random tree and sampled every 1,000 generations for 6 × 10 7 generations.
Divergence times among populations were estimated using BEAST, version 2.3.0 (Drummond and Rambaut, 2007). A Yule model was set as prior for the tree model. We selected the F81 model with a relaxed uncorrelated molecular clock for combined cpDNAs and an HKY+I model with a strict molecular clock for joint nDNAs. Because an accurate evolutionary rate for Cycas was unavailable, we applied the evolutionary rates for seed plants of 1.01 × 10 −9 substitutions per site per year for synonymous sites for cpDNA and 6.1 × 10 −9 (5.1-7.1 × 10 −9 ) substitutions per site per year for synonymous sites for SCNGs (Graur and Li, 2000;Feng et al., 2017). The MCMC analyses were run for 10 8 generations and was sampled every 10,000 steps. Convergence of all parameters was checked using Tracer, version 1.7 (Rambaut et al., 2014) with ESS (value >200). The log and tree files from Tracer were combined with LogCombiner. The consensus tree was generated with TreeAnnotator, version 1.8.0 after removal of the first 10% of the trees. The results were visualized with FigTree version 1.4.2 (http://tree.bio.ed.ac.uk/Software/figtree).

Microsatellite Data Analyses
Genetic diversity indices were calculated using GenAlex, version 6.5 (Peakall and Smouse, 2012), including the number of alleles (N A ), effective number of alleles (A E ), expected heterozygosity (H E ), observed heterozygosity (H O ), fixation index (F), and percentage of polymorphic loci (PPB). Gene flow (Nm) between pairwise populations was estimated based on Nm = (1 − F ST )/4 F ST (Wright, 1931), and the differentiation index (F ST ) was calculated with Arlequin, version 3.0 (Excoffier et al., 2005). Exact tests for deviation from Hardy-Weinberg equilibrium (HWE) were computed in each locus and population using Genepop, version 4.5.1 (Rousset, 2008). AMOVA was conducted in Arlequin, version 3.0 (Excoffier et al., 2005) according to different species and distribution regions. To test for the correlation between genetic distance and geographic distance, the isolated by distance (IBD) model implemented in a mental test (Mantel, 1967) was performed in GenALEx, version 6.3. Genetic distance was calculated with Genepop, version 4.1.4 (Rousset, 2008) according to the formula F ST /(1 − F ST ). The geographic distance between sampling locations was determined using the R package (Viechtbauer, 2010).
The genetic structure of populations in the C. taiwaniana complex was assessed in several ways. First, the individualbased principal coordinate analysis (PCoA) was performed with GenALEx, version 6.5 (Peakall and Smouse, 2012) based on genetic distance between pairs of individuals. Next, unweighted pair group mean analysis (UPGMA) of populations was carried out using NTSYS-pc, version1.4 (Jensen, 1989). Finally, Bayesian model-based clustering analysis of the 18 populations was conducted using STRUCTURE, version 2.2 (Pritchard et al., 2000). The number clusters (K) from K = 1 to 18 were run 20 times. Each run contained a burn-in of 10 5 iterations and 10 5 replications of MCMC. The most likely number of K was estimated in Structure Harvest, version online, version 0.6.9 (Earl and Holdt, 2012) using ΔK (Evanno et al., 2005), where the modal value of the distribution is located at the real K. The STRUCTURE results were plotted with DISTRUCT (Rosenberg, 2004).
We performed bottleneck effect test of 18 populations in the C. taiwaniana complex to investigate demographic dynamic. The two-phased model (TPM) implemented with Wilcoxon test method was selected in bottleneck analysis. We also tested for a bottleneck by using a mode shift model (Piry et al., 1999) in BOTTLENECK, version 1.2.02 (Cornuet and Luikart, 1996).

rESUlTS genetic Diversity and Differentiation
The combined and aligned cpDNA was 1,594 bp long (877 bp for trnS-trnG and 717 bp for atpB-rbcL) with a rate of homogeneity (P = 0.08, > 0.05). A total of 17 polymorphic sites and nine haplotypes (H1-H7) across 177 individuals were identified (Supplementary Table 3). For cpDNA, haplotype diversity was detected only in populations FAIRY5, CHA1, CHA2, CHA3, and CHA4. Population CHA1 of C. changjiangensis had the highest haplotype diversity (H d = 0.64), and CHA2 had the highest nucleotide diversity (P i = 0.0011) (Supplementary Table 3). The average within-population diversity (H S = 0.095) was much lower than the total diversity (H T = 0.772) ( Table 2). N ST was significantly larger than G ST , suggesting that haplotype similarities were correlated with geographic distribution ( Table 2). The hierarchical AMOVA (Supplementary Table 3) indicated that the genetic variations were mainly due to species differences (87.97%) and interregional differences (80.02%), suggesting a regional population substructure.
The aligned SCNGs EX and FJ were 941 and 784 bp long, respectively. EX and FJ had eight and three recombination events, respectively. Two SCNGs showed variable genetic diversity. In EX, 56 haplotypes were obtained based on 33 polymorphism sites (Supplementary Table 3). At the species level, C. lingshuigensis had the highest genetic diversity (H d = 0.93, P i = 0.0045). For FJ, 29 haplotypes with 32 polymorphism loci were identified (Supplementary Table 3). C. changjiangensis had the highest species-scale haplotype and nucleotide diversity (H d = 0.86, P i = 0.0034). The overall estimated genetic diversity was larger for EX (H d = 0.94, P i = 0.0048) than for FJ (H d = 0.83, P i = 0.0028). The total genetic diversity in both EX and FJ was high. N ST was significantly larger than G ST , indicating a phylogeographic structure of haplotype distribution ( Table 2). The difference among species explained 28.69% of the EX variation and 32.48% of the FJ variation. Among species, variations were higher within populations (50.44% for EX and 49.43% for FJ) than among populations (20.87% for EX and 18.09% for FJ). Among regions, higher variances (49.16% for EX and 46.76% for FJ) were also distributed within populations than among populations based on SCNG data. The values of F ST ranged from 0.330 to 0.962, indicating a highly significant genetic differentiation among populations in the C. taiwaniana complex ( Table 3).
A total of 117 alleles ranging from 16 to 24 per locus were identified based on SSR across the 354 individuals ( Supplementary  Table 4). At the species level, C. changjiangensis, C. hainanensis, and C. lingshuigensis had high genetic diversity according to the   Table 7). The AMOVA revealed that 66.96% and 63.56% of variation were attributed to differences among individuals within populations at the species and regions levels, respectively ( Table  3). Genetic distance was significantly correlated with geographic distance (r = 0.421, P = 0.001), supporting an overall pattern of IBD in the C. taiwaniana complex (Supplementary Figure 2).

Haplotype Network and genetic Clustering
The cpDNA network indicated that H1 was the most common and widespread haplotype in seven populations across four species and is candidate ancestral haplotype. H2 was the second most frequent haplotype, which was found in six populations across two species. The remaining haplotypes occurred in only one species (Figure 2A). For EX, haplotype H1 was the most frequently shared in four species and was located in the interior position of the network, suggesting that it might be an ancestral haplotype ( Figure 2B). For FJ, H4 was the most widely shared haplotype. However, haplotype relationship remained ambiguous due to the presence of multiple "loops" (Figure 2C). The UPGMA dendrogram for microsatellite data demonstrated that populations FAIRY1, FAIRY2, FAIRY3, FAIRY4, and SZE1 grouped into one clade and that the other populations formed a second clade (Figure 3A). Structure analysis ( Figure  3B) supported two genetic clusters based on ΔK statistics (Supplementary Figure 1). The optimal K value was 2, indicating that the 18 populations from the C. taiwaniana complex were separated into two clusters. The second fittest value (K = 3) indicated that the populations were grouped into three clades. However, there were some admixtures when the K value was 3. Low genetic composition was detected among C. lingshuigensis, C. hainanensis, C. taiwaniana, and C. changjiangensis. The UPGMA and structure analyses were supported by the PCoA (Figure 3C).

Historical Demography
Regarding cpDNA data, we only conducted mismatch analyses for the three species that had haplotype genetic diversity: C. changjiangensis (Supplementary Figure 3A), C. fairylakea (Supplementary Figure 3B), and C. hainanensis (Supplementary Figure 3C). The mismatch analysis of cpDNA revealed a unimodal distribution in C. hainanensis (Supplementary Figure 3C), suggesting a recent population expansion. This inference was supported by the significant negative values of Fu and Li's D* and F* (Supplementary Table 8). For SCNGs, C. taiwaniana showed significant positive values for Fu and Li's F* and for Tajima's D, suggesting a stable demographic history. Furthermore, multimodal distribution patterns were detected in C. taiwaniana, which indicated that the species has not undergone recent population expansion (Supplementary Figures 3I, K). For EX, multimodal distribution patterns were also detected in C. fairylakea, C. hainanensis, C. lingshuigensis, C. szechuanensis and C. changjiangensis (Supplementary Figure 3D-H). For FJ, C. szechuanensis, C. fairylakea and C. changjiangensis showed multimodal distribution patterns (Supplementary Figure 3J, L and M). However, unimodal distributions were detected in C. hainanensis and C. lingshuigensis (Supplementary Figure 3N and O).  The Bayesian skyline plot analyses revealed a complex demographic history in the C. taiwaniana complex. The results derived from cpDNA, EX, and FJ were inconsistent. Based on cpDNA data, the population size experienced a long history of equilibrium and rapid contraction since approximately 8,000 years ago (Figure 4A). For EX, slow population growth with a subsequent slight decline since 5,000 years ago was detected ( Figure 4B). The Bayesian skyline plot of FJ indicated three dynamic events: the C. taiwaniana complex was stabile from 1.25 to 0.07 Ma, subsequently experienced a population expansion beginning about 0.07 Ma, and then underwent a slow decline from 0.015 Ma to the present (Figure 4C).
According to the bottleneck analysis, four populations (HAI3, TAI1, FAIRY3, and FAIRY5) had a significant excess of heterozygosity and deviated from mutation-drift equilibrium (Supplementary Table 9). In addition, the mode shift tests revealed that most of the populations had a normal L-shaped distribution, except for populations TAI1, FAIRY2, FAIRY3, FAIRY4, and FAIRY, which suggested that only these four populations had experienced a severe bottleneck (Supplementary Table 9).

Phylogenetic relationship and Divergence Time
The partition-homogeneity test analysis indicated that the cpDNA and scnDNA sequences could be combined for phylogenetic analysis (P = 0.1). The phylogenetic tree constructed from the concatenated cpDNA-scnDNA dataset was partly unresolved with C. pectinata as the outgroup (Figure 5). However, two main clades in the C. taiwaniana complex were obtained with high support: one clade comprised C. fairylakea and C. szechuanensis (clade A), and the second clade (clade B) comprised C. hainaniana, C. lingshuigensis, C. changjiangensis, and C. taiwaniana (Figure 5). Within the first clade, C. szechuanensis was placed as the sister group of C. fairylakea with robust support. The relationships among C. hainaniana, C. lingshuigensis, C. changjiangensis, and C. taiwaniana were weakly supported.
The species trees for cpDNA and SCNG datasets had similar topologies and corroborated the presence of two wellsupported clades (Supplementary Figure 4). The populations based on cpDNA sequences were divided into two lineages that diverged at approximately 1.790 Ma (95% HPD = 0.85-2.87 Ma) (Supplementary Figure 4A). For SCNGs, populations also clustered into two lineages that diverged about 0.311 Ma (95% HPD = 0.11-0.71 Ma) (Supplementary Figure 4B). Dating analyses suggested that six taxa of the C. taiwaniana complex diverged in the Middle Pleistocene.

DISCUSSION genetic Diversity and Differentiation
Population size, life history traits, breeding, and eco-landscape can affect genetic variation within or among species (Nybom, 2004;Reisch and Bernhardt-Roemermann, 2014;Reisch et al., 2017). Most Asian Cycas are narrowly distributed and have similar life history traits, such as dioecy, long life spans, anemophilous, or entomophilous (Feng et al., 2014). At the species level, C. fairylakea had lower genetic diversity based on cpDNA and SSR than other Asian Cycas species (Zheng et al., 2017). C. changjiangensis had high genetic diversity based on cpDNA and SCNGs, which is consistent with previous results based on allozyme analysis by Jian et al. (2005c). The total genetic diversity of the C. taiwaniana complex among all populations FIgUrE 4 | Bayesian skyline plots based on cpDNA (A) and single-copy nDNA: EX (B) and FJ (C). The X-axis indicates time in thousands of years ago and the Y-axis represents the effective size multiplied by generation time on a log scale. The black line indicates the estimated median and the area between the blue lines represents the 95% confidence interval. was higher than the mean value (H T = 0.67) of 170 plant species (Petit et al., 2005). Probable explanations for the high genetic variation include retention of ancestral polymorphisms (Yang et al., 2015), refugia (Feng et al., 2014;Lin et al., 2018), genetic drift (Van et al., 2010), or genome accumulation during a long evolutionary history (Feng et al., 2014).
In the current study of 18 populations in the C. taiwaniana complex, the total genetic diversity (H T = 0.772 based on cpDNA) was much larger than the average within-population diversity (H S = 0.095), suggesting that the level of genetic differentiation among populations is high (G ST = 0.977 and N ST = 0.952). In addition, N ST was significantly greater than G ST , which indicates the presence of a distinct phylogeographical structure in the C. taiwaniana complex. The F ST values obtained by cpDNA were also much greater than the average value of 0.67 previously estimated from other maternally inherited markers (Petit et al., 2005), indicating significant genetic differentiation among populations. SCNG-and SSR-based F ST values greater than 0.25 also demonstrate significant genetic differentiation (Wright, 1978). In Cycas species, the cpDNA is maternally inherited by seed, whereas SCNG is biparentally inherited through seed flow and pollen flow. In this study, the higher F ST detected by cpDNA than those in SCNG and SSR suggested that the pollen flow of C. taiwaniana complex was greater than the seed flow Xiao et al., 2018).

Phylogeographical Pattern
In the STRUCTURE analysis, the low genetic admixture and robust population clusters also suggest high genetic differentiation. The Cycas species, with similar life history and reproductive characteristics, are thought to have a high level of genetic variation within populations and significant genetic differentiation among populations (Liu et al., 2015). The significant genetic differentiation and clear genetic structure observed in the C. taiwaniana complex may be explained by limited gene flow (Feng et al., 2014;James et al., 2018), which had also been reported for other Cycas species (Zhao and Gong, 2015;Feng et al., 2017). The effective propagation distance of Cycas species is 2-7 km (Yang and Meerow, 1996). The seeds of Cycas species are too large to FIgUrE 5 | Phylogenetic relationships among 18 populations in the Cycas taiwaniana complex inferred from combined two single-copy nuclear genes and two chloroplast intergenic spacers based on Bayesian inference. disperse for a long distance and drop near the mother plants, enhancing the probability of inbreeding (Hall and Walter, 2011).
Our STRUCTURE analysis of SSR detected two distinct clusters, which correspond to the morphological characteristics and geography. The two lineages were also supported by a phylogenetic tree, UPGMA and PCoA clustering. C. szechuanensis and C. fairylakea from inland China formed one clade, whereas C. hainanensis, C. changjiangensis, and C. lingshuigensis from Hainan Island formed the other clade. There was a low level of admixture for Cycas species between Hainan Island and the Mainland. Island populations that were isolated from Mainland relatives could exhibit genetic divergence, local adaptation, and incipient speciation. Furthermore, the limited dispersal abilities of Cycas on islands and their adaptation to the local environments are likely to produce distinct lineages (Huang et al., 2013b). However, C. taiwaniana with only one population from Fujian were clustered into Hainan Island clade. The exact origin of C. taiwaniana remains unclear. If the C. taiwaniana was ever part of the flora of Mainland China, it is probably now extinct in that habitat. The C. taiwaniana complex is dioecious and allogamous. Their seeds are heavy and contain the toxic compound, which hampers long distance disposal by water or animals (Schneider et al., 2002). Based on our BEAST analysis, C. taiwaniana diverged more recently than C. hainanensis, C. changjiangensis, and C. lingshuigensis from Hainan Island (Supplementary Figure 4). Therefore, C. taiwaniana is likely originated on Hainan Island, then migrated to the Mainland and survived in Fujian now.
Both the BI analysis from the combined cpDNA-SCNGs dataset and the two Bayesian species trees revealed similar relationships among the two major lineages within the C. taiwaniana complex. The results presented here are mostly consistent with the recent molecular phylogeny of Cycas . Clade A (C. szechuanensis and C. fairylakea) is well supported with PP = 1. The other clade B (C. hainanensis, C. changjiangensis, C. lingshuigensis, and C. taiwaniana) located in Hainan Island and Fujian Province is supported by PP = 0.97. However, the relationship within C. hainanensis, C. changjiangensis, C. lingshuigensis, and C. taiwaniana is not very clear. The clade including species from Hainan Island is geographically isolated from species from South China, indicating that limited genetic exchange may have promoted the divergence of the two clades. Hainan Island was separated from Mainland China by the Qiongzhou Strait, which formed approximately 2.5-25 million years ago (Wu et al., 2012). The land bridge of Qiongzhou Strait has experienced cyclic upheaval and submergence with sea-level changes (Zhao et al., 2007). The low level of gene exchange between island taxa and continental relatives might have been facilitated by the periodic presence and absence of the land bridge (Zhao et al., 2013).

Divergence Time and Demographic Dynamics
Phylogenetic analyses revealed two lineages (A and B) in C. taiwaniana complex. The divergence time of the two lineages based on cpDNA and SCNG sequences is about 1.790 and 0.311 Ma, respectively, corresponding to a period of glacial cycle during the Middle Pleistocene, which verifies recent divergence in the former research (Nagalingum et al., 2011). The divergence time from cpDNA and SCNG was inconsistent, which is caused by different evolutionary rates and migration modes between organelle (pollen migration) and nuclear markers (pollen migration and seed migration; seed migration is very little) (Wolfe et al., 1987;Zheng et al., 2016). The recent divergence between the two lineages indicated that the species in clade B colonized Hainan Island in the Middle Pleistocene when land bridges formed (Shi et al., 2006;Huang et al., 2013b). A compilation of molecular data for divergence in subspecies and species complexes showed that species were formed through the Pliocene and Pleistocene (Hewitt, 1993). We suspect that the geographic isolation and low seed dispersal ability in the C. taiwaniana complex have generated genetic isolation among populations, i.e., have promoted a "terrestrial island speciation" (Wang et al., 2017).
Although the Pleistocene ice events were evident in Europe (Taberlet et al., 1998), the climate also changed in China (Harrison et al., 2001). Climate oscillations during the Pleistocene had effects on the demographic history of plants (Dorsey et al., 2018;Zhang et al., 2019). Different Cycas species had different population dynamics to respond to glacial and interglacial influences. Cycas debaoensis (Zhan et al., 2011), Cycas simplicipinna (Feng et al., 2014), Cycas multipinnata , and Cycas diannanensis (Liu et al., 2015) have experienced population contractions, while Cycas revoluta and Cycas taitungensis (Chiang et al., 2009) have expanded their geographical distributions. In this study, a recent population contraction in the C. taiwaniana complex is suggested based on three markers (cpDNA, SCNGs, and SSR). It is possible that refugia, temperature changes, and recent human activities caused the contractions in the C. taiwaniana complex.

Implications for Conservation
To successfully protect threatened species, it is necessary to preserve the maximum genetic diversity (Montalvo et al., 1997;Jump and Penuelas, 2005). Species with unique haplotypes should be given the highest priority protection (Petit et al., 1998). In this study, we found a higher genetic diversity in the C. taiwaniana complex than many other Cycas species in China. However, severe habitat loss and illegal exploitation for trading and ornament have severely threatened the existing populations of C. taiwaniana complex. C. hainanensis, C. changjiangensis, C. lingshuigensis, and C. taiwaniana are regarded as endangered (EN), while C. szechuanensis and C. fairylakea are critically endangered (CR) species (IUCN, Species Survival Commission, Cycad Specialist Group, Source: https://www.iucnredlist.org/ search). Therefore, we proposed to protect all the existing wild populations of C. taiwaniana complex and their habitats in situ. Some species and populations with unique haplotypes, such as C. hainanensis, C. changjiangensis, C. lingshuigensis, C. szechuanensis, C. fairylakea, and populations LING1, HAI2, FAIRY1, FAIRY5, and CHA1-4 should be given the highest priority protection. We also recommend the ex situ conservation for those critically endangered species (C. szechuanensis and C. fairylakea) and important populations (such as small population size or with unique haplotypes) by collecting seeds or seedlings from wild (Griffith et al., 2015;Griffith et al., 2017). Furthermore, conservation awareness of local people and law establishment by the government should be improved to protect C. taiwaniana complex from extinction.

CONClUSION
Our findings demonstrate that the C. taiwaniana complex has a high level of genetic diversity and a high degree of genetic differentiation among populations. Our findings also suggest that the C. taiwaniana complex have two main lineages. C. fairylakea was incorporated into C. szechuanensis and the other populations merged into one lineage. The genetic structure of the C. taiwaniana complex has been greatly affected by Pleistocene climate shifts, sea-level oscillations, and human activities, which may increase our ability to identify the phylogeographic patterns in these species and to detect other potential evolutionary lineages within the C. taiwaniana complex across South China. In addition to providing insight into the evolution of Cycas species, the findings of this study will be useful for the conservation of these endangered plants.

DATA AVAIlABIlITY STATEMENT
The datasets generated for this study can be found in the GenBank with accession numbers MK613454-MK613787, MK620342-MK620685.

AUTHOr CONTrIBUTIONS
S-GJ designed the research and collected the study materials. X-HW performed the SCNG sequencing, data analyses, and wrote the manuscript. JL and L-MZ contributed to material collection, DNA extraction, SSR and cpDNA sequencing. S-GJ, Q-MM, Z-WH, and XG contributed to manuscript revisions. All authors approved the final manuscript.

ACKNOWlEDgMENTS
We are grateful to Quan Chen and Zhengfeng Wang for their assistance in maintaining and collecting the samples used in this work.