Evolutionary History of Rhus chinensis (Anacardiaceae) From the Temperate and Subtropical Zones of China Based on cpDNA and Nuclear DNA Sequences and Ecological Niche Model

To explore the origin and evolution of local flora and vegetation, we examined the evolutionary history of Rhus chinensis, which is widely distributed in China’s temperate and subtropical zones, by sequencing three maternally inherited chloroplast DNAs (cpDNA: trnL-trnF, psbA-trnH, and rbcL) and the biparentally inherited nuclear DNA (nuDNA: LEAFY) from 19 natural populations of R. chinensis as well as the ecological niche modeling. In all, 23 chloroplast haplotypes (M1–M23) and 15 nuclear alleles (N1–N15) were detected. The estimation of divergence time showed that the most recent common ancestor dated at 4.2 ± 2.5 million years ago (Mya) from cpDNA, and the initial divergence of genotypes occurred at 4.8 ± 3.6 Mya for the nuDNA. Meanwhile, the multimodality mismatch distribution curves and positive Tajima’s D values indicated that R. chinensis did not experience population expansion after the last glacial maximum. Besides, our study was also consistent with the hypothesis that most refugia in the temperate and subtropical zones of China were in situ during the glaciation.

The temperate and subtropical region of China is a model area for studying plant species in response to past climate changes (Chen S.C. et al., 2012;Li X.H. et al., 2012;Qi et al., 2012;Zhao et al., 2013;Fu et al., 2014). Up to date, many phylogeographic studies have been used to elucidate the impacts of the uplifts of the QTP on the climate within the modern-day temperate and subtropical zones, or warm temperate zones in China (e.g., Yellow River Basin, Chen K.M. et al., 2008 andChen Y.Y. et al., 2008;Yunnan-Guizhou Plateau, Fan et al., 2013;Wang W. et al., 2014;Yangtze River, Sun et al., 2013;Wang H. et al., 2015;Qinling Mountains, Liu J.Q. et al., 2014;Lu et al., 2016;QTP, Liu D. et al., 2015 andLiu Y.P. et al., 2015); i.e., 23.5 • -42.0 • N and 98.0 • -124.0 • E (Gao et al., 2003;Shangguan et al., 2009). The results showed that the QTP acted as a barrier against glaciation within the warm temperate zones of China and resulted in the arid climate for thousands of years within the Quaternary period, which has been widely accepted nowadays Yu et al., 2013;Meng et al., 2014). Thus, the present warm temperate region probably served as a glacial refugia for plant species in the past time, and this hypothesis has been tested and advanced through phylogeographic studies (e.g., Liu et al., 2012;Qi et al., 2012;Wan et al., 2016). However, it is less well known whether population genetic diversification of plants within the warm temperate zone or within the glacial refugia is due to isolation on a heterogeneous landscape or adaptation and selection along ecological gradients Zhao et al., 2016). Therefore, more phylogeographic studies of additional plant species within the warm temperate refugial regions are necessary in order to detect their spatial geographic patterns and to assess the underlying causes.
Rhus chinensis belongs to the plant family Anacardiaceae and is a common deciduous tree that is endemic to the warm temperate zone of Asia. It widely occurs at the elevation of 170-2700 m above sea level in Shaanxi, Shanxi, Hebei, Sichuan, Hunnan, and Yunnan of China (Zheng and Min, 1980). Due to its commonality and widespread distribution within the warm temperate zone, R. chinensis is thus an ideal study case for phytogeography within this region. In this study, we used three cpDNA regions (trnL-trnF, psbA-trnH, and rbcL) and one nuDNA region (LEAFY) to examine (1) the genetic diversity and structure of R. chinensis populations in China and (2) how is the demographic history of R. chinensis during the Quaternary climate oscillations, and further to explore the origin and evolution of local flora and vegetation.

Population Sampling
In total, leaf samples of 312 individuals were collected from 19 natural populations of R. chinensis, representing its whole geographic distribution within the warm temperate zone of China (see Figure 1 and Table 1). Eight to 20 individuals were collected for each population, and all individuals were at least 15 m apart. We obtained several voucher specimens for each population, which were deposited at the School of Life Sciences, Shanxi University, Taiyuan, Shanxi, China. The information of latitude, longitude, and altitude of each population were recorded using an Etrex GIS (Garmin, Taiwan, China).

Data Analysis
We aligned sequences with Clustal_X (Thompson et al., 1997) and coded indels following the method of Simmons and Ochoterena (2000). Indels within mononucleotide repeat regions were deleted for phylogenetic analyses, because the homology of these indels could not be verified (Chen S.C. et al., 2012).  The levels of inter-and intra-population genetic diversity (h: haplotype diversity and π: nucleotide diversity) were calculated for the cpDNA and nuDNA using DnaSP version 5.0 (Rozas et al., 2003). We compared G ST and N ST using the U-statistic, which is approximated by a Gaussian variable by taking into account the covariance between G ST and N ST , and a one-sided test (Pons and Petit, 1996). The former considers only haplotype frequencies while N ST also takes into account differences between haplotypes. When N ST is larger than G ST , phylogeographic structure is obvious, which indicates that closely related haplotypes were found more often in the same area than less closely related haplotypes (Pons and Petit, 1996). We also estimated genetic differentiation among all populations with AMOVA and inferred population growth and expansion according to Tajima's D using Arlequin version 3.0 (Excoffier et al., 2005), with 1000 random permutations to test for significance of partitions. Genealogical relationships among cpDNA and nuDNA haplotypes were constructed using TCS version 1.21 (Clement et al., 2000).
The phylogenetic relationships among haplotypes and genotypes of cpDNA and nuDNA were reconstructed with Bayesian inference (BI) methods in MrBayes version 3.1.2 (Ronquist and Huelsenbeck, 2003). We applied the best fit model, GTR + I + G, which was inferred by Modeltest 3.7 under the Akaike information criterion (Posada and Crandall, 1998;Ronquist and Huelsenbeck, 2003). The BI consisted of two parallel runs with four incrementally heated chains and three million generations sampled every 1000 generation. The output was diagnosed for convergence using TRACER v.1.3 (Rambaut and Drummond, 2007), and summary statistics and trees were generated using the last two million generation in MrBayes version 3.1.2 (Ronquist and Huelsenbeck, 2003). In order to distinguish the haplotypes and genotypes clearly, the branches with high bootstrap value (>0.95) were classified as new clades based on the phylogenetic trees (Porter et al., 2005;Pyron, 2011;Ye et al., 2017). The divergence times within R. chinensis were estimated using a molecular clock and fossil data. Three fossils of Rhus were used to calibrate the node ages of R. typhina and R. glabra (6.0 Mya), R. typhina and R. virens (38.1 Mya), and R. typhina and P. vera (49.1 Mya) for cpDNA data, respectively (Yi et al., 2004), while one fossil (49.1 Mya) was used as the divergence time between Rhus and Pistacia for nuDNA data. Both the strict and relaxed molecular clock rates were tested in MEGA6 (Tamura et al., 2013) using the BI summary tree, and they could not be rejected for either the cpDNA or nuDNA data. Therefore, the strict and relaxed clocks were both applied to the two datasets in the BASEML and MCMCTREE programs of PAML (Yang, 2007), and used our BI summary tree as the guide tree.

Ecological Niche Modeling
We compared the current distributions of R. chinensis with its inferred distributions during the last glacial maximum  (Hijmans et al., 2005) representing the present (averaged from 1950) and the LGM according to the Community Climate System Model (CCSM; Collins et al., 2006). We employed 20 replicates based on 80% of the distribution coordinates for training and 20% for testing, and adopted the model with the best AUC values (Phillips et al., 2006). We performed a jackknife test to estimate the percent contributions of bioclimatic variables to the prediction for the distributional models. Meanwhile, we also employed the "10 percentile presence" threshold logistic approach as determined by Maxent in order to distinguish the threshold between suitable and unsuitable habitats for further analyses. We drew Graphics for each predicted SDM using DIVA-GIS 7.5 (Hijmans et al., 2005).

Genetic Diversity and Structure
Aligned cpDNA dataset consisted of 2051 bp with 70 nucleotide substitutions and two indels. We detected 23 different haplotypes (M1-M23) based on combined cpDNA dataset from 19 3 http://www.worldclim.org/ populations. The LEAFY gene region varied from 412 to 645 bp and had an aligned length of 682 bp, which contained 14 nucleotide substitutions. Our sequences of LEAFY comprised 15 genotypes (N1-N15). Based on cpDNA and nuDNA sequences, the total haplotype diversity of R. chinensis was estimated to be 0.738 and 0.614, and the total nucleotide diversity was inferred to be 6.910 × 10 −3 and 3.050 × 10 −3 , respectively ( Table 2). We found the highest levels of haplotype and nucleotide diversity in four populations: P2, P11, P14, and P16 ( Table 2). The most widespread haplotypes and genotypes were M1 (in 11 of 19 populations, cpDNA) and N3 (in 18 of 19 populations, nuDNA; Table 2), respectively. Based on cpDNA and nuDNA sequences, M1 and N3 were the primary haplotype and genotype, respectively (Figure 1). AMOVA analysis indicated that genetic variation in R. chinensis was greater within populations than among them (P < 0.01; Table 3). The mismatch distribution (Figures 1A1,B1) and positive values of Tajima's D value (1.19, 0.05 < P < 0.10 for cpDNA; 2.37, P < 0.01 for nuDNA) of all populations rejected a sudden expansion model, and positive Tajima's D may indicate population admixture. Phylogeographic structure is not obvious at the species level for both sets of genetic markers. For the cpDNA data, N ST (0.382) was slightly higher than G ST (0.375), while for the LEAFY, the difference between the two indices was not significant (N ST = 0.321, G ST = 0.319, P > 0.05).

Ecological Niche Modeling
The inferred past (LGM) and current distributions of R. chinensis are illustrated in (Figure 3). The AUC values based on both training and test presence data for the present and at the LGM were all higher than expected (not shown), which demonstrated good model performance. It was notable that the current distribution model indicated that R. chinensis mainly occurred in the warm temperate zone of China, which also suggested that it should occur in the same region during the LGM period ( Figure 3A). In the comparison with the two simulated distributions, the LGM distribution model predicted that the species was mainly concentrated in Yunnan and central China including Shaanxi, Sichuan, Hubei, and Jiangxi provinces, and it had slightly shrunk in these regions during the LGM period ( Figure 3B).

DISCUSSION
We did not detect a clear phylogeographic structure among the 19 populations of R. chinensis sampled in the present study. We found a somewhat lower differentiation among R. chinensis populations (N ST = 0.382 for cpDNA, N ST = 0.321 for nuDNA) compared to sympatric species such as Platycarya strobilacea (Chen S.C. et al., 2012) and Cotinus coggygria (Wang W. et al., 2014). Limited phylogeographic structure within a metapopulation may be due to high levels of geneflow and/or of geophysical connectedness (Avise et al., 1987). High levels of gene flow among R. chinensis populations may be due to the seed dispersal mechanism, which has been implicated in high levels of gene flow in many other plant species (e.g., Lopez et al., 2007;Song et al., 2013;Johnson et al., 2017). R. chinensis can produce 1000 seeds per plant on average, and the seeds are dispersed by animals, including mammals and birds, and by water (Huang and Qiu, 1994;Wang W. et al., 2014). Therefore, it is possible that relatively limited population differentiation may be due to the movement of seeds, including maternal and bi-parental genetic material, throughout the warm temperate zone. Geophysical connectedness within the range of R. chinensis may also be responsible for high levels of gene flow among populations. Stated another way, there may be limited barriers to dispersal. In the distributional area of R. chinensis, no obvious geographic barriers have been observed. Therefore, R. chinensis does not appear to be geographically isolated, allowing ecological niche modeling to be used in the assessment of species status (Li X.H. et al., 2012;Liu L. et al., 2014;Wang W. et al., 2014). Ecological niche models suggested the suitable habitats of R. chinensis were continuous in the present time while compressed during the LGM period, demonstrating multiple possible isolated glacial refugia (Figure 3). The response to impact of cold and warm times on the distribution of R. chinensis was validated in the simulation of ecological niche modeling, although we only used the simulated environment of current and LGM period (Figure 3). This pattern of range Frontiers in Genetics | www.frontiersin.org shifts indicated a likely scenario of repeated glacial compressions followed by interglacial expansions for R. chinensis during the Quaternary climatic oscillations. It is interesting that the geographic distribution of the cpDNA haplotypes differs from the nuDNA genotypes (Figure 2). Mismatch distributions between organellar DNA haplotypes and nuclear DNA genotypes have been reported in other groups such as Sophora davidii (Fan et al., 2013), Cycas diannanensis , and Osteomeles schwerinae (Wang Z.W. et al., 2015). Therefore, we thought that the forest birds and mammals were known as seed dispersers for many species in Anacardiaceae (Wang W. et al., 2014), which might have directly impacted the genetic structure with biparental inheritance.
The populations originated from Yunnan occurred at the China-Vietnam border and split from other clades at 4.2 ± 2.5 and 3.8 ± 3.0 Mya according to the cpDNA (clade VI) and nuDNA (clade III), respectively (Figures 2A,B). Early diverging populations in Yunnan have been detected in other genera or species such as Ceratotropis (3.62 Mya, Javadi et al., 2011), Incarvillea sinensis (4.4 Mya, , and Stuckenia filiformis (3.93 Mya, Du and Wang, 2016). Within these species, the uplift of the QTP has been implicated as the main mechanisms of driving diversifications, but the estimated divergences were more recent than the last phase of the uplift (7-8 and 13-15 Mya; Harrison et al., 1992;Shi et al., 1998;Spicer et al., 2003). So, we thought that the geographical isolation of Yunnan populations was caused by the isolation of the QTP uplift in late Pliocene. Furthermore, the suitable climate in the temperate and subtropical zone could have subsequently facilitated the Pliocene-Pleistocene diversification of R. chinensis into different eco-geographic populations (Javadi et al., 2011).
Previous phylogeographic studies have widely supported hypotheses that climatic changes during the LGM forced plants into refugia within Central China, where they were protected by the QTP from the brunt of the ice age (Tian et al., 2009;Liu et al., 2012). After the glaciers retreated, the plants expanded their ranges rapidly (Hewitt, 2000;Li Z.H. et al., 2012;Qi et al., 2012). Our results showed that the range of R. chinensis had increased since the LGM (Figure 3) but did not support a rapid expansion based on the mismatch distribution (Figure 1) and Tajima's D (1.19, 0.05 < P < 0.10 for cpDNA; 2.37, P < 0.01 for nuDNA). Refugia in the warm temperate China may have been dominated by evergreen forest or temperate deciduous forest during the LGM (Liu, 1988). Thus, southern Shaanxi, northern Sichuan, Yunnan, and Jiangxi could have supported R. chinensis during the LGM and been its main center of diversity. Just as P. strobilacea (Chen S.C. et al., 2012), Cercidiphyllum (Qi et al., 2012), and C. coggygria (Wang W. et al., 2014), the plants were slightly affected and were able to survive in situ at the period of the glaciation. So, the characterized phylogeographic structure of R. chinensis was consistent with the second hypothesis, which was that they survived in situ and occupied multiple localized glacial refugia during the glaciation.

CONCLUSION
We used cpDNA and nuDNA sequences, and ecological niche modeling to investigate the evolutionary history of R. chinensis distributed in the warm temperate zone of China. The cpDNA and nuDNA data separately revealed six and five clades corresponding to the geographic regions. The divergence among haplotypes and genotypes of R. chinensis occurred at the Pliocene based on cpDNA and nuDNA data. Our ENMs showed enlarged potential distributions in the present compared to LGM, but we did not detect a sudden demographic expansion after the glaciation according to the molecular data. Our results suggest that R. chinensis was not affected by glacial cycles seriously and survived in situ and occupied a few main refugia.

AUTHOR CONTRIBUTIONS
ZR conceived and designed the research. YL and YZ collected the samples, performed the experiments, and conducted data analyses. XS and ZR drafted the manuscript. JW polished the manuscript. All authors read and approved the final manuscript.