Molecular Phylogeography and Ecological Niche Modeling of Sibbaldia procumbens s.l. (Rosaceae)

The phylogeographical analysis and ecological niche modeling (ENM) of the widely distributed Northern Hemisphere Sibbaldia procumbens s.l. can help evaluate how tectonic motion and climate change helped shape the current distribution patterns of this species. Three chloroplast regions (the atpI-atpH and trnL-trnF intergenic spacers and the trnL intron) were obtained from 332 (156 from present study and 176 from the previous study) individuals of S. procumbens s.l. An unrooted haplotype network was constructed using the software NETWORK, while BEAST was used to estimate the divergence times among haplotypes. ENM was performed by MAXENT to explore the historical dynamic distribution of S. procumbens s.l. The haplotype distribution demonstrates significant phylogeographical structure (NST > GST; P < 0.01). The best partitioning of genetic diversity by SAMOVA produced three groups, while the time to the most recent common ancestor of all haplotypes was estimated to originate during the Miocene, with most of the haplotype diversity having occurred during the Quaternary. The MAXENT analysis showed S. procumbens s.l. had a wider distribution range during the last glacial maximum and a narrower distribution range during the last interglacial, with predictions into the future showing the distribution range of S. procumbens s.l. shrinking.


INTRODUCTION
Intercontinental disjunct distributions of flowering plants are believed to result from the opening and closing of land bridges (Beringia and the North Atlantic) between Eurasia and North America, with both vicariance and long-distance dispersal proposed to play important roles in shaping intercontinental disjunctions (Boufford and Spongberg, 1983;Wen and Shi, 1999;Manos and Donoghue, 2001;Ivanov et al., 2002;Riggins and Seigler, 2012).
During the period of about 15-5.5 Ma, due to persistent cooling and aridification, the Beringia was hypothesized to act as the land connection between North America and Eurasia allowing floristic exchange (Riggins and Seigler, 2012;Krasnov et al., 2015). The Bering Strait disappeared during the late Miocene, and reemerged multiple times during Pleistocene glacial periods when sea levels dropped (Marincovich and Gladenkov, 1999;Gladenkov et al., 2002).
In comparison, the North Atlantic land bridge is thought to have been the connection between eastern North America and northwestern Europe across Greenland during the late Eocene and Plio-Pleistocene (Qian and Ricklefs, 2000;Ian Milne, 2006).
Large-scale shifts of distribution range are believed to have taken place in response to the development of Quaternary continental ice-sheets, especially the cycles of glacial and interglacial during the Pleistocene (Ujiié and Ujiié, 1999;Zhang et al., 2007;Yang et al., 2008). Many mountain ranges have been strongly glaciated during the Quaternary ice age, the icefree mountain top and the edge of ice sheet offering the only refugia for alpine plants (Briner et al., 2003;Provan and Bennett, 2008). The refugia were characterized with special, buffering environments, in which species can persist one or more glacialinterglacial cycles (Rowe et al., 2004;Provan and Bennett, 2008). Re-colonization in post-glacial regions may have occurred either through long-distance dispersal from unglaciated areas or from local refugia. Both ways have played important roles in shaping the distribution patterns of species. If dispersal was the predominant mechanism of re-colonization, the species would tolerate a wide range of environmental conditions (Martin et al., 2009). When species survived in local refugia, a certain relict component of this species should exist, mostly endemic and likely with a narrower tolerance range (Martin et al., 2009). High mountain ranges were especially significant for the long-term survival of plants because their height offered sufficient scope for altitudinal shifts when climate changed (Meng et al., 2007;Fu et al., 2016).
In the present study, we newly sequenced three chloroplast non-coding regions (atpI-atpH spacer, trnL intron, and trnL-trnF spacer) from 156 S. procumbens s.l. individuals primarily in Eastern Asia, in combination with sequences from populations in North America and Europe (Allen et al., 2015), with the goal (1) to examine the genetic structure of S. procumbens s.l. in the Northern Hemisphere and (2) to investigate the possible historical events that helped shape the current geographical and genetic distribution patterns of S. procumbens s.l.

MATERIALS AND METHODS
Population Sampling, DNA Isolation, Amplification, and Sequencing A total of 156 individuals representing 29 populations of S. procumbens s.l. were newly collected from China, Armenia, Austria, Russia, and Japan (Table 1), which covers nearly the entire geographical range of S. procumbens s.l. in Eurasia. Fresh leaves were collected from each individual and dried with silica gel. Total genomic DNA was extracted using the Plant Genomic DNA extraction kit (TIANGEN, Beijing, China).
An unrooted haplotype network was constructed using NETWORK 4.6.11 (Bandelt et al., 1999). Partitioned Bayesian analysis was carried out with MrBayes 3.1.2 (Huelsenbeck and Ronquist, 2001), under the general time reversible substitution model. The best partition scheme and substitution model was determined by PartitionFinder 1.1.1 (Lanfear et al., 2012). We executed one cold and three incrementally heated Monte Carlo Markov chains on two simultaneous runs, sampling every 1,000th generation, with a burn-in of 25%. Indels were included as (0, 1) characters in a separate data partition with software SeqState (Müller, 2005). The divergence time of haplotypes was estimated using BEAST 1.7.5 (Drummond and Rambaut, 2007), following the descriptions by Allen et al. (2015).

Biogeographic Reconstruction
For biogeographic reconstruction, individuals of S. procumbens s.l. were assigned to five areas (labeled A-E) based on their present distribution: Eastern Asia, Central Asia, Western Eurasia, eastern North America, or western North America. Ancestral distributions of S. procumbens s.l. were inferred by BioGeoBEARS (Matzke, 2013(Matzke, , 2014 as implemented in RASP 4.0 (Yu et al., 2015) using 10,000 randomly sampled trees from the posterior distribution generated in the BEAST analysis, all other parameters were set to their defaults.

Ecological Niche Modeling
Ecological niche modeling was carried out in Maxent 3.3.3k (Phillips and Dudík, 2008) to reconstruct potential geographic distribution of S. procumbens s.l. in different historical periods. A total of 225 records (Supplementary Table S1) spread across the distribution range were used for the analysis. Nineteen bioclimatic variables with the 2.5 arc-min spatial resolution were downloaded from WorldClim 1 , and the variables with | Pearson's R| ≥ 0.7 were excluded; 10 variables (BIO1, annual mean temperature; BIO2, mean diurnal range; BIO5, max temperature of warmest month; BIO7, temperature annual range; BIO8, mean temperature of wettest quarter; BIO9, mean temperature of driest quarter; BIO12, annual precipitation; BIO15, precipitation seasonality; BIO17, precipitation of driest quarter; BIO18, precipitation of warmest quarter) were selected to simulate the past, current, and future suitable areas of S. procumbens s.l. Model validation was performed using Maxent default settings with 10 independent replicates of subsamples. The present distribution of S. procumbens s.l. was established using the bioclimatic variables for the current climate . Future climate scenario data for 2070 (2061-2080) were obtained from global climate models (GCMs), with future climate projections based on IPCC 5th assessment data, with calibration and statistically down-scaling using the data for the "current" climate. The community climate system model 4 (CCSM4) and NCAR-CCSM were chosen for the last glacial maximum (LGM, 0.021 Ma) and the last interglacial (LIG, 0.12-0.14 Ma), respectively.
A total of 41 haplotypes were detected, including 19 (Haplotypes 23-41) newly identified haplotypes in this study and 22 previously identified haplotypes by Allen et al. (2015). Haplotypes 1-18 correspond to haplotype A-R in Allen et al. (2015); haplotypes 19-22 correspond to haplotype T, U, V and W in Allen et al. (2015). Among the 28 haplotypes in Eurasia, 16 were found only in a single population, two haplotypes were found in two populations, and five were found in more than two populations (Table 2 and Figure 1). The most frequent and widespread haplotype (H19) was found in 30% of individuals and 17.6% of populations, which was widely distributed in the QTP and adjacent regions of Taiwan and Georgia. Fourteen populations contained only one haplotype (Table 2 and Figure 1). Twelve populations contained two haplotypes, six populations contained three haplotypes, and one population contained four haplotypes (Table 2 and Figure 1).

Genetic Diversity and Structure
The haplotype diversity (Janssens et al., 2016) and the nucleotide diversity (π) was 0.8977 and 0.6 × 10 −2 , respectively, for all populations in the Northern Hemisphere. The haplotype diversity and nucleotide diversity for populations in Eurasia was 0.8394 and 0.46 × 10 −2 , respectively. For the populations in Eurasia, h S and h T were 0.426 and 0.871. The coefficients of differentiation measured over the 39 populations in Eurasia were G ST = 0.515, N ST = 0.826; and the Permutation test revealed that N ST was significantly higher than G ST (P < 0.01), which indicates phylogeographic structure for Eurasian populations. AMOVA results showed that a large proportion (92.97%) of the chloroplast variation was among populations ( Table 3). The SAMOVA analysis identified three population groups (Figures 1, 2).
PartitionFinder determined one partition (trnL-trnF + trnL intron + atpI-atpH) as the best scheme. The phylogenetic analysis of all haplotypes revealed three clades (Figure 3). The early diverging clade (clade 1) includes eight haplotypes, H21-H24 and H31-H34, which are all endemic to the QTP and adjacent regions. Clade 2 consists of 11 haplotypes (H15-H20, H25-H30, H36-H41), which are endemic to Asia with a wider current distribution range than clade 1. The haplotypes in clade 3 form two small subclades; the first subclade consists of 15 haplotypes (H1-H14, H37), with H1 and H3-H14 distributed in North America, with H2 having a wide distribution around North America and extending east to Northwest Europe. H37 is distributed in the Changbai Mountains in the northeast of China; the second subclade includes seven haplotypes, which are mostly distributed in Europe (H15-H18, H36, H41), with only H35 distributed in the Qilian Mountains (north of the QTP) in China.
RASP analysis resolved Eastern Asia as the most likely ancestral region of S. procumbens s.l. (Figure 4). Fifteen dispersal and four vicariance events were identified.

Past, Present, and Future Ecological Niche Models
The AUC (area under curve) value for the potential distribution of S. procumbens s.l. was high (0.978 for future, 0.979 for LGM, and 0.95 for LIG), indicating good predictive model performance.
The predicted distribution under current conditions was generally similar to its observed distribution across the Northern Hemisphere, with western North America and the QTP as the main distribution areas, and sparse distribution in Greenland, the Mediterranean, and Taiwan. At the LGM, the range of S. procumbens s.l. showed an expansion in distribution to the Caucasus Mountains and north of the Mediterranean (the Alps and Scandinavian Peninsula); the expansion was also detected in the Western Rocky Mountains and the Eastern Appalachian Mountains (Figure 5). At the LIG, the range of S. procumbens s.l. shrunk to the tallest mountain peaks in the QTP, western North America, and Scandinavia (Figure 5). The predicted distribution of S. procumbens s.l. in the future (2061-2080) is contracted, mainly concentrated in the QTP and adjacent regions (Figure 5). FIGURE 2 | Unrooted haplotype network with SAMOVA defined groups, the size of the circles represents the frequency of haplotypes, with larger circles representing haplotypes found more frequently than smaller circles.

Haplotype Divergence During Neogene and Quaternary
Climate change and tectonic motion are thought as important factors driving species formation and differentiation (Bruch et al., 2006;Fortelius et al., 2006;Johnson et al., 2006). According to our estimation, the diversification of S. procumbens s.l. is likely to have begun around 5.75 Ma (95% HPD 4.09-7.49 Ma) in the late Miocene (Figure 3). A global second cooling happened 11.6-5.3 Ma (Molnar et al., 2010), which could have driven the initial divergence of S. procumbens s.l. In addition, the early diverging haplotypes (clade 1 in Figure 3) of S. procumbens s.l. are distributed in the QTP (Figure 1), and the fifth stage uplift of the Himalayas occurred about 7-5 Ma (Harrison et al., 1992;Yin, 2006), which may have been another factor driving the initial diversification within S. procumbens s.l.
The divergence between the Asia clade (clade 2) and the North America + Europe clade (clade 3) was estimated to be around the early Pliocene (Figure 3). This divergence likely resulted from the spread of S. procumbens s.l. from the QTP to other regions. A series of mountain ranges extending from the Himalayas are thought to have facilitated the spread and diversification of this species (Allen et al., 2015).
The migration of S. procumbens s.l. from Asia into North America via the Bering Strait was estimated to be around 2.36 Ma (95% HDP 1.55-3.29 Ma), which generally coincides with the estimation by Allen et al. (2015). The reduction of the sea level between Asia and Alaska during the late Pliocene may have facilitated the spread of many species (Vermeij, 1991;Berggren, 1995). Long-distance dispersal could also play an important role in the distribution expansion of S. procumbens s.l. (Givnish et al., 2004;Nathan, 2006;Mummenhoff and Franzke, 2007); since the seeds are small achenes (Coker, 1966;Li et al., 2003) which can be transmitted with wind or germinate after passage through the digestive tract of birds and other herbivores (Myers et al., 2004;Nathan, 2006;Mummenhoff and Franzke, 2007).
Quaternary glacial-interglacial cycles played important roles in the formation and differentiation of species (Nie et al., 2005;Meng et al., 2007;Yang et al., 2008;Wang et al., 2009). Periodic  climatic fluctuations in the Quaternary were considered one of the main factors that formed geographical patterns of modern alpine plants (Yang et al., 2008;Wang et al., 2009;Zhang et al., 2014). In the current study, we found the greatest differentiation of S. procumbens s.l. occurred during the Quaternary (Figure 3), which indicate the glacial-interglacial cycles are likely factors contributing to the recent diversification within this species.
In general, our data suggest that it is likely the rapid landmass uplifts of the QTP and the global cooling during the Neogene that promoted the speciation and spread of S. procumbens s.l., while the Quaternary glacial-interglacial cycles have driven the recent differentiation of this species.
Even though the fossil evidence of S. procumbens s.l. is rare, our phylogenetic analysis (Figure 3) resolved an early diverging clade composed of eight haplotypes which were all distributed in the QTP and adjoining regions suggesting the origin of the species occurred in the QTP or adjoining regions. This corresponds with the results of RASP analysis (Figure 4), which indicate Eastern Asia as the most likely ancestral region of S. procumbens s.l. The QTP and adjoining regions is one of the world's most important centers of biodiversity due to its high species richness and abundance of endemic species (Myers et al., 2000;Luo et al., 2016). This region has been suggested as an important origin and differentiation area for taxa with intercontinental disjunct distribution (Nie et al., 2005;Wang et al., 2007;Zhang et al., 2014). Birkeland (2012) and Allen et al. (2015) proposed S. procumbens s.l. migrated both eastward and westward from Asia. The distribution patterns and phylogenetic analysis of haplotypes (Figures 1-3), and a series of dispersal events identified by the RASP analysis (Figure 4) support this proposal. More specifically, S. procumbens s.l. likely originated from the QTP and adjacent regions in the uplift phase during the late Miocene. The global cooling during this period provided a wider habitat, likely promoting the spread of this species, and the high mountain ranges extending from the QTP provided corridors for the spread outward: (1) to the west, S. procumbens s.l. dispersed along the Himalayas to the West Pamir Mountains (Tajikistan), across the northern Iranian Plateau to the Caucasus, continued to spread westward to the Balkan and Carpathian Mountains, then along the Alpine Mountains to the Scandinavian Mountains, finally arriving to lands and islands in the North Atlantic Ocean, and even to eastern North America and (2) to the east, the species likely first spread to the mountain ranges north of QTP (e.g., the Qilian Mountains and Kunlun Mountains), and moved further along the mountain range to the Tian Shan and Altai Mountains, then extended south and east to Siberia, finally reaching western North America via the Bering Strait. The North American clade was further divided into two subclades (Figure 3), the first is widespread and presents high haplotype diversity and the second consists of individuals in California and the Changbai Mountains. This supports the hypothesis of Allen et al. (2015) that S. procumbens s.l. shifted far southward after arriving to North America.
Additionally, S. procumbens s.l. could have also spread along the Qilian Mountains, Helan Mountains, Yin Mountains, and Yan Mountains to the Changbai Mountains in the northeast of China, and later arriving to Japan. A series of mountains (e.g., Qinling Mountains, Daba Mountains, Dabie Mountains, and Wuyi Mountains) could have been the route for the spread of S. procumbens s.l. from the QTP to Taiwan. Currently we cannot rule out the possibility that S. procumbens s.l. found in Taiwan are from Japan, or vice versa.

Refugia in Eastern Asia
Refugia are geographical areas of sheltered topography that provided suitable stable microclimates allowing species to persist throughout climatic oscillations (Rowe et al., 2004;Provan and Bennett, 2008). The refugia can be identified from fossils, paleo environmental data (Comes and Kadereit, 1998;Petit et al., 2002), and phylogeographic information (Stehlik, 2003;Tribsch and Schönswetter, 2003;Provan and Bennett, 2008;Eidesen et al., 2013). Multiple Pleistocene refugia for S. procumbens s.l. in North America and Europe have been reported (California and the Southern Rocky Mountains; Eidesen et al., 2013;Allen et al., 2015). Similarly, several refugia in Asia for S. procumbens s.l. during the LGM have been suggested in the current study.
As indicated by Allen et al. (2015) and Eidesen et al. (2007), genetic differences between populations in different regions suggest long separation through glacial intervals. In Asia, scattered distribution of S. procumbens s.l. are found in the Changbai Mountains in the northeast of China and in Japan. Both regions possess unique private haplotypes, implying both regions have been geographically separated for one or more glacial cycles, and could have served as refugia for S. procumbens s.l. during the LGM.
High haplotype diversity was discovered in multiple populations (Bomi, Dingqing, Mangkang, Xiaojin, and Daofu) in the eastern QTP (Table 1 and Figures 1, 4), implying microrefugia may have existed along the mountains. We hypothesize S. procumbens s.l. originated in the eastern QTP and then colonized other mountain ranges surrounding the QTP. The mountains with high elevation provided the conditions necessary for differentiation of the species. There is only one haplotype (Haplotype 19) in the island of Taiwan; while haplotype 19 is also widespread across Asia. This implies Taiwan may have been recolonized by S. procumbens s.l. during interglacial periods, rather than serving as one refugium for S. procumbens s.l. during the LGM.

CONCLUSION
Our study implies current geographic and genetic distribution of S. procumbens s.l. is likely to have been shaped by changing climates in both the Neogene and Quaternary periods. Multiple regions in Eastern Asia may have served as important refugia for S. procumbens s.l. during extreme climatic events. Longdistance dispersal and vicariance may have played an important role in shaping extant distribution patterns of S. procumbens s.l. The distribution range of S. procumbens s.l. appeared to have experienced expansion and contraction during the LGM and LIG, respectively; in the future, when the global climate becomes warmer with rising carbon dioxide, the distribution of S. procumbens s.l. will shrink and be limited to the QTP and adjacent regions.

DATA AVAILABILITY
The datasets generated for this study can be found in National Center for Biotechnology Information, Please see Table 1 in the manuscript.