Big data help to define climate change challenges for the typical Mediterranean species Cistus ladanifer L.

Climate change’s huge impact on Mediterranean species’ habitat suitability and spatial and temporal distribution in the coming decades is expected. The present work aimed to reconstruct rockrose (Cistus ladanifer L.) historical and future spatial distribution, a typically Mediterranean species with abundant occurrence in North Africa, Iberian Peninsula, and Southern France. The R ensemble modeling approach was made using the biomod2 package to assess changes in the spatial distribution of the species in the Last Interglacial (LIG), the Last Glacial Maximum (LGM), and the Middle Holocene (MH), in the present, and in the future (for the years 2050 and 2070), considering two Representative Concentration Pathways (RCP 4.5 and RCP 8.5). The current species potential distribution was modeled using 2,833 occurrences, six bioclimatic variables, and four algorithms, Generalized Linear Model (GLM), MaxEnt, Multivariate Adaptive Regression Splines (MARS), and Artificial Neural Networks (ANN). Two global climate models (GCMs), CCSM4 and MRI-CGCM3, were used to forecast past and future suitability. The potential area of occurrence of the species is equal to 15.8 and 14.1% of the study area for current and LIG conditions, while it decreased to 3.8% in the LGM. The species’ presence diaminished more than half in the RCP 4.5 (to 6.8% in 2050 and 7% in 2070), and a too low figure (2.2%) in the worst-case scenario (RCP 8.5) for 2070. The results suggested that the current climatic conditions are the most suitable for the species’ occurrence and that future changes in environmental conditions may lead to the loss of suitable habitats, especially in the worst-case scenario. The information unfolded by this study will help to understand future predictable desertification in the Mediterranean region and to help policymakers to implement possible measures for biodiversity maintenance and desertification avoidance.


Introduction
The impact of the going on climate and land use changes on plants, species interactions, and ecosystem functioning was yet unpredictable. Climate impacts on species were estimated to be severe, considering changes predictable in near-surface temperature, precipitation, frequency and intensity of heat waves, forest fires, and pest outbreaks. In the Mediterranean region, an increase in seasonality is also expected, as well as in late frost, heat waves, intensive summer drought, punctual heavy rains, and fire risks (Alcamo et al., 2007). Therefore, bioclimatic envelopes are expected to shift considerably within one century (Loarie et al., 2009). The future climate change effect on plant ecosystems is predicted to affect the ability of plants to adapt to stresses, particularly in the Mediterranean region. The warming up is 20% faster than the global average, and land-use fragmentation might result in species' collapse, considering that the plants' adaptation to fast-changing climatic conditions is also unknown. How quickly plant species can adjust to global change, despite sizeable genetic variation detected in outcrossed species (Gomulkiewicz and Houle, 2009), is a question yet to be answered. The populations might go extinct before adapting to the new environment if genetic modifications are too slow (Jump et al., 2006). Climate change impact on well-adapted Mediterranean species was rarely studied considering the full range of their distribution (Serra-Varela et al., 2015;Viruel et al., 2020;Almeida et al., 2022), and including genetic and palynological data. The current study aimed to forecast this phenomenon in a species that is particularly adapted to low precipitation and high-temperature summers, the sclerophyllous shrub Cistus ladanifer L. (rock rose). Indeed, an increased risk of desertification and the need for biodiversity conservation in this region, a biodiversity hotspot, might be at stake, even for droughtresistant species.
The complex history of the Mediterranean flora has been influenced by three main factors: geological history, climate, and human activities (Thompson, 2005). This region is characterized by several mountain ranges taken as physical dispersal barriers for lowland species, but also as significant refugia during Pleistocene glaciations [starting at approximately 2.6 million years (Ma)] (Médail and Diadema, 2009). These extreme climatic events were decisive for the current floristic composition of the Mediterranean Basin, just as were the Messinian Salinity Crisis (6-5 Ma) or the intensification of climate seasonality during the Middle Miocene (16-14 Ma) (Thompson, 2005;Rundel et al., 2016). Through these periods, tropical and subtropical elements were progressively replaced by the droughttolerant woody sclerophyllous plants that dominate the region at present (Suc et al., 2018).
The Mediterranean region is the third-world plant biodiversity hotspot (Buira et al., 2021, and references therein). Despite having less species density than other Mediterranean-climate regions (Cowling et al., 1996), it has the highest number of species compared to the other regions with similar climate, and harbors a global hotspot for recent rapid speciation (Valente and Vargas, 2013); being the adaptive radiation of Cistus (Cistaceae) a representative example . The Cistus diversification appears to have been prompted by the onset of the Mediterranean climate (Fernández-Mazuecos and Vargas, 2010), whose early stage took place, probably, in the western Mediterranean (Guzmán and Vargas, 2005) during the warmest periods of the Pleistocene (Thompson, 2005). Cistus ladanifer is one of the most recent species in the genus, which diverged into three subspecies (Demoly and Montserrat, 1993): africanus [the basal one, occurring in the north of Africa and the southern tip of the Iberian Peninsula (IP)], sulcatus (restricted to the southwestern coast of Portugal), and ladanifer (the most widespread subspecies, spanning the west and center of the IP, south of France and north of Africa).
Cistus ladanifer is a perennial sclerophyllous scrub with resin, due to labdanum production. The species occurs mainly in oak or pine forests understory and tends to dominate disturbed and low-biodiversity scrublands. The populations constitute early successional stages adapted to disturbances operating in Mediterranean ecosystems, particularly fire. Moreover, this species have high production of seeds with considerable longevity, leading to persistent seed bank formation, and seeds have high physical dormancy, broken by high temperatures generated by a fire in soil top layers (Bastida and Talavera, 2002, and references therein). Seed dispersion is mainly local due to barochorous dispersion (Bastida and Talavera, 2002), though long dispersal is possible by animals that feed on seeds and fruits (Malo and Suárez, 1998). Additionally, seed production with different degrees of dormancy might help the species' fitness in harsh environments, with germination happening after a fire but harder seeds germinating afterward (Delgado et al., 2001). Cistus ladanifer is often perceived as a prejudicial plant, though with valuable applications in the pharmacological and aromatic industries due to its resin, the labdanum (Quintela-Sabarís et al., 2011;Frazão et al., 2018), and creating associations with other species of commercial interest (Alonso Ponce et al., 2011). Hence, increasing the scientific knowledge on this vast and valuable resource is crucial to improve current and future species management and to capture information about the predictable influence of climate change on very well dry-adjusted species.
Currently, the C. ladanifer native area occupies a large part of the IP, especially the occidental and central regions, but it is also present in southern France, northern Morocco, and Algeria, mostly in poor and acidic soils and, infrequently, in ultramafic and calcareous soils (Demoly and Montserrat, 1993). This species is well adapted to hot and dry landscapes and is expected to be highly resilient to climate change, which is starting to increase the aridity of the Mediterranean Basin (Guiot and Cramer, 2016). Nonetheless, doubts remain about the subspecies' sulcatus future resilience since the distribution range will decrease considerably over the current century (Ferreira et al., 2021).
A phylogeographic analysis for C. ladanifer suggests the existence of glaciation refugia in southern Iberia and northern Africa, enclosed by the Rif and Betic mountain ranges, in North Africa and South of Spain (Quintela-Sabarís et al., 2011). This resulted, besides climatic oscillations, as a partial consequence of human activities, which have been historically prevalent in the Mediterranean Basin (Thompson, 2005), since this species is an avid colonizer of degraded lands. Wildfire intensification (Pérez-García, 1997) and land abandonment (Mendes et al., 2015) are both beneficial for the emergence of C. ladanifer scrublands, whose allelopathic behavior hinders the occurrence of other plant species (Sosa et al., 2010;Gallego et al., 2020). Apart from some indications of potential glacial refugia and dispersal routes Quintela-Sabarís et al., 2011), there was little information concerning C. ladanifer historical range. Improving the knowledge about this species' past biogeography and how it will respond to climate change across the entire native Frontiers in Ecology and Evolution 03 frontiersin.org range is needed. Species distribution models (SDMs) are increasingly employed to unveil those biogeographic patterns (Svenning et al., 2011;Wróblewska and Mirski, 2018). The general objective was to investigate past species dynamics using molecular and palynological analyses and SDM modeling information, as the past data might help predict future range changes. However, future changes are foreseen to happen at a much faster pace. With the example of a well-adapted species, the climate change impact on the Mediterranean flora could be anticipated. A claim to policymakers for ecosystem management and maintenance could be targeted to minimize near-future predictable desertification in this region. Additionally, the rock rose has potential economic value due to the labdanum production, a resin from where valuable essential oils are extracted, with medical, cosmetic, and food industry putative applications (Frazão et al., 2018, and references therein) and improvement programs could be designed due to the gathered information.
Employing species distribution models methodology, this study aimed at (i) predicting the potential current C. ladanifer distribution in its native range area, (ii) identifying the most suitable areas and potential glacial refugia for this species in three past periods based on SDM modeling, phylogeography, and fossil data, and (iii) projecting the model into future periods, analyzing the effect of climate change on its current distribution.

Study area
The study area included the western part of the Mediterranean Basin (latitude from 29.88°N to 49.05°N and longitude from 13.18°W to 13.00°E), from Portugal to southern Germany and western Italy, in Europe, and, from Morocco to northwestern Libya, in Africa ( Figure 1). The Mediterranean climate prevails in the study area, with harsh summers, generally high temperatures and low levels of precipitation and climate heterogeneity, a biodiversity trigger; thus, this region includes one of the most remarkable non-tropical biodiversity hotspots (Myers et al., 2000).

Occurrence data
The species' presences were collected from different data sources (Supplementary Table S1). The dataset was cleaned to retain highquality distribution records: some were dubious or had coordinates with large uncertainty values. All the records located outside the study area or in regions where C. ladanifer is not autochthonous were eliminated. Thus, the data found in Macaronesia, the Mediterranean islands, the northern half of France, where the Mediterranean biome is not predominant, and the Moroccan Atlantic coast south of Larache (Mohammed Ater, personal communication) were removed.
The cleaned dataset, including 560,915 records, was further reduced to one occurrence per grid cell of ca. 5 km 2 to attenuate spatial autocorrelation, resulting in a final set of 2,833 records (Figure 1), to avoid the adverse effects of the spatial model bias. The selection of a single record per grid cell was sufficient, as already demonstrated elsewhere (Gorelick et al., 2017).

Environmental data
Nineteen bioclimatic variables (BIO1 to BIO19), with a spatial resolution of 30-s (~1 km 2 ) (Supplementary Table S2), were downloaded from the Worldclim v1.4 (Hijmans et al., 2005), a database of high spatial resolution global weather and climate data. They were downloaded for the past, the Mid-Holocene [MH: ~6,000 BP (years before present)], the Last Interglacial (LIG: ~120,000-140,000 BP), the present, and the future. The future climate data comprised the years 2050 and 2070 and two Representative Concentration Pathways (RCP 4.5 and RCP 8.5). The RCP 4.5 is a moderate emission scenario for the stabilization of radiative forcing by 2,100 , and the RCP 8.5 is the high-emissions scenario, called "business as usual" (van Vuuren et al., 2011), the worst-case scenario.
The CHELSA v1.2 (Karger et al., 2017) was chosen to download the bioclimatic variables for the Last Glacial Maximum (LGM ~22,000 BP), given the 30-s resolution unavailability for variables for this period in the Worldclim v1.4. The Global Climate Models (GCMs) CCSM4 (Community Climate System Model 4.0) and the MRI-CGCM3 (Meteorological Research Institute-Coupled Global Climate Model 3.0) were selected for all the periods under study, but the LIG, since the extracted information was only available, for this period, according to the displayed source (Otto-Bliesner et al., 2013).
The extracted variables were narrowed to an adequate subset of non-collinear variables using the function removeCollinearity from R package virtualspecies v1.5.1 (Leroy et al., 2016). The collinearity among the 19 bioclimatic variables was verified using Pearson's correlation coefficient (Dormann et al., 2013), and the multicollinearity cutoff = 0.8 was used to select the subset of non-collinear variables.
The final set included six variables, half of them temperaturerelated (BIO1 -Annual Mean Temperature, BIO5 -Maximum Temperature of Warmest Month, BIO10 -Mean Temperature of Warmest Quarter), and another half precipitation-related (BIO14 -Precipitation of Driest Month, BIO17 -Precipitation of Driest Quarter, and BIO18 -Precipitation of Warmest Quarter). The summary statistics considered for each selected variable, the GCMs, and across all the considered periods, past, current, and future, were shown in Supplementary Figure S1.

Species distribution modeling
The current species distribution models, including the ensemble, were implemented in R (R-Core-Team, 2021) with the package biomod2 v3.5.1 (Thuiller et al., 2021), using the species occurrences, the six environmental variables previously selected, and four different algorithms. The ensemble model was further projected into the past (LIG, LGM, and MH) and future (RCP 4.5 2050 and 2070; RCP 8.5 2050 and 2070) scenarios. The algorithms involved in the study included two regression methods and two machine-learning methods. The regression-based methods were the Generalized Linear Model (GLM) (McCullagh and Nelder, 1983) and the Multivariate Adaptive Regression Splines (MARS) (Friedman, 1991), and the machinelearning ones were the MaxEnt (Phillips et al., 2006) and the Artificial Neural Networks (ANN) (Hopfield, 1982). The biomod2 default settings were applied for all the models, according to the typical usage of biomod2 (Hao et al., 2019). The resulting ensemble model based on Frontiers in Ecology and Evolution 04 frontiersin.org different methods was suggested in the literature as a proxy for predicting species range due to the considerable variation in different model types' performance and consensus inexistence on the best technique. Thus, individual models' averaging results may increase the overall accuracy of predictions, and compromise among different algorithms could help constrain prediction error (e. g., Jones et al., 2010;Ochoa-Ochoa et al., 2016). Moreover, ensembles tend to perform better than most single models by quantifying model congruence, recognizing the highest uncertainty regions, and moderating inaccurate predictions risk (Stewart et al., 2022). Three sets of 100,000 random pseudo-absences (or background) were randomly generated to guarantee that environmental variation in the study area was considered. Five-fold cross-validation was performed to test the algorithms' accuracy by splitting presence records into training (75%) and test datasets (25%). Summing up, 15 models (3 plus 5) were executed in the biomod2 using the four algorithms, resulting in a total of 60 models (see Supplementary Figure S2 for a graphical overview of the methods).
The obtained models' performance was evaluated using two criteria, (i) the Area Under the Curve (AUC) of the Receiving Operator Characteristics (ROC) and (ii) the True Skill Statistic (TSS), using the BIOMOD_Modeling function, and the argument -models. eval.meth. The AUC ranges from 0 to 1, with higher values corresponding to better models' prediction. The AUC has the advantage of being both prevalence and threshold-independent. The TSS is also prevalence independent and ranges from −1 to 1, where 1 indicates perfect agreement and inferior to 0 indicates a performance no better than random (Swets, 1988;Fielding and Bell, 1997;Allouche et al., 2006;Supplementary Figure S2). Moreover, the importance of the variables for each model was calculated using the variables_ importance function. This function returns the variable importance value for each variable in the model. The highest the value, the more influence the variable has on the model.
The ensemble model was obtained using the BIOMOD_ EnsembleModeling function, and the algorithms produced models with a TSS value equal to or above 0.6 (Jones et al., 2010), considering the mean probabilities across predictions. This ensemble model was further projected into the future and the past using the BIOMOD_ EnsembleForecasting function. For each period (except the LIG), the results of the two GCMs were averaged to reduce uncertainty.
The potential distribution maps obtained for all scenarios, with values between 0 and 1 for each pixel, were reclassified into five suitability categories using the equal interval approach: (0.0-0.2; 0.2-0.4; 0.4-0.6; 0.6-0.8; 0.8-1.0). To predict changes, expansions, and contractions in the species' range from the past to the present and from the present to the future distributions, the ensemble model and Location of the study area and spatial representation of the occurrence dataset. Light green dots represent the species' presence (see Supplementary Table S1 for details). Map of the pollen sites in the Mediterranean region with Cistus ladanifer and C. ladanifer Type records. The map was built with ArcGIS® Desktop v10.8.1.
Frontiers in Ecology and Evolution 05 frontiersin.org each projection were converted to binary presence-absence maps of estimated presences (1 -suitable habitat) and absences (0 -unsuitable habitat) through the maximization of the TSS (Thuiller et al., 2021). The ArcGIS Combine tool was used to compare differences between pairs of binary maps to identify potential shifts and trends in suitable habitats. This function allowed us to determine if the C. ladanifer distribution area increased | habitat gained (0 → 1), decreased | habitat loss (1 → 0), remained suitable area | habitat maintained (1 → 1), and remained unsuitable area | unsuitable habitat (0 → 0).

Palaeobotanical and genetic data
The models projected in the past were integrated with the paleobotanical and genetic data to check their plausibility. Twentyfour sites corresponding to fossil pollen records were found in the literature and the Neotoma Paleoecology Database (Williams et al., 2018), further compiled in Supplementary Table S3 and mapped in Figure 1.
The Quintela-Sabarís et al. (2011,2012) studies were used as a source of genetic data. In the first study, the authors gathered results from 38 C. ladanifer populations, covering almost the entire natural range of the species, including the three subspecies, screened for length variation of polymorphic chloroplast simple sequence repeats, (cpSSRs), to investigate the species phylogeographic structure to locate Quaternary refugia, to reconstruct its recolonization patterns and to assess the role of geographical features (mountain ranges, rivers and the Strait of Gibraltar) as barriers to seed flow and expansion through the Western Mediterranean. In the second study, nuclear markers were used (AFLPs) and 33 populations were screened. Four clusters were unveiled by using Bayesian analysis [ Figure 3 in Quintela-Sabarís et al. (2012)].

Variable importance and model performance
The precipitation variables BIO18 (1), BIO14 (2), and BIO17 (3) had decreasing importance in the ensemble model, followed by the temperature ones BIO10 (4), BIO5 (5), and BIO1 (6) ( Table 1). Additionally, in the study area, all future scenarios predicted a temperature increase, considering the temperature-related variables (BIO1, BIO5, and BIO10) and a decrease in precipitation in the related ones (BIO14, BIO17, and BIO18). The temperature variables had a reduction from LIG to LGM and an increase from LGM to present. On the other hand, the precipitation increased from the past to the present (Supplementary Figure S1), and those results were consistent in both GCMs. Figure 2 displays how each variable's predicted probability of occurrence is affected by the ensemble model, considering all the other constants. A medium to high suitability (>0.6) for the species is observed when (i) the annual mean temperature (BIO1) ranged between 15 and 17°C, (ii) the maximum temperature of the warmest month (BIO5) ranged from 31 to 36°C, (iii) the mean temperature of warmest quarter (BIO10) around 23-24°C, (iv) the precipitation of driest month (BIO14) decreased bellow 5 mm, (v) the precipitation of driest quarter (BIO17) ranged from 29 to 300 mm, and (vi) the precipitation of warmest quarter (BIO18) ranged from 19 to 32 mm. The precipitation-related variables response curves suggested that the species tolerate relatively low precipitation content, and, in general, the probability of occurrence decreased when those variables increased.
The GLM displayed the worst performance (AUC = 0.871 and TSS = 0.648), and the ANN the best one (AUC = 0.951 and TSS = 0.789), based on the evaluation statistics ( Table 2). The ensemble model showed higher predictive accuracy since the AUC = 0.944 was close to the theoretical maximum AUC value (Phillips et al., 2006), and the TSS = 0.755, indicated that the model had a better performance than 0.6, the value that defined a good model (Jones et al., 2010) than the three algorithms (GLM, MaxEnt, and MARS), but not when compared with the Artificial Neural Networks.

Current potential distribution range
In the ensemble model (Supplementary Figure S3E, green and mustard), the potential C. ladanifer distribution with medium to high suitability (suitability >0.6) included the south and southwestern part of the Iberian Peninsula, the center and north-east of Portugal, and spotted areas in Morocco and Algeria. The center and southern part of the IP, south-east France, a few sites in Italy (including Sardinia), Tunisia, and the Atlantic coast of Morocco (Supplementary Figure S3E, orange and yellow) were characterized as low to generally suitable regions (0.2 > suitability >0.6). These patterns were also observed in the GLM, ANN, and MARS models (Supplementary Figures S3A,C,D, respectively), with some differences, particularly in northern Africa. These three models showed a high agreement in their predictions, though projecting more high suitability areas (Supplementary Figures S3A,C,D, green)  Figure S3B) was the most disruptive algorithm with less suitable areas predicted, namely the mediumhigh suitability. Figure 3 displays the past (LIG, LGM, and MH), the current, and the future potential C. ladanifer distribution, under the global climate change scenarios RCP 4.5 and RCP 8.5, for the years the 2050s and 2070s. Regarding the LIG ( Figure 3A), a considerable part of the Mediterranean North Africa area and the southern and western IP were found suitable. In the LGM ( Figure 3B), an enormous retraction in the species distribution occurred, which remained spotted nearby the sea and absent inland. The model predicted refugia, in this period, on the eastern coast of Spain, the Sardinian west coast, and along the Algerian and Tunisian Mediterranean regions. The MH environmental conditions were more suitable for the species development than during the LGM, with a consequent C. ladanifer range expansion from that period until the current conditions ( Figures 3C,D).
The future projections ( Figures 3E-H) unveil a considerable habitat loss across the North of Africa and the central part of the IP,  including the regions with the highest suitability values, where a higher number of presence points were recorded (Figure 1). The RCP 4.5 in the years 2050 and 2070 displayed similar scenarios ( Figures 3E,F), with the species' putative existence in the western part of the IP. Contrarily, in the RCP 8.5 scenario ( Figures 3G,H), a large area of central and northeastern Portugal and western Spain will be unsuitable or low-suitable, particularly for the RCP 8.5-2070 scenario. From the LIG to the current species' presence, the species' disappearance in the LGM was well discernible. Those losses in suitability across time can be seen in the binary presence-absence maps (Supplementary Figure S4), where the past, present, and future distributions of C. ladanifer are displayed. In the present, the species had the broadest distribution considering all the periods and scenarios (Supplementary Figures S4A-D).

Rock rose distribution changes
The current binary presence-absence map (Supplementary Figure S4D) was used as a reference to identify potential shifts and trends in suitable habitat areas between the current and the other periods and scenarios. Suitability changes in past and future scenarios relative to current were classified into: (i) unsuitable habitat, (ii) habitat loss (contraction), (iii) habitat maintained, and (iv) habitat gained (expansion) (Figure 4 and Supplementary Figure S5D). In general, two patterns were observed: (1) habitat gained from past scenarios to the present and (2) habitat loss from the present to the future scenarios ( Figure 5).
From the LIG period to the present ( Figure 4A), habitat gain is superior in the area to habitat loss. As expected, the changes in suitable habitat across time from the LGM to the present ( Figure 4B) were characterized by a significant increase in the area due to improved climatic conditions (Supplementary Figure S5D).
For the future scenarios ( Figures 4D-G, 5), the results showed that habitat loss will increase according to the climate scenario severity, and, consequently, the species' presence will decrease. Under RCP 4.5 ( Figures 4D,E), the habitat loss is similar for the years 2050 and 2070; in contrast, the habitat gain is higher in 2070 due to the emergence of newly suitable areas. The RCP 8.5 (Figures 4F,G), the Frontiers in Ecology and Evolution 08 frontiersin.org worst climate scenario, will provoke a sounding suitable area reduction, particularly in the 2070s. Habitat loss will be more evident in North Africa and, in central and southern IP, persisting small isolated populations across North Africa and South Iberia (increase in fragmentation) and in a large area of Central Portugal. The species will remain in southern France in the RCP 4.5 scenario but will disappear there in the RCP 8.5 scenario, suggesting that this area will become unsuitable in the future (Figure 4). The model indicated the occurrence of some habitat shifts in the past, and it predicted that the shifts would have a shallow extent in the future.

Palaeobotanical and genetic data
The palynological record (Figure 1 and Supplementary Table S3) showed very little evidence from the Late Pleistocene to the Holocene, all of which are in the IP (24). The two oldest records, dating from the Pleistocene (ca.142 and 90 ka) in Bajondillo and Padul-Granada (numbers 1 and 2), confirmed the presence of C. ladanifer in the south of Spain before the LGM. The other two Pleistocene records dated ca. 13 ka, after the LGM, originated from sites nearby the sea by the southern Portuguese coast (number 3) and in Galicia, North of the IP (PRD-4, number 4). These presences are congruent with refugia on the western coast of Portugal, as indicated in the model produced for the LGM by the current study ( Figure 3B). The pollen records from the Holocene were distributed along a strip across the middle of the IP, from the central Portuguese coast to the northeastern Spanish coast (Figure 1). This distribution was under the species expansion area during the MH and onwards. Interestingly, the most recent C. ladanifer fossil pieces of evidence (late Holocene, around 1.3 ka) were found in the northeastern part of Spain, indicating the species expansion, both in this area and on the southern French coast, were recent events in this shrub history. Nevertheless, inferences from some palynological pieces of evidence should be made with caution: the 'C. ladanifer-Type. ' The North of Africa was under-represented in the palynological records, probably a less studied region than Europe.
Moreover, C. ladanifer pollen is mainly dispersed by gravity at short distances, leading to a reduced presence in the pollen records. The North of Africa's lack of palynological data might be partially overcome by the information given by the genetic data. Chloroplast and nuclear DNA information indicate that rockrose populations from the IP belong to two main population clusters that originated from independent colonization events from the North of Africa (Figure 1). Therefore, according to the genetic data, the presence of C. ladanifer in Africa should preclude the oldest indications of rockrose presence in the IP Quintela-Sabarís et al., 2011). The present distribution of the central populations' genetic lineages indicated separated putative glacial refugia in the southeastern and western IP. This information is congruent with the past changes in habitat suitability from LIG through LGM to the MH, inferred from the models produced in the current study ( Figures 3A-C).
The rock rose suitable habitat was predicted to occur in a wide area of the western Mediterranean region, particularly in the Iberian Peninsula, southern France, and northern Morocco and Algeria. Those findings confirmed the species' ability to adapt to various climatic conditions, though the Mediterranean climate is prevalent. Indeed, the species distribution foreseen by the ensemble model is a good representation of the C. ladanifer current distribution (Figure 1, light green dots). The species' medium-high probability of occurrence (>0.6) happens mostly in its native range, in the center and southwest of the IP, northern Morocco, and Algeria. These medium-high probability of occurrence were more frequent under the current conditions compared to the other scenarios, thus confirming the current climatic conditions as the most suitable for the species occurrence. Additionally, there is a good match between two divides, a bio-climatological and a lithological one, and the potential distribution of the species. A significant lithological divide exists between western acidic and eastern basic substrates [see Figure 1 in Buira et al. (2021)]. The other divide is due to a temperate climate north and northwest of the IP (the Pyrenees and the Cantabrian Mountains). At the same time, the rest of the region is characterized by the Mediterranean climate, with a noticeable summer drought (Buira et al., 2021, and references therein). In the binary ensemble map (Supplementary Figure S4D), these two divides frame the C. ladanifer potential distribution, south and westward oriented, avoiding basic soils and temperate climate.
The results showed that the species distribution range would be affected by climate change, regardless of the severity of the scenario. Indeed, heat waves are expected shortly for the Mediterranean region, but also a decrease in precipitation, in particular in this region, compared to global change worldwide (Vescovi et al., 2010;Grímsson et al., 2020), and future climate conditions will become less suitable for the species occurrence. Nevertheless, it seems to cope further with temperature excess than with the precipitation decrease, since the species' distribution was mainly determined by the precipitationrelated variables. Future temperature increase impact on rock rose distribution will probably be moderated by the species' temperature resistance ecophysiology. Indeed, the production of labdanum, a sticky resin, increasingly during summer, photoprotects the leaves from UV light and helps control leave temperature and evapotranspiration decrease (Chaves et al., 1993;Nunez-Olivera et al., 1996).
According to the results, under the moderate emission scenario, the potentially suitable area will decrease both in 2050 and 2070 to a similar degree, yet, under the higher emission scenario, the suitable area will have a significant reduction in 2070 compared to the present. This reduction will be principally due to medium-high suitability areas decrease, from the present to the future, especially for the worstcase scenario, with substantial habitat loss in the center of the IP and northern Africa, a retraction westward oriented, with a potential fragmentation increase. This species' habitat suitability will change considerably, even enduring drought and temperature increase, and the geographic distribution patterns will be affected by future climate change. Other authors uncovered that global change would negatively affect the Mediterranean species (e.g., Quercus sp.) distribution areas (Nicol-Pichard and Dubar, 1998;Çoban et al., 2020). Likewise, a decrease in suitability area is predicted for many other Mediterranean species' geographical distribution, leading to shifts in geographical ranges and habitat loss, like with strawberry tree and others (e.g., Peñuelas et al., 2017;Almeida et al., 2022). Nevertheless, the rock rose Frontiers in Ecology and Evolution 09 frontiersin.org will not migrate northerly; the suitable areas will contract toward the Portuguese coast, particularly in the worst-case scenarios. The northward migration of the species was a consistent result obtained by several authors, including Arbutus unedo (Santos et al., 2002;Abel-Schaad and López-Sáez, 2013;Rocha et al., 2014;Ribeiro et al., 2019;Almeida et al., 2022). The species' response to climate change in the past could further help in understanding its response in the future, which will be further explored in the coming point.

Past scenarios
The projections into the past revealed a range contraction from the LIG to the LGM and a continuous range expansion from the LGM through MH to the present (see Figure 5). Thus, during the LIG, the model projected a wide suitable area for C. ladanifer in the North of Africa and two main suitable areas in the South of the IP separated by the Guadalquivir basin: one to the southeast, encircled by Betic ranges and other to the west, spreading from the Algarve in the SW to the Tejo valley ( Figure 3A) The pollen record partially supports these projections, with confirmed presence of C. ladanifer in the Betic area around 100 ka (Pons and Reille, 1988;Cortés-Sánchez et al., 2008). Moreover, the genetic data indicate that the African populations of C. ladanifer were the origin of the Iberian ones, which further supports the model projections about habitat suitability in the North of Africa (Quintela-Sabarís et al., 2011, 2012. The results indicated that the climatic conditions in the LGM were less suitable for the rockrose, with the disappearance of the areas with a medium-high probability of occurrence (dark green in Figure 3B) and a general contraction of the areas with a general probability of occurrence (orange in Figure 3B). Thus, three main areas emerge as potential refugia for C. ladanifer during LGM. First, a circle comprising East Spain (Valencia and Murcia regions) and the NE coast of Algeria and Tunisia. This circle is a typical hotspot of biodiversity and common putative refugia during this period (Médail and Diadema, 2009, and references therein;Feliner, 2011). Second, the central coastal region of Portugal has also been found to be a refugium for the Mediterranean species maritime pine and strawberry tree, based on the fossil record (Ribeiro et al., 2001;Almeida et al., 2022). Third, a cryptic refugium in the Betic and Rif coastal regions (South of the IP, North of Morocco, respectively). The fossil evidence for this period supports some of the model results, with the registered presence of C. ladanifer-Type pollen 13 ka in palynological sites in the South of Portugal and on the Galician coast (northwestern IP).
The IP is considered one of Europe's most important Pleistocene glacial refugia. The 'refugia within-refugia hypothesis' (Gómez and Lunt, 2007) claims that this region is unlikely to be a unique continuous refugium area due to its high environmental variability. Instead, it is divided into multiple and isolated refugia of different sizes. The 'refugia within refugia' have been demonstrated to be a common pattern, which explains the current distribution of several species of animals and plants in intraspecific lineages in the IP (Gómez and Lunt, 2007, and references therein). Indeed, the LGM refugia identification should help understand the species' postglacial migration during the Holocene, from those regions and its adaptation to local conditions (Gavin et al., 2014).
The scenario depicted by our model considering C. ladanifer distribution during the LGM also fitted this pattern and, despite the Changes in the C. ladanifer distribution area between scenarios.
Frontiers in Ecology and Evolution 10 frontiersin.org limited amount of pollen evidence from that period, supported by the current species' genetic studies. Thus, four intraspecific lineages in C. ladanifer (whose distribution coincides with the refugia inferred by the model approach: Figure 3B) have been identified based on chloroplast and nuclear markers (Quintela-Sabarís et al., 2011, 2012: four lineages, (1) Rif region, (2) Betic area, (3) central coastal Portugal, and (4) East of the IP. Since the LGM, the C. ladanifer suitable area has progressively increased, both in the North of Morocco and in the IP, and the species' expansion reached its current distribution range. This model projection is also supported by the increasing number of pollen occurrences in the center of the IP and a more recent expansion to the northeastern IP region (around 1.3 ka, López-Sáez et al., 2009) and South of France. Viruel et al. (2020), in their study, found support for the carob tree persistence in Moroccan and Iberian (Betic region) refugia during the LGM, confirmed both with molecular and potential historical range dynamics by SDMs reconstruction. However, this species had multiple domestications events from native populations throughout the Mediterranean basin. Besides rock rose fossil pieces of evidence, the existence of a putative central coastal refuge in Portugal (4) was also supported by similar findings in other thermophilous Mediterranean taxa in this sheltered area due to the oceanic influence, such as olive tree, maritime pine, strawberry tree, and mastic gum (Ribeiro et al., 2001;Figueiral and Terral, 2002;Almeida et al., 2022).
Species' postglacial expansion is dominated by the refugia placed in the 'leading edge, ' whereas the populations' expansion behind that front diffuses at a slower rate (Hewitt, 2001). This seems to be also the case in C. ladanifer, as lineages with refugia in Portugal and, especially East of the IP, expanded to the north and currently occupy most of the distribution area of this species. In contrast, the expansion from the Betic refugium was limited to the North by the presence of a physical barrier (the Betic mountain ranges) and to the West by the contact with the area occupied by other lineages. Indeed, , in their phylogeography study using organellar sequences, claimed that the C. ladanifer origin and the subspecies divergence happened in the Upper Pleistocene, before the Strait of Gibraltar onset and the Mediterranean climate establishment. Their study indicated that the Cistus species arose after the Mediterranean Sea re-emergence (5.59-5.33 Ma) and the Mediterranean climate launch (3.2 Ma). The sea possibly sheltered the species during ice ages, and the current species distribution is the result of migration by animals eating the capsules and from the different detected putative refugia, seen both in the molecular and the past projected modeled suitability of occurrence (LGM).
Indeed, this species moved westwards from the eastern refugia (3), northerly from North Africa (1), and in a fanned way from the central Portugal refugia (4), during the Holocene warming up, in a relatively quick spread pace since the MH model predicted that the habitat suitability matched, in a certain extent, with the current species occurrence (Figures 3B-D). Those findings are in agreement with the LGM-modeled map information ( Figure 3B), consistent with the hypothesis of the two refugia unveiled in eastern and southern IP, by the sea, another in North Africa, and the fourth one in central Portugal, also nearby the sea. The oceanic influence could have played a protective role against the last glaciation's most arduous climate and was ultimately responsible for the species' survival, as found in a study with the strawberry tree (Ribeiro et al., 2017).
As mentioned previously, the C. ladanifer seeds' primary dispersal strategy is barochory, and this species' gene flow is mainly local and that the seed dispersal is a significant component of gene flow, with the pollination dropping to zero at ca. 30 m (Quintela-Sabarís et al., 2012, and references therein). However, a secondary dispersal strategy made by animals (ants and deers) is an essential source of the species' long-distance dispersal, with ready-togerminate seeds spread by deer (Malo and Suarez, 1996;Bastida and Talavera, 2002). Wind and water are not to be discarded as additional seed spread vectors since the seed is very light and small. Therefore, this species' high structure and genetic differentiation was consistent with low gene flow, compared to species with generally high gene flow -as in wind-pollinated species -such as pines (e.g., Ribeiro et al., 2001), which promotes genetic drift and the formation of clear lineages/genetic clusters (Quintela-Sabarís et al., 2011, 2012. In summary, the distribution projections to the past offered by the models are congruent with the other two independent approaches (fossil records and genetics). This confirms the validity of our modeling approach to infer the effect of past changes in climate on C. ladanifer distribution and supported projections into the future under global change scenarios. This species will be dramatically affected by global warming, despite its resilience to drought and temperature. As commonly observed in other Mediterranean species, North migration is not the picture. Instead, the species will collapse westwards toward the Portuguese coast. The area of this species was predicted to be much reduced in the future, and the information given by this paper might help to understand the consequences for other equally drought-resistant species in this region. Policymakers should be put under pressure for urgent measures to mitigate the warming-up tendency and biodiversity depletion in this region, as well as expected for other thermophilous Mediterranean species. Certainly, habitat loss and spatial distribution conversion are predictable effects of climate change for Mediterranean species, and species acclimation, adaptation, and altitudinal and latitudinal migrations are expected (Carrión and Van Geel, 1999;Barredo et al., 2016). The Mediterranean region has many rare and locally endemic taxa that survived as small populations, many of which are threatened by habitat transformation and climate change (Peñuelas et al., 2017).
Additionally, the use of big data was also crucial, in this case for the species' range-wide modeling, due to the immense amount of information requested: presence points, layers of bioclimatic variables, different periods in the past and future, different algorithms, and GCMs. Moreover, the current study is an example of how big data allows us to predict the future and reconstruct the past.

Data availability statement
The original contributions presented in the study are included in the article/Supplementary material, further inquiries can be directed to the corresponding author.

Author contributions
MR and PF conceived the study idea. MRF and AMA handled the species occurrence data preparation and bias correction, conducted the analyses using the GIS and the R package biomod2, and wrote the first draft of the manuscript. All authors significantly contributed to ideas and revision and editing and revising the manuscript.