Model emulators for the assessment of regional impacts and risks of climate change: A case study of rainfed maize production in Mexico

The collection of publicly available databases about climate change and its impacts on natural and human systems is unprecedented and ever-growing. However, the requirements of information can vary widely among users depending on their region, socioenvironmental context, and interests. Moreover, in the current era of active mitigation and adaptation policies, information needs are frequently not satisfied even by these massive and variated collections of databases. The development and use of emulators can help closing this information gap by allowing users to approximate the output from complex models and create user-defined experiments, without being technically or computational demanding on the user. Here, a simple emulator of the EPIC biophysical crop model is presented which is able to adequately reproduce the changes in rainfed maize and to create projections for user-defined scenarios. Moreover, it allows to produce risk measures that are not available with the original model. The proposed methodology is illustrated with a case study of rainfed maize production in Mexico for a reference emissions scenario (SSP370) and two user-defined international mitigation policy scenarios. These scenarios represent 1) current international mitigation commitments and 2) a scenario in which China withdraws from international mitigation efforts. Results showed that, under the reference scenario, climate change could have widespread consequences on rainfed production all over the country with decreases in yields reaching up to 80% in the southeast and northeast of the country. These impacts can be partially modulated by the moderately ambitious mitigation commitments assumed in recent international agreements if all countries comply. However, a potential withdraw of China from these efforts would significantly reduce any benefits from international mitigation. Under all scenarios, changes in productivity impose increasing risks for already vulnerable populations and considerable economic costs at the state and national levels. These results suggest the urgent need for critical planning for adaptation in the agricultural sector of the country.

The collection of publicly available databases about climate change and its impacts on natural and human systems is unprecedented and ever-growing. However, the requirements of information can vary widely among users depending on their region, socioenvironmental context, and interests. Moreover, in the current era of active mitigation and adaptation policies, information needs are frequently not satisfied even by these massive and variated collections of databases. The development and use of emulators can help closing this information gap by allowing users to approximate the output from complex models and create user-defined experiments, without being technically or computational demanding on the user. Here, a simple emulator of the EPIC biophysical crop model is presented which is able to adequately reproduce the changes in rainfed maize and to create projections for user-defined scenarios. Moreover, it allows to produce risk measures that are not available with the original model. The proposed methodology is illustrated with a case study of rainfed maize production in Mexico for a reference emissions scenario (SSP370) and two user-defined international mitigation policy scenarios. These scenarios represent 1) current international mitigation commitments and 2) a scenario in which China withdraws from international mitigation efforts. Results showed that, under the reference scenario, climate change could have widespread consequences on rainfed production all over the country with decreases in yields reaching up to 80% in the southeast and northeast of the country. These impacts can be partially modulated by the moderately ambitious mitigation commitments assumed in recent international agreements if all countries comply. However, a potential withdraw of China from these efforts would significantly reduce any benefits from international mitigation. Under all scenarios, changes in productivity impose increasing risks for already vulnerable populations and considerable economic costs at the state and national levels. These results suggest the urgent need for critical planning for adaptation in the agricultural sector of the country.

Introduction
Climate change poses a serious risk to agriculture worldwide potentially compromising food security both globally and locally (Altieri and Nicholls, 2017). All domesticated crops, and particularly cereals, have been adapted to constrained climatic requirements that rely on predictable and recurrent climatic patterns. Deviations from these conditions have differential impacts on agricultural production around the world. It is expected that negative changes will be more evident in tropical regions where temperature is already close to the high temperature thresholds for suitable cereals production (Rosenzweig et al., 2014;Betts et al., 2018). Hazards of an altered climate on agriculture include the rise in global temperatures (Betts et al., 2018), the increment in frequency of extreme climatic events (Lesk et al., 2016;Cook et al., 2018) and a shift in precipitation seasonality (Zaveri et al., 2020). To understand and prevent the worst potential impacts of climate change on agriculture, a plethora of investigations have been conducted at different geographical scales from global to subnational (Ziska et al., 2012;Deryng et al., 2014;Rosenzweig et al., 2014;Lesk et al., 2016;Kukal and Irmak, 2018;Agovino et al., 2019;Jägermeyr et al., 2021;Kogo et al., 2021). Nevertheless, the availability of these main sources of information (satellites, censuses, surveys and models) and the spatial and temporal resolutions of these data products are not matching. In addition, these resources are frequently beyond of the computational abilities of different types of users which include policymakers . As a result, there is an urgent need to close the gap between the generation of sound scientific information, and its application in decision making to manage climate risks for global food systems facing climate change.
At least three approaches have been adopted to address the evaluation of climate change impacts on agriculture: empirical studies on observed climate variability and change and crop production; field experiments, and; process-based computational models. The first consists on case-studies of observed anomalous climatic events to exemplify the potential impact if similar conditions were to happen in the future. This type of approach is also used to extrapolate the impacts on crops under future climate conditions (Estrada et al., 2012;Iizumi and Ramankutty, 2015). This methodology has the advantage of being applicable at any spatial scale, thus potentially generating direct information for the decision makers. However, it usually involves using statistical models to extrapolate the effects of climate conditions beyond the range of observations in which the model was calibrated. In addition, climatic events as analogs of future climate conditions can offer little insight about how crops can respond to persistent climate conditions (Dell et al., 2014). A second approach is the employment of field trials, such as rain-exclusion and warming experiments (Robertson and Hamilton, 2015), which evaluate how crop yields could change in locally constructed future climatic conditions. Although they offer greater insights on potential response in face of future scenarios, their spatial extrapolation is limited to similar local conditions. Also, these studies are usually costly, limited to case studies and not feasible at the national or regional scales. The final approach has been the construction of computational models based on the processes that govern the agricultural systems and their relationships with climate. Despite limitations of their own (Rosenzweig et al., 2014), including that potential yields in general do not reflect the observed yields, this approach has enough flexibility to provide information at multiple spatiotemporal scales, including past events and future climatic scenarios.
Several global agricultural models have been developed in the last decades for generating climate change impact projections. The Environmental Policy Integrated Climate (EPIC-TAMU) is an open model which originally estimated the effects of soil erosion on crop productivity (Williams et al., 1984(Williams et al., , 1989. The major components of this model are weather simulation, hydrology, erosion-sedimentation, nutrient cycling, pesticide fate, crop growth, soil temperature, tillage, economics, and plant environment control. Although EPIC operates on a daily step, it is capable of simulating hundreds of years (Williams et al., 2015). The EPIC-IIASA global gridded crop model (Balkovič et al., 2014) is not an open model, based on the EPIC-TAMU version 081. It assesses the impacts on yields, water availability, and soil degradation in the main global agricultural systems with different management such as cropping, fertilization, irrigation practices, and organic options under future climate change conditions. The EPIC-IIASA model estimates plant growth and yield based on temperature and soil moisture (Balkovič et al., 2014). Another popular and open model is the pDSSAT which comprises an assortment of survey-based and geospatial data sources , and field-scale crop models, including those based in the Decision Support System for Agrotechnology Transfer (DSSAT) framework (CROPGRO103 and CERES (Crop Environment Resource Synthesis (Jones et al., 2003) (referred to as pDSSAT). The pDSSAT model simulates food, fiber and biomass production systems at high spatial resolution and continental or global extents (Müller et al., 2019). The Lund-Postdam-Jena managed Land model (LPJLmL) originated from the LPJ-Dynamic Global Vegetation Model (Sitch et al., 2003) and it is associated with biogeochemical processes (mainly carbon cycling) (Bondeau et al., 2007). The LPJLmL model simulates the growth and geographical distribution of natural plant and crop functional types. There are other alternative crop models such as WOFOST that is a simulation model for analyzing the growth and production of field crops under a wide range of weather and soil limiting conditions (Diepen et al., 1989), the CLM-Crop (Levis et al., 2012), ORCHIDEE-CROP, and PEGASUS (Deryng et al., 2011).
Considering the wide variety of crop models, some international efforts have been developed to compare their projections such as the Inter-Sectoral Impact Model Intercomparison Project (ISIMIP) (https://www.isimip.org/) and the Agricultural Model Intercomparison and Improvement Project (AgMIP) (https:// agmip.org/). Their main goal is to facilitate the evaluation and improvement of the models. They also aim to improving the estimates of the biophysical and socio-economic impacts of climate, to provide knowledge for enhancing technological capabilities, and to tackle food security, and poverty at local to global scales.
How countries adapt to climate change impacts and prevent further risks possess critical questions about the countries' capacity to produce climate knowledge on a relevant scale for local decisionmakers. This is a key aspect of the climate services discussions (Vaughan and Dessai, 2014;Soares and Buontempo, 2019) and the analysis of how to bridge climate policy and the science interface (Lemos et al., 2012;Tang and Dessai, 2012;Knutti, 2019). Unequal climate knowledge infrastructures have been identified as one of the key dimensions to better understanding how climate information is locally produced, circulated, selected, and used by policymakers in different development sectors (Edwards, 2010;Mahony and Hulme, 2016). This geographical imbalance can be modulated by how countries customize the information from all parts of the world and integrate it to their processes. Here, novel modelling approaches are proposed to help different regions customize and use model output and data. At the same time, this approach can help build stronger technical capacities and provide adequate information for decision making.
The increasing availability of crop projection databases from leading modelling groups allows proposing simple model emulators based on statistical techniques (Blanc, 2017;Estrada et al., 2020). These emulators have low technical and computing requirements and aim helping a variety of users who have no access to biophysical crop models, but that have information needs that may go beyond publicly available datasets. The requirements of information can vary widely among users and include tailor-made, user-defined policy or reference climate change scenarios to address specific information needs. The low-computational and technical costs of emulators also facilitate the use of simulation and resampling methods that allow generating probabilistic scenarios and estimating various risk measures to facilitate communication of results and help decision-making processes. Examples of such risk measures include the social and economic time of emergence, and multivariate risk indices (Hawkins and Sutton, 2012;Estrada and Botzen, 2021;Ignjacevic et al., 2021).
Limited access to complex models and technical and computational challenges are much more common for policy makers and stakeholders in developing countries (Blicharska et al., 2017). In these regions, the development of alternative modelling approaches and tools, such as emulators, can be of high scientific and policy relevance. For instance, biophysical crop models require high levels of expertise and programming to use and adapt in a way that they can address relevant national, subnational and local needs. Due to the lack of modelling alternatives, policymakers in these countries are frequently left with model runs created for other users' needs that do not respond to their specific demands. However, the usefulness and benefits of emulators of complex models are not limited to cases in which technical or computational resources are scarce or not available. A variety of emulators, such as MAGICC (Meinshausen et al., 2011), are used in high-impact publications on climate change (Fawcett et al., 2015;IPCC, 2021) and play a central role in integrated assessment modelling (Tol and Fankhauser, 1998;van Vuuren et al., 2011;Nordhaus, 2013;Meinshausen et al., 2020;Estrada and Botzen, 2021).
The main objective of this work is to offer a simplified mathematical framework to produce emulators that can approximate complex climate change impact assessment models for agriculture. These emulators can extend the original models' results to a wide range of user-defined intervention/inaction greenhouse gas emissions scenarios. They can also be extended to a probabilistic setting in which a variety of risk measures can be developed to address the user's information needs. In specific, emulators of the EPIC model based on the simulations that are freely available in the AgMIP7 database are developed. Some performance metrics and a test to evaluate their performance are proposed. This builds upon concepts such as "weather typing" (Hay et al., 1991), "time-shift approach" (Herger et al., 2015) and "time sampling" (James et al., 2017), developed for climate simulations and for downscaling. These concepts have been applied to interpolate impact projections for the Network for Greening the Financial System (NGFS) 1 . Here we extend previous work to a formal mathematical model and provide metrics and tests for evaluating its performance.
The proposed methodology is illustrated with a case study of rainfed maize production in Mexico, which has a particular cultural and socioeconomic importance for the country. Although the contribution of agricultural activities accounts for 3.4% of the Mexican GDP (INEGI, 2020), there are~6 million people who depend directly on this sector (SIAP, 2019) and up to 26.9 million people considering their relatives (INEGI, 2021). Farmers in Mexico have been cataloged as of the most vulnerable to climate change because of a mosaic of conditions (Monterroso et al., 2014;Murray-Tortarolo et al., 2018;Donatti et al., 2019), and their incomes are highly dependent on crop yields. The vast majority of agricultural land is rainfed (69.7%) (SIAP, 2021), and conducted in small patches of land (usually <2ha) (Ibarrola-Rivas et al., 2020), with traditional management practices (e.g., milpa). The proposed emulators are used to estimate the changes in yields at the grid (0.5°× 0.5°) and state levels for three emissions scenarios that are not available in the AgMIP7 database. These scenarios explore the benefits of strict compliance of the Nationally Determined Contributions (NDC) and what consequences would arise if a large emitter would drop out of the NDC and the Paris Agreement. In the current context of the cancellations of the US-China talks by China, a modified NDC scenario in which China decides not to participate in the NDC efforts is selected for the analysis. The economic costs and benefits of such scenarios are evaluated, and two risk measures are used to identify the regions that are more exposed to climate change impacts on rainfed maize production.
The remainder of this article is structured as it follows. Section 2 presents the data and methods used. It develops the modelling framework for constructing the emulators and presents an approach to evaluate the emulators' adequacy and accuracy. Section 3 presents and discusses model evaluation and the projected changes in yields and their implications for rainfed maize production in Mexico. Section 4 summarizes the results and concludes.

Model description
A simple methodology for constructing crop model emulators is presented. The proposed emulators can closely reproduce the outcome of complex models and address information needs that cannot be covered with the simulations that are publicly available. This methodology is illustrated using the EPIC crop model projections that are available in the AgMIP7 database. The emulator is based on a simple non-parametric modeling strategy described in the following paragraphs.
The evolution of crop yields y t can be represented by means of a signal plus noise model, such as: where S i,j,t is the true, unobserved systematic component of y i,j,t which can be approximated by a general function f(H) of a set of information H, and i, j are geographical coordinates (latitude and longitude).Ŝ i,j,t is the resulting approximation of S i,j,t . The noise term ε i,j,t η i,j,t + u i,j,t has two components, one related to bias and the other to non-systematic variation. η i,j,t S i,j,t −Ŝ i,j,t is an error term which absorbs the influence on the systematic part of y i,j,t of all other factors not included in H. In addition, ε i,j,t includes u i,j,t which is a zero mean, stationary noise that accommodates the nonsystematic effects of other factors, such as some short-term variability in climate variables. Note that the error term η i,j,t can become a non-stationary process ifŜ i,j,t has systematic biases due to the omission of important determinants in H. If no biases are present inŜ i,j,t , η i,j,t is a zero mean, stationary variable (i.e., η i,j,t~I (0)). The first step in the methodology is to construct an information set H that contains a library of simulations from the biophysical model and the corresponding global temperature change, both for a range of emissions scenarios, such as those in the Representative Concentration Pathways (RCP): where L l 1 , l 2 , . . . , l k and each l k is a three-dimensional matrix (latitude, longitude, time) of yield projections for a particular crop, production system and emissions scenario. T g is a vector T g T g 1 , T g 2 , . . . , T g k of annual global temperature change for a total of k different emissions scenarios, each one covering a horizon of n years and that are expressed with respect to a reference period (e.g., preindustrial times).
The second step consists of indexing the information contained in L with respect to the associated changes in annual global temperature T g and proposing the following specification for the systematic component ofŷ i,j,t : where θ t is a subset of L such that θ t T g t − ω ≤ T g t ≤ T g t + ω and ω is a parameter that defines a rolling window around T g t . Eq. 4 consists in calculating the average of yield maps across the elements of θ t which satisfy the condition of being associated with a global temperature change in the range of T g t ± ω, regardless of the date of occurrence and/or the emissions scenario they belong to. Averaging over this range of values has the effect of minimizing the effects of factors that are not common across the elements of θ t and reinforcing those that are common. In particular,Ŝ i,j,t would preserve the effects over yields of changes in external forcing, as well as other determinants (e.g., soil properties, fertilizers, among others), and dilute the effects of those that are different (e.g., natural variability, differences in regional forcing). As such, by analyzing the error termε i,j,t it can be inferred if the proposed model has important biasesη i,j,t and thus if information set H provides a representative sample to adequately emulate the outcome y i,j,t . The existence of important biases inŷ i,jt can be evaluated by testing if ε i,j,t~I (0). Sinceŷ i,j,t is an average, a centered running mean of y i,j,t is used to calculateε t and to compute in-sample and out-ofsample forecast evaluation measures (RMSE, nRMSE). This running mean is closer to whatŷ i,j,t represents and minimizes the effects of natural variability.
Some relevant properties and limitations of the proposed methodology include: • There are no assumptions about the functional form relating the effects of changes in climate over yields. • Spatial patterns produced by the original biophysical crop model are preserved. Noise is introduced to these patterns by the mismatch betweenŜ i,j,t and S i,j,t , as well as by u i,j,t . However, ifε i,j,t~I (0) any mismatch is transitory, and no systematic biases are present. • As with many other models, projections beyond the range of values used for calibration are likely not valid. • The proposed methodology is not appropriate for scenarios which involve large changes in spatial climate patterns (i.e., spatial stationarity of changes does not hold, see below) such as those that correspond to climatic catastrophes (e.g., thermohaline circulation collapse). • At each time t the emulator in Eqs 4, 5 constructs a library of realizations that represents the response of the biophysical climate models to similar levels of warming. These collections of realizations can be used to approximate the empirical distributions of crop yields conditional on the level of warming T g at time t, through resampling and simulation methods as illustrated in the following section. This allows to explore, for example, the probabilities of exceeding thresholds and other risk measures.
Finally, note that T g provides a succinct representation of changes in climate as it implicitly offers an approximation of how temperature and precipitation vary at a spatially explicit scale. A variety of studies has provided strong evidence in favor of stationarity in the spatial patterns of change in variables such as monthly and annual temperatures (mean, maximum and minimum), as well as in precipitation (Tebaldi and Arblaster, 2014). This implies that changes in temperature and precipitation at the grid cell are proportional to changes in annual global temperature: Frontiers in Environmental Science frontiersin.org 04 where v i,j,t is the change in variable v, P v i,j is a matrix of scaling values that are fixed in time but that vary across space. The scaling pattern P v i,j represents the response of the climate system in variable v to changes in external forcing, while ξ i,j,t is a noise term that includes the effects of natural variability. This means that while changes in the climate variable v (or in multiple climate variables V v 1 , v 2 , . . . , v p ) are heterogeneous in space, they all scale linearly with T g at a fixed proportion. Thus, S i,j,t f(V, ·) ∝ f(T g , ·). In consequence, the proposed methodology requires the assumption of spatial stationarity to hold.

Computation of risk measures estimates
A relevant application of the proposed methodology is to provide risk measures that are currently not available from the original biophysical crop models. The library of realizations in the information set H, in combination with resampling methods, can be used to approximate the empirical distribution of crop yields conditional on the level of warming T g at time t. Specifically, for each time step t, the set of realizations in H are resampled with replacement n times and the resulting four-dimensional matrix (latitude, longitude, time, resampled realizations) can be used to approximate the probability of exceedance of a risk threshold defined by the user (Estrada et al., 2020;Estrada and Botzen, 2021). A risk threshold based on the percent change in yields is defined by the user and the probabilities of exceedance are computed from the four-dimensional matrix of yields. A Boolean function is used to assign the value 1 to the entries in the four-dimensional matrix that exceed the chosen threshold and 0 otherwise. The average value of the resulting matrix is calculated to approximate the probability of exceedance per grid cell and time step. Once the probabilities of exceedance have been computed, the date of exceedance can be estimated by selecting a probability threshold at which the occurrence of exceedance is declared.

Calculation of economic losses
For calculating the economic losses, the following steps were carried out. First, because modelled and observed yields are not directly comparable (Rosenzweig et al., 2014) the change in modelled yields is applied to observed yields as follows (see Estrada et al., 2022): where Y fut is the future yield, Y obs ref is the observed yield in the reference period, Y mod ref is the yield from the biophysical model emulator in the reference period and Y mod fut is the yield calculated from the model emulator for the future period. Second, the change in yields (Y obs ref -Y fut ) is obtained and multiplied by the number of hectares in each state devoted to the production of the crop to provide an estimate of the tons of crop lost due to climate change. Third, the estimated annual loss in production is multiplied by the price per ton of the crop in each state. Finally, the present value of losses is calculated with a user-defined discount rate.

Evaluating the adequacy and accuracy of models based on different information sets H
As mentioned in the previous subsection, the adequacy of the information set H used to calculate Eq. 5 can be evaluated by analyzing the properties of the error termε i,j,t . If differences between S i,j,t andŜ i,j,t are transitory thenε i,j,t~I (0), while if they are persistent, they will make the error term non-stationary. The Augmented Dickey-Fuller (ADF) test is commonly used to distinguish between stationary and non-stationary variables (Dickey and Fuller, 1979;Said and Dickey, 1984), which involves estimating the following regression for any time series x t : where J j 1 β j Δx t−j are additional terms to correct for autocorrelation. Under the null hypothesis δ 0 and x t contains a unit root, and the alternative is that it is stationary around zero. It is important to note that 1) the power of the ADF test goes to zero when a deterministic trend is omitted, 2) when an intercept is not included, the power is adversely affected and decreases with the magnitude of the omitted constant and, 3) in the case of structural changes in the trend function, the δ will be biased towards zero (the non-rejection of the null). A such, when applied toε i,j,t , the null hypothesis of the ADF would likely not be rejected if a persistent bias is present inŜ i,j,t , regardless of the non-stationarity being caused by the presence of a unit root, an omitted trend/intercept or the existence of structural breaks (Perron, 1989).
To evaluate the accuracy of the projections obtained using different H sets, the root mean square error (RMSE) and the normalized RMSE (nRMSE) are calculated using the mean of the yield of the original model for the projected period. These metrics also help to assess how much additional realizations of the biophysical crop contribute to improve the emulator's projections.

Data description and sources
The proposed methodology is illustrated using the output from the EPIC crop model for rainfed maize (Williams et al., 1984(Williams et al., , 1989 forced with the climate projections of the HadGEM2-ES climate model under the RCP8.5, RCP6.0, RCP4.5 and RCP2.6 emissions scenarios. 2 This information constitutes the L component of the information set H. All data was obtained from the AgMIP7 dataset using the Geoshare AgMIP Tool (Villoria et al., 2016). The geographical domain chosen for this study is Mexico, and the period is 2005-2100. For the T g component, the ensemble average of annual mean global temperature projections from the 2 These emissions scenarios are named after the radiative forcing they would produce by the end of the present century, ranging from 8.5 W/ m 2 to 2.6 W/m 2 . They can also be interpreted as a very high emissions trajectory (RCP8.5), two scenarios that are similar to what current policies would achieve (RCP6.0) and to what strict fulfilment of Nationally Determined Contributions (NDC) would produce (RCP4.5), and a stringent international mitigation scenario that is consistent with the Paris Agreement goals of keeping global temperature increase well below 2°C by 2100.

Frontiers in Environmental Science frontiersin.org
HadGEM2-ES climate model were computed for each of the four RCP scenarios. Four simulations were available for the RCP2.6, RCP4.5 and RCP8.5, while only three for the RCP6.0. All HadGEM2-ES output was downloaded from the KNMI's Climate Explorer tool (https://climexp.knmi.nl/). To illustrate the usefulness of the proposed emulators, the ensemble average (one member per model) of the Coupled Model Intercomparison Project phase 6 (CMIP6) dataset for the SSP370 were obtained, which was also obtained from the KNMI's Climate Explorer tool, and two simulations from the CLIMRISK  and MAGICC6 (Meinshausen et al., 2011) models. These two simulations represent 1) the strict compliance of the Nationally Determined Contributions (NDC) of all countries and 2) the NDC scenario but with China dropping out from this international effort (NDCnoCHINA). Observed yields, cultivated area and prices for rainfed maize were obtained for the period 2000-2010 from SIACON 3 .

Evaluation of adequacy and accuracy of the proposed emulators
The adequacy and accuracy of different emulators based on all possible combinations of RCP simulations to integrate the information set H was evaluated. Supplementary Tables S1-S4 in the Supplementary Material show the RMSE, nRMSE and the significance of the ADF test statistic. Bold figures denote which emulators provide no evidence of non-stationarities inε i,j,t and produce the lowest errors. When H is composed of only one RCP scenario, the RCP8.5 is the only one that produces stationary residuals. This emulator has an out-of-sample RMSE (averaged over all simulations except those included in H; in this case, the RCP8.5 is excluded) of 0.89 t/ha and a nRMSE of 13.6%. 4 The nRMSE is reduced by about 30% when the RCP4.5 is added to H, and the errors are also stationary. Including the RCP8.5, RCP4.5 and RCP6.0 in H decreases the average out-of-sample nRMSE by about 4% and produces stationary errors. The average out-of-sample RMSE is 0.64 t/ha and a nRMSE of 9%. When H includes all four RCPs, errors are stationary, the average in sample RMSE is 0.44 t/ha and the average nRMSE is 7%. Supplementary Figures S1-S9 in the Supplementary Material compare the original yield projections obtained from the EPIC model and those produced with the proposed methodology. These figures show the spatial patterns of the nRMSE and the temporal evolution of the yield projections from the EPIC model and the proposed emulators for a randomly chosen grid cell.
The results in Supplementary Tables S1-S4 in the Supplementary Material show that: 1) the errors produced by the proposed emulators are relatively small, as the RMSE is in general below 1 t/ha in comparison with the average yield for the area of study (about 6 t/ha); 2). Due to the fact that the RCP8.5 expands over a wider range of global temperature change, it is the scenario that adds more information to the set H. In contrast, the RCP2.6 adds the least because all other RCP scenarios provide information for changes in yields in a range of global temperature change that encompasses that of the RCP2.6; 3) the out-of-sample RMSE values averaged over the different RCPs decreases as more RCP scenarios are added to the set H, suggesting that the emulator becomes better at producing projections that are not in the training set.
Furthermore, most of the combinations of RCP in H that include the RCP8.5 produce stationary errors at each grid cell, for all RCP that are evaluated. This suggests that differences between S i,j,t and S i,j,t are indeed transitory and that RCP-specific differences such as in regional forcing and other factors do not produce a systematic bias in the emulator's projections.

An illustration of the proposed emulators for generating user-defined scenarios
The usefulness of the proposed methodology is illustrated by projecting rainfed maize yields under three emissions scenarios that are not considered in AgMIP7. Furthermore, risk estimates that are not directly available using current biophysical crop models are provided. The emissions scenarios that were selected are: the SSP370 used in the CMIP6, and that is similar to a "business-asusual" scenario; 2) a strict compliance NDC scenario and; 3) the NDCnoCHINA scenario which consists of the NDC scenario but excluding China's participation .

Climate change impacts on rainfed maize yields
Results of the simulations of rainfed maize yields for the selected emissions scenarios are presented in Supplementary Figure S10 included in the Supplementary Material. This figure shows the changes in yields (%) with respect to 1980-2010 for the SSP370, NDC and NDCnoCHINA emissions scenarios for the time horizons 2055 and 2085. The SSP370 scenario implies large reductions in rainfed maize in Mexico by mid-century. These reductions are highly heterogeneous in space and particularly large for part of the northeast and most of the southeast of Mexico. This is also the case for the south-center region of the US, where the yield changes can exceed −40%. The reductions in yields become much larger and widespread near the end of the century, reaching over 70% in the southeast and northeast of Mexico (and in the southeast of the US), and close to 40%-50% in some regions of the Pacific coast where some of the largest producers of rainfed maize are located.
Aggregating yield changes at the state level ( Figure 1) shows that under the SSP370 all states, with the exception of the Baja California peninsula, would experience important decreases in rainfed maize yields during this century. The largest reductions occur in Nuevo Leon reaching close to 50% during the 2050s and about 80% at the end of the century, followed by those in Campeche which exceed 40% by mid-century and 60% by the 2080s. Other states with decreases in yields exceeding 50% by the end of the century are 3 SIACON is a query system for agricultural information created by the Mexican government. SIACON is available at https://www.gob.mx/siap/ documentos/siacon-ng-161430 4 Note that there is no consensus about what an acceptable range RMSE or nRMSE values is. This measure is intended to compare the accuracy of alternative models in relative terms (Blanc, 2017;Estrada et al., 2020).

Frontiers in Environmental Science frontiersin.org 06
Coahuila, Quintana Roo, Tabasco, Tamaulipas, and Yucatán. States such as Chiapas, Guerrero, and Oaxaca, which are characterized by high levels of poverty and small-scale producers that depend on rainfed maize production for subsistence, would also experience large reductions in yields. For these states, the expected reductions in yields are about 30%-45% by the end of the century and 15%-30% in the following 3 decades. The largest producers of rainfed maize in the country (e.g., Mexico, Jalisco, and Nayarit) would see reductions between 5% and 15% by the 2050s, and 15% and 25% at the end of the century. Figures 2A, 3A show these results as maps for the short (2025) and medium (2055) time horizons, respectively.
If an international mitigation effort consistent with the NDC commitments would be implemented, a significant fraction of these reductions in yields could be avoided. Figure 1B shows the yield changes obtained for the NDC scenario and reveals that there would be important benefits for most states if such an international mitigation effort would be implemented in comparison with a "business-as-usual" type of scenario (SSP370; Figures 2B, 3B). Thirteen states would avoid losing at least 10% of their current yields by the end of the century, and seven states would avoid reductions in yields of 5% or more by the 2050s. However, if China, one of key actors for limiting greenhouse gas emissions, would decide not to participate in the NDC effort these benefits would be significantly reduced ( Figures 1D, 2C, 3C). In comparison with the SSP370, implementing the NDCnoCHINA scenario would more than halve the benefits that would be obtained under the full compliance of all participant countries (NDC): only three states would avoid reductions of at least 10% by the 2080s and no state would see benefits exceeding 5% by mid-century.

Risk measures estimates for rainfed maize in Mexico
In this subsection, the library of realizations in the information set H is used in combination with resampling methods to approximate the empirical distribution of crop yields conditional on the level of warming T g at time t. Specifically, for each time step, the set of realizations in H are resampled with replacement 10,000 times and the resulting four-dimensional matrix (latitude, longitude, time, resampled realizations) is used to approximate the probability of exceedance of a risk threshold defined by the user. For the results presented below, a 30% reduction in yields is chosen as the user-defined risk threshold and the probability threshold is set at 50%. In other words, it is required that at least 50% of the realizations exceed the risk threshold defined by the user to  Frontiers in Environmental Science frontiersin.org 08 Frontiers in Environmental Science frontiersin.org 10 declare that the risk threshold has been exceeded. Once the threshold has been declared as exceeded, the date when this first occurs is retrieved. These risk measures can help decision-makers developing critical path planning for adaptation in which dates of exceedance provide a time frame for designing and implementing a sequence of adaptation activities.
Animated Supplementary Figures S12-S14 in the Supplementary Material show the evolution of the probabilities of exceeding 30% reductions in rainfed maize yields for the SSP370, NDC and NDCnoCHINA scenarios for the period 2005-2100. Under the SSP370 (Supplementary Figure S12), the probabilities of exceedance increase rapidly in the second part of the century reaching values above 80% in the southeast region of the country and all along the Pacific coast. In contrast, for most of the central region these probabilities remain below 60%. Although the NDC is not a very ambitious international mitigation effort, the  Frontiers in Environmental Science frontiersin.org probabilities of exceeding the selected risk threshold for rainfed maize yields are much lower for most of the country. The exceptions are the states located in the southeast of the country, as well as in Nuevo Leon, Coahuila, Tamaulipas, and Veracruz. Figure 4 shows the dates of exceedance for each of three scenarios considered. Under the SSP370 scenario ( Figure 4A), most of the area devoted to rainfed maize production in Mexico would likely experience reductions of 30% in yields during this century. During the present decade, states such as Nuevo Leon and Coahuila would reach this risk threshold, as well as some parts of Campeche. In the 2030s, Tamaulipas, parts of Sonora and the rest of Campeche would exceed decreases in yields above 30%. The rest of the southeast of Mexico would exceed 30% decreases in yields during the 2040-2060 period, and a large fraction of the remaining area devoted to this crop would exceed the threshold as early as the 2070s. Strict compliance with the NDCs would not be enough to delay exceeding the risk threshold for Nuevo Leon, Coahuila, Tamaulipas or Campeche ( Figure 4B). Nevertheless, it would provide about 10 extra years for adaptation in parts of the southeast of Mexico, and about 20 years in Nayarit, Sinaloa and, Sonora. Furthermore, such mitigation scenario would push the date for exceeding the risk threshold into the next century for most of the central part of Mexico.
If China dropped out of the NDCs (NDCnoCHINA), most regions in Mexico would still experience some benefits in terms of delaying the date of exceedance of the risk threshold ( Figure 4C). These areas include the southeast of the country, where the dates for exceedance would be like those obtained in the NDC scenario. For most of the Pacific coast there would be a delay of 10-15 years in comparison with the SSP370 scenario. The central part of the country would also experience about a 20-year delay for reaching the risk threshold of with respect to the SSP370 scenario. These delays would provide additional time for designing and implementing adaptation strategies to minimize the impacts of climate change on this crop and for addressing the challenges of the population that depends on it.

Estimates of the economic costs of climate change for rainfed maize in Mexico
In this section, estimates of the economic costs of climate change at the national and state levels are provided. For this purpose, the official statistics about yields, crop area and prices collected over the period 2000-2010 by the Ministry of Agriculture and Rural Development of Mexico are used, as well as the projections of changes in yields obtained for the SSP370, NDC and NDCnoCHINA scenarios.
To represent the reference yields and area devoted to rainfed maize, the state average values of these variables during the 2000-2010 period are used. For each scenario, the future yields are obtained multiplying one plus the projected changes (%) by the observed average yield of each state. Assuming the rainfed maize area remains constant for the rest of this century, the losses/gains from climate change in rainfed production are calculated as the difference between future and reference yields in each state, multiplied by the rainfed maize area in each state. The resulting quantity of tons are multiplied by the state-level price to approximate the costs or benefits of climate change for this crop under a particular emissions scenario. For the results in this subsection, state prices in 2012 pesos and a 4% discount rate are used for calculating present values.
At the national level, the present value of the cumulative losses in rainfed maize yields over this century amounts to $130,000 million pesos, which is comparable to three times the value of rainfed production of Mexico in 2012 (Table 1). These losses are highly heterogeneous at the state level with Chiapas, Jalisco, Veracruz, Oaxaca, and Guerrero account for about 60% of the total national losses ( Figure 5A). In comparison with the SSP370, the present value of the cumulative benefits over this century of the NDC scenario ( Figure 5B) would be about $25,000 million pesos, with the largest benefits in Chiapas ($5,600 million), Jalisco ($3,300 million) and Veracruz ($3,000 million). The decision of China to drop out of the NDC agreement would represent a loss of $8,600 million pesos for Mexico in rainfed maize production in comparison with the strict implementation of the NDC. About 46% of these lost benefits would occur in Chiapas, Jalisco and Veracruz ( Figure 5B).

Conclusion
The amount of data about climate change and its impacts on natural and human systems that is available for decision-makers and researchers all over the world is unprecedented and ever-growing. Moreover, a large fraction of these databases is publicly available through international efforts of the climate change modelling community. However, the information needs are highly dynamic in an era of active mitigation and adaptation policies and are very heterogeneous among users. As such, information needs can hardly be satisfied even by such impressive and variated collections of databases. This is particularly true in the case of complex models for which runs are typically available for a limited number of scenarios (e.g., RCP, SSP) and experiments. Limited access to these models and lack of technical and computational capacities to run them constitute significant barriers for a variety of users and preclude them from creating tailor-made scenarios to address their specific information needs.
This information gap can be addressed through the development of emulators which can approximate the output from complex models using simple methods that are not technically demanding on the user, nor costly in computational terms. Moreover, such methods can be easily implemented and made publicly available. In this paper, a simple emulator of the EPIC model applied to rainfed maize is presented. It is shown that it can adequately reproduce the output of this complex biophysical crop model and to create projections for user-defined scenarios, as well as risk measures that are not available with the original model.
The proposed emulator is illustrated with an application for rainfed maize production in Mexico under three scenarios that are not available in the AgMIP7 database: the SSP370 and two user-defined scenarios that represent the strict compliance of the NDC commitments and a hypothetical case in which China drops out of this international mitigation effort. It is shown that under the baseline scenario (SSP370), rainfed maize yields could decrease at least 40% for 11 states of the country and up to 60%-80% in some regions by the end of the century. These results are consistent with the range of yield reductions reported in Estrada et al. (2022), which analyzes yield changes of the EPIC model under the RCP8.5 and RCP2.6 scenarios Frontiers in Environmental Science frontiersin.org