The combined effects of VPD and soil moisture on historical maize yield and prediction in China

Understanding the effects of thermal and water stress on maize yield in the context of climate change is crucial to ensure food security in China. However, very few studies looked into the combined effects of heat and water stress on maize yield in China. Here, we utilized historical reanalysis data from ERA5 and four future shared socioeconomic pathway scenarios (SSP1-2.6, SSP2-4.5, SSP3-7.0, and SSP5-8.5) of the Coupled Model Intercomparison Project 6 (CMIP6) models to predict the maize yield. We used the linear mixed-effects model to quantify the grid cell sensitivity of vapor pressure deficit (VPD) and root-zone soil moisture to maize yield in China during 2010–2016. The results infer that VPD and root-zone soil moisture are excellent representatives of heat and moisture stress. Maize yield is beneficial only when the atmospheric moisture demand and soil moisture are in relative balance. Based on the historical results’ polynomial function for VPD and soil moisture, we predict the maize yield response to soil moisture and VPD in the four SSPs. The results show that considering soil moisture in the future the projected yield estimates reduce the overestimated yield loss by half compared to considering only atmospheric moisture requirements. Maize yield will decrease under representative SSPs due to an increase in temperature (1.5, 2.0, 2.5, 3.0, 3.5, and 4.0°C). This study suggests that both atmospheric moisture demand and supply need to be considered when analyzing the specific influence of climate change on crop yield to secure and assure global food supplies.


Introduction
Maize is one of the most valuable food crops to ensure food security in the world (Jones and Thornton, 2003;Lobell et al., 2013;Abera et al., 2018). Increased maize production is essential for human livelihood, welfare, and the development of the national agriculture and livestock industry. The maize yield in China is probably 2.61 × 10 8 tons, accounting for 22.7% of the total global maize production (FAOSTAT, 2020), making it the world's second-largest maize producer. Global warming, accompanied by an increase in the frequency and intensity of extreme weather and climate extremes (droughts, heat waves, floods, etc.), will irreversibly affect plant physiological and yield parameters (IPCC, 2021). Conversely, the impacts of climate change and the intensification of associated hydrometeorological extremes will exacerbate the maize production crisis, affecting food security and sustainability (Li et al., 2021a). Vegetation growth and development at different stages are subject to changes in temperature degree days (Schlenker and Roberts, 2009), vapor pressure deficit (VPD) (Arve et al., 2011), soil water (Siebert et al., 2014;Webber et al., 2016), CO 2 (Lobell et al., 2013) and solar radiation (Mercado et al., 2009). Moreover, large-scale climatic phenomena such as the El Niño Southern Oscillation (ENSO) (Zhu et al., 2018) and anthropogenic interventions in the natural earth climate system can affect the global vegetation growth (Reichstein et al., 2013). Air temperature and soil moisture are two main controlling factors for rainfed maize yield (Carter et al., 2016;Ortiz-Bobea et al., 2019). Therefore, the specific air temperature and soil moisture range corresponding to each growth stage contribute to the physiological growth of maize. Fluctuation beyond the optimal threshold range adversely affects biological production and grain yield . Excessively increasing air temperature triggers heat stress, reducing maize yield (Schlenker and Roberts, 2009;Butler and Huybers, 2013;Lobell et al., 2013;Butler and Huybers, 2015). For another, excess soil moisture (e.g., flood, over-irrigation) also produces waterlogging, damaging the plant root system and loss of essential nutrients, causing maize yield damage (Rosenzweig et al., 2002;Beier et al., 2012;Li et al., 2019;Lesk et al., 2020). On the contrary, an increase in soil moisture also mediates the passive influence of heat stress from an increase in temperature-related heat losses or amplifies the losses when the soil moisture decrease. Air temperature also has been shown to have similar influences on maize yield due to excess (deficit) soil moisture (Lobell et al., 2013). These imply that air temperature and soil moisture jointly impact maize yield.
However, predicting maize yield using the relationship between maize yield, air temperature, and soil moisture may not be the best choice. Rigden et al. (2020) found that the collective influence of soil moisture and VPD can accurately predict maize yield in the US, rather than soil moisture and air temperature, demonstrating that VPD is more suitable for maize yield prediction than the air temperature. This is because VPD, determined by air temperature and specific humidity, can reflect atmospheric demand more than air temperature (Hsiao et al., 2019). Thus, the effect of VPD on vegetation growth has received more attention. Meanwhile, Yuan et al. (2019) indicated that the negative contribution of VPD limits global terrestrial vegetation growth since 1998 even more than the CO 2 fertilization effects. Sulman et al. (2016) and Novick et al. (2016) pointed out that VPD substantially affects vegetation growth more than soil moisture. VPD affects vegetation growth because increasing VPD leads to an increase evapotranspiration (Li et al., 2011;Li et al., 2022). Consequently, vegetation prevents water loss by closing its stomata, thus limiting vegetation growth. Rigden et al. (2020) validated the accuracy of maize yield prediction based on soil moisture and VPD. However, the results of this predicted method have not been validated in China, where maize is widely grown. China has a vast territory with diverse topography and climate, and maize is grown in 90% of the provinces in the country at a preliminary estimate . As the largest grain crop, maize accounts for about 25% of grain sown area and 40% of grain output in China (NSBC, 2021). Meanwhile, China is an important maize producer in the world. Based on these aspects, we need to conduct studies on soil moisture and VPD to preferable realize the effect of atmospheric moisture requirement and supply on maize production in China.
There are three classic approaches to quantify the effects of climatic variables on crop production: (1) process-based biophysical crop simulation models (Whish et al., 2015;Chen et al., 2020), (2) field trials (Nandram et al., 2013;Feng et al., 2020), and (3) statistical models (Tao et al., 2012;Leng and Hall, 2020). Compared with field trial methods and process-based crop models, yield estimation methods by statistical regression are widely used and have a long history because of inexpensive cost and easy application (Feng et al., 2018). In addition, statistical regression models can explain specific correspondence between independent (precipitation, etc.) and dependent variables (maize yield, etc.), which can more directly and accurately understand and quantify the relationship between climate change and crop production (Peng et al., 2004). The linear mixed-effects model belonging to the statistical model not only eliminates the requirement for data independence and other requirements of traditional linear statistical models but also retains its normality assumptions, which greatly improves the applicability of linear models.
In recent decades, few studies have quantified the influence of climate variables on maize production at the grid scale in China due to the limitation of maize yield availability. Some previous studies were mainly carried out at the provincial level (Li et al., 2010;Chen et al., 2015). Other site scales were on the basis of major maizegrowing belts of China in Northeast China (Zhao et al., 2015;Zhao et al., 2016), Huang-Huaihai region (Liu et al., 2010), Southwest region (Li et al., 2014). However, the trend of climatic factors and warming affecting maize yields, especially in the future, are still unclear. This study attempts to analysis climate variability effect on maize production in the major maize-growing belts of China at the grid point level using high-resolution data with a linear mixed-effect model. We also estimate the optimal water balance of maize over the historical period. Then a polynomial model is used to estimate how maize yield will change under the different shared socioeconomic pathway (SSP) in the warming future, compared to the preindustrial period. This work is a reference for estimating the climatic variability of maize production in China.

Study area
The study focuses on mainland China, which is a well-known maize producer and ranks second in the world. Its land coverage is vast, and the terrain is complex and diverse, with geographical locations between 73°40′-135°20′E and 3°52′-53°33′N. The climate zones of China gradually transition from the southernmost subtropical monsoon to a temperate climate in the north. However, due to the wide area and diverse terrain, the actual climatic states (temperature, precipitation and etc.) vary greatly between regions (Liang et al., 2018;Yao et al., 2018). Under the promotion of climate and other factors, China's maize production has gradually formed three regional patterns with distinct characteristics after a long evolution process (Miao et al., 2014). These include the northeastern production area, the northern production area (Supplementary Figure S1), and the southwestern production area, which connect to a narrow maize growing belt Yang et al., 2017). The belt is the major maize-producing district in China and the section of interest here.

Maize yield data
The historical maize production data is derived from the Global Dataset of Historical Yields for major crops (GDHY), which integrates agricultural census statistics from the United Nations statistical database (FAOSTAT) and multiple satellite products. This global grid yield dataset with a spatial resolution of 0.5°contains four major crops -maize, rice, wheat and soybeans -from 1981 to 2016 (Iizumi and Sakai, 2020). Here, we use a calibrated version (v1.2 + v1.3) which had the data examined in agreement with other crop yield products (Ray et al., 2012) containing only from 1995 to 2005, but this data is also extended to 2016. Therefore, this historical yield data is of higher quality and spans a longer period to meet the needs of newer scientific studies. For this study, we treat this data as the actual yield value. The data was obtained from https://doi.org/10. 1594/PANGAEA.909132. Although the crop calendar for maize production in the dataset is divided into two seasons (major and second, the second mainly contains maize from eastern South Africa and central and southern Brazil), there is no data on maize production in the second season for our study area, which is the main maize producing area in China. Therefore, the variable we use for maize yield data is "maize major", which represents the maize production in the first season. This work used the global crop calendar to determine the growing season of the crops in China (Sacks et al., 2010). As this study is focused on the maize-producing areas of China, some pre-processing is required before this data can be used. The most important of these is the extraction of the Chinese maize-growing areas, and we show this part of the process in the methods part.

Climate data
The ERA5 dataset (Hersbach et al., 2020) we use in this study is from the European Centre for Medium-Range Weather Forecasts (ECMWF). The spatial resolution of the dataset is 0.25°× 0.25°with temporal coverage from 1959 to the present downloaded from https://www.ecmwf.int/en/forecasts/datasets/reanalysis-datasets/ era5. Evaluation studies have demonstrated its high skill in China (Hagan et al., 2019;Hersbach et al., 2020;Liu et al., 2022). ERA5 contains products with hourly and monthly time resolution. Since our study requires daily frequency data, we convert the hourly data to daily mean data. The meteorological variables used in this study are 2 m air temperature, 2 m maximum and minimum air temperature, 2 m dew point temperature, total precipitation, surface soil volumetric water content (top layer of soil moisture, 0-7 cm), and soil volumetric water content at a depth of the maize root system (third layer of soil moisture data, 28-100 cm). Since the main root water absorption of maize extends to less than 7 cm, to characterize the available water to the plants more accurately, we select soil moisture data of corresponding depth according to the distribution area of maize roots. The reanalysis data from 2010 to 2016 were interpolated into a regular cell of 0.5°u sing the nearest neighbor method to be consistent with yield data.
The daily vapor pressure deficit (VPD) was computed using the dew point, maximum and minimum temperature variables of 2 m per day.

CMIP6 data
The climate data for the pre-industrial and future periods used in this study are derived from CMIP6 of the World Climate Research Programme (WCRP) (https://esgf-node.llnl.gov/projects/cmip6/). The simulation of CMIP6 is from 1950 to 2100 and contains simulations of historical periods  and predictions of future periods (2015-2100). CMIP6 data approves 23 sub-program experiments, including a newly additional estimation project, the Scenario Model Intercomparison Project (ScenarioMIP). ScenarioMIP's eight sets of scenario experiments for future periods contain two levels (Tier-1 and Tier-2) on the basis of priority. The first level of experiments is the central experiment, including four scenarios for SSP1-2.6, SSP2-4.5, SSP3-7.0, and SSP5-8.5 (Zhang et al., 2019). The first scenario, SSP1-2.6, represents a sustainable development scenario that is more stable and has lower pressure and radiative forcing (van Vuuren et al., 2017). Both stability and radiative forcing in the SSP2-4.5 scenario are moderate. SSP3-7.0 is a new radiative forcing scenario that has high challenges to mitigation and adaptation (Fujimori et al., 2017). And SSP5-8.5 is a rapidly developing high-emission shared socio-economic pathway based on high fossil fuel consumption. To more accurately analyze the climate-yield relationship in future climates, we use a collection of multiple model ensembles and scenarios to study.
Given the availability of the data, we select nine models covering the meteorological elements of the main climate scenarios for the future period (soil moisture content, 2 m temperature, and nearsurface relative humidity), along with the temperature and relative humidity data for the preindustrial period. Information on the names, research institutions, and spatial resolutions of each mode can be found in Supplementary Table S1. The simulated data needs to be re-gridded to a resolution of 0.5°to ensure consistency with the reanalysis dataset and yield data. In addition, to reduce the error between the model-simulated data and the reanalysis data, we also corrected the model-simulated data. Data for future scenarios for all models cover the period 2041-2100. In calculating the future maize yield response to elevated VPD, we counted the number of days of exposure to specific VPD and SM r for each model under different SSPs, and then averaged the number of days across all models. The ensemble-mean number of days of exposure was input into the model to obtain the yield response. However, in calculating the future yield response to temperature increase, this study first calculated the time period of the particular temperature increase for each model. The mean is the height of the bar in Figure 8 and the error bar indicates the difference between each model.

Data pre-processing
Maize is grown in about 31 provinces in China, due to the suitability of climatic conditions, the districts with the largest planting area are mainly concentrated in the maize belt from the northeast to the southwest, which is the main maize producing area in China Yu, 2014;Yang et al., 2017). We first select the grid points where the maize yield of the GDHY dataset has values from 1982 to 2016 to Frontiers in Environmental Science frontiersin.org determine the main maize-growing areas in China during these 35 years.
Our study area is planted with spring maize and summer maize with different growing seasons. Northeastern China is mainly dominated by spring maize, and the northern region is dominated by summer maize, while spring and summer maize coexist in the southwest mountainous areas. For the general applicability of the findings, the study uses meteorological variables from April to September, which includes the growing seasons of spring and summer maize (Huai et al., 2020;Luo et al., 2020;Guo et al., 2021;Zheng et al., 2022). The daily VPD in the study area is calculated using temperature data from the ERA5 dataset and CMIP6. VPD is the saturated water vapor pressure minus the actual water vapor pressure. The daily average saturated water vapor pressure can be estimated using the mean of saturated vapor pressure at the maximum and minimum temperature or directly using the mean temperature (Allen et al., 1998;Allen et al., 2000). The actual water vapor pressure can be calculated by various approaches using one or more of the following parameters: dew point temperature, relative humidity, and air temperature (Allen et al., 1998;Upreti and Ojha, 2017). In this study, the observed VPD for the historical period in the study area is calculated using dew point temperature and maximum and minimum temperatures. The distribution of daily VPD and soil moisture in the maize root zone is shown in Figure 1. Due to the data availability, the simulated VPD for the future period is calculated from relative humidity and air temperature. We expect some deviations between the climate models and the actual situation in our study area due to the low resolution of these models, the inter-model resolution differences, and different model sensitivities and parameterization schemes (Zhang et al., 2016;Chen et al., 2020;Richter and Tokinaga, 2020). To reduce these uncertainties, before using these data, we carried out bias corrections and spatial downscaling based on station observations from historical periods in China (Kazmi et al., 2015;Jiang et al., 2022;Mondal et al., 2022). The method used for bias correction is the Equidistant Cumulative Distribution Function (EDCDF) (Li et al., 2010), which is a mapping method based on the probability distribution of data. It is a mapping method used to adjust the distribution of simulated data sets and is the most commonly used bias correction method for model (ESM) output. EDCDF considers historical and future period variables using quantile-based cumulative distribution function (CDF) mapping. EDCDF takes into account the CDF of historical and future periods differences between them, and it assumes that the differences between the observations referenced during training and the model output remain during calibration. This differs from a simple CDF, where the relationship between the two time periods remains unchanged.
x correct is the bias between model outputs and observations; F is the value of CDF, F −1 denotes the reciprocal of F; oc and mc represent the observations and model outputs for the historical training period, and ms is the corrected model result. After correction, spatial resolution downscaling is required because the ESMs are inconsistent and rough in spatial resolution. The statistical downscaling was performed by spatial disaggregation (SD) methods (Wilby and Wigley, 1997;Wood et al., 2002). First, we applied inverse distance weighting (IDW) interpolation to change the coarse resolution of the ESMs based on a monthly spatially resolution of 0.5 observed climate dataset. Then the anomaly values of the climate variables of the ESMs were differenced to 0.5°× 0.5°resolution. Finally, the anomaly field based on the obtained interpolated values is used to obtain downscaled ESM simulation data.
The future soil moisture in the maize root-zone used in the study is extracted from the representative soil layer thickness for nonuniform thickness layers of CMIP6 using the weighted average method following Wang et al. (2015). And the relevant layer number information of the soil layer moisture content of each model used is listed in Supplementary Table S1. It must be noted that the unit of soil moisture for the CMIP6 is kg/cm 2 , which is inconsistent with the unit of ERA5, so we converted all the model units to be consistent with ERA5 soil moisture. The conversion is done following Zhu and Shi. (2014).
In Eq. 2, S t is the depth of each soil layer in millimeters. SM v represents the moisture content of the volume unit of soil moisture. Figure 6

Construction of yield calculation model
We estimate the 2010-2016 maize yield anomaly concerning the yield data during 1982-2009. The anomaly is calculated using a linear regression fit of yield on each grid point from 1982 to 2009; the fitted predicted yield is then subtracted from the grid observation in 2010-2016. First, we construct yield prediction models based on individual variables for historical periods to analyze yield variation to climatic conditions, and the variables used are listed in Table 1. After screening and processing, 9,205 grid points-years are preserved, covering all grid points in China's major maize-growing regions from 2010-2016. To be able to model one item of each climate variable on yield, this study assumes that during the selected maize growing season (April-September, all national growing seasons of maize), the response of yield accumulates over time, and the yield is proportional to the total exposure time. Therefore, this response can be replaced cumulatively over time (Schlenker and Roberts, 2009). We model yield as a function of exposure events under specific climatic conditions to analyze the impact of climate on production in this work. We fit grid-level yield anomalies under different climatic conditions with a step function based on a linear mixed-effects model as follows In Eq. 3, ρ h represents the number of days in each interval of n different climatic conditions at each grid point, g. β h represents the sensitivity of the maize yield to various conditions in tons per hectare per day, remained invariant in grid points and years; f g represents a fixed effect on a grid and ϵ g shows the error. A linear mixed-effects model is used to fit β h and f g values with no global constant. To calculate the confidence intervals for correlation (Table 1), the study uses a non-parametric resampling method in which data from each grid point are re-extracted over 10,000 implements. There is a little unsymmetric in these configuration terms due to fixed effects (Slaets et al., 2017).
To clarify the effect of water availability on thermal stress (Figure 3), we applied three sets of VPD data extracted from various soil moisture states to Eq. 3: all grid-year data, the first 40% and bottom 40% of the soil moisture sequence. Finally, for a more visually clear effect, we used eight equally spaced climate variables to estimate variation for maize production, with the range being 0.1-1.67 kPa. The numerical interval of daily climate data from April to September is divided into five equal-width bins to contrast yield variations to various climate conditions. We divided the variables in these climate combinations equally into five boxes and modeled to assess the combined effects of thermal and water stress, which increased the amount of box (from 5 to 25) by squaring. When fitting Eq. 3, which defines bin edges based on the 1% and 99% quartiles, we use all 9,205 grid points-years for each climate variable for all days from April to September to estimate the yield response. Surface soil water content is taken in the range of 0.2-0.42 cm 3 cm −3 , the value interval for root-zone soil moisture is 0.2-0.42 cm 3 cm -3 , VPD ranges from 0.1 to 1.67 kPa, temperature ranges from 10°C to 36°C, and precipitation ranges from 0 to 40 mm. The data interval of VPD and SM r used is the range in the dashed black box shown in Figure 1. Identify the five boxes and their individual value intervals, and ensure that the amount of data falling within each interval is sufficient.
To investigate the interaction between heat and water stress, the data are grouped according to water stress (soil moisture or precipitation) and thermal stress (VPD or temperature), producing 25 combinations (e.g., 5 precipitation groupings (h) multiplied by 5 VPD groups (i)), the form of the model is shown below In this two-dimensional equation expressing the interaction, ρ h,i denotes the exposure matrix of climatic conditions and β h,i denotes the sensitivity matrix of yield, and their products are summed between the rows and columns. The range of values of climatic variables is consistent with the previous section Supplementary Figure S2. Provides a visualization of the methods, data and assumptions employed in this study.

Applying yield models to future
To infer the yield response to the environment in future climatic states based on the constructed yield predictive model for the historical period (Figure 7 and Supplementary Figure S3), we fit the polynomial to the sensitivity (β h ) of the yields of Eqs. 3, 4, which is its response. In detail, the β h (h = 5) of the VPD obtained from Eq. 3 are input into the one-dimensional quadratic polynomial for corresponding coefficient estimation.
In Eq. 5, a weight (W h ) term is added. We compare the total number of days (N h ) of each interval of 13169 grid-years counted to the sum of the observed days ( N h ) for each of the 9,205 grid points, i.e., W h N h / N h . To illustrate the interaction between VPD and root zone soil moisture over future periods, we fit the twodimensional quadratic polynomials of VPD and SM r to the 25 yield sensitivities of Eq. 4.
Similar to Eqs. 5, 6 also includes weights, but it is built on the days within the 25 combined intervals. The W h is very meaningful as they balance the effect of each interval on the results. Therefore, it prevents intervals with a low number of observations from being overweight in the fitted coefficients, which is why the coefficients of the combination of the two differ from the output of the twodimensional polynomial when both atmospheric water supply and demand are low. The fitting of this multinomial enables yield estimates for future periods beyond the time horizon of currently available observations (Figure 7). To describe the uncertainty of the fitting results of the polynomials, we also resampled 10,000 times the sensitivity β h of the polynomial fitting to Eqs. 3, 4 using the nonparametric bootstrap technique. The yield response of each grid point to increasing VPD is predicted, and the average production TABLE 1 Correlation between the estimated and actual yield anomalies. The correlation (r) between estimated and actual yield anomalies and the 90% confidence interval for the corresponding correlation (from n = 10,000 bootstrap realizations). The variables in the table are vapor pressure deficit (VPD), root-zone soil moisture (SM r ), maximum temperature (T), precipitation (P), and surface soil moisture (SM s ), all of which are daily data.  Figure 7). We explain the uncertainty of the calculated yield response by using the 10th-90th percentiles of the yield response series as the confidence interval for the response (shaded part of Figure 7). In parallel, we propose specific changes in maize yields in the Chinese region for a warm future, 1.5°C-4.0°C warmer than preindustrial temperatures, in steps of 0.5°C. Nine models from CMIP6 are used to project 6 warming scenarios (1.5°C, 2.0°C, 2.5°C, 3.0°C, 3.5°C, and 4.0°C above pre-industrial levels) and 4 SSPs: SSP1-2.6, SSP2-4.5, SSP3-7.0, and SSP5-8.5 for 24 scenarios of yield response. Keeping global warming less than 2.0°C warmer than pre-industrial levels is a very important global climate policy goal that promises to reduce catastrophic consequences (Vautard et al., 2014

Model validation
Numerous studies on rainfed agriculture indicate that temperature has a non-negligible effect on maize yield (Butler and Huybers, 2015), while water is an essential and important condition for vegetation growth and development. Heat stress is strongly correlated with yield reduction in maize (Rigden et al., 2020). However, the connection between water availability and yield variability has not been fully demonstrated. Therefore, we first use a yield estimation model based on a linear mixed effects model to identify the climatic variables characterizing temperature and moisture availability that impact maize yield most. Table 1 shows the results of the correlation coefficients between yield anomalies estimated using the model and the observed yield anomalies for the historical (2010-2016) period. We first calculate the correlation between the model outputs and the actual results for the single variable characterizing moisture or temperature that impacted yield. The calculations show that the output results of the model inputs for soil moisture in the maize root zone performed best on average, with values as high as 0.8 (Confidence Interval (CI): 0.78-0.81). The results of soil moisture data at surface depths of 1-7 cm are only 0.1 lower than the results of root-zone soil moisture (CI: 0.77-0.81). This shows the impact of available water since soil moisture is also a variable that characterizes the water required for plant growth and development. The calculation results of precipitation are not entirely satisfactory. The average correlation coefficient is only 0.66, and the maximum value of the CI is only 0.70, which is lower than the CI of soil moisture in the root zone (0.78). This may be because precipitation contains changes in runoff, drainage, and evaporation (Basso and Ritchie, 2014), and the changes in precipitation are uneven (Fezzi and Bateman, 2015); as a result, the amount of water available to plants cannot be well determined. Rainfall is not a good representation of the water available to plants. On the other hand, in the variables representing temperature, the mean value of the model output of VPD (0.74) is better than that of the 2 m air temperature (0.56).
Subsequently, to investigate the effect of thermal and water stress on yield, we selected one of the variables that characterized temperature and water and substituted them into a two-dimensional model representing the interaction (Eq. 4). The results indicate that the hydrothermal combination of VPD and soil moisture in the root zone form the most suitable climatic combination of maize yield. The correlation coefficient between yield calculated based on this combination and actual yield reaches up to 0.85, with a mean value of 0.83. This is followed by the combination of temperature and root zone soil moisture, which correlated up to 0.82. The lowest correlation coefficient is obtained from the combination of precipitation and temperature (CI: 0.74-0.77, mean: 0.75), which is lower than the univariate results for surface soil moisture (CI: 0.77-0.81, mean: 0.79). Overall, the results for soil moisture (top soil moisture and root zone soil moisture) are better than precipitation for the single elements related to maize yield, and root-zone soil moisture yields better results than topsoil moisture. The comparison between the univariate and bivariate results shows that the latter results are overall significantly better than the former. The univariate best result (SM r ) ranks third in all results. It is noteworthy that the three sets of results with the highest correlations all include root-zone soil moisture, reflecting that root-zone soil moisture strongly influences yield, which is the meteorological variable representing the available water of maize over longer time scales.

FIGURE 2
Scattered density of yield anomalies between estimated and actual yield anomalies. We drew the scatter density plot of estimated and actual yield anomalies and calculated the fitting coefficient. R is 0.825, MAE is 0.38, RMSE is 0.48, and the Bias is −0.02. All deviations are measured in tons per hectare.

Frontiers in Environmental Science
frontiersin.org Next, the optimal results are selected based on Table 1 for verification. Our validation includes data distribution density and spatial distribution validation. The results of the scattered density of the estimated and the observed yield anomaly are shown in Figure 2. It shows the scatter density distribution of the estimated historical yield anomalies and the observed yield anomalies for 9,205 grid years from 2010 to 2016. The denser the data distribution, the warmer the colors (blue to red). The correlation coefficient of the scatter density fitting of the predicted and observed yield anomalies is 0.82, reflecting the high consistency between the predicted and observed yield anomalies. The overall MAE, RMSE, and Bias representing error are 0.380, 0.480, and −0.020, respectively, indicating subtle differences. The results of the spatial distribution verification are shown in Figure 3, where Figure 3A indicates the spatial distribution of the annual mean of the observed yield anomalies over the grid points from 2010 to 2016 and Figure 3B shows the spatial distribution of the estimated yield anomalies output by the model. By comparing the spatial distributions and values of the two, we find that the modes of the spatial distribution of the observed and estimated yield anomalies are very similar, but there are moderate differences. For example, the yield anomalies in the maize-growing areas in the southwest mountains of China are negative. It indicates that the actual maize yield in the southwest mountains has not reached the expected mean of linear trend fitting for 2010-2016, while positive values dominate the northeast region of China. It has been confirmed that maize in Northeastern China is rain-fed and susceptible to interannual meteorological conditions, with high yield volatility but an overall upward trend, while parts of Southwestern and Eastern China have shown a downward trend in recent years. Based on the above phenomena, maize production and cultivation's national center of gravity gradually moves to the northeast (Xu et al., 2013). The results shown in the Figure are consistent with the above facts, which also illustrate the robustness of the yield estimation model based on the linear mixed-effects model.

Impact of combined SM-VPD on historical maize yield in China.
The validation results of the maize yield model in the former part clearly show that the yield model can be well adapted to the study. A previous finding suggests that thermal stress is often accompanied by water stress and that the two are irreplaceable and not interchangeable. However, since there are still uncertainties concerning the role of moisture in regulating the loss of maize yield due to heat stress, we first conduct a simple analysis of the yield response to heat stress and how it depends on soil moisture. We use the model in Eq. 3 to estimate the yield sensitivity of VPD to show the impact of daily accumulated atmospheric moisture demand on yield (Butler and Huybers, 2013). Based on the different average soil moisture states during the maize growing season, we extracted data for the driest and wettest 40% of soil water content in the study area from April to September and fed them into the model separately. For comparison, we put in all grid-year data at the same time. Daily mean VPD rather than daily mean air temperature was used as an input to the model because VPD was shown to be more closely related to water stress to predict better yield loss (Hsiao et al., 2019). When comparing yield sensitivity results from three sets of data (Figure 4), it was found that high VPD was not consistently destructive. Only in wetter years, a low VPD (<0.6 kPa) was detrimental to maize yields, and when VPD was higher than 1.3 kPa, it would reduce maize yields in drier years. The 95% confidence intervals for yield sensitivity were also different at low and high VPD. The above analysis shows that if the state of different soil moisture is mixed together, it is easy to get misleading results, and it will be mistakenly believed that higher or lower VPD is not conducive to maize yield.   Figure 5A is 0.1-0.414 kPa, and so on. Although the yield contribution to 2010-2016 is negative for the entire Chinese maize growing region in Figure 5A (VPD: 0.1-0.414 kPa), a positive contribution gradually emerges with increasing VPD. When VPD increases to 0.728-1.042 ( Figure 5C), all of northeastern and northern China have changed to meaningful positive contributions to yield, with changes up to 1.21 t ha-1. Negative values then appear again as the numerical contribution of VPD increases, but when atmospheric water demand is highest ( Figure 5E), the overall negative impact in China is smaller than in Figure 5A, and positive impacts still dominate northeastern China. Similarly, Figures 5F-J shows the contribution of soil moisture to yield by equally dividing the interval 0.2-0.42 cm 3 cm −3 at 0.044 cm 3 cm −3 intervals. Although the trend in the contribution of soil moisture to yield in China from 2010 to 2016 is similar to that of VPD, the negative impact of relatively high soil moisture on yield is greater. In general, there is a trend of the negative-positive-negative contribution of VPD and SM r to maize yield, along with increasing the range of values taken. Moreover, the negative effect is more dominant in southwestern China for both Daily sensitivity of maize yields to VPD under different soil moisture conditions. Eq. 3 estimates the response of the driest 40% (red), the wettest 40% (blue), and all grid-years (black) to the daily yield anomaly of vapor pressure deficit (VPD). A constant width interval from 0.4 kPa to 2 kPa is specified, and the center of the interval is marked on the horizontal axis. The size of the circles in the plot represents the percentage distribution of data that belongs to the corresponding interval in each set of data, and the vertical bar represents the 95% confidence interval corresponding to each sensitivity.

FIGURE 5
Contributions of VPD and SM r with different intensities to maize yield in 2010-2016. (A-E), the VPD contribution to maize yield was divided into five intervals from 0.1 kPa to 1.67 kPa (F-J), same as in a-e, but driven by the root-zone soil moisture of maize (SM r ) from 0.2 cm 3 cm -3 to 0.42 cm 3 cm −3 .

Frontiers in Environmental Science
frontiersin.org VPD and root-zone soil moisture, which is consistent with the shift of maize center of gravity to the northeast, as analyzed in the previous section. The yield response to VPD is shown here to be affected by soil water content, suggesting that both conditions should be considered when constructing yield models. We construct the model assuming that the yield response to climatic variables accumulates daily within each growing season. Based on this, a two-dimensional correlation model is built to infer VPD and soil moisture interactions ( Figure 6). As previously described, the method is suitable for this study, in which yield is gradually accumulated according to changes in daily exposure to VPD and soil moisture. Applying the model Eq. 4 to 9,205 grid-year yields from 2010 to 2016 gives a yield-sensitivity model of the interaction between VPD and root-zone soil moisture ( Figure 6). Yield sensitivity indicates that a supersaturated state of soil water content reduced yield by 0.011 t d −1 ha -1 , while a dry state with high VPDs resulted in a yield loss of 0.006 t d −1 ha −1 . The yield loss estimate for wet conditions is the average of the four negatively affected squares at the bottom right of the diagonal line in Figure 6. Similarly, the loss value for dry and high VPD conditions is the average of the three squares at the top left of the diagonal line.

Future yield prediction based on VPD and soil moisture
The response of yield to the increase in VPD is closely related to the soil moisture in the root zone. However, it remains unclear how changes in the balance between the demand and supply of moisture would impact future yield. Additionally, there is no guarantee that the past covariance between increased atmospheric demand and lower moisture supply will remain consistent in future climate change. Relative to the preindustrial (1850-1900) climatology in the 9 CMIP6 models, the vapor pressure of maize-growing regions in China could increase by up to 0.5 kPa in 2041-2100. The increase in VPD indicates that the temperature and saturated vapor pressure will increase in the future climate state. Still, since the relative humidity change is not apparent, the change in actual vapor pressure can also be ignored (Ficklin and Novick, 2017). The increase in VPD leads to a decrease in soil moisture content, and we have estimated here that the changes in VPD and soil moisture in the 9 models of CMIP6 are inversely correlated (r = −0.33). We analyze the average performance of VPD-based 1D and VPD-SM r -based on 2D polynomials in the SSP1-2.6, SSP2-4.5, SSP3-7.0, and SSP5-8.5 scenarios for nine models (Figure 7). The dotted lines in the Figure 7 represent the one-dimensional model, while the solid lines represent the participation of soil moisture. When the VPD increases by 0.5kPa, the model using the combination of rootzone soil moisture and VPD Eq. 6 shows that the SSP5-8.5 scenario will reduce the yield of about 276.1718 million tons per season, which is about 1.17 times the average yield in 2010-2016 (236.9214 million tons). If only the VPD-based one-dimensional model is used to estimate yields for future periods, the SSP1-2.6 scenario shows increased yield loss by less than twice the former. In contrast, it will increase the yield loss estimate by about 2.7 times in the SSP5-8.5 scenario. We also roughly estimate the modalities of the optimal water balance state in favor of maize under four different future scenarios (Supplementary Figure S3). After comparison, we find that with the gradual increase of the emission scenario, the optimal water

FIGURE 6
The response of maize yields to moisture supply and demand in 2010-2016. Based on observational data, we estimated the maize yield responses to VPD and root-zone soil moisture. The color of each rectangle shows the estimated daily maize yield sensitivity (i.e., β h factor) and the radius of the square indicates the relative amounts of days falling in each value interval. The largest icon in the figure means 8.21% of the used data. The black dots in squares indicate significant fitting coefficients (p < 0.05). The surface of the interpolated 25 coefficients is drawn behind the square.

Frontiers in Environmental Science
frontiersin.org balance conditions for maize will gradually change relative to 2010-2016, and the scope tends to shrink. Under the projected warming of the climate, we calculate the specific future yield response of maize under 6 different warming conditions (1.5, 2.0, 2.5, 3.0, 3.5, 4.0°C) (Figure 8). In models where the warming degree of a certain emission scenario does not meet our calculation standard, we ignore it and average the remaining models. The highest and lowest ends of the error bars represent the difference in results for different modes, and in some cases, only one mode exists, so some columns do not have error bars (e.g., SSP1-2.6 at 1.5°C increase). We have listed the specific values of the average yield response of the 9 models for all 24 different emission paths or different warming states in Supplementary Table S2. When the future temperature increases by 4.0°C relative to the reference period, scenarios from SSP1-2.6 to SSP5-8.5 reduce production by about 0.0690, 1.5705, 2.1634, and 6.2310 million tons per season, respectively. That is about 0.003%, 0.663%, 0.913% (1/109) and 2.63% (1/38) of the average production in 2010-2016 (236.9214 million tons). Overall, maize yield loss gradually increases with rising temperature under either SSP scenario, although SSPs 1-2.6 are positive yield responses until temperature increases reach 4°C.

Discussion
The impact of climate change on agricultural production remains difficult to control, with the Fourth Assessment IPCC Report (Easterling et al., 2007) noting that moderate warming of the climate would benefit maize yields, while the Fifth Assessment Maize yield response to VPD and SM r in different SSPs. Increasing the observed VPD and using the fitted polynomial function of the VPD (dashed line) and the fitted polynomial function of VPD and SM r (active line) to predict the future response of maize yield to VPD elevation in China. The shaded confidence interval represents a significant coefficient of fit (p < 0.05) from the distribution of the mean yield estimated from the bootstrap data from the grid (n = 10,000).

FIGURE 8
Yield response under different temperature-rising conditions in different SSPs. We predict yield responses under different temperature increases in different social sharing economy pathways using a fitted polynomial surface for VPD and SM r . The error bar indicates the difference between the models.

Frontiers in Environmental Science
frontiersin.org Report (Porter et al., 2014) and the most recent the Sixth Assessment Report (Bezner et al., 2022) indicated yield losses. Accurately predicting future production is quite challenging. This uncertainty may be due to the fragmentation of existing climateagriculture studies. The nuances of the constructed models will also impact the experimental results. If the key variables relating to yield are omitted from the model, it can easily cause biases in the results. It is well known that moisture and thermal stress are important variables affecting yield. Many studies on rainfed agriculture have shown that heat stress is associated with reduced yields (Schlenker and Roberts, 2009;Butler and Huybers, 2013;Lobell et al., 2013;Porter et al., 2014;Butler and Huybers, 2015;Zhao et al., 2017;Rigden et al., 2020). Although water is clearly essential for crop growth and development, the relationship between water effectiveness and yield is still not well resolved. Many studies using precipitation to indicate water availability have not yielded good results (Schlenker and Roberts, 2009;Butler and Huybers, 2013;Lobell et al., 2013). Soil moisture in the root zone indicates better water stress that affects plant water use. VPD is not only an indicator of air dryness and temperature but also accurately captures moisture stress in the environment.
Our study uses fine-scale meteorological data and combines it with a flexible linear mixed-effects model linking yield to weather variability in each season, considering the specification of fixed effects for grids. We calculate the sensitivity of maize yield to VPD under different soil moisture conditions in historical periods based on a mixed linear effects model (Figure 4), the results of which demonstrate the sensitivity of maize yield to VPD in the form of an approximate quadratic curve. The model was validated to be highly accurate (Figures 2, 5). This study therefore began by using a quadratic polynomial to predict the effect of VPD on maize yield in future periods. To analyse the effect of water stress on yield, we therefore modified the polynomial into a binary quadratic polynomial. We retained only the more significant results for scientific certainty (Figure 7).The focus of this study was to analyse the effect of soil moisture and VPD on maize yield during the historical period, which has been demonstrated in the previous paper with high accuracy of results. Then we want to study and quantify the effect of this effect on yield in the future period for the purpose of yield prediction. We demonstrate that a VPD and soil moisture imbalance reduces the daily yield response. The balance between soil moisture and atmospheric demand in the root zone provides optimal water balance conditions for maize yield, corresponding to the higher positive yield sensitivity in Figure 6, a phenomenon consistent with the true physiological state of maize (Kramer and Boyer, 1995). When the water supply exceeds the demand or is lower than the conditions required for the growth and development of maize, it will cause a loss of yield. The yield decline shown in Figure 6 under drier soil conditions and higher atmospheric moisture requirements may be caused by insufficient raw materials for plant photosynthesis. Specifically, maize closes its stomata to prevent its water loss due to excessive transpiration under such conditions (Bennett et al., 1987), and also to limit the uptake of CO 2 from the atmosphere, thereby impeding photosynthesis (Farquhar and Sharkey, 1982). When the soil moisture is sufficient, and the atmospheric moisture demand is low, the negative response of the yield also indicates that the photosynthetic capacity of maize is decreasing. Numerous studies have shown that photosynthesis is sensitive to water stress and that the photosynthetic rate decreases with increasing water stress (Boyer, 1970;Ephrath and Hesketh, 1991;Dai et al., 1995). Water stress can lead to a decrease in chlorophyll content and even irreversible damage to chlorophyll if the water stress is too high (Zhao et al., 2003). The activity of the key enzyme (Bubisco), which fixes CO 2 in photosynthesis, will also decrease. If the soil is oversaturated with water, it will lead to hypoxia of plant roots, reduce the absorption of nutrients, contact and increase the exposure to toxic substances, and also lead to the proliferation of fungal pathogens and eventually lead to reduced yields (Harvell et al., 2002;Rosenzweig et al., 2002;Li et al., 2019). Eventually, when atmospheric water demand is low, often accompanied by poor light conditions, photosynthesis can also be limited. Within climate change conditions, VPD and root-zone soil moisture are still the main study variables in the future. The main reason is that they are among the most directly related environmental variables for plant growth and development (Hsiao et al., 2019). We hypothesize that the joint distributions of VPD and root-zone soil moisture are unchanged except for changes in the mean, while yield sensitivity to these variables also remained unchanged. While we believe that process-based crop models can also assess the problem of crop demand and supply of atmospheric water (Tao et al., 2009), the method used in this study uses fewer parameters and yet can accurately represent and analyze the response of maize-growing regions as an extension to increased confidence in assessing future climate impacts on major crops around the world.
There are some limitations to our study. First, spring and summer maize are two types. Due to the lack of data, the study chose the entire maize belt as the study area to analyze the impact of climate change on maize yields throughout China. Second, our basic statistical approach does not quantify the effect of CO 2 as a separate variable, but its effect is considered laterally in the study of atmospheric moisture demand and supply. Carbon dioxide is the basic raw material for plant photosynthesis, so its level affects maize growth and yield. The increase in global ambient CO 2 concentration in recent decades and the change in the correspondence between it and water are likely to change the sensitivity (Rigden et al., 2020). It is still controversial that higher CO 2 concentrations have the potential to increase yields (Lobell et al., 2013), which will need to be explored in future studies. Finally, our basic statistical approach cannot account for and quantify the effects of CO 2 in regression analyses of observed yields because CO 2 concentrations exist in concentration gradients throughout the atmosphere that flow endlessly, leaving only a slowly increasing time trend. Although crop growers have been adapting to climate change by introducing new technologies, the growth of global yields of major crops, including maize, has been gradually slowing or even remaining constant (Brisson et al., 2010;Ray et al., 2012). It is worth noting that the negative impact of global warming on yields, in addition to increasing VPD and reducing soil moisture, may also affect yields by shortening the phenological period of crops (Porter et al., 2014). For the purpose of the science and integrity of the study, we will later consider the effects of other natural elements (wind speed, solar radiation, etc.) and human influences (maize seed, management practices, etc.) on yield.
Since soil moisture plays an important role in regulating yield losses due to heat stress (Rigden et al., 2020), this study also implies Frontiers in Environmental Science frontiersin.org that optimizing and adjusting management programs for water balance in the crop root zone to maintain higher soil moisture content is particularly important to increase yield sensitivity to the environment. Increasing management intensity and better crop varieties may also inadvertently increase yield sensitivity to weather (Lobell et al., 2014). Therefore, there is an urgent need to focus on national research and extension programs to promote production, including through crop-region-specific adaptation strategies, to offset the negative impact of future climate change on yields and ensure food security for a growing world population.

Conclusion
We first used grid-level maize yield data from GDHY v3.0 from 2010 to 2016, combined with high-resolution ERA 5 reanalysis meteorological data, to develop a yield estimation model based on a linear mixed-effect statistical model to favor the optimal water balance conditions for maize yield in China. Our findings showed that historical yield estimation derived from the combination between VPD and maize root-zone soil moisture (SM r ) were obviously better than those from other fundamental variables alone or together. Only when the demand and supply of atmospheric moisture are relatively balanced can there be benefits to maize yields. If either side is too extreme, it can damage maize yields. Furthermore, soil moisture can regulate yield losses caused by thermal stress. We also extrapolated the above model results into the future using data from 9 global climate models from CMIP6, using a two-dimensional polynomial built on VPD and SM r to explore changes in maize yield under climatic conditions in future periods. We also estimated changes in maize under different temperatures (1.5, 2.0, 2.5, 3.0, 3.5, and 4.0°C) scenarios relative to the base period . We found that considering soil moisture in the future yield estimation reduces the overestimated yield loss by about a factor of two, compared to considering only atmospheric moisture requirements in SSP 1-2.6. This can even be reduced by a factor of 1.7 in SSP5-8.5. At the same time, under the different SSP scenarios, the reduction of maize yield gradually increased with the temperature rising. Moreover, for the same temperature increases, the yield loss also increases gradually with increasing emission concentration. The implications of these findings are vital for present and future food security purposes.

Data availability statement
The original contributions presented in the study are included in the article/Supplementary Material; further inquiries can be directed to the corresponding author.