Seasonal Predictability of Four Major Crop Yields Worldwide by a Hybrid System of Dynamical Climate Prediction and Eco-Physiological Crop-Growth Simulation

The purpose of this study was to evaluate the prediction accuracy of a newly developed crop yield prediction system, composed of a dynamical seasonal climate prediction model (SINTEX-F2) and an eco-physiological process-based crop growth model (PRYSBI2). We explored the 3-months lead prediction accuracy of year-to-year variations in yield of four major crops (maize, rice, wheat, and soybean) in global regions and evaluated for which crops and in which areas the system performs well. The results indicated the system is more accurate for wheat relative to the other crops. Also, we found that different strategies would be useful in improving the system, depending on the crop. For winter wheat and rice, we need to improve the temperature predictions, particularly over the mid-latitudes, whereas improving rainfall predictions was more important for maize. For spring wheat and soybeans, the crop growth simulation itself should be improved. Although this study is only a first step, we believe that additional efforts to improve the system by understanding and incorporating processes of climate and crop growth will potentially provide useful prediction information to big stakeholders like global agribusiness companies and countries for improving food security in regions where crop yield is vulnerable to extreme climate shocks and where food markets are isolated from international trade.


INTRODUCTION
Food production is highly sensitive to seasonal climate variability throughout the world (Iizumi et al., 2013b(Iizumi et al., , 2014a(Iizumi et al., , 2018bYuan and Yamagata, 2015;Oettli et al., 2018). Therefore, seasonal climate predictions linked to a crop simulation model can provide potentially useful information for stakeholders to reduce risks related to crop failure (Hansen and Indeje, 2004;Crane et al., 2010;Hayashi et al., 2018;Rodriguez et al., 2018). For example, an early warning of an abnormal seasonal climate-induced crop failure could serve as useful information for better management of national food balance by imports, insurance systems, and stabilization of commodity markets (FAO, 2016). The recent development of improved numerical seasonal climate prediction technologies has enhanced the possibility of predicting crop yields several months prior to harvest, but there is still room for improvement.
One possible approach to improving the prediction system is to better understand processes for abnormal climate events and their impacts on crop growth. Simple statistical methods are not suitable for this purpose because they implicitly link output (crop yields) and input (climate prediction) using available data (e.g., Lobell and Asseng, 2017). Therefore, developing hybrid systems of dynamical seasonal prediction systems and process-based crop growth models is a necessary step for potentially improving seasonal predictions of crop yields. Previous studies have evaluated the prediction accuracy and usefulness of dynamical seasonal prediction systems linked with process-based crop growth models. Brown et al. (2018) showed the benefits of using a climate model to predict wheat yield in the Australian cropping zone. Rodriguez et al. (2018), focusing on sorghum yield in Australia, showed that a dynamical seasonal prediction system linked with a crop simulation model can be used to inform optimum crop designs to increase farmers' profits and reduce risks. However, there still has not been sufficient discussion about which crops and areas have a high predictive ability on a global scale.
Our aim was to develop a scheme for forecasting worldwide crop yield variations based on seasonal climate predictions. We developed a hybrid system of a dynamical seasonal prediction model and an eco-physiological process-based crop growth model for four major crops (maize, rice, wheat, and soybean) and investigated its prediction accuracy in global regions. We chose these four crops because together they account for about two-thirds of the world's food calories and are crucial as agricultural commodities in international trade.

Seasonal Climate Prediction System
The SINTEX-F2 seasonal prediction system (Doi et al., 2016) was used. It is based on the fully coupled ocean-atmosphereland-sea-ice SINTEX-F2 climate model (Masson et al., 2012;Sasaki et al., 2013) and is an upgraded high-resolution version of the earlier SINTEX-F1 system (Luo et al., 2005). The atmospheric component has a horizontal resolution of 1.125 • (T106). System details and an overview are given in Doi et al. (2016). This system has been used to make accurate predictions of El Niño/Southern Oscillation (ENSO), the Indian Ocean Dipole (IOD), and associated seasonal climate variations (Doi et al., 2016Ratnam et al., 2017Ratnam et al., , 2018Ratnam et al., , 2019. In consideration of uncertainties of both initial conditions and model physics, the system has six ensemble members. Here, we used retrospective forecast experiments with a 3months lead period beginning on the first day of every month in 2000-2010.

Eco-Physiological Process-Based Crop-Growth Simulation Model
PRYSBI2, an eco-physiological process-based crop growth simulation model, was used. It can successfully describe the response of crop growth and yield to abnormal weather and climate in global regions at a large scale, taking soybean as the example (Sakurai et al., 2014;Müller et al., 2017). The model is sufficiently complex to describe important factors, such as the enzyme kinetics in photosynthesis, to simulate yield response to abnormal weather and climate conditions. However, it is simplified as much as possible to enable us to do the huge number of calculations (see the Supplementary Material) needed because we used a Bayesian statistical approach with a Markov-Chain Monte Carlo (MCMC) method based on field observations and statistical yield data to estimate the parameters of the process-based model. The horizontal resolution was 1.125 • (T106) to match the outputs of the climate prediction system.
The original model is a global gridded crop model (Müller et al., 2017) in which the representative value of crop yields of a target area is estimated. Our crop model simulates representative crop yield in each 1.125 • grid cell using daily meteorological data. The sowing date of Sacks et al. (2010) is used, and the harvest date is determined by accumulated temperature from sowing date (Sakurai et al., 2014). The parameter value relevant to harvesting day is also estimated using the data of Sacks et al. (2010) for each grid cell. More details and an in-depth overview of the original model are presented in Sakurai et al. (2014) and Okada et al. (2015). A description of the updated model used in this study is available in the Supplementary Material.

Reforecast Experiment
For the retrospective forecast (re-forecast) experiments, the daily outputs from the SINTEX-F2 seasonal prediction system (shortwave radiation, precipitation, relative humidity, wind speed, and daily average, maximum, and minimum temperature) were used as the climate inputs to the crop model after bias correction. In the bias correction, the daily climatology for 365 days for the period 2000-2010 was computed using a global retrospective meteorological forcing dataset (Iizumi et al., 2013a) with a spatial resolution of 1.125 • . The daily climatology for the same 11-years period was then derived using the SINTEX-F2 outputs, and the climate-model data were adjusted to have the same climatology as the forcing dataset. The day-to-day deviation in a climatic variable relative to the climate-model climatology was maintained, although the systematic error in the climatemodel climatology was removed. This procedure was conducted for each crop, 1.125 • grid cell, and six climate-model ensemble member. We also used the retrospective meteorological forcing dataset (Iizumi et al., 2013a) to estimate past crop yields to evaluate the upper limitation of the prediction accuracy of the crop model itself.

Prediction Accuracy Metrics
Country-level yield data from the Statistical Division of the UN Food and Agriculture Organization (FAOSTAT) (see www. fao.org/faostat/en/#home) were used for verification of global average yield in 2000-2010. Global average yield data were calculated from country-level yield and harvested area data. We also used grid-cell crop yield datasets in the years of 2000-2006 by Iizumi et al. (2014b), which are 1.125 • grid-scale yield data estimated using satellite data (NOAA/AVHRR), and countrylevel yield data by FAO (Iizumi et al., 2013b). Although data in the grid-scale dataset are estimates and not ground truth, they have been shown to be in good agreement with actual yield data when the dataset was compared to other yield datasets solely based on national and subnational yield statistics (Iizumi et al., 2018a).
The six-member ensemble mean of the yield and climate simulations was used to calculate anomaly correlation coefficients (ACCs) as a deterministic accuracy score. ACCs were calculated as follows. (2) crop yield variability tends to change according to the average yield (Hawkins et al., 2013). Although this derivation of ACC is slightly different from that generally used, the derived value has the same characteristic-it increases as the prediction improves, with an upper limit of 1. When calculating the predicted global average yield, we calculated the average yield weighted by harvested area (Monfreda et al., 2008) for each crop.
We also calculated the root-mean-square error (RMSE) as follows.
Although the RMSE formula is different from the classic formula, the derived value has the same characteristic-it decreases as the prediction improves.

Comparison of the Error Contributions
We conducted crop-growth simulations forced by the biascorrected re-analysis meteorological forcing dataset (Iizumi et al., 2014a) to compare the errors generated by the re-analysis with those simulated with the seasonal climate prediction. By calculating the difference in the RMSE between the crop yield predictions of the two analyses, we roughly estimated the degree of the contribution of the error of the seasonal climate prediction to the RMSE of the crop yield prediction for each grid cell.

RESULTS
Prediction Accuracy for Global Average Figure 1 shows the time series of year-to-year variations of historical global mean yield of four major crops. Maize yield had the largest year-to-year variation in 2003-2004 ( Figure 1A). Unfortunately, the prediction system cannot capture it, and the ACC is not statistically significant for maize. Because the ensemble spread of the prediction is much larger relative to the standard deviation of the year-to-year variation, the uncertainty is large. On the other hand, the historical rice yield was rather stable and did not have large year-to-year variations ( Figure 1B).
The prediction system is not accurate in predicting these small variations in the annual global mean rice yield (ACC = 0.10), and the uncertainty is large. The predictions for wheat are relatively better than those for the other crops (Figures 1C,D). As shown in the relatively small ensemble spreads for winter/spring wheat, the uncertainties are relatively small. The ACC values are 0.54 for winter wheat and 0.38 for spring wheat. The former value is statistically significant at 85% and very close to the 90% significant level, although the latter value is not significant at 90% because of the small sample size (n = 10). Soybean prediction is the most challenging of the four crops. The ACC score of the global mean yield is even negative, and the ensemble spread is the largest among the crops considered here ( Figure 1E).

Horizontal Distribution of Prediction Accuracy
Horizontal distributions of the ACC scores are shown in Figure 2. Note that an ACC value of >0.669 is necessary to conclude that our yield prediction is accurate when the sample size is 7 (5% significance level, one-sided). For maize, the prediction system is accurate over some parts of the United States (U.S.), Europe, northeastern China, and Indonesia, although the horizontal distribution is patchy over the South American continent and Southern Africa (Figure 2A). The values are very low over eastern Australia. Rice prediction shows higher ACC values over some parts of northeastern Brazil, Europe, and northeastern China ( Figure 2B). On the other hand, it is quite low over southern Brazil, West Africa, South Africa, Thailand, and Indonesia. The relatively better prediction of globally averaged wheat (section Prediction Accuracy for Global Average) is due mainly to the high scores in the eastern U.S., southern Africa, and western Russia ( Figure 2C). For spring wheat, accurate predictions are seen in some parts of western/eastern Australia, western U.S., Turkey, and Central Russia ( Figure 2D). Soybean accuracy scores are very low (negative) over the major soybean-producing regions, including the U.S., China, Brazil, and Argentina, although the scores are higher over some parts of India and northeastern China ( Figure 2E). We evaluated the statistical significance of ACCs by conducting 3,000 bootstrapping sampling in which ACCs were calculated with the observation values and randomly selected predicted values for each grid to estimate the percentage of the global harvested area with accurate prediction by chance (the number of grids with accurate prediction per number of grids).
For each bootstrapping sampling, the number of grids with accurate prediction was defined as those that have a significant (p < 0.05) correlation coefficient (ACC > 0.729 if n = 6, ACC = 0.669 if n = 7). In the bootstrap analysis, the average (±SD) rates of grids that have significant ACCs by chance were 5.7% (±0.4%), 5.6% (±0.4%), 7.2% (±0.5%), 5.0% (±0.6%), and 5.9% Frontiers in Sustainable Food Systems | www.frontiersin.org (±0.6%) for maize, rice, winter wheat, spring wheat, and soybean, respectively. For our prediction system, the corresponding rates were 18, 20, 21, 15, and 15%, respectively. All of the rates obtained using our prediction system therefore largely exceed the rates calculated under the random assumption, which means that the prediction system may have potential benefits.  were observed. These results indicate that, for these crops and grids, the hybrid prediction system would be improved by elaborating both the seasonal climate prediction and the crop growth prediction. On the other hand, in soybean and spring wheat, the classification of most of the grids as class X (negative) or A indicated that the prediction errors would be caused mainly by inaccuracy in the crop model. Figure S1 is same as Figure 2, but for the estimation forced by the atmospheric re-analysis data. Therefore, the errors are mainly attributable to the crop model. We can find that there is room of improvement for maize in eastern China, central North America, and central Asia and for winter wheat in central Asia by improving seasonal climate prediction. We also added horizontal maps of the RMSE for the estimation forced by the atmospheric re-analysis data ( Figure S2) and by the seasonal climate prediction (Figure S3). The result suggests that the prediction error of the crop yields are attributable to the prediction errors of the seasonal climate prediction particularly in central North America (for maize), Eastern Asia (for maize and rice) and central Asia (for maize and winter wheat). However, for spring wheat and soybean, the greater part of the prediction error of the crop yields are attributable to the error of the crop model.

Consistency With Previous Works
The results indicate that the prediction accuracy is high for wheat relative to the other crops. The results are partly consistent with previous studies (Iizumi et al., 2013b(Iizumi et al., , 2014aYuan and Yamagata, 2015), which also showed that wheat yield can be reliably predicted using simple regression methods based on the outputs of the SINTEX-F1 prediction system (Luo et al., 2005). Particularly, Yuan and Yamagata (2015) showed that the potential source of seasonal predictability of Australian wheat is due to the tropical Indo-Pacific climate modes, such as IOD, ENSO, and ENSO-Modoki. The high prediction accuracies of those climate variations and their associated teleconnection patterns by the SINTEX-F2 system (Doi et al., 2016) partly contribute to the successful prediction of Australian wheat yield, as shown in Figure 2D.

Improvement
The results suggest that the crop yield prediction errors are attributable to the prediction errors of the seasonal climate prediction for maize, rice, and winter wheat. This tendency was remarkable for maize, particularly the improvement of seasonal climate prediction in the central U.S. and Iran. It was also important in the southern U.S. for rice and southern Africa for winter wheat. Production of maize is sensitive to water stress, while production of wheat and rice is sensitive to temperature stress (Iizumi et al., 2013b). Temperature prediction by the SINTEX-F2 system is generally more accurate relative to rainfall prediction (Figure S4), which is a common attribute of dynamical seasonal climate prediction systems (e.g., the North American Multi-Model Ensemble: NMME; Kirtman et al., 2014). This feature can partly explain why the prediction error of maize yield is mostly due to the seasonal climate prediction errors, relative to those of rice and winter wheat. Low prediction accuracy of rice yield in Eastern Asia may be due to the difficulty of prediction of boreal spring-summer temperature there (Figures S4A,C), while low prediction skill of winter wheat in central Asia may be related to the difficulty of prediction of boreal winter temperature there ( Figure S4G).
As shown in Figure 1C, an extreme failure of the global average crop yield for wheat was experienced in 2007. This is successfully predicted 3 months ahead by our system. Figure 4A shows the country-level yield of wheat in 2007. Although the prediction is relatively better over the U.S., Russia, and India, predicted wheat yield in China is underestimated and that in Canada is overestimated (Figure 4B). This is also seen by the grid-level yield ( Figure S5). The error in Canada is due mainly to the crop model itself because the crop model forced by the biascorrected re-analysis dataset still has a similar error (Figure 4C). In contrast, the error in China is due mainly to seasonal climate prediction error (Figures 4B-D). From the latter half of 2006 through 2007, we experienced the growth and decay of El Niño and the positive Indian Ocean Dipole event, which may remotely influence the East Asia climate, including China (e.g., Behera et al., 2008). The seasonal prediction system needs to be improved to capture those teleconnections. We have been developing the SINTEX-F2 system to improve its prediction accuracy by updating the initialization scheme and increasing the ensemble size (Doi et al., , 2020Morioka et al., 2019).
For spring wheat and soybean, most of the crop yield prediction error can be attributed to crop model error (Figures 3D,E). Although identification of specific reasons for the crop model error is beyond the scope of this study, poor estimation of crop phenology may be in part responsible. Although daylength is not considered in our crop model, it is known that both temperature and daylength influence crop phenology (Yan and Wallace, 1998), and the influence of daylength on crop phenology differs among crops and cultivars (Major, 1980). For example, whereas soybean phenology is influenced largely by daylength (Major, 1974), maize phenology can be modeled using thermal time (Gilmore and Rogers, 1958;Tollenaar et al., 1979;Plett, 1992). Nutrient stress tolerances are also different among crops, which might affect the difference in the prediction accuracy among the crops. For example, soybean is a crop that establishes a symbiotic relationship with a nitrogenfixing bacterium and has high nitrogen stress tolerance. For the accurate estimation of the crop yield, it would be needed to consider nitrogen dynamics, the possible input of the nitrogen, and the difference of the response to the nitrogen input for each crop. In this study, soil water content and drought stress were calculated with the Soil & Water Assessment Tool (Neitsch et al., 2005). However, nitrogen dynamics was not included in our model (but included in the original model) because of avoiding the artificial effect on the estimation of the yield. It is possible that this simplification increased the error of the crop model in this study. Moreover, soil characteristics in each area were considered by using the ISLSCP II soil data (Scholes and Brown de Colstoun, 2011) but not in each crop in this study, although the soil characteristics may differ among the crops even in the same area. This simplification might also affect the prediction error in this study. Detailed analysis of these types of differences in the prediction accuracy of the crop model is an important future task. We will try to include this factor in future versions of the model.

Limitations
The prediction system in this study is accurate for some crops over some areas. However, a discussion based on statistical significance is still challenging because this study used data from a relatively short time period, i.e., 7 years (2000-2006) for the grid-wise assessment and 11 years (2000)(2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010) for the global average. Although statistical significance is not always the best basis for making conclusions (Amrhein et al., 2019;Wasserstein et al., 2019), a longer time period is required for a more reliable assessment of yield prediction accuracy. For that, we may need more streamline yield data corrections and reduce the computational cost of the hybrid system of dynamical climate prediction and eco-physiological crop-growth simulation. Some other limitations also need to be addressed in future studies. The current crop model was calibrated using yield data for 1981-2006, and the data used for the accuracy assessment are included in this dataset. The bias-correction procedure does not divide climate data into calibration and independent subsets. The use of a more reasonable experimental setup, such as the "leaveone-out" cross validation technique, is necessary. In addition, calibration of the crop model with a Bayesian approach is currently computationally expensive and not feasible at the present time.
Global mean yield may not be a good metric when evaluating predictions of seasonal yield variability because crop durations (planting to harvesting) vary by geographic location. For example, maize is harvested from October to November in the U.S., whereas first-and second-season maize in Brazil is harvested from February to June and from June to September, respectively (U.S. Department of Agriculture, 1994). In the most of the countries in Central America and the Caribbean (such as Mexico) and African countries (such as Nigeria), second season maize is also planted (U.S. Department of Agriculture, 2020). Particularly in Brazil, the production of the second season maize is substantially increasing in recent years (van Benthem, 2013). In contrast, seasonal climate forecasts for the coming 3 months are issued at a single point in time (e.g., 1 January), and crops in some regions would be harvested within the 3-months period, but crops in other regions would not. For this reason, global mean yield can be computed in the re-forecast experiments, but it never appears in the operational forecast mode. Country mean yield forecasts are most likely better for use in the assessment of operational yield prediction service.
The computational cost of the model might be a serious limitation for some researchers in third-world countries, who would be some of the most benefited from such modeling. We are thinking to share our outputs in future. Also, we may reduce the horizontal resolutions and/or focus on some target regions from the worldwide calculation to reduce the computational cost.

Future Potential
The results indicated that the prediction accuracy of the system varies according to area and crop species. This indicates that an important direction of future study would be not only elaborating the prediction models but also searching for combinations of areas and crops that are appropriate for seasonal prediction. Further studies need to summarize the advantages and disadvantages of using statistical and processbased crop models in seasonal yield prediction and make careful comparisons based on coordinated reforecast experiments. One of the notable advantages with the process-based prediction system in this study is that it can provide prediction results supported by reasons. As an additional improvement, we need to carefully compare the processes to understand the system's predictive potential by focusing on particular regions and years. This type of information will provide clues on how to improve and optimize the system. This research approach is not possible by using only a simple statistical method, as shown in previous studies (Iizumi et al., 2013b(Iizumi et al., , 2014aYuan and Yamagata, 2015). Although this study is only a first step, we believe that further efforts to improve the system by more deeply understanding climate and crop growth processes can provide useful prediction information to society. Although a global prediction system is still challenging, the accurate information could be beneficial for agro-businesses (for example, a bank that issues credits worldwide for agriculture) and for improving food security in regions where crop yield is vulnerable to climate shocks and food markets are isolated from international trade. In addition, a process-based prediction system may provide information to country-level institutions responsible for agriculture, which in turn could inform farmers, to prevent possible yield losses, for example, choice of sowing date based on seasonal prediction. Such a research stream is also already underway (Sakurai et al., 2018).

CONCLUSIONS
This study presented a newly developed system combining an eco-physiological process-based crop model and a dynamical seasonal climate model to predict worldwide yields for four major crops and evaluated the system's results. The 3-months lead predictions of year-to-year variation in the wheat yield were more accurate relative to those of maize, rice, and soybean. Maize yield is difficult to predict because of its huge intrinsic variability. Conversely, small variations in rice yield also make it difficult to predict. Soybean yield is the most difficult to predict. For winter wheat and rice, temperature predictions need to be improved, particularly over the mid-latitudes, whereas for maize, rainfall predictions should be improved. For spring wheat and soybeans, the crop growth simulation itself needs to be improved. However, we don't completely buy that low prediction scores in maize, rice and winter wheat are just because errors of seasonal climate prediction. It may be also due to intrinsic genetic variation in those crops and other biological and management factors. We also believe that further efforts to improve the system can contribute to provide a possible tool to concern food security under future climate change because process-based models allow representing future climate and management conditions not sampled in the historical record and new locations to which cultivation may shift (Franke et al., 2019).

DATA AVAILABILITY STATEMENT
The datasets generated for this study are available on request to the corresponding author.

AUTHOR CONTRIBUTIONS
TD conducted the seasonal climate reforecast experiments. GS conducted the crop simulation. TI conducted the bias correction. All authors contributed to the article and approved the submitted version.

FUNDING
This study was supported by the Environment Research and Technology Development Fund (2-1801) of the Ministry of the Environment, Japan.