Toward Regional Marine Ecological Forecasting Using Global Climate Model Predictions From Subseasonal to Decadal Timescales: Bottlenecks and Recommendations

This perspective paper discusses how the research community can promote enhancement of marine ecosystem forecasts using physical ocean conditions predicted by global climate models (GCMs). We review the major climate prediction projects and outline new research opportunities to achieve skillful marine biological forecasts. Physical ocean conditions are operationally predicted for subseasonal to seasonal timescales, and multi-year predictions have been enhanced recently. However, forecasting applications are currently limited by the availability of oceanic data; most subseasonal-to-seasonal prediction projects make only sea-surface temperature (SST) publicly available, though other variables useful for biological forecasts are also calculated in GCMs. To resolve the bottleneck of data availability, we recommend that climate prediction centers increase the range of ocean data available to the public, perhaps starting with an expanded suite of 2-dimensional variables, whose storage requirements are much smaller than 3-dimensional variables. Allowing forecast output to be downloaded for a selected region, rather than the whole globe, would also facilitate uptake. We highlight new research opportunities in both physical forecasting (e.g., new approaches to dynamical and statistical downscaling) and biological forecasting (e.g., conducting biological reforecasting experiments) and offer lessons learned to help guide their development. In order to accelerate this research area, we also suggest establishing case studies (i.e., particular climate and biological events as prediction targets) to improve coordination. Advancing our capacity for marine biological forecasting is crucial for the success of the UN Decade of Ocean Science, for which one of seven desired outcomes is “A Predicted Ocean”.


INTRODUCTION
Marine ecosystem forecasting, often leveraging predictions of physical ocean conditions, is an emerging research area that has rapidly attracted significant attention (Payne et al., 2017;Tommasi et al., 2017;Hobday et al., 2018;Capotondi et al., 2019a;Park et al., 2019;Jacox et al., 2020;Bolin et al., 2021). The development and improvement of marine biological forecasts are motivated by a number of ecological and socioeconomic aims, including management of fisheries and aquaculture, conservation of endangered marine species, and protection of human health. At present, most marine ecosystem predictions are in the experimental stage, but in the future, they could be operationalized with a wide range of applications.
Many different statistical and dynamical methods can be used for ecological prediction on subseasonal to decadal timescales Jacox et al., 2020). However, perhaps the most promising approach is to start with general circulation or global climate model (GCM) predictions of the physical environment and use them as the basis for ecological prediction. GCM predictions are conducted for a range of forecast lead times (i.e., the length of time between the time of initial condition and the time for which conditions are being predicted 1 ). Forecasts with subseasonal to seasonal lead time (i.e., several weeks to a year) are operationally produced by a number of modeling centers, and multiannual predictions (i.e., 1-10 years) are also being examined. These global climate forecasts offer a foundation to be used for an array of marine ecosystem predictions. While alternative methods may also be leveraged to generate biological predictions (for example, forecasting fish population dynamics by monitoring earlier life stages), they are beyond the scope of this paper.
Forecasting marine ecosystems using physical predictions consists of a multi-step process, typically including a GCM prediction, the dynamical or statistical downscaling of the GCM fields, and biological estimation (Figure 1) (e.g., Jacox et al., 2020). The data transfer between the tasks is an important consideration for the workflow. The most intensive data transfer is needed for dynamical downscaling, in which the threedimensional (3D) output of a GCM prediction is needed to force a regional model ( Figure 1A). A prime example of this workflow is J-SCOPE (JISAO's Seasonal Coastal Ocean Prediction of the Ecosystem), for which dynamical downscaling using the Regional Ocean Modeling System (ROMS) is conducted using surface and lateral boundary conditions taken from version 2 of the National Oceanic and Atmospheric Administration (NOAA) Climate Forecast System (CFSv2) Siedlecki et al., 2016). This system is supported by publicly available, 6-hourly, 3D forecast outputs of ocean variables for CFSv2. However, such data availability is exceptional; 3D GCM forecast output at higher than monthly resolution is typically not publicly available for other projects. Thus, in most cases, this workflow requires a close collaboration between the climate prediction center and the user institute.
A more practical workflow for many researchers is to use two-dimensional (2D) GCM output for key fields such as SST ( Figure 1B). This workflow was employed by a series of Australian studies in fisheries forecasting applications (Spillman et al., 2013;Spillman and Hobday, 2014;Eveson et al., 2015;Brodie et al., 2017). In this case, users may employ statistical downscaling rather than dynamical downscaling. A promising future extension of this workflow is to use multiple GCM outputs ( Figure 1C) because a multi-model ensemble can better capture reality than a single model due to the reduction of model-specific errors, as found for SST Yati and Minobe, 2021) and for sea-surface height (Widlansky et al., 2017;Long et al., 2021). Furthermore, the reduction of model-specific errors can lead to a better estimation of prediction uncertainty, which can be useful for applications using predictions.
For marine ecosystem forecasts based on physical predictions, some bottlenecks and gaps need to be resolved. In order to address those problems, coordination across institutes is needed, and a large body of research is required. Thus, researchers, managers, and funding agencies need a strategy to work across climate and oceanographic disciplines in pursuit of the larger goal. The purpose of this perspective paper is twofold: (1) to review major ongoing activities related to climate predictions at subseasonal to decadal lead times, and (2) to outline new research opportunities for marine ecosystem forecasting.

PRESENT STATUS OF PREDICTIONS OF OCEANIC PHYSICAL CONDITIONS
In this section, we review how predictions of oceanic physical conditions, which are the basis of marine ecosystem prediction, are conducted from subseasonal (two weeks to two months), seasonal (from two months to one year), and to multiannual (from a year to ten years) lead times, including information on publicly available oceanic variables ( Table 1). As noted above, our focus here is on physical predictions obtained using GCMs. Other prediction products may be suitable for some applications but are outside the scope of this paper (for example, forecasts produced with statistical methods such as linear inverse models 2 and ocean-only model forecasts such as 10-day ocean weather forecasts around Japan 3 ).
The subseasonal to seasonal (S2S) prediction project of the World Climate Research Programme (WCRP) (Vitart et al., 2017) provides S2S prediction datasets for forecast lead-times up to 60 days ( Table 1). The data are available at the European Centre for Medium-Range Weather Forecasts (ECMWF) 4 and at the Chinese Meteorological Administration (CMA) 5 . Several modeling centers participating in the project provide various oceanic variables as 2D outputs, including SST, sea-surface Frontiers in Marine Science | www.frontiersin.org August 2022 | Volume 9 | Article 855965 Ecological Forecasting Bottlenecks and Recommendations salinity, surface currents, sea-surface height, mixed-layer depth, and 0-300 m averaged temperature and salinity. For other subseasonal-to-seasonal prediction projects, currently SST is the only oceanic variable made publicly available. Those projects include the Subseasonal Experiment (SubX) project , the North American Multi-Model Ensemble (NMME) (e.g., Becker et al., 2014;Kirtman et al., 2014), and seasonal prediction by the Copernicus Climate Change Service (C3S) ( Table 1). However, it should be noted that ocean output from CFSv2 seasonal forecasts, which can be obtained from NOAA 6 , includes a suite of 2D variables (temperature, salinity, and currents at fixed depths; isotherm depths, sea-level height, 0-300 m heat content) as well as 3D fields (temperature, salinity, and horizontal and vertical velocities) at monthly-mean or 6-hourly resolution.
Multiannual prediction, which is often called "decadal prediction" (Boer et al., 2016), is in its experimental stage. The first systematic collection of multiannual predictions was conducted in the context of the Climate Model Intercomparison Project Phase 5 (CMIP5) (Taylor et al., 2012) and has been enhanced in CMIP6 (Eyring et al., 2016). The data of CMIP5 and CMIP6 are available via the Earth System Grid Federation (ESGF) 7 . In CMIP6, multiannual prediction is coordinated under the Decadal Climate Prediction Project (DCPP), and DCPP 7 https://esgf-node.llnl.gov/projects/esgf-llnl/ (access May 1, 2022).
A B C FIGURE 1 | Typical workflows of biological forecasts based on physical GCM predictions. The squares indicate systems that produce predictions, downscaling or biological estimation, whereas the parallelograms indicate the outputs from the systems. Panel (A) and panel (B) indicate the workflows with dynamical and statistical downscalings, respectively, using outputs of a single GCM prediction, and panel (C) indicates the workflow using outputs of multiple GCMs with statistical downscaling. The stacked parallelograms in the middle of panel (C) indicate downscaling of different GCM predictions. "BGC" in the figure indicates information about biogeochemistry and lower-trophic level biology. Downscaling can be skipped if GCM predictions are of adequate resolution and acceptable bias. The physical prediction is assumed to be conducted by GCMs, but it can be made by ESMs, which also include BGC. experiments contain both reforecasts (i.e., forecasts simulated for a retrospective period; called dccpA) and near real-time forecasts (called dccpB) (Boer et al., 2016). Early evaluations of multiannual prediction skill have found that it mainly arises from initial conditions in the first few years, and at longer lead times is associated with the forced response to climate change, especially for temperatures (e.g., Branstator and Teng 2010;Yeager et al., 2018). CMIP6/DCPP provides a suite of 2D ocean variables as well as 3D ocean temperature, salinity and currents, all available as monthly or annual means. Recently, the World Meteorological Organization has established the Lead Centre for Annual to Decadal Climate Prediction, which annually issued a Global Annual to Decadal Climate Update 8 . The latest report documented forecasts for the target years from 2022 to 2026 (see also Hermanson et al., 2022).
Most of these prediction projects use GCMs, but a few modeling centers use Earth System Models (ESMs), i.e., GCMs coupled with a biogeochemical and lower-trophic ecosystem model. The Seasonal-to-Multiyear Large Ensemble (SMYLE) (Yeager et al., 2022) and the Decadal Prediction Large Ensemble (DPLE) (Yeager et al., 2018), both produced by the National Center for Atmospheric Research (NCAR), use the Community Earth System Model (CESM). The outputs of DPLE are publicly available on the NCAR web site 9 . Also, biogeochemical and biological variables for near real-time multiannual prediction under DCPP (dcppB) are available for one model (for the Canadian Earth System Model version 5) and for six models for reforecast (dcppA) at the present 10 .
In addition to considering the availability of data, it is also helpful to know whether a subset of the data for a selected region can be easily downloaded. The downloading of a selected region is possible for the C3S seasonal forecast data using the Application Programming Interface and for the SubX, NMME, and CFSv2 data via Open-source Project for a Network Data Access Protocol (OPeNDAP); but such an option is not available for the S2S and CMIP6/DCPP data.
This summary of available output from global climate forecast systems highlights both the considerable potential in ongoing efforts and several major bottlenecks for marine ecosystem prediction, specifically the availability of already-computed data and the ability to download them efficiently.

NEW OPPORTUNITIES
A wide range of new studies needs to be conducted to successfully develop marine ecological forecasts built on physical predictions. These studies can be broadly divided into two main categories: physical downscaling and biological prediction (Figure 1). The physical downscaling can be viewed as an intermediate goal that can be undertaken by physical researchers. An appropriate intermediate goal will allow researchers of physical oceanography to publish their own papers and obtain funding as principal investigators, and these prospects are important to attract young researchers (Minobe, 2014). To highlight the many specific opportunities for research in physical and biological aspects of marine ecosystem forecasting, we describe them separately below. But of course, even research in specialized areas can benefit from interdisciplinary collaboration.

Physical Research
GCM prediction skill should be examined for various oceanic variables that are useful for biological forecasts, because skillful ocean predictions of quantities that drive biological models are needed to achieve skillful ecological forecasts. To date, oceanic prediction skill has been examined mainly for SST (e.g., Becker et al., 2014;Doi et al., 2019;Hervieux et al., 2019) including marine heatwaves , sea-surface height (e.g., Widlansky et al., 2017;Long et al., 2021;Shin and Newman, 2021;Amaya et al., 2022), and upper-layer temperatures (e.g., Yeager et al., 2018;Doi et al., 2020), because these variables are important in describing physical climate variability and relatively easy to evaluate with observation-based products. However, for biological predictions, other variables (e.g., mixed-layer depth, upwelling, salinity, bottom temperature, vertical profiles of temperature and density) can also be important, as they impact nutrient availability and the habitat of marine species, and they may be associated with a higher degree of predictability (e.g., Siedlecki et al., 2016;Capotondi et al., 2019a). Furthermore, physical predictions from large ensembles and multi-model ensembles should be examined for their use in biological prediction. Recent studies identified an interesting bias in climate prediction systems known as the "predictability paradox" or "signal-to-noise paradox" (Eade et al., 2014;Dunstone et al., 2016;Smith et al., 2020). The basic idea of ensemble prediction is that the reality can be viewed as one member of an ensemble, and thus the difference between the ensemble mean and reality (as approximated by observations) should be similar to the differences between the ensemble mean and each ensemble member. However, when the predictability paradox occurs, the ensemble mean is more similar to reality than to other ensemble members. In this case, averaging over large ensembles is helpful to obtain a better prediction than those from smaller ensembles. To increase the size of ensembles, it is generally effective to use output from multiple models, and as discussed in Section 1, the use of multiple models also has the effect of reducing problems specific to individual models. Therefore, we suggest that using outputs of large ensembles from multiple GCMs can also have advantages for marine biological prediction (Figure 1C), and this possibility should be explored.
For regional marine ecosystem prediction efforts, the resolution of the global ocean models may not be sufficient, and thus dynamical or statistical downscaling of the predicted data at higher spatial resolutions may be necessary. Dynamical downscaling is used for J-SCOPE, as mentioned above, and various machine learning techniques are used for statistical downscaling (Stengel et al., 2020;Kashinath et al., 2021). Statistical downscaling schemes can be constructed using observations at specific sites together with coarse outputs of numerical models, but they can also be built on the results of dynamically downscaled data from the coarse model outputs (Jacox et al., 2020). The latter approach should be especially useful for variables that are not well observed. Both dynamical and statistical downscaling should be investigated in detail, as they have advantages and limitations. Dynamical downscaling can provide a complete representation of the ocean at the needed resolution, but will inherit the biases of the climate model that was used for the lateral boundary conditions and the surface forcing.
The skill of downscaled forecasts should be compared to that of GCMs to quantify the added value of the downscaling procedure. Relative to dynamical downscaling, statistical downscaling can better capture observed relationships, but may be limited by data availability. Furthermore, dynamical downscaling is much more computationally expensive and slower than statistical downscaling, which can be important considerations for operational biological forecasting. Thus, depending on the specific application, different approaches may be more suitable.

Biological Research
It is important to identify which ecological variables are promising targets for prediction. The target for prediction should be relevant to species valued by society and should be sensitive to physical conditions. Candidates of target species can be identified by examining the statistical relationships between physical conditions and marine ecosystem status using the observational data as a first step. A classic example is the relation between the Pacific Decadal Oscillation (PDO) and Pacific salmon catches (Mantua et al., 1997), and that between the PDO and the Japanese sardine population (Yasuda et al., 1999). A more systematic approach using principal component analysis of a large number of marine ecosystem indicators reported that many species are influenced by climate variability and change in the North Pacific and adjacent seas (Hare and Mantua, 2000 ;Tian et al., 2006;Litzow and Mueter, 2014;Ma et al., 2019;Yati et al., 2020) and in the northeast Atlantic (Brunel and Boucher, 2007). Of course, such statistical analysis can only identify correlations, which do not necessarily mean causality, and the relationships may be nonstationary. Furthermore, a causal relationship is not enough for prediction, because if physical conditions that influence marine species are unpredictable, then biological targets are also not predictable (Brodie et al., 2021).
For the potential marine ecosystem targets, prediction skill should be assessed by conducting retrospective forecasts (i.e., reforecasts) evaluated against observations. Reforecasts using global ESMs have demonstrated meaningful prediction skill with lead times of a year or more for certain regions and variables, including surface pH (Brady et al., 2020), ocean carbon uptake (Lovenduski et al., 2019), aragonite saturation state (Yeager et al., 2022), chlorophyll and net primary productivity (Séférian et al., 2014;Rousseaux and Gregg, 2017;Krumhardt et al. 2020), and even annual fish catch in large marine ecosystems (Park et al., 2019). The availability of the biogeochemical model output is very important for better understanding the links between physics and biology in the model context. These links are still poorly constrained by observations due to the sparsity of biogeochemical data (Turi et al., 2018), highlighting the need for expanded and sustained biogeochemical observational networks in support of biological prediction efforts (Capotondi et al., 2019a).
Ecological predictions are also challenged by shifts in the relationships between physical conditions and biological responses through time and with population sizes. Empirical relationships that appear robust for several years can decay over time (Myers, 1998;Deyle et al., 2013). Changes in climate conditions at the basin scale have been linked to shifts in relationships between local physical properties and fisheries recruitment (Litzow et al., 2019). An understanding of the underlying ecological mechanisms that explain empirical relationships between physical and biological properties and how those mechanisms may change over time is necessary to increase confidence in ecological predictions. Frontiers in Marine Science | www.frontiersin.org August 2022 | Volume 9 | Article 855965 In the coming years, regional biological reforecasts should be widely examined. Biological reforecasting is a relatively new study area, but lessons learned from climate reforecasts can be useful to guide similar efforts for biology: 1. Forecast skill should be examined for anomalies, i.e., the differences between forecasted raw values and the forecasted mean seasonal cycle (or climatology). If the forecast skill is examined with raw values, then the skill is likely to be dominated by a seasonal cycle rather than the interannual variability of interest. Furthermore, since model biases tend to grow due to model drift at longer lead times, model climatologies should be lead-time dependent. 2. For the estimation of the statistical significance of a metric, it is important to take into account the serial correlation of the data to be examined. For example, if there are annually sampled predicted and observed data for a period of N -years, and the respective time series have auto-correlation at a oneyear lag of. r a and r b , then the effective degrees of freedom of the data for the Pearson's correlation can be estimated as N (1−r a r b )/(1+ r a r b ) (Bretherton et al., 1999). The influence becomes strong when the lag-1 autocorrelation is large. If the lag-1 autocorrelation is 0.6 (0.3) then the effective degrees of freedom are 47% (83%) of the original data samples. The serial correlation is not generally considered in widely used software packages or libraries. Therefore, the p-value obtained by such packages is inappropriate when the serial correlation cannot be ignored. In any case, how degrees of freedom were estimated should be clarified. 3. The separation of training and verification data, known as "crossvalidation", is crucial for assessing the performance of statistical estimation (e.g. Arlot and Celisse, 2010). For example, in V-fold cross validation, all data are divided into v "folds, " the prediction model is trained using data of v-1 folds, the remaining one fold is used for validation, and the fold to be used for validation is successively changed. Cross-validation is especially important when using a statistical or machine-learning technique that can substantially overfit the training data. 4. Ensembles of prediction should be used appropriately. Since the biological responses to physical conditions may be nonlinear, it is desirable to use the individual ensemble members of physical forecasts to drive biological models, rather than using the ensemble mean of the physical forecasts. A large ensemble size is especially useful to evaluate the probability of extreme events and to evaluate whether the predictability paradox occurs as mentioned above. Furthermore, if biological estimation involves uncertainty in the variables other than the physical prediction, using ensembles for these variables may be useful to understand the uncertainty originating in biological process.

SUMMARY AND RECOMMENDATIONS
Existing subseasonal to decadal climate predictions can potentially be very valuable for the prediction of marine ecosystems. However, availability of forecast output for ocean conditions is generally rather poor; in particular, most seasonal prediction systems do not provide ocean variables other than SST. While there are technical hurdles to providing additional large datasets, one of the reasons for the limited data availability may be that the modeling centers do not see enough demand for those data to be shared. We suggest that the demand actually exists, and we make two recommendations for enabling the uptake of physical forecasts in marine ecosystem prediction: 1) Climate prediction projects should make more ocean prediction data available to the research community. Making an expanded suite of 2D variables available, as done by the S2S project and CFSv2, would be a good starting point. Some currently unavailable 2D variables, such as eddy kinetic energy, could be useful for marine biological forecasts (e.g., Brodie et al., 2018), and thus they would be candidates of variables to be made available in the future. 2) Enable users to download data for selected regions. This capability is useful for a wide range of users who may be interested in specific regions and greatly reduces user requirements for data downloading, storage, and processing.
Combining physical and biological disciplines with the common goal of improved marine ecosystem prediction will be a fruitful area of research with clear applications to society. To facilitate this research area, it would be useful to develop a set of case studies for biological prediction. For example, a massive Northeast Pacific marine heatwave in 2013-2016 involved compound extremes of a heatwave, a low-oxygen extreme, and an ocean acidity extreme (Gruber et al., 2021). While the forecast skill and predictability of SST anomalies during this event have been explored (e.g., Hu et al., 2017;Jacox et al., 2019;Capotondi et al., 2019b;Capotondi et al., 2022), further research could investigate how much these various co-occurring extremes and their impacts on the marine ecosystem could be predicted. Since a regional phenomenon is generally studied by regional researchers, other case studies distributed across the global ocean can be identified to attract the international community's interest.
As the global community increasingly recognizes the sensitivity of marine ecosystems to climate variability and change and the potential consequences to human society, the time is ripe to enhance forecasts of marine ecosystems by pursuing the strategies proposed here. Such efforts are gaining some attraction at the international level. For example, the North Pacific Marine Science Organization (PICES) is in the process of establishing a new working group on "Climate Extremes and Coastal Impacts in the Pacific" with a focus on climate and marine ecosystem predictions 11 . Furthermore, the United Nations Decade of Ocean Science for Sustainable Development (UN Ocean Decade) has been launched for the 2021-2030 decade. The overarching theme of the UN Ocean Decade is "The Science We Need for the Ocean We Want." One of its seven expected outcomes is "A Predicted Ocean." 12 11 https://meetings.pices.int/members/working-groups/wg49 (access January 15, 2022). 12 https://www.oceandecade.org/vision-mission/ (access January 10, 2021).
To know what ocean we can have in the future, the capability of marine biological predictions is essential.

DATA AVAILABILITY STATEMENT
The original contributions presented in the study are included in the article. Further inquiries can be directed to the corresponding author.

AUTHOR CONTRIBUTIONS
SM wrote the first draft, and all authors actively contributed to its improvement. All authors contributed to the article and approved the submitted version.

FUNDING
SM was supported JSPS KAKENHI Grant Numbers JP18H04129, JP19H05704 and MN was by JP19H05701. MGJ was supported by NOAA Southwest Fisheries Science Center, RRR was supported by NOAA Pacific Islands Fisheries Science Center, and AC was supported by the NOAA Climate Program Office Modeling, Analysis, Predictions and Projections (MAPP) program.