A Dynamic Statistical Subseasonal Forecast Model for OLR Over Tropical Pacific Region

How to improve the subseasonal forecast skills of dynamic models has always been an important issue in atmospheric science and service. This study proposes a new dynamical-statistical forecast method and a stable components dynamic statistical forecast (STsDSF) for subseasonal outgoing long-wave radiation (OLR) over the tropical Pacific region in January-February from 2004 to 2008. Compared with 11 advanced multi-model ensemble (MME) daily forecasts, the STsDSF model was able to capture the change characteristics of OLR better when the lead time was beyond 30 days in 2005 and 2006. The average pattern correlation coefficients (PCC) of STsDSF are 0.24 and 0.16 in 2005 and 2006, while MME is 0.10 and 0.05, respectively. In addition, the average value of PCC of the STsDSF model in five years is higher than MME in 7–11 pentads. Although both the STsDSF model and MME show a similar temporal correlation coefficient (TCC) pattern over the tropical Pacific region, the STsDSF model error grows more slowly than the MME error during 8–12 pentads in January 2005. This phenomenon demonstrates that STsDSF can reduce dynamical model error in some situations. According to the comparison of subseasonal forecasts between STsDSF and MME in five years, STsDSF model skill depends strictly on the predictability of the dynamical model. The STsDSF model shows some advantages when the dynamical model could not forecast well above a certain level. In this study, the STsDSF model can be used as an effective reference for subseasonal forecast and could feasibly be used in real-time forecast business in the future.


INTRODUCTION
The atmosphere is a complex nonlinear giant system with external forcings and internal dissipations. Under the impact of external forcing factors, such as the sun, ocean, land, and human activities, a series of physical and chemical changes and interactions occur in the atmosphere, which brings great difficulties to weather and climate prediction [1][2][3]. A small error can be dramatically amplified over time, which allows for a predictable limit on the weather forecast. Moreover, some studies suggest that the chaotic effects could prove the predictable limits of weather systems in theoretical ways [4,5]. It is now accepted that the upper limit on the predictability of actual weather systems is approximately 2 weeks [6,7]. However, studies have shown that the predictability of some largescale components is significantly higher than that of small-scale components, which could be more than 2 weeks [8,9]. At the same time, some low frequency weather systems with a duration of more than 2 weeks were found in the observations.
Based on the theory of predictability, the prediction error of dynamic models mainly results from the following aspects: the initial error of observation data, the error caused by the set scheme in the dynamic model simulation, and the systematic error of the model in the single variable simulation [10]. The initial error of the observation data is mainly caused by the inaccuracy of the observation value and the uneven distribution of observation stations. The error of set scheme in dynamic mode simulations is mainly caused by the numerical set in order to deduct the random error. The systematic error of the single variable simulation is mainly caused by the performance of the model itself, such as the model resolution, parameterization scheme for the physical processes, and the calculations of the discrete numerical difference scheme [11][12][13]. The reasons described above limit the forecast accuracy of dynamic models and are the main obstacles to the further improvement of the subseasonal forecast.
Given the shortcomings of error growth theory based on the linear framework, Chen et al. and Ding et al. developed a new theory of nonlinear error growth to measure the predictability of atmosphere and the nonlinear local Lyapunov exponent (NLLE), revealing the temporal and spatial distribution of the predictable duration in different weather and climate variables [14,15]. Theoretical studies show that climate variables are composed of climate signals and noises, and climate signal is mainly affected by the external boundary conditions of the climate system [16,17]. For example, the heat capacity of subsurface water during an El Niño event can significantly enhance the East Pacific tropical cyclone [18]. When the proportion of climate signals in variables is large enough to overcome the destructive effects of noises, climate anomalies may show a certain degree of predictability [19]. Observational studies show that there are some slow changing processes in atmospheric evolution above the level of weather noises, these slow changing processes are associated with large-scale atmospheric motions and a timescale of several weeks, which is much longer than that estimated by nonlinear hydrodynamics [20][21][22]. There are still some predictable meteorological characteristics in the subseasonal time scale, and the atmospheric stable component can be robustly predicted [23,24].
Although the development of extended period forecast skills is remarkable, its technical difficulty can not be covered up. On the one hand, it is difficult to use commonly used methods to predict the intensity and duration of the weather process objectively. On the other hand, the ability of the daily weather forecast depends on the increase of initial error by chaotic motion in the atmosphere [25,26]. According to the chaotic characteristics of the atmospheric system, Chou et al. expounded the extraction method of the predictable components in the 10-30 days extended period, and further suggest that different strategies and methods should be adopted for predictable components and random components [27]. However, if each time step needs to be checked in the actual forecast process, the computation expenses will be very large. Ren et al. proposed a similar evolution method, assuming that similar initial conditions have similar prediction errors in the allowable time range [28]. By synthesizing and analyzing the prediction errors of historical similar initial conditions, the current prediction errors are estimated. The calculations are greatly simplified and more easily incorporated into numerical prediction models. Zheng et al. discussed the properties of predictable components in the extension period and their application in numerical models [29].
Based on predictable components theory, the model subseasonal forecasts can be divided into two parts: the predictable component and the random component. The numerical model error can also be reduced in the process of integration by using historical observation data. Therefore, it is obvious that the method exists on model dependence. For different numerical models, the model performance is different, and the definition and extraction of predictable components are different too. Focusing on the prediction of weather or climate at different time and spatial scales, the stable components should be especially investigated. To improve subseasonal process diagnosis and prediction skills, it is important to grasp the main characteristics of the subseasonal stable component [30,31]. Under the same initial and exogenous forcing characteristics, the atmospheric system with a large-space time scale is more predictable. Therefore, it is necessary to distinguish the stable components from atmospheric circulation on a subseasonal scale and analyze the properties and improve subseasonal prediction skills by using stable components [32,33]. Moreover, because the extraction method of stable components is based on historical observation data instead of model data, it can overcome the shortcoming of model dependence.
Subseasonal precipitation prediction is one of the most important aspects of weather forecasting [34,35]. The tropical area accounts for about half of the total global area, of which the ocean accounts for about 3/4. The solar radiation energy received in the tropical area is much more than that be reflected into space, which makes the tropical ocean a vital energy source for global atmospheric circulation movement [36,37]. In addition, the tropical ocean is one of the most important water vapor sources for the global atmosphere [38,39]. Due to the abundant water vapor contents in tropical marine areas, precipitation generation is closely related to atmospheric vertical convection intensity. The intensity of precipitation can be characterized by OLR in the tropics. If the subseasonal variability of OLR can be forecasted well, the tropical precipitation forecast skills will be robust [40,41]. Therefore, this study mainly uses a dynamic statistical forecast model (STsDSF) to improve the subseasonal forecast skills of 11 advanced multi-model ensembles for OLR over the tropical Pacific region (140°E-100°W, 30°S-30°N) and discusses the predictability of STsDSF in January and February from 2004 to 2008. It proposes a new application of the dynamic-statistical method in subseasonal forecast.

DATA AND METHODOLOGY
The daily OLR data was published by the National Oceanic and Atmospheric Administration (NOAA) over the period from 1979 Frontiers in Physics | www.frontiersin.org January 2022 | Volume 9 | Article 665828 to 2008 with a horizontal resolution of 2.5°× 2.5° [42]. The external boundary conditional forcing of sea surface temperature (SST) is represented by Oceanic Niño Index (Niño 3.4), which is also published by NOAA. The hindcast data of the dynamical model is from Intraseasonal Variability Hindcast Experiment (ISVHE), which is jointly supported by the Asian Pacific Climate Center (APCC), NOAA, Climate Variability and Predictability (CLIVAR) Asian-Australian Monsoon Panel, and some other organizations [43]. The model products of ISVHE have been studied at the predictability of intraseasonal variability as a whole [44,45]. In total, ten one-tier hindcasts and one two-tier hindcast from ISVHE were used in this study, including the Australian Bureau  [46][47][48][49][50][51]. The details of the model data and operation scheme are briefly shown in Table 1. Because the skill of MME is better than that of any single model in seasonal forecasting [52,53], the improvement of STsDSF is mainly based on the MME forecast. In this study, five daily OLR forecast cases were selected. The start time of each was January 1 every year from 2004 to 2008.
STsDSF is a hybrid dynamical-statistical method for subseasonal prediction, which consists of four steps 1) distinguish the stable components from climatic state vector by training daily contribution rate of variance and persist time 2) divide the dynamical model output into predictable components and random components by projection 3) forecast predictable components in dynamical simulation and forecast random components in statistical estimation 4) make ensemble predictands for dynamical simulation and statistical estimation. Figure 1 is the schematic diagram of the STsDSF model, and the details of the STsDSF model have been introduced in [32,33]. In this study, the Niño 3.4 index is considered as a similarity criterion that represents the external boundary conditions of tropical SST, and three similar years chosen for statistical estimation in the STsDSF model from 1979 to 2003.
In this study, the subseasonal forecast skill of OLR is measured by PCC and TCC. The PCC formula is expressed as follows: where X f , X v , and X c are the predicted value, observed value, and climate mean, respectively; M f,c and M v,c represent the deviation means of X f and X v from X c , respectively; n represents the number of grids in the chosen area, and cosφ i is the latitude of grid i.
The TCC formula is expressed as follows: where X f and X v are the predicted value and observed value; X f and X v represent the means of X f and X v , respectively; and n represents the number of times. Every grid TCC is calculated by formula (2).

SUBSEASONAL FORECAST SKILL OF MME AND STSDSF MODEL
For subseasonal forecast products, the development trend is more reliable than daily outputs [54,55]. In this study, five-day mean PCCs for both MME and STsDSF are calculated as 12 pentads. The comparison between MME PCC and STsDSF PCC shows that the trend of STsDSF is similar to MME. However, the variance of MME PCC is greater than STsDSF PCC. For example, MME PCC ranges from 0.80 to -0.18, while STsDSF PCC ranges from 0.36 to -0.15 in 2004 (Figure 2A). The reason for this is that MME is effective at simulating OLR at the start time, but the accuracy drops quickly over time. According to five OLR subseasonal forecasting cases from 2004 to 2008 (Figure 2), the PCC of STsDSF was higher than that of MME when the lead time was beyond 6 pentads. During 7-12 pentads, the average value PCC of STsDSF is 0.24 and 0.16 in 2005 and 2006, verses 0.10 and 0.05 from MME, respectively. For instance, the PCC of STsDSF is higher than that of MME during 7-12 pentads in 2005 and 7-11 pentads in 2006 ( Figures 2B,C). In 2007, the PCC of STsDSF is higher than that of MME in 9-11 pentads ( Figure 2D), while the forecasting skill of STsDSF is worse than MME in 2008 ( Figure 2E). The possible reason may be attributed to the PCC of the MME forecast being very accurate compared to other cases, which means MME PCC drop slowly over time, and there is no capacity for improvement by STsDSF in 2008. The OLR forecast skill of STsDSF depends on the performance of the model simulation. If the model can reasonably capture the  climatic patterns on a subseasonal scale, the forecast skill of STsDSF will be reduced.
Different from 1-6, the STsDSF model shows some advantages in 7-12 pentads. The five-year mean PCC of the STsDSF model is compared with MME for OLR in 7-12 pentads in Figure 3. The PCC of the STsDSF model ranges from -0.03 to 0.20, and the PCC of MME ranges from 0.10 to 0.18. Besides 12 pentads, the STsDSF model is more skillful than MME in 7-11 pentads. Therefore, the numerical simulation for the five consecutive year average also agrees with the conclusion above. Only when the skill of MME reduces to a certain level, the improvement of MME by STsDSF can show some advantages.
The standard deviation of STsDSF PCC is less than that of MME PCC (Figure 3), which indicates that the STsDSF model is more stable. The reason is that the principle of the STsDSF model is to distinguish model predictable information based on climate state and to replace random information with climate analogue. For the subseasonal OLR forecast over tropical Pacific region after 30 days, statistical methods still have the potential to surpass the dynamic model. To further investigate the STsDSF model performance in subseasonal time scale pentad by pentad, in the next section of the article, a case study from 2005 is presented to analyze the characteristics of OLR forecast error over the Pacific region.

SUBSEASONAL FORECAST OF STSDSF MODEL AND MME IN 2005
In this section, a forecast case covering January-February 2005 is selected as an example for specific analysis. TCC is a common way to investigate the predictability distribution. The TCC of the STsDSF model and MME is shown in Figure 4. The red area indicates that the forecasting skill is high, and the blue area indicates that the forecasting skill is low. As shown in Figure 4B, MME forecasts well in most tropical regions, with TCC exceeding 0.993. Only in two meridional zonal regions (180°and 120°W), the TCC is lower than 0.989 and the distribution patterns of the two regions are very similar. The corresponding TCC of the STsDSF model in Figure 4A shows that the overall distribution pattern is the same as that of MME, and the values are close. However, the TCC of the STsDSF model is more complete and more continuous in the blue areas, and the TCC of MME is more dispersed. Comparing Figure 4A with Figure 4B, the overall forecast skill pattern has no great change in the subseasonal period, and the distribution characteristics of MME are retained in the STsDSF model.
The variation of MME forecast error for OLR in January-February 2005 is exhibited pentad by pentad ( Figure 5). MME can capture the movement of OLR very well in pentad 1 ( Figure 5A). During pentad two to five, the forecast error increases to 60 W/m 2 , and the discrete anomaly is mainly distributed in the southwest and northeast of the tropical Pacific ( Figures 5A-E). During pentads 8-12, the forecast error increases to 80 W/m 2 , and the error region are complete and continuous. Overall, MME forecast error is mainly negative anomalous, and the skill is poor over southwest and northeast of the tropical Pacific. MME can capture the patterns of OLR over the tropical Pacific in the beginning, and the error grows gradually over time, which is consistent with the PCC changes in the above analysis ( Figure 2).
Different from the error distribution of MME for OLR subseasonal forecasts, the forecast error of the STsDSF model shows larger negative anomalies in 1-4 pentads ( Figures 6A-D).
The error is smaller in equatorial and larger in the higher latitudes over the Pacific. In addition, the pattern and intensity of STsDSF model error change little during 5-12 pentads ( Figure 6E-l). Based on the diagnosis of OLR subseasonal forecast skill of STsDSF model and MME in January-February 2005, although STsDSF model error is larger than MME at the beginning, it grows slowly and shows some advantages in 8-12 pentad ( Figure 6H-l).

DISCUSSION AND SUMMARY
The predictability of atmospheric stable component mainly comes from three aspects: outer boundary condition forcing, a continuous component of low frequency wave, and atmospheric periodic component. Furthermore, these atmospheric components are statistically regular [56,57]. In this study, a new method called the STsDSF model is proposed, to improve the subseasonal dynamic model for OLR over the tropical Pacific region in January-February from 2004 to 2008. We compare subseasonal forecast skills of the STsDSF model and MME of 11 advanced models and analyze the growth characteristics of subseasonal forecast error.
The conclusions of this study reveal that the performance of the STsDSF model is largely determined by the forecasting skill of the dynamical model. The STsDSF model only provides additional improvement information when the dynamical model output incredible data. According to five OLR subseasonal forecasting cases from 2004 to 2008, the PCC of STsDSF will be higher than that of MME when the lead time is beyond 6 pentads. These results demonstrate that the STsDSF model can be used as an important reference for subseasonal forecasting after 30 days. The numerically simulated results for five consecutive years on average have a good agreement with the above discussion.
Moreover, the overall OLR TCC pattern of the STsDSF model is similar to that of MME, and the TCC values are close over the tropical Pacific. The change characteristics of MME are retained in the STsDSF model in time scale, as well as in the space scale. In addition, by comparing the subseasonal forecast error pentad by pentad, the error of the STsDSF model grows more slowly than that of the MME model in 12 pentads.
The STsDSF model only uses the initial condition and historical observation data to improve the subseasonal dynamical model and could feasibly be incorporated into existing business real-time forecast products. However, further studies are still needed. For example, we focus on the subseasonal forecast over tropical Pacific OLR in January-February 2004-2008. Further investigations of the STsDSF model, considering different meteorological elements, periods, and regions, needs more modeling studies. In this study, the similarity criterion of selecting historical data is Niño 3.4, considered as an important external boundary forcing signal.   Frontiers in Physics | www.frontiersin.org January 2022 | Volume 9 | Article 665828 7 However, Niño 3.4 is a Pacific index and can not stand for all SST forcings [58,59]. Therefore, a more comprehensive index considering different weights for different ocean forcings could be developed. In addition, when we analyze the subseasonal forecast skill of the STsDSF model, considering El Niño-Southern Oscillation (ENSO) cycle in interannual timescale, the relationship between subseasonal forecast skill and ENSO cycle is intriguing, a new story which could be further discussed in the future.
The hindcast skill of ISVHE performs better than the business model, so it needs to be further verified whether the improvement effect of the STsDSF model is different for the dynamical model forecasts with different forecasting skills. Moreover, due to the limitation of the time span of dynamical model data, we only select cases of five consecutive years for comparative analysis. Therefore, more simulation experiments are needed in further studies.

DATA AVAILABILITY STATEMENT
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.