Initialization shock in the ocean circulation reduces skill in decadal predictions of the North Atlantic subpolar gyre

SPG for a time horizon of 6-10 years. Full-ﬁeld initialized predictions with low AMOC drift show better SPG skill than those with a large AMOC drift. Nevertheless, while the anomaly-initialized predictions do not experience large drifts, they show low SPG skill when skill also present in historical runs is removed using a residual correlation metric. Thus, reducing initial shock and model biases for the ocean circulation in prediction systems might help to improve their prediction for the SPG beyond 5 years. Climate predictions could also beneﬁt from quality-check procedure for assimilation/initialization because currently the research groups only reveal the problems in initialization once the set of predictions has been completed, which is an expensive effort.

Due to large northward heat transport, the Atlantic meridional overturning circulation (AMOC) strongly a ects the climate of various regions.Its internal variability has been shown to be predictable decades ahead within climate models, providing the hope that synchronizing ocean circulation with observations can improve decadal predictions, notably of the North Atlantic subpolar gyre (SPG).Climate predictions require a starting point which is a reconstruction of the past climate.This is usually performed with data assimilation methods that blend available observations and climate model states together.There is no unique method to derive the initial conditions.Moreover, this can be performed using fullfield observations or their anomalies superimposed on the model's climatology to avoid strong drifts in predictions.How critical ocean circulation drifts are for prediction skill has not been assessed yet.We analyze this possible connection using the dataset of decadal prediction systems from the World Meteorological Organization Lead Centre for Annual-to-Decadal Climate Prediction.We find a variety of initial AMOC errors within the predictions related to a dynamically imbalanced ocean states leading to strongly displaced or multiple maxima in the overturning structures.This likely results in a blend of what is known as model drift and initial shock.We identify that the AMOC initialization influences the quality of the SPG predictions.When predictions show a large initial error in their AMOC, they usually have low skill for predicting internal variability of the

Introduction
The Atlantic meridional overturning circulation (AMOC) transports a large amount of heat to the North Atlantic, affecting climate in the Euro-Atlantic sector.Total poleward heat transport constitutes approximately 0.4-0.6PW (zonally accumulated heat transport over the full water column starting from the Greenland coast toward Scotland; Böning et al., 1996;Zhao et al., 2018).Previous studies have shown that the AMOC and its slow oscillation exhibit strong predictability when perturbing its initial conditions (Collins et al., 2006), and as such, is believed to be a source of predictability in decadal climate predictions (DCPs).For instance, the warming of the North Atlantic subpolar gyre (SPG) in the late 1990s has been attributed to the increased northward heat transport due to a strong AMOC (Robson et al., 2012;Williams et al., 2014;Yeager and Danabasoglu, 2014) and the subsequent cooling in the 2000s associated with the AMOC slow down (Hermanson et al., 2014).Severe winter conditions in 2010 and 2011 over northwestern Europe were associated with the interannual variability of the AMOC (Bryden et al., 2014).Understanding predictability of the AMOC under climate-change conditions is of great interest for narrowing down the uncertainty in climate-change projections (Swingedouw et al., 2022).Earlier studies of potential predictability suggest that the AMOC is predictable up to two decades ahead or even longer under certain conditions (Griffies and Bryan, 1997;Collins and Sinha, 2003;Pohlmann et al., 2004).More recent DCP studies, which compare predictions with the reconstructions of the recent past climate from the data assimilation products, show a different range of predictability for different models from 4 to 10 years (Matei et al., 2012;Persechino et al., 2013;Swingedouw et al., 2013;Polkova et al., 2014;Mignot et al., 2016;Yang et al., 2021).Due to the lack of long enough basin-wide and deep-ocean observational data, the actual prediction skill for the AMOC is difficult to assess.The best assessment that we can currently perform is against the data assimilation products, e.g., ocean reanalyses, but they poorly agree with each other (Karspeck et al., 2015;Jackson et al., 2019).
DCPs aim to provide the most accurate predictions of the near-term climate.In addition to the external forcing, they utilize the knowledge of the observed climate state.The components of the Earth System Models that are usually synchronized with the recently observed climate state are the ocean and the atmosphere (Meehl et al., 2014;Boer et al., 2016).Mostly, the studies are interested in predicting user-relevant climate indices for their further application in climate services (Marotzke et al., 2016;Smith et al., 2019;Solaraju-Murali et al., 2022).Several studies that have assessed the prediction skill for the AMOC from the DCPs reported some issues in the initialization of the AMOC (Smith et al., 2013;Kröger et al., 2017;Bilbao et al., 2021).For instance, Kröger et al. (2017) showed that errors generated during the full-field nudging procedure applied to the MPI-ESM led to artificial ocean heat transport (OHT) in the interior of the ocean.This, in turn, had an impact on the AMOC and led to large errors in surface temperature predictions in the North Atlantic region.Bilbao et al. (2021) showed that in the full-field initialized EC-Earth3 predictions, the SPG has poor skill due to an initialization shock which led to the collapse of the Labrador Sea convection and a rapid decline in the AMOC.So far, it has not been studied if AMOC initialization errors occur in other prediction systems and if they have an effect on the predictability of other climate variables as shown by Kröger et al. (2017).Moreover, there is not any study that has leveraged the multi-model ensemble to make general conclusions about the impact of initialization shock like our study aims to do.In terms of the terminology, it is worth mentioning that, theoretically, there is a distinction between initialization shock that results from an imperfect initialization method and model drift that occurs due to the existence of biases and is independent of initialization.Practically, these two types of errors are difficult to separate from each other (Mulholland et al., 2015), and sometimes the terms are used interchangeably.As we analyze the DCPs, we are interested to learn about both types of errors but in particular about those related to initialization that we hopefully can fix with more skillful initialization procedures.
Here, we examine DCPs contributing to the annually-issued World Meteorological Organization's (WMO's) Global Annual to Decadal Climate Update (Hermanson et al., 2022;www.wmolcadcp.org).To study the effects of initialization shocks, both predictions and their initial conditions (produced with dataconstrained assimilation runs) are needed.The assimilation runs are often considered as an intermediate step to produce initial conditions and usually do not receive attention in publications as much as the ocean or atmosphere reanalyses do.The quality of ./fclim. .the assimilation runs is thus usually indirectly confirmed by the prediction skill of the DCPs.However, carrying out a full set of retrospective and future DCPs is an expensive effort (one set of hindcasts accounts for about 6,000 model years), and thus, it would be useful to identify possible initialization issues at earlier stages, which might lead to reduction of predictions' quality.Specifically, we analyze AMOC initial shocks resulting from the initialization and data assimilation procedures.The North Atlantic is the region where initialization is reported to bring most of the improvement in DCPs (Boer et al., 2013;Meehl et al., 2014).In particular, the subpolar gyre exhibits longer skill than other regions of the ocean in terms of surface temperature and ocean heat content (Polkova et al., 2019).The SPG is an important region for the deep water formation (Rhein et al., 2011).Thus, we aim to identify possible links between the AMOC initialization aspects and the prediction skill at the ocean interface in the region of the North Atlantic subpolar gyre.

Materials and methods
The multi-model ensemble forecasts are a part of the operational decadal predictions hosted by the WMO and coordinated by the Met Office (Hermanson et al., 2022).The decadal predictions are produced by the WMO-designated Global Producing Centres and other contributing centers.The decadal predictions include DCPP-A experiments (Decadal Climate Prediction Project Component A; Boer et al., 2016) of the CMIP6 (Climate Model Intercomparison Project Phase 6).From this data set, we analyze near-surface air temperature and the AMOC stream function.In addition to the decadal predictions, the CMIP6 historical (uninitialized) simulations for near-surface air temperature from the same prediction systems are used to distinguish the impact of external forcing from that of initialization.
The DCPs were produced with the following prediction systems: MPI-ESM-LR, MIROC6, GFDL SPEAR, CanCM4, CAFE, FGOALS-f3-L, MRI-ESM2, NorCPM1, BSC EC-Earth3, CMCC-CM2-SR5, SMHI+DMI EC-Earth3, and HadGEM3-GC31-MM (hereafter DePreSys4).Except for CanCM4 which contributed to CMIP5, all other prediction systems are contribution to CMIP6.The DCPs are ensembles of 10-year simulations, except of those from MRI-ESM2 that represent 5-year simulations.Retrospective predictions are initialized every year.The starting month differs across the prediction systems (October, November or January).We analyze the prediction skill over the common initialization period for the retrospective DCPs (hindcast period) that is 1961-2018.Further details on DCPs are presented in Table 1 and under the System Configuration Information for Global Producing Centers at www.wmolc-adcp.org.
We analyze the effect of AMOC initialization shock on the prediction skill of the North Atlantic subpolar gyre (SPG).The SPG index is calculated as the average of the near-surface temperature anomalies (w.r.t. to climatology over the period 1991-2020) over the region 45-67 • N and 60-0 • W (Robson et al., 2018).The choice of the climatology period follows the WMO recommendations (https://community.wmo.int/en/wmoclimatological-normals).ERA5 (Hersbach et al., 2020) and RAPID (Bryden et al., 2014) are used as the reference (observational) datasets.We apply the anomaly correlation coefficient, root mean square error, and residual correlation (Smith et al., 2019) as the verification metrics.For residual correlation, residual predicted and observed time-series are generated.For this, the uninitialized ensemble mean is removed via linear regression from the predicted and observed time-series, respectively (see for details Smith et al., 2019).The residuals represent the variability that cannot be captured by the uninitialized simulations and their correlation represents the impact of initialization.To evaluate residual correlation for the SPG index, we use the historical simulations that are available for the following models: MPI-ESM-LR (10 members), MIROC6 (33 members), GFDL SPEAR (1 member), CanCM4 (9 members), MRI-ESM2 (10 members), BSC EC-Earth3 (14 members), CMCC-CM2-SR5 (1 member), DePreSys4 (4 members), NorCPM1 (30 members), and FGOALS-f3-L (3 members).Two models, BSC EC-Earth3 and SMHI+DMI EC-Earth3, share the same historical simulations.For all the former models, in the diagnostics of residual correlation, we removed in the DCPs the external forcing signals estimated from the historical simulations stemming from the same model.For CAFE, since we do not have the corresponding historical simulations, we used the 115 members multi-model ensemble mean of the historical runs from the available models.

. Prediction skill for the SPG
We analyze predictions of the SPG index from the 12 DCP experiments (Figure 1).Overall, most DCPs predict the observed warming in the 1990s and cooling in the 2010s.Some models struggle with predicting the warming anomalies in the recent observed decade (e.g., GFDL SPEAR and NorCPM1).In addition, in the early initialization period of the 1960s-1980s, half of the prediction systems show lower SPG temperature anomalies than the ERA5 atmospheric reanalysis suggests (i.e., GFDL SPEAR, MRI-ESM2, BSC EC-Earth3, CMCC-CM2-SR5, SMHI+DMI EC-Earth3, and DePreSys4).This might have a relation to rather sparse observational records for temperature and salinity profiles that are used for initialization for that period (de Boisséson et al., 2018).Interestingly, GFDL SPEAR which only implements ocean surface initialization also shows lower SPG temperature anomalies than ERA5.Another reason could be that the DCPs overestimate warming trend.Since the climatological period for bias-correction is 1991-2020, DCPs with a too large trend would appear colder in the earlier period.
The prediction skill for the SPG index is shown in Figure 2. The models MPI-ESM-LR and CanCM4, as well as FGOALS-f3-L, show reduced SPG skill in terms of correlation, especially for the second pentad (Figure 2A).A few other models such as BSC EC-Earth3 and SMHI+DMI EC-Earth3 show enhanced root mean square errors (Figure 2B), possibly due to errors that have been associated with a non-stationary expression of the forecast drift (Bilbao et al., 2021).Overall, correlation and root mean square error for other models are consistent with each other and support long-lasting predictability for the SPG.The prediction skill from the multi-model DCP experiments, some of which are also used in .
/fclim. .The prediction skill for the SPG might be dominated by the externally forced signal (Borchert et al., 2021).Therefore, to assess the impact of initialization, we analyze the residual correlation (Figure 2C).The analysis of the residual correlation reveals a larger spread among DCPs from different models than that of the total correlation (Figure 2A); more DCP experiments from various models fall below the multi-model average of DCPs (Figure 2C).Overall, the residual correlation tends to be not statistically significant for the DCPs that show low skill either in terms of the root mean square error (Figure 2B) or anomaly correlation (Figure 2A). .

AMOC after initialization
To analyze the prediction skill of the SPG index linked with the AMOC initialization, the initial conditions from the assimilation runs are needed.Due to the absence of these runs in the WMO dataset, we use the respective first month of each DCP as a proxy for the initial state and further examine decadal predictions from the first few months to few years after the initialization.The target here is to look for abrupt changes indicative of the potential shock from initial conditions when the system is released from observational constraints.The overturning cells in different models in the first month after initialization exhibit large differences (Figure 3).Due to different initialization months (Table 1), the AMOC cells from different models for the first lead month cannot be directly compared with each other.The analysis of the timeseries of the AMOC leadtime-dependent climatology at 25 • N and 40 • N shows that AMOC experiences drift of 1 to 12 Sv depending on the model and latitude (Figure 4 and Supplementary Figures 1, 2).The AMOC drift is approximately linear initially and, for some DCPs, it saturates after several years (e.g., MPI-ESM-LR, CMCC-CM2-SR5, and NorCPM1; Figure 4 and Supplementary Figures 1, 2).The AMOC drift is also not stationary, i.e., in some models, it is larger in the earlier initialization period (e.g., BSC EC-Earth3 and CAFE; Supplementary Figures 1-4) and in other models in the more recent initialization period (e.g., MPI-ESM-LR and GFDL SPEAR; Supplementary Figures 1-4).
Comparing the initial AMOC cells of each DCP with that of at later lead years suggests a distorted cell structure in some of the DCPs at the initialization step (e.g., MPI-ESM-LR and CanCM4; Supplementary Figures 5-7).During integration, the upper AMOC cell in DCPs "recovers" presumably to the preferable model's state; however, depending on the severity of the initialization shock, this process might take different times in various DCPs.To visualize the AMOC drift for the whole Atlantic basin, we fit the linear regression to the leadtime-dependent climatology of AMOC (Figure 5).The climatology is calculated over the period 1991-2020, which is used to diagnose the bias for the biascorrection procedure in DCPs.If the AMOC drift is associated with the ocean heat transport toward or from the SPG gyre, it could affect the variability of the SPG, especially in the later lead years (2nd pentad) as previous studies showed that changes in the AMOC in northern high latitudes lead temperature anomaly in the subpolar gyre region by several years (Zhang and Zhang, 2015;Borchert et al., 2018).The AMOC was shown to affect the skill of the SPG in one of the early development stages of the decadal prediction system based on MPI-ESM-LR (Marotzke et al., 2016;Kröger et al., 2017, with the ocean initialization based on full-field nudging).Similarly, Robson et al. (2012) showed for the earlier version of DePreSys that ocean heat transport changes are responsible for the predictability of the SPG in the 1990s and that successful AMOC initialization was essential for skilful temperature predictions.Thus, in the following, we analyze the AMOC drift in more details and contrast it against the SPG skill in the multi-model ensemble.

FIGURE
The time-series for the North Atlantic SPG index from ERA (black, all panels), the ensemble mean of the historical simulations (in red, lower panel), and the WMO initialized decadal predictions for di erent starting dates (in color, corresponding panel).The ensemble mean of the multi-model ensemble of all DCPs is shown in multi-color in the lower panel.The SPG index is calculated as the average of the near-surface temperature anomalies over the region -• N and -• W. The lead-time dependent bias (calculated with respect to -) is removed from the hindcasts' time-series of the SPG index.
Several models show large trends in the overturning cells, pointing to the regions where models attempt to fix the AMOC structure after the initialization shock (Figure 5).For example, in CanCM4, there is more overturning in the northern hemisphere and less in the southern one.In MPI-ESM-LR, there is more overturning in 0-20 • N, where there is a split of the upper overturning cell at the initialization step (Figure 3).These trends originate from initialization as the models in non-initialized experiments do not show such features; (e.g., for CanCM4 and MPI-ESM-LR; Yang and Saenko, 2012;Brune and Baehr, 2020).The DCPs with the largest drift are full-field initialized (Figure 5).Further analysis of the errors in AMOC from different DCPs are presented in Supplementary Figures 1-9.For example, comparing the AMOC cells among different models in the first month after initialization shows multiple maxima in several models, e.g., for MPI-ESM-LR and DePreSys4.However, without observing the AMOC cells from historical simulations for all models, it is difficult to assess how unusual the AMOC mean state is just after initialization and, if the AMOC in later lead years, represents the AMOC of the model attractor or if it would continue to drift further (beyond 10 lead years).
Overall, several models stand out in terms of both reduced prediction skill for the SPG in the later lead years (Figure 2) and overturning cell features (such as multiple maxima of the mean AMOC cell, AMOC drift, or AMOC root mean square error w.r.t.RAPID): MPI-ESM-LR, CanCM4, FGOALS-f3-L, BSC EC-Earth3, and SMHI+DMI EC-Earth3.A possible link between the drop of the prediction skill in the full-field initialized DCPs for their SPG surface temperatures could be the drift in the meridional ocean heat transport (OHT) induced by the assimilation procedure as reported by Kröger et al. (2017).Anomaly initialized predictions can also experience drifts, although somewhat smaller, when the anomalies, from which predictions are initialized, are not compatible with the simulated variability range.In fact, from Figures 4, 5 and Supplementary Figures 1, 2, it is evident that the anomaly-initialized DCPs also experience AMOC drift (MIROC6, FGOALS-f3-L and NorCPM1).Following the hypothesis of OHT being driven by the AMOC (Zhang and Zhang, 2015), Borchert et al. (2018) showed that in the North Atlantic region, the AMOC and the OHT are highly correlated and that OHT changes at 50 • N lead to changes in the North Atlantic sea surface temperature by up to 9 years.To investigate the AMOC fingerprint, they identified the range of latitudes for the propagation of AMOC anomalies stretching between 40 • N and 50 • N.
. Relationship between the AMOC drift and the SPG skill Thus, as shown in Figure 6A, we contrast the SPG residual correlation for the second pentad of the DCPs (lead years 6-10) and the AMOC drift associated with initialization in the range of latitudes 40-50 • N and depth 0-5,000 m.The mean AMOC drift in the larger latitudinal band 20-60 • N vs. SPG residual skill is shown in Supplementary Figure 10.We also contrast the SPG skill and the AMOC root mean square error estimated with respect to the RAPID data (Figure 6B).When contrasting the AMOC drift and the SPG residual skill, three models fall into the region of low SPG residual skill and high AMOC drift: CanCM4, BSC EC-Earth3, and MPI-ESM-LR.All three models use fullfield initialization in the ocean.The other four models with low SPG residual skill show small drift in the region: among them, MIROC6, SMHI+DMI EC-Earth3, FGOALS-f3-L, and NorCPM1.For the latter, the SPG residual skill is statistically significant.They are anomaly initialized in the ocean.The rest of the models are full-field initialized and show small drift at 40-50 • N as well as high and significant residual correlation.This suggests that the strong AMOC drift that originates from initialization can affect the evolution of the SPG in the later lead years.However, this does not seem to be the only reason why some models lack predictability due to internal variability in the absence of the large drift.For the anomaly initialized models, this might be poor representation of the observed anomalies, with the large biases being an indicator of poor initialization.Due to the absence of long-enough observational AMOC data in the range of latitudes 40-50 • N, we cannot estimate the relation between the AMOC skill in higher latitudes and the SPG skill.Instead we analyzed AMOC root mean square error at 25 • N, as shown in Figure 6B.The root mean square error does not only focus on the initial shock but is a general measure of prediction errors.Similar to Figure 6A, it also points at the models CanCM4 and MPI-ESM-LR with large-scale ocean circulation errors which echo in the SPG skill.Figure 6B also highlights NorCPM1 that has a very large bias for the AMOC.
For the full-field initialized DCPs that fall into the category of low AMOC drift (CAFE, CMCC-CM2-SR5, DePreSys4, and

FIGURE
The AMOC cells at the first lead month after initialization (LM ) averaged over the period -.Notably, the initialization month is di erent among the prediction systems: October, November, or January (see Table ).
GFDL SPEAR), we recalculated the prediction skill based on the ensemble mean constructed from these models.The subsellected multi-model mean beats the prediction skill of the full multimodel ensemble mean, especially in the last pentad of the decadal prediction years (shown in red dash in Figure 2).
Previous studies suggest the importance of SPG-AMOC coupling in the models (Sun et al., 2021).In this respect, Yang et al. (2021) found that GFDL SPEAR's predicted SPG temperatures at three different lead times (including 6-10 lead years) are highly correlated with the corresponding initial values of AMOC with correlations larger than 0.75.The initialization that updates the ocean surface and the atmosphere states with observations without breaking up this coupling could explain high skill for the SPG in this model.We analyzed correlation between AMOC at 40 • N and 1,000 m depth for lead year 1 and SPG temperatures for lead years 6-10 from the WMO ensemble of DCPs.From four DCP systems (with low AMOC drift and high SPG residual skill), CMCC-CM2-SR5, GFDL SPEAR, and CAFE indeed show statistically significant correlation between the time-series of the initial AMOC at 40 • N and the SPG index at later lead years (Supplementary Figure 11).This does not necessarily mean that other models, when not initialized, do not have this relationship.However, lacking or out

Summary and discussion
While the research centers assimilate ocean and atmosphere observations from the same data, they use different models and different assimilation/initialization methods.Ocean circulation reacts sensitively to these different choices so that the initial ocean state varies across different DCPs.In our study, we dealt with the question: why do the decadal prediction systems start from so different ocean circulations but the prediction skill at the ocean surface is similar and hardly improves the skill over the historical simulations (Borchert et al., 2021)?Our results show that this may appear so at first glance because the skill at the air-sea interface is largely dominated by the externally forced response.The picture starts changing when we analyze the skill due to internal variability such that DCPs that appear to have smaller errors in initial Atlantic meridional overturning circulation have a better total skill for the subpolar gyre than a multi-model average of DCPs.Below, we summarize several key findings and recommendations for future studies: • Multi-model and single-model predictions (e.g., CAFE and CMCC-CM2-SR5) of the SPG index are skilful on decadal time scales.Evaluations of residual variability obtained from linear regression against historical simulations suggest that the prediction skill of most of the DCPs is largely explained by the forced response, while tentative initialization of internal variability provide additional skill in the later lead years only in a few prediction systems (Figure 2).We identified systematic behavior of the initialized predictions in the 1960-80s, where most of the DCPs underestimated or overestimated temperature in the SPG region.Excluding the 1960-80s period from the analysis improved the prediction skill of some models (Supplementary Figure 12).This result calls for improving the reconstruction runs that provide the initial conditions for DCPs over 1960-80s or at least keeping in mind the possible poor performance of the DCPs in this verification period.The study retains several open questions, and we would like to encourage the scientific community to contribute to the investigations since very multi-model analysis of initial shock is quite scarce, while this might constitute a crucial issue for decadal prediction systems as highlighted here.For instance, the effect of the drift is expected to be removed by the leadtime dependent bias correction.However, we observe that the lead-time dependent bias-correction does not seem to completely debias the AMOC predictions (Supplementary Figures 4, 5), which might also explain their low skill for the SPG index.On the other hand, if initialization/assimilation could indeed introduce artificial flows in the dynamically sensitive regions as suggested by Kröger et al. (2017), they might also evolve the observed anomaly contained in the initial conditions, modifying potential internal variability and creating main signal that is a drift.It thus appears necessary to further analyze the exact mechanism by which the AMOC drift and other errors could affect the SPG skill in DCPs.
The mechanism proposed by Borchert et al. (2018) about strong OHT leading to high SPG skill could be considered to understand better the relation between strong AMOC and OHT drifts and the SPG skill.For instance, models with weak AMOC mean state might be prone to having low prediction skill.Moreover, too low or too high AMOC variability could also have a negative impact on the prediction skill because the response to AMOC variability will not be appropriate.Further analysis of the assimilation runs and ocean heat content budget as in the study by Kröger et al. (2017) might be useful to trace origins of the initialization shock of the ocean circulation.We thus encourage the research groups that will contribute to future DCP and CMIP experiments along with DCPs to also plan providing initial conditions (assimilation runs) to better assess the impact of initialization shock.In addition, we encourage more institutes to contribute to WMO decadal prediction project, in order to increase the sampling for multi-model analysis, which will also improve the robustness of the results from this international initiative.The analysis of AMOC initial shock in depth space is recommended for future studies to gain more insight into the problem from the point of view of the water mass transformation across isopycnals.The AMOC indices in depth and density space are equivalent as long as the isopycnals are relatively flat across the basin, which is the case in the tropical and subtropical regions but not in the subpolar region (50-60 • N; Xu et al., 2014).In our current analysis, we relate the skill of the SPG with the heat transport between 40 and 50 • N, where the relation between the two AMOC indices is still rather high.
AMOC is one of the diagnostics for the assimilation runs.In the attempt to answer the question of how good should the AMOC be in the assimilation runs, it appears reasonable to aim at representing a realistic vertical structure of the AMOC.It is necessary to further evaluate the strength and variability of AMOC on obtaining right impact on the SPG and density dominance by temperature or salinity (Menary and Hermanson, 2018).Our study shows that DCPs with low SPG skill also have AMOC errors (strong and non-stationary drifts).In this study, we could not contrast the AMOC residual skill against the SPG residual skill mostly because the WMO dataset lacked the respective historical simulations for AMOC.The AMOC root mean square error w.r.t.RAPID points at the DCPs with both drifts and biases, suggesting value for continuing the RAPID as well as OSNAP (Overturning in the Subpolar North Atlantic Program) observing periods, to guide and improve the tuning of the underlying climate models, and the production of more realistic assimilation runs to provide initial conditions.For the 3D AMOC view, it remains difficult to judge what the good or realistic structure of the AMOC should be in each model.In this respect, the ocean reanalyses remain our best estimate so far.

FIGURE
FIGUREAMOC drift for the period -at • N and , m depth calculated as absolute di erence between AMOC climatology at each lead year and AMOC climatology at the st lead year.

FIGURE
FIGUREAMOC drift (Sv/year) calculated in terms of the slope of the linear regression fitted into lead years of the leadtime-dependent climatology of AMOC, except of MRI-ESM , for which the fit is performed over lead years.The climatology is calculated over the period -.The panels have a di erent range of the colorbar.

FIGURE
FIGURESPG residual skill for lead years -vs.AMOC drift (A) and vs. AMOC rmse (Sv) w.r.t.RAPID data at the first lead year (B).The significant SPG residual skill (in black circles) is estimated with the t-test, p < .(Smithet al.,  ).The AMOC drift is diagnosed as the slope of the linear regression fitted to lead years of the leadtime-dependent AMOC climatology.Climatology is estimated for the period -.Then, L -norm of the AMOC drift is calculated for the region -• N and -, m depth as the sum of the squared AMOC drift values in each grid cell.Orange markers signify anomaly initialized DCPs, blue markers full-field initialized.
TABLE WMO decadal predictions and their specifications.

•
The analysis of the lead-time dependent climatology of multiple DCP systems indicates individual initialization errors in the AMOC, e.g., multiple maxima in MPI-ESM-LR, strong AMOC drifts in MPI-ESM-LR, CanCM4 and BSC EC-Earth3, and large AMOC biases in NorCPM1.DCPs that initially have strong AMOC drift at 40-50 • N also have generally low residual skill in the SPG for the 2nd pentad of the prediction period.The analysis indicates that the mechanism affecting the skill reduction is not local and instantaneous but rather remotely driven and lagged, presumably via changes in the ocean heat transport.