Simulations of ocean deoxygenation in the historical era: insights from forced and coupled models

Ocean deoxygenation due to anthropogenic warming represents a major threat to marine ecosystems and ﬁ sheries. Challenges remain in simulating the modern observed changes in the dissolved oxygen (O 2 ). Here, we present an analysis of upper ocean (0-700m) deoxygenation in recent decades from a suite of the Coupled Model Intercomparison Project phase 6 (CMIP6) ocean biogeochemical simulations. The physics and biogeochemical simulations include both ocean-only (the Ocean Model Intercomparison Project Phase 1 and 2, OMIP1 and OMIP2) and coupled Earth system (CMIP6 Historical) con ﬁ gurations. We examine simulated changes in the O 2 inventory and ocean heat content (OHC) over the past 5 decades across models. The models simulate spatially divergent evolution

Ocean deoxygenation due to anthropogenic warming represents a major threat to marine ecosystems and fisheries.Challenges remain in simulating the modern observed changes in the dissolved oxygen (O 2 ).Here, we present an analysis of upper ocean (0-700m) deoxygenation in recent decades from a suite of the Coupled Model Intercomparison Project phase 6 (CMIP6) ocean biogeochemical simulations.The physics and biogeochemical simulations include both oceanonly (the Ocean Model Intercomparison Project Phase 1 and 2, OMIP1 and OMIP2) and coupled Earth system (CMIP6 Historical) configurations.We examine simulated changes in the O 2 inventory and ocean heat content (OHC) over the past 5 decades across models.The models simulate spatially divergent evolution

Introduction
Dissolved oxygen (O 2 ) sustains aerobic life in the ocean's interior and plays a critical role in the ocean carbon and nitrogen cycles (e.g.Codispoti, 1995).Changes in O 2 levels have a major influence on ecosystem habitats and biogeochemistry throughout the water column (Morel and Price, 2003;Vaquer-Sunyer and Duarte, 2008;Keeling et al., 2010;Deutsch et al., 2015;Deutsch et al., 2022;Moreé et al., 2023).One such change of particular concern is ocean "deoxygenation", the long-term decline of O 2 content due to anthropogenic climate change, which, along with ocean warming and acidification, represents a major threat to marine biodiversity and fisheries in the 21 st Century (Gruber, 2011;Mora et al., 2013).Analysis from the 6 th Coupled Model Intercomparison Project (CMIP6) (Eyring et al., 2016) suggests that global warming will lead to about 3−4% decline in the global oceanic O 2 content by the end of the 21 st century (Kwiatkowski et al., 2020), and synthesis of observations indicate that about 1−2% of the global O 2 inventory has already been lost since the mid 20 th century (Ito et al., 2017;Schmidtko et al., 2017;Ito, 2022).While global open-ocean deoxygenation is largely driven by global warming, coastal oceans also experience deoxygenation caused by increased loading of nutrients (Breitburg et al., 2018) and the expansion of oxygen minimum zones (OMZs) (Breitburg et al., 2018;Levin, 2018).
The decrease in the oceanic O 2 content is driven by two key processes.First, ocean warming leads to a decrease in gas solubility, reducing the ability of surface waters to absorb O 2 (Keeling and Garcia, 2002).Second, warming further stratifies the upper ocean, reducing ventilation rates and weakening the supply of O 2 from the surface into the interior ocean (Bopp et al., 2002;Rhein et al., 2017;Tjiputra et al., 2018).Changes in ocean productivity and remineralization may also contribute to or compensate for these O 2 changes, but their overall effects in current models are relatively small compared to the physical drivers (Bopp et al., 2017).While Earth System Models (ESMs) largely project a reduction in the global oceanic O 2 content in response to warming, major challenges and discrepancies remain.The mechanisms and individual contribution of processes regulating these O 2 changes, for instance, differ across regions and models, leading to major uncertainties in the attribution and detection of observed O 2 trends and future projections (Bopp et al., 2013;Oschlies et al., 2018;Takano et al., 2018;Kwiatkowski et al., 2020;Buchanan and Tagliabue, 2021;Busecke et al., 2022;Tjiputra et al., 2023).
Recent studies suggest that ESMs may underestimate the observed interannual to decadal variability and long-term changes in the O 2 content and air-sea O 2 fluxes (Long et al., 2016;Eddebbar et al., 2017;Schmidtko et al., 2017;Oschlies et al., 2018).Models also struggle to reproduce the observed O 2 loss in the tropical basins (Stramma et al., 2012;Oschlies et al., 2017), likely due to pronounced biases in this region arising from poorly simulated or unresolved ocean processes (e.g.eddies and zonal equatorial jets) partly because of limitation in model resolutions (e.g. Figure 16 in Berthet et al., 2019), and a poorly constrained role for internal climate variability (Cabréet al., 2015;Berthet et al., 2019;Busecke et al., 2019;Ito et al., 2019;Laffoley and Baxter, 2019;Eddebbar et al., 2021).The observational estimates of ocean deoxygenation are also highly uncertain (Ito et al., 2017;Schmidtko et al., 2017;Ito, 2022), particularly in the tropical Oxygen Minimum Zones (OMZs) and the Southern Ocean where major spatial and temporal gaps exist.CMIP6 models showcase a more realistic ocean biogeochemistry than previous model generations due to improved representation of ocean physical and biogeochemical processes (Sefeŕian et al., 2020;Canadell et al., 2021), but various challenges continue to hinder our assessment and understanding of ocean deoxygenation in both models and observations.The recent IPCC Sixth Assessment Report concluded with medium confidence that the subsurface O 2 content is projected to decrease in the next century (IPCC, 2019;Canadell et al., 2021;Cooley et al., 2022).
The key question in the present study is whether ESMs simulate the observed ocean deoxygenation and O 2 variability during the historical era (defined here as changes in O 2 concentration or inventory in the past 5 decades) (Ito et al., 2017;Schmidtko et al., 2017;Ito, 2022).
In this study, we conduct a comprehensive assessment of simulations from CMIP6 models (Eyring et al., 2016) focusing on the global upper ocean (0-700m) deoxygenation in the past 5 decades (1960s-2010s) and how they compare to a recent observational dataset (Ito, 2022).Using model outputs from the Ocean Model Intercomparison Project 1 and 2 (OMIP1 and OMIP2) (Griffies et al., 2016;Orr et al., 2017) and CMIP6 Historical simulations, we examine how differences in model structure and atmospheric forcing (i.e.forced ocean-only vs coupled configurations) lead to differences in the simulated oceanic O 2 content response to the recent warming and compare these simulated trends to the most recently available gridded observational dataset (Ito, 2022).Various estimates of changes in ocean O 2 levels exist, varying mainly in their observational dataset construction methods (such as interpolation and gap filling procedures) (Helm et al., 2011;Ito et al., 2017;Schmidtko et al., 2017;Ito, 2022;Ito et al., 2023), though the availability of these products to the public via a community O 2 data platform and their comparison is still underway (e.g.Greǵoire et al., 2021).We thus focus on the publicly available dataset from Ito, 2022 (and some complementary analysis based on the previous gridded O 2 dataset from 2017; Ito et al., 2017) and include a discussion on observational uncertainty in our comparison with the simulations.Our overall aim is to provide a comprehensive overview of model simulations of historical upper ocean deoxygenation and identify remaining challenges for the state-of-the-art models as guidelines for future coordinated efforts for ocean biogeochemistry simulations.
The structure of the paper is as follows.We first describe the CMIP6 models, configurations, observational dataset and data analysis methods used herein (Section 2).Section 3 outlines the OMIP simulations' equilibrium states of sea water temperature and upper ocean O 2 along with key differences in climatological mean states of upper ocean O 2 between OMIP, CMIP6 Historical simulations and observations.In Section 4, we compare the observed and simulated changes in upper ocean O 2 over the past 5 decades, focusing on the upper 700m, where most of the O 2 changes and warming are expected in recent decades.We further evaluate the spatial characteristics of this O 2 change across the models and configurations and explore the drivers of the global O 2 loss.Finally, we conclude with a discussion on the future road map for coordinated modeling efforts to simulate the ocean O 2 cycle and outstanding uncertainties and discrepancies between models and observations.

Ocean biogeochemistry simulations
We use model outputs from two sets of simulations from the forced ocean-only model simulations and a set of fully coupled Earth System Model (ESM) simulations from the CMIP6 Historical configuration (Sefeŕian et al., 2020).More specifically, the forced ocean-only model simulations are based on the Ocean Model Intercomparison Project 1 (OMIP1) and Project 2 (OMIP2) protocols (Griffies et al., 2016;Orr et al., 2017;Tsujino et al., 2020) with two additional simulations based on the ECMWF ERA-20C and the NCEP-NCAR Reanalysis 1 atmospheric dataset (Kalnay et al., 1996;Poli et al., 2016).Forced oceanonly model simulations include ocean physics and biogeochemistry, with the atmosphere represented by noninteractive (and observation-derived) reanalysis fields of properties with bulk formulae to calculate ocean heat, moisture, and air-sea gas exchange fluxes, and momentum transfer.In contrast, fully coupled ESM simulations include the same representation of the ocean, but the atmosphere is a dynamical model that responds interactively to the ocean state driving it.In total, we analyze 17 models based on different configurations.We selected models that include an ocean biogeochemistry component and provide O 2 as model output.We use model outputs of sea water temperature (thetao) and salinity (so) (the names inside parentheses are the CMIP6 variable names) for calculating solubility driven changes in O 2 vs circulation and biology induced changes in O 2 (see section 2.3 for details).
A summary of the model and simulation information are shown in Table 1.The OMIP1 and OMIP2 simulations are based on the CORE-II protocol (Griffies et al., 2016;Tsujino et al., 2020), and use the CORE-II atmospheric forcing from 1948-2009(Large and Yeager, 2009) and the JRA55-do atmospheric forcing from 1958-2018 (Tsujino et al., 2018), respectively.The JRA55-do atmospheric forcing covers a more recent period and increase in spatio-temporal resolutions compared to the predecessor (Tsujino et al., 2018;Tsujino et al., 2020).The atmospheric states of the original JRA-55 have been adjusted to match reference states based on observations or the ensemble means of atmospheric reanalysis products, which makes the atmospheric forcing dataset more realistic compared to the CORE-II forcing (Tsujino et al., 2018).These datasets are used to force the ocean biogeochemistry models and have been cyclically applied for the simulations.For the OMIP1 simulations, the length of the simulation is 310 years, conducted by repeating 5 cycles of the CORE-II forcing (Large and Yeager, 2009).For the OMIP2 simulations, the length of the simulation is 366 years, completed by repeating 6 cycles of JRA55-do forcing (Tsujino et al., 2018).The initial conditions are based on the observational dataset from the World Ocean Atlas 2013 and GLODAPv2 (Olsen et al., 2016;Garcia et al., 2019).For further detailed information of the forced ocean-only model simulations and experimental designs, see the OMIP protocol references (Griffies et al., 2016;Orr et al., 2017;Tsujino et al., 2020).Our focus is mainly on the past 5 decades from 1960s to 2010s.This corresponds to the last cycle from the OMIP1 and OMIP2 simulations.We also analyzed two additional forced ocean-only model simulations to complement the analysis of OMIP simulations.One is based on the ECMWF ERA-20C atmospheric reanalysis forcing 1900-2009 for the MPI-ESM1.2model, the ocean model MPI-OM with an ocean biogeochemistry model HAMOCC6 (Ilyina et al., 2013;Paulsen et al., 2017;Mauritsen et al., 2019), and the other is based on the NCEP-NCAR Reanalysis 1 atmospheric forcing  for the NEMOv3.6-PlankTOM12. 1 (Le Queŕeé t al., 2016;Buitenhuis et al., 2019;Wright et al., 2021) from the University of East Anglia.The MPI-ESM1.2 model simulation is similar to the experimental design of the ocean carbon cycle simulation in the Global Carbon Budget project (Friedlingstein et al., 2019).The model was spun-up for millennia using the ECMWF ERA-20C atmospheric forcing from the early period (1900)(1901)(1902)(1903)(1904)(1905)(1906)(1907)(1908)(1909)(1910)(1911)(1912)(1913)(1914)(1915) to achieve steady ocean states.An ocean simulation was then completed for the twentieth century using the ECMWF ERA-20C atmospheric forcing up to 2009.The NEMOv3.6-PlankTOM12.1 model was integrated from 1750-1947 (198 years) based on the NCEP-NCAR Reanalysis 1 climatological (a 1948Reanalysis 1 climatological (a -1977 daily climatology) atmospheric forcing followed by 61 years of hindcast simulation using the interannual atmospheric forcing from 1948-2009.Further details of these two additional simulations are provided in the Supplementary Material.
While the analysis periods differ slightly among the models and configurations, these differences were not found to significantly impact our results.We compare these simulations with the most recent and publicly available observational O 2 dataset (Ito, 2022).Before focusing on the analysis of model outputs from the recent decades, we briefly examine the models' equilibrium states of upper ocean temperature and O 2 from OMIP1 and OMIP2 using the model output from the full length of the simulation cycles.
The CMIP6 Historical simulations (Eyring et al., 2016) are based on atmosphere-ocean-land coupled models listed in Table 1.The historical simulations span from 1850-2014 following the preindustrial control simulations for each model.Duration of preindustrial control simulations diverge among the models (e.g.Sefeŕian et al., 2016;Sefeŕian et al., 2020), but most of the CMIP6 models with ocean biogeochemistry have been spun-up longer than the standard OMIP protocol duration (except for the GFDL-CM4, which is a relatively highresolution ocean model).In the CMIP6 Historical simulations, the models are driven by the historical forcing (i.e.boundary conditions) of greenhouse gas concentrations, stratospheric and tropospheric aerosols, and land-use (Eyring et al., 2016).A comprehensive evaluation of ocean biogeochemical historical simulations based on CMIP6 models is documented in (Sefeŕian et al., 2020).We analyze the model output for the past 5 decades from 1965-2014 to overlap with the observational dataset period.

Observational dataset
The observational product used in this study is the global threedimensional, time varying O 2 dataset based on the quality-control bottle O 2 data from the World Ocean Database 2018 (WOD2018) from 1965-2015 (Ito, 2022) (hereafter ITO2022 dataset).This product is a 1-degree gridded product based on the optimal interpolation method to fill in spatial gaps.In this dataset, the number of O 2 samples is denser in 1970s and 80s, with more samples in the North Pacific and North Atlantic Oceans.The global coverage of samples is more even after 1990s, but overall, the samples remain relatively sparse, particularly after the year 2000 (see Figure 1 from Ito, 2022).Despite the interpolation, smoothing and gap filling procedures, the signals could be skewed towards the data dense regions in certain periods.Note that this product differs from the previous product (Ito et al., 2017), which includes Conductivity Temperature Depth (CTD) profiler measurements.
The O 2 data product from ITO2022 dataset is presented as an anomaly relative to the long-term  mean (Ito, 2022), where O 2 concentrations are shown as yearly anomalies calculated using a 5-year moving averaging window.Note that a 5-year moving averaging window reduces the variability on timescales shorter than 5 years (Ito, 2022).The estimate of global ocean deoxygenation from this latest observational product is smaller than the previous works (Ito et al., 2017;Schmidtko et al., 2017) and thus provides a lower bound estimate of global ocean deoxygenation (Ito, 2022).
We compare the upper ocean heat content (OHC) based on five dataset products to the simulated OHC from OMIP1, OMIP2, and CMIP6 configurations.These dataset products include the ECMWF Ocean Reanalysis System 4 (ORAS4) (following the previous observational O 2 and OHC study from Ito et al., 2017), EN4.2.1 from UK Met Office, Cheng dataset from the Institute of Atmospheric Physics (IAP), Chinese Academy of Sciences (http://www.ocean.iap.ac.cn/),Ishii dataset from Japan Meteorological Agency, and Levitus data from NOAA NCEI (Gouretski and Reseghetti, 2010;Levitus et al., 2012;Balmaseda et al., 2013;Ishii et al., 2017;Cheng et al., 2022).We additionally use the sea water temperature and salinity data from the ECMWF ORAS4, EN4.2.1, Cheng, and Ishii datasets to calculate oxygen saturation (O 2,sat and Apparent Oxygen Utilization (AOU) (in sections 2.3 and 3.4).For climatological comparison of upper ocean O 2 (in section 3.1), we use the observations-based climatology from the World Ocean Atlas 2018 (Garcia et al., 2019).

Data analysis
To further examine the drivers of O 2 changes in the simulations and observations, we decomposed the O 2 changes into changes in O 2 saturation, i.e. the "thermodynamic" component, and changes in Apparent Oxygen Utilization (AOU), the component driven by changes in ocean circulation and biological activities (e.g.Emerson and Hamme, 2022).We first calculate O 2 saturation based on the Garcia and Gordon formula (Garcia and Gordon, 1992) and subtract the in-situ O 2 concentration to obtain AOU as follows: which can be re-arranged to, This decomposition is commonly used (e.g.Bopp et al., 2017) to highlight the thermal vs. non-thermal, circulation and biology driven O 2 changes in the ocean.In this study, we present -1 × AOU in figures to reflect more clearly its contribution to changes in O 2 (i.e. an increase in AOU is shown as a decrease in -1 × AOU, which translates to a decrease in O 2 ) (section 3.4).
All the model outputs are re-gridded to a 1-by-1 degree horizontal grid using distance weighted average remapping method (climate data operators (CDO); remapdis) (Schulzweida, 2022).The vertical interpolation is based on linear level interpolation (climate data operators (CDO); intlevel) (Schulzweida, 2022), interpolated from the model native levels to 102 vertical levels following the observational product, though our analysis focuses primary on the upper depth range (0-700m) of the ocean (i.e. the top 41 vertical levels).The spatial resolution of the regridded model output is the same as the ITO2022 dataset used in this study (Section 2.2).The main diagnostics of ocean deoxygenation and associated ocean warming for the past 5 decades are based on the anomaly of O 2 and sea water temperature relative to the long-term mean (i.e. the past 5 decades).The calculations of the anomaly of O 2 from the model output basically follows the ITO2022 dataset.We focus on upper ocean O 2 and heat content changes as this is where the majority of ocean physical and biogeochemical changes due to atmospheric forcing on inter-annual to decadal timescales are expected.Also the upper ocean environmental changes are highly relevant for marine ecosystems (e.g.Longhurst and Harrison, 1989).We calculate globally integrated O 2 inventory anomaly and OHC time series for comparing models and observations evaluations.OHC is in anomaly by definition (e.g.Levitus et al., 2012).
Ocean biogeochemistry simulations require a long term spin-up to reach equilibrium and can exhibit a long-term drift in their solution (Sefeŕian et al., 2016).The standard OMIP protocols request 310-366 years for a model spin-up (Griffies et al., 2016), which is not sufficient for the deep ocean to achieve equilibrium state (e.g.Yool et al., 2020).The estimation of deep North Pacific Ocean ventilation age based on radiocarbon and inverse models is 1200-1500 years (Gebbie and Huybers, 2012;Khatiwala et al., 2012) or even longer timescales from some modeling studies (Wunsch and Heimbach, 2008).Unlike in the coupled model simulations, the OMIP1 and OMIP2 simulations have no control simulations to compare, making it important to assess the potential drift in OMIP simulations to ensure that it does not obscure the multi-decadal ocean deoxygenation signals in which we are interested.To check the impact of model's spin-up in the upper ocean, the model's equilibrium states of the sea water temperature and upper ocean O 2 are presented in the first part of the results following the ocean model intercomparison study by Tsujino et al., 2020.To assess the models' drift from forced ocean-only model simulations, we explore upper ocean global mean (covering 89.5°S -65°N in this study following the spatial coverage of the ITO2022 dataset) properties as a metric to evaluate the models' equilibrium states for the variables on which we are focusing.Specifically we examine the global volume-weighted mean sea water temperature and O 2 from 0-700m from the OMIP1 and OMIP2 simulations (Figure 1).The time series of the global volume-weighted mean O 2 and sea water temperature from the upper ocean for the full cycles show individual model transitions to their respective quasiequilibrium states for both properties.We show that for the final 50-60 years of the simulations, the analysis of ocean deoxygenation during this period is not significantly impacted by model drift in the upper ocean (see Figure 1).
We used linear trend analysis to quantify multidecadal changes in upper ocean O 2 inventory and OHC.Linear trend analysis is based on annual model output and observational dataset.The statistical significance test uses a modified Mann-Kendall test, which takes into account autocorrelations within the dataset (Yue and Wang, 2004;Hussain and Mahmud, 2019;Martıńez-Moreno et al., 2021).

Model solutions: equilibrium states of forced ocean simulations and climatological mean states
For the evaluation of models' equilibrium states, we use the term "quasi-equilibrium states" to denote the upper ocean (0-700m) variables close to steady-states.For the OMIP1 simulations, most of the models reach quasi-equilibrium states in the fourth cycle (Figure 1), with approximately half of the models showing a positive drift anomaly in O 2 compared to the initial conditions based on the World Ocean Atlas 2013, though we expect the ocean to loose O 2 as the ocean warms.The positive drift anomaly in O 2 is likely driven by changes in ocean circulation and biological processes, offsetting the solubility-driven changes (see Supplementary Figure 1).Many of the models drift towards warm ocean states relative to the initial conditions except for the CNRM-ESM2-1, GFDL-CM4, and IPSL-CM6A-LR, from the OMIP1 and EC-Earth3 from the OMIP2 simulations.Three of these models (i.e.CNRM-ESM2-1, EC-Earth3, and IPSL-CM6A-LR) are based on the same ocean general circulation model, the Nucleus for European Modelling of the Ocean (NEMO) but the CNRM-ESM2-1 employs a different sea-ice model.
Despite differences in the atmospheric forcing between the configurations, the OMIP2 simulations exhibit a similar transition phase to the OMIP1 simulations, with most of the models reaching quasi-equilibrium states in the fourth cycle of the simulations.The EC-Earth3, for instance, runs under both OMIP1 and OMIP2 protocols, and its global mean O 2 equilibrates similarly in both simulations.The cyclically forced ocean-only model simulations also have issues involving the initial shock due to a jump in atmospheric forcing from the end to the beginning of the forcing period (Tsujino et al., 2020).Therefore, we omitted the first 10 years of the last cycle of the OMIP simulations from the analysis in Section 3.2.
Overall, the OMIP simulations shown here reach quasiequilibrium states for both upper ocean O 2 and sea water temperature, and thus model drift does not substantially obscure the long-term trend associated with multi-decadal ocean deoxygenation due to global warming.We note, however, the timescales of the upper ocean quantities reaching quasiequilibrium states are also model-dependent (e.g.Danabasoglu et al., 2014).Therefore, in some models, model drift could still impact detection of long-term changes (Tjiputra et al., 2023).Hereafter, we focus our analysis on the historical period after 1950s, which is represented by the last cycle from OMIP1 (1958( -2009( ) and OMIP2 (1968( -2018) ) simulations.
Next we compare the modern climatological mean states of upper ocean O 2 inventory from the OMIP1, OMIP2, and CMIP6 Historical simulations to the World Ocean Atlas 2018 (Garcia et al., 2019).Here we evaluate the upper ocean O 2 inventory, shown as maps of vertically integrated O 2 for each grid points from 0-700m (i.e refer to herein as column O 2 inventory) for the historical period.The modern climatological mean states are defined as the long-term mean of upper ocean column O 2 inventory 1958-2007 for the OMIP1 multi-model mean, 1969-2018 for the OMIP2 multimodel mean, and 1965-2014 for the CMIP6 Historical multimodel mean (Figure 2).The annual climatology from the World Ocean Atlas 2018 is based on measurements from the past 5 decades.We present here the multi-model ensemble mean from the OMIP1, OMIP2, and CMIP6 Historical simulations to summarize the characteristics of modern climatological column O 2 inventory from each configuration.For brevity, climatology of column O 2 inventory from the individual models are presented in the Supplementary Material.
Overall the observed global patterns of modern climatological mean state of upper ocean column O 2 inventory are relatively well reproduced by the simulations, though biases emerge in key regions.The upper section (0-700m) of the oxygen minimum zone (OMZ) of the tropical Pacific Ocean, for instance, extend less westward than the observations and are simulated as a single spatially merged structure rather than two asymmetric northern and southern OMZs as shown in the World Ocean Atlas 2018 (Figure 2).Similarly, the OMZ in the tropical Atlantic Ocean is too intense in models, while the simulated Indian Ocean OMZ is weak (i.e. more O 2 ) compared to the observations.These biases were previously identified in CMIP5 model simulations (e.g.Cabréet al., 2015) and in CMIP6 model simulations in the Pacific OMZs (e.g.Busecke et al., 2022), and thus are shown to persist across CMIP effort.As Cabréet al., 2015 pointed out from analyzing CMIP5 models, the OMZ biases could stem from a deficient ventilation in the upper ocean, combined with a biologically driven downward flux of particulate organic carbon that is too large, resulting in a remineralization bias in these regions.
An additional mismatch between models and observations is the O 2 excess in the subpolar gyre region of the North Pacific Ocean.The World Ocean Atlas 2018 shows a deficit of O 2 in this region, but the models tend to have too much O 2 in this region for OMIP2 and CMIP6 Historical simulations.This is also a region where the model representations of climatological O 2 diverge substantially likely due to biases associated with poorly simulated ventilation processes (see Supplementary Material, Figures S3-S5).The globally integrated climatological upper ocean O 2 inventory (Figure 2) is slightly overestimated in the simulations (+0.42-1.35Pmol) compared to the observational estimate of 43.02 Pmol.This overestimation stems from excess O 2 in the Southern Ocean and the western tropical Pacific Ocean in models compared to the observational estimate that overcompensate for the lower O 2 inventory of the eastern tropical Pacific Ocean and the tropical Atlantic Ocean (see Supplementary Material for further information).
The difference between OMIP1 and OMIP2 simulations shows how differences in the atmospheric forcing impact the climatology (i.e.quasi-equilibrium states) in O 2 (Figure 2H).The climatology of O 2 inventory from OMIP2 simulations show a higher O 2 in the North Pacific and the Southern Ocean indicating this updated atmospheric forcing results in more upper ocean O 2 inventory in these regions.Atmospheric wind impacts ventilation and mixing, and is expected to modulate O 2 and other ocean biogeochemical tracer variability and distributions.Previous work has also shown that biases in the atmospheric wind forcing in OMIP simulations project onto ocean circulations and water mass formation [e.g. in the tropical Pacific Ocean (Tseng et al., 2016)], with cascading effects on ocean biogeochemistry.

Global historical ocean deoxygenation and heat uptake: globally integrated changes
We examine the long-term trends and variability in O 2 inventory and OHC over the upper ocean global from the OMIP1, OMIP2, and CMIP6 Historical simulations (Figure 3).A prominent feature for the OMIP1 models -based on the CORE-II atmospheric forcing (Large and Yeager, 2009) -is an initial decrease in the upper ocean O 2 inventory through the mid 1980's followed by an increase in the remaining 25 years of the simulation (Figure 3A).This increase is in contrast to the long-term decrease reported in recent observational studies (Helm et al., 2011;Ito et al., 2017;Schmidtko et al., 2017;Ito, 2022).The shift from a decadal decrease to an increase in the upper ocean O 2 inventory during the mid 1980's coincides with the El Chichoń and Pinatubo eruptions which have been proposed to drive   (Eddebbar et al., 2019).The volcanic effects were previously assessed in the large ensemble simulations (e.g.Eddebbar et al., 2019;Fay et al., 2023) and emerge similarly here in the CMIP6 Historical multi-model mean (see the Pinatubo-related pause in ocean deoxygenation and heat loss in CMIP6 Historical in Figures 3A, B), but this signal is challenging to detect in OMIP simulations and observations.The discrepancy in OMIP simulations could be due to the confounding effect of atmospheric driven climate variability (i.e.such as a common wind forcing in OMIP simulations modulating the ocean circulation) or OMIP simulations are less sensitive to volcanic effects.Detection of volcanic effect in observations is particularly challenging given the large gaps and uncertainties in the dataset.The complementary MPI-ESM1.2simulation -based on the ECMWF ERA-20C atmospheric forcing (Poli et al., 2016) -also shows a slight increase in O 2 inventory.While the OMIP1 simulations show a net uptake in heat as expected from anthropogenic forcing, the overall ocean heat uptake is weak, with many models showing little to no heat uptake in the first decades of the simulation (see Supplementary Material, Figures S6 and S7), potentially explaining the weak negative O 2 trends.All OMIP1 simulations and complementary simulations however show an increase in heat uptake in the recent decade (1995)(1996)(1997)(1998)(1999)(2000)(2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008), which is initially unexpected given the coincident increase in O 2 during this period.
In contrast, the OMIP2 simulations, which are based on the JRA55-do atmospheric forcing (Tsujino et al., 2018), show a decrease in the upper ocean global O 2 inventory since the beginning of the time series (i.e. at year 1968), though the rate of O 2 loss is much weaker compared to observations (Figure 3A).The decadal shift in the O 2 inventory in the OMIP2 simulations from a long-term decrease to a brief increase and subsequent flattening is centered around the end of the 1990's and independent from the timing of the volcanic eruptions of El Chichon´and Pinatubo.The ocean overall takes up more heat in the OMIP2 than the OMIP1 simulations, though heat uptake is still weaker compared to the observations (Figure 3B).In the OMIP2 simulations, the ocean continues to take up heat after 2000 ("hiatus decade") while the oceanic O 2 inventory remains relatively unchanged (Figures 3A, B), suggesting a potential decoupling of the changes in heat and O 2 contents during these recent decades.We also note a much stronger agreement across the OMIP2 models for both O 2 inventory and OHC changes than in the OMIP1 models (see Supplementary Material, Figures S6, S7).More than half of the models (8 out of total 13 models) from OMIP1 simulations show positive trend in the O 2 inventory (and all the models show a negative OHC trend, which is counter-intuitive), whereas all the models from OMIP2 simulations show a negative trend in O 2 inventory.We note, however, that there are fewer simulations of OMIP2 compared to OMIP1 simulations.Compared to the OMIP simulations, the CMIP6 Historical simulations show a substantial decrease in the O 2 inventory over the past 50 years (Figure 3A), in line with the most recent observational estimate of Ito (2022) likely due to warming of the ocean and reduction in ventilation supply.The signature of the major volcanic eruptions are also pronounced in these simulations (Figures 3A, B, particularly for Agung (1963)(1964)(1965) and Pinatubo (1991Pinatubo ( -1994)), in line with the expected cooling of the upper ocean (Figure 3B) and intrusion of O 2 following eruptions which acts to temporarily slow down the advance of ocean deoxygenation (Eddebbar et al., 2019;Fay et al., 2023).The O 2 loss shown by the CMIP6 Historical simulations occurs alongside significant heat uptake which is also in line with the observed estimate of changes in the upper OHC (Figure 3B).
Simulated O 2 shows weaker variability than observed at depth (Long et al., 2016), an issue that likely persists in the OMIP and CMIP6 simulations (Figures 3C, E, G).About half of the models show lower temporal SD in global O 2 inventory compared to the latest ITO2022 dataset (magnitude ∼ 0.75 × 10 14 mol) and much smaller than the temporal SD from the previous dataset from Ito et al., 2017 (Figures 3C, E, G).While the CMIP6 Historical simulations show a relatively strong 50-year downward trend (Table 2) in the global O 2 inventory, major biases associated with physical and biogeochemical processes (e.g.zonal equatorial currents and eddies) in hinders their ability to reproduce observed variability of O 2 (e.g.Busecke et al., 2019).The temporal SD of global OHC time series from OMIP1 and OMIP2 simulations are smaller compared to the CMIP6 Historical simulations.Compared to the recent observational OHC time series, the simulated temporal SD might underestimate the observed fluctuations compared to the temporal SD from the Cheng and Levitus datasets.The ECMWF ORAS4 and EN4.2.1 show higher SD compared to the simulations and other observational datasets.
The 50-year linear trend in upper ocean global O 2 inventory and OHC for models and observations are summarized in Table 2.
The magnitude of the total decrease of global O 2 inventory over the historical simulation period is about 4−5×10 14 [mol] for the CMIP6 Historical simulations (Figure 3A and Supplementary Material), which is much stronger than the OMIP simulations.The latest estimates based on observational dataset shows a similar range, though discrepancies between observational products arising from gaps in observational coverage and differences in interpolation methods continue to generate large uncertainties in the rate of global ocean deoxygenation (Ito, 2022).A detailed observationsmodel comparison requires sub-sampling, gridding, and interpolation of model outputs in similar fashion to the observational estimates and will be addressed in a separate study (Ito et al., 2023).
Because ocean deoxygenation is driven by warming, we expect global O 2 inventory loss during periods of increased OHC, which is reflected in the observations-based products (Ito et al., 2017).The changes in OHC in the OMIP1, OMIP2 and CMIP6 Historical simulations do show heat gain in the upper ocean, although heat uptake in the OMIP simulations is substantially weaker compared to the CMIP6 Historical simulations (Figure 3B).This is likely due to adjustment of the ocean models to specific atmospheric forcing (Large and Yeager, 2009) and model spin-up procedures (e.g.simulations from Berthet et al., 2023 also showed an improvement representing observed upper 300m OHC using a modified OMIP2 protocol).To summarize, despite their shortcomings in representing the trends of ocean O 2 loss and Linear trend calculation is based on the past 50 years of time series (1958-2007for OMIP1 multi-model mean, 1969-2018for OMIP2 multi-model mean, and from 1965-2014for CMIP6 Historical multi-model mean, and 1965-2015 for observations).Linear trend values are based on Theil-Sen slope estimator and values with asterisk denotes statistically significant trends based on a modified Mann-Kendall test (p < 0.05).The ± ranges indicate the multimodel standard deviation for each configuration (OMIP1, OMIP2, CMIP6 Historical) and the standard deviation among the five observational dataset for OHC (± 0.34).For the observational global O 2 inventory, we present results from two different datasets with three different estimates based on Ito et al., 2017;Ito, 2022.Volume normalized (denoted as VN) value is the grid-volume normalized value presented in Ito et al., 2017.The observational global O 2 inventory trend values in square brackets denote linear trend estimates with uncertainties (the 2.5 th and 97.5 th percentiles of the slope distribution based on a bootstrapping method).We used scikit-learn (https://scikit-learn.org/stable/)for these uncertainty calculations.Takano et al. 10.3389/fmars.2023.1139917Frontiers in Marine Science frontiersin.orgheat uptake, the OMIP2 simulations show a general improvement in representing global ocean deoxygenation compared to the OMIP1 simulations, a feature that is likely due to improved representation of the atmospheric forcing such as wind (Taboada et al., 2019) and precipitation fields used to spin-up and run the ocean models.

Global historical ocean deoxygenation and heat uptake: spatial and vertical patterns
The linear trend patterns from the OMIP and CMIP6 Historical simulations show substantial differences among the configurations both in magnitudes and even sign of O 2 changes (Figure 4).The linear trend of column O 2 inventory in OMIP1 simulation is generally large and shows both positive and negative trends in different regions of the world ocean (Figure 4A).The positive column O 2 trend patterns in OMIP1 are pronounced in the subtropical North Pacific Ocean and in the high latitude Southern Ocean.Another key OMIP1 feature is that warming and cooling in the Pacific Ocean leads to a gain and loss of O 2 , respectively, was initially counter intuitive and is likely due to ocean circulation and biological processes rather than solubility driven changes (see section 3.4 for further discussion).The OMIP2 and CMIP6 Historical simulations overall show weaker magnitudes in linear trend than OMIP1 for both column O 2 inventory and OHC.The CMIP6 Historical simulations show an overall loss of O 2 that is well distributed globally, with intense regions of ocean deoxygenation focused in the North Atlantic, North Pacific and Indian Ocean that are tightly associated with ocean warming in these regions.The warming of the Southern Ocean, however, shows relatively little changes in the column O 2 inventory in CMIP6 Historical simulations.
The OMIP simulations are driven by atmospheric reanalysisbased forcing which includes both an atmospheric internal climate variability component and an externally generated forced response (e.g.greenhouse gases, volcanic and anthropogenic aerosols), and thus the ocean signature (shown by the OMIP means) of the atmospheric driven internal climate modes and anthropogenic drivers is common across model simulations.The CMIP6 Historical simulations, however, generate their own internal climate variability in the coupled climate system from each model ensemble member (e.g.Deser et al., 2020), and thus the CMIP6 multi-model mean largely highlights the ocean response to external forcing (i.e.global warming), as the internal climate variability component is presumably smoothed over in the multi-model mean.As a result, a more homogeneous pattern emerges in the CMIP6 Historical simulations that mainly reflects the anthropogenic responses (i.e.ocean deoxygenation and warming) while the OMIP models contain signatures associated with the specified atmospheric forcing ( Figures 4E, F).
The observational patterns show a less uniform distribution than the CMIP6 Historical simulation trends, likely due to the combined effects of internal climate variability and external forcing, though the limited observational coverage and gap filling methods may induce some of the patchy signals of upper ocean column O 2 inventory shown in Figure 4G.In the observational product, the North Pacific O 2 loss is pronounced, which is missing in OMIP simulations and only weakly simulated in CMIP6 Historical simulations.The tropical and subtropical regions show areas of positive O 2 trends, indicating that these regions may respond differently to anthropogenic forcing than the high latitudes, or that internal climate variability might play a more dominant role in these regions.An interesting feature that exists across simulations and observations is cooling (i.e."warming hole") in parts of the subpolar North Atlantic Ocean (e.g.Rahmstorf et al., 2015;Keil et al., 2020), that is associated with an increase in O 2 .This cooling and O 2 increase is occurring along with broad ocean warming and deoxygenation in the North Atlantic, and likely reflects an ocean physical response to anthropogenic forcing (Rahmstorf et al., 2015;Keil et al., 2020).
The upper ocean vertical structures (i.e.Hovmo¨ller diagram of depth vs time) of global area-weighted mean O 2 and sea water temperature highlight how deep the ocean deoxygenation and warming signals penetrate into the ocean (Figures 5, 6).In all but the OMIP1 simulations, the ocean is losing O 2 and gaining heat at depth.The OMIP1 deoxygenation occurs mainly in the upper 200m, and increase in O 2 below 200m after 1995, which is consistent with their simulated increase in upper ocean global O 2 inventory in recent decades (Figure 3A).The OMIP2 simulations show a general decrease in O 2 throughout the upper ocean, though most of the O 2 loss occurs in the upper 200m.In contrast, the CMIP6 Historical simulation show a substantial decrease in global O 2 throughout the upper 700m associated with the anthropogenic warming effects, though the highest rates of O 2 loss are found between 200-600m depth, in line with reduced ventilation of the thermocline in a warming world (Keeling and Garcia, 2002).There are also contrasts in the vertical penetration of anthropogenic heat and O 2 loss, whereby the pronounced O 2 changes penetrate down to 700m but the largest heat uptake signal is in the upper 200m (Figures 5, 6), suggesting the large role of stratification and reduced ventilation in setting the O 2 changes at depth in the CMIP6 Historical simulations.This is evident from the changes in AOU (see section 3.4 and Supplementary Material) and likely the combined physical-biological effect supports O 2 loss penetration further down into the ocean compared to OHC.
The observational vertical patterns show a relatively similar ocean deoxygenation response to the CMIP6 Historical simulations (Figure 5D), with O 2 signals penetrating down to 700m, suggesting that the observations may be reflecting changes associated largely with anthropogenic warming.Positive anomalies in O 2 are found at depth following eruptions, as expected from the effects of volcanic eruptions on the O 2 content (Eddebbar et al., 2019;Fay et al., 2023), though the observational product does not show a major anomaly associated with the Pinatubo eruption (Figure 5D), potentially due to the large anthropogenic forcing and internally generated variability effects during the later decades.The patchiness in O 2 anomalies found in observations and OMIP simulations are likely due to atmospheric or oceanic internal climate variability.Similarly to the globally integrated trend analysis discussed above, the warming and O 2 changes shown at depth in both OMIP simulations are weaker compared to the CMIP6 Historical simulations and observations, particularly for the OMIP1 simulations which show cooling around 200m depth.The observational changes in global OHC for upper ocean overall shows transition from cooling to warming in the upper few hundred meters, although Cheng's dataset shows intense cooling from 1960s to 1970s compared to the other dataset (Figure 6).

Drivers of historical ocean deoxygenation: thermodynamics vs. circulation and biology
Global O 2 saturation decreases in OMIP2 and CMIP6 Historical simulations, but the magnitudes differ among the simulations (Figure 7 and Supplementary Material).The most intense O 2  saturation loss is found in the CMIP6 Historical simulations, consistent with the intense global ocean heat uptake in the upper ocean (Figure 3B).The OMIP1 simulations show weak or even no change in saturation driven O 2 changes.The complementary simulations (MPI-ESM1-2-LR and NEMOv3.6-PlankTOM12.1)show a much stronger decrease in O 2 saturation likely due to different model spin-up procedures in these simulations (see Supplementary Material).The dominant driver of global ocean deoxygenation is AOU, also a major source of uncertainty in simulating historical deoxygenation.The CMIP6 Historical simulations and observations show an increase in AOU (i.e.decrease in O 2 ), which reinforces the thermally driven O 2 loss (Figure 7 and Supplementary Material).The OMIP1 simulations show decrease in AOU, which is a major factor for the oxygenation in the last decades of these simulations.
The spatial patterns of changes in upper ocean vertical mean O 2 saturation and AOU are shown through a map of linear trends over the past 50 years in Figure 8. Similar to the linear trend analysis of column O 2 inventory and OHC (Figure 4), the OMIP1 simulations show intense changes on multi-decadal timescales, and the magnitude of the trend is larger in AOU than O 2 saturation.The response in O 2 saturation in OMIP1 models is consistent with their weak OHC change in Figure 4, and thus AOU changes play a more important role in regions with deoxygenating (oxygenating) and cooling (warming) tendencies.Similarly, the O 2 saturation trends in the OMIP2 and CMIP6 Historical simulations and observations are weaker than the AOU trends, with AOU playing a key role in the tropical basins for the CMIP6 Historical simulations.The CMIP6 Historical simulations overall show a uniform decrease in O 2 saturation due to warming and increased AOU.This increase in AOU is likely due to reduced ventilation supply of O 2 from the surface into the interior ocean.The strong AOU increase in the North Pacific Ocean from observations may include contributions from the internal climate variability which is weak but is partly evident in the OMIP2 simulations.The North Atlantic warming hole signature is evident in the O 2 saturation changes, which is consistent with changes in OHC (Figure 4).
The upper ocean vertical structures (i.e.Hovmöller diagram of depth vs time) of O 2 saturation and AOU highlight the drivers of ocean deoxygenation in the surface and interior ocean (see Supplementary Material).The simulations and observation both exhibit thermal loss of O 2 in upper 100m, but how deep the thermal loss penetrates differs among the configurations.The CMIP6 Historical simulations and observation are relatively similar in vertical structures of thermal loss, which penetrates down to 700m but OMIP simulations thermal loss is restricted to the top 200m.In OMIP1 simulations, there is an increase in O 2 saturation, likely driven by cooling at 200m depth.
To summarize the relationship and relative contributions of these two components, we present global changes in O 2 saturation and AOU for models and observations in the scatter plots shown in The same as in Figure 5, except for sea water temperature anomaly.The Hovmöller diagram of sea water temperature anomaly include results from the models (A-C) and four different observational datasets (D-G).Ensemble data spread (SD) for observational datasets is included in (H).
the strong changes in the CMIP6 Historical simulations towards negative O 2 saturation as much as -1 mmol m −3 and increasing AOU, which results in stronger ocean deoxygenation trends in these simulations, consistent with the trends shown in Figures 3, 4.
The strong negative O 2 saturation and AOU changes are consistent with the intense simulated ocean heat uptake in CMIP6 Historical simulations, and observations (Figure 9, black lines and dots).

3.5
The O 2 to heat ratio: how tightly are global ocean deoxygenation and warming related?
The global O 2 inventory is known to be tightly and negatively correlated with global OHC in the upper ocean (Bopp et al., 2002;Keeling and Garcia, 2002;Ito et al., 2017).Here, we examine this global O 2 Heat relationship within the context of the historical era  in the OMIP1, OMIP2, and CMIP6 Historical simulations and observations.A key feature in the OMIP1 simulations is the absence of a strong relationship between the global O 2 inventory and OHC in the multi-model mean, with a broad spread across individual models (gray dots and the blue line in Figure 10A).The OMIP2 simulations show a more negative and tighter relationship between the two properties (gray dots and the red line in Figure 10B), with narrower spread between models than the previous OMIP1 simulations.This is likely due to differences in theatmospheric forcing driving a more intense and sustained ocean deoxygenation and heat uptake.The CMIP6 Historical simulations show a tight relationship between the O 2 and heat contents (gray dots and the magenta line in Figure 10C), with a similar slope to the OMIP2 simulations (see Figures 10B, C).Table 3 summarizes the regression coefficients between global O 2 inventory and OHC.The regression coefficients from OMIP2 and CMIP6 Historical simulations (-3.2 and -2.8 [nmol/J]) are close to the observational values (-2.3 to -2.6 [nmol/J]) while the OMIP1 simulations show a weak relationship (-0.4 [nmol/J]) (Table 3).This indicates that the OMIP2 simulations capture the observed global O 2 -Heat relationship.This tighter global O 2 -Heat relationship compared to that of the OMIP1 simulations are likely due to improvements in the atmospheric forcing.
Where do these differences in the simulated and observed global O 2 inventory and OHC relationships originate?To elucidate this, we show global correlation coefficient patterns of column OHC (i.e.twodimensional data) against the globally integrated upper ocean (0-700m) O 2 inventory time series (i.e.one-dimensional data) over the historical period (i.e. the past 5 decades) (Figure 11).This enables us to highlight in which region the ocean gains or loses heat when global O 2 inventory changes.For example, the positive correlation coefficient indicates that the increase in global O 2 inventory is associated with increase in column OHC in a specific region (grid points) (i.e.counter intuitive relationship).We note that the O 2 inventory and OHC data have not been de-trended in this analysis to focus more on the long-term changes and linear relationships.Key differences emerge in the correlation coefficient fields from the OMIP1 simulation, showing a positive relationship between O 2 inventory and OHC throughout the northern subpolar, subtropical, and tropical Pacific and North Atlantic basins, though with low agreement between models in the tropical Pacific Ocean.The positive relationship in these regions likely drives the weak

Discussion and conclusion
Through a comprehensive analysis of the OMIP and CMIP6 Historical simulations, we found that the simulations show substantial differences in the rate and spatial patterns of global O 2 inventory loss in the upper ocean over the historical era.These differences likely arise from differences in the ocean models background mean states set by the spin-up procedures as well as the atmospheric forcing product likely associated with adjustments to the surface wind and buoyancy forcing.

Potential impact of atmospheric forcing
Differences in the long-term trend of the surface wind stress fields between OMIP1 (CORE-II) and OMIP2 (JRA55-do) (in Figure 12) could lead to bias in spun-up states and subsequent decadal response in the upper ocean O 2 inventory.A spurious trend in zonal wind stress in the tropical Pacific Ocean (also discussed in Yeager et al., 2018;Ito et al., 2019 and see Figure 12A) likely leads to O 2 inventory and OHC biases in the Pacific Ocean (see Figure 4) through changes in the equatorial currents and upwelling in this  region.These biases overall impact on the global O 2 inventory and OHC in the OMIP1 simulations.Therefore, while the OMIP1 simulations struggle to simulate the observed O 2 loss, OMIP2 simulations show improvement to some extent in simulating global ocean deoxygenation likely due to reduced bias in zonal wind stress and increased resolution of the wind fields (Oschlies et al., 2018;Taboada et al., 2019).We do note that the complementary simulations based on the ECMWF ERA-20C and the NCEP-NCAR Reanalysis 1 atmospheric forcing did not show much of a difference in O 2 inventory response even with a different spin-up procedures.

Models' equilibrium states and spin-up
Differences between OMIP and CMIP6 Historical simulations include the models' equilibrium states.The CMIP6 Historical simulations are less likely to be impacted by model drift compared to the OMIP simulations due to the longer spin-up used in CMIP6 Historical simulations.In addition, the coupled CMIP6 Historical simulations could reach surface equilibrium faster compared to the OMIP simulations as any oceanic heat loss (gain) has an opposing effect on the atmospheric temperature whereas changes in sea surface temperature in the OMIP simulations do not feedback on the atmosphere.These spin-up procedures and feedback could have major impacts on regional changes.
The OMIP simulations are spun-up based on the recent historical forcing (i.e.atmospheric reanalysis for the past 50 years or so), which likely brings the background mean states to modern conditions.This is a major difference from the CMIP6 Historical simulations, which are based on the pre-industrial spun-up states of the ocean.This hypothesis has been tested in a recent study (Huguenin et al., 2022) for a specific modeling study of OHC, which showed that the background mean state of the ocean model is important for simulating historical ocean heat uptake.This approach should be tested with multi-model simulations.We hypothesize that a similar effect applies for simulating changes in the ocean O 2 content, as well as other biogeochemical tracers and processes (e.g.changes in the ocean carbon and nitrogen inventories).

Implications and future strategy
The CMIP6 Historical simulations show that the state-of-the art models have the potential to simulate historical deoxygenation in line with one of the most recent observational estimates.However, model spinup strategy and the dependency of the ocean circulation and biogeochemical responses on the atmospheric forcing could strongly bias the simulated O 2 changes.We thus propose for the next multi-modeling projects to coordinate forced ocean-only model simulations with a model spin-up strategy that employs a pre-industrial-like atmospheric forcing to achieve background mean states close to pre-industrial conditions.The potential experimental design could be based on an ocean model spin-up using the corrected atmospheric forcing used in the recent study by Huguenin et al., 2022.In this protocol, the corrected atmospheric forcing product uses air temperature and shortwave radiation adjusted to pre-industrial states from the recent IPCC report.This adjusted forcing successfully simulates global OHC during the historical era in the ACCESS-OM2 (one of the models analyzed in this study), which is currently weaker in the standard OMIP2 simulations (see Figure 3B and Huguenin et al., 2022).The similar approach should be tested using a suite of CMIP6 models, extending our simulations and analysis to other ocean biogeochemical tracers.
An alternative route may include using pre-industrial atmospheric forcing from the coupled simulations from each model participating in CMIP6 DECK simulations (Eyring et al., 2016).For either option, it is important to explore the ocean biogeochemical response to different background mean states using a suite of models to elucidate how ocean biogeochemistry responded in recent decades.Coordinated sensitivity experiments based on perturbed atmospheric forcing will enable us to further elucidate the mechanisms controlling the multi-decadal O 2 variability and trend (and factors leading to biases in the current simulations).Wind perturbation experiments, such as changing the magnitudes of zonal wind in the tropical Pacific Ocean, Southern Ocean, etc., built upon the previous sensitivity studies with specific models, for example from (Duteil et al., 2018) and (Ridder and England, 2014) will further elucidate how "common" atmospheric forcing biases leads to O 2 and ocean biogeochemistry biases in climatological mean states and long-term changes.Exploring O 2 variability driven by major climate modes, such as the Pacific Decadal Oscillation (PDO) and El Niño Southern Oscillation (ENSO) should also be explored on top of these sensitivity experiments (e.g.Deutsch et al., 2011;Ito and Deutsch, 2013;Deutsch et al., 2014;Eddebbar et al., 2017;Duteil et al., 2018;Ito et al., 2019;Poupon et al., 2023).
Ocean biogeochemistry models are known to have major issues in simulating the observed patterns and magnitude of global ocean deoxygenation in past decades (Stramma et al., 2012;Oschlies et al., 2017).Our analysis shows some improvements in simulating the observed global ocean deoxygenation.The observational datasets contains major uncertainties, with varying rates of O 2 loss across studies from the estimates of Ito et al. (2017) to more conservative estimates of Schmidtko et al. (2017) andIto (2022).Recent efforts to combine existing datasets and new observations from autonomous floats e.g.Addey (2022) along with more robust quantification of uncertainties associated with data coverage and interpolation, alongside improved comparisons to models will be critical for a comprehensive assessment.Despite their shortcoming, forced ocean model simulations remain highly useful tools, particularly in understanding the drivers of internal climate variability in the O 2 content and distributions (e.g.Ito et al., 2019;Buchanan and Tagliabue, 2021).To maximize their use, future coordinated model experiments should highlight the choice of spin-up strategy for ocean biogeochemistry to achieve pre-industrial like ocean states for O 2 as well as other ocean biogeochemical cycles.The cyclic forcing (as in OMIP simulations) and associated discontinuity in solubility and ocean ventilation change could impact the ocean carbon cycle, and this could be explored quantitatively in future work.More detailed regional analysis is also needed to understand the mechanisms regulating global O 2 levels, particularly how ocean circulation and biological processes induce region-specific changes.

1
FIGURE 1Time series of upper ocean global (0-700m) mean sea water temperature (A, E) and O 2 (C, G) from the OMIP1 and OMIP2 simulations from full cycles (310-366 years).OMIP-MM-MEAN is multi-model mean from both OMIP1 and OMIP2 simulations.Gray vertical dashed lines show the repeating forcing cycle, 5 cycles for OMIP1 and 6 cycles for OMIP2 simulations.The OMIP drift is defined as the difference between the initial condition and the historical climatology based on 50-year mean from the final cycle of the simulations to quantify the direction of model's drift relative to the initial condition (B, D, F, H).Black horizontal dashed lines show zero lines.
FIGURE 2 Climatological mean distributions of the upper ocean (0-700m) column O 2 inventory based on the past 5 decades of data.Column O 2 inventory is vertically integrated O 2 for each grid point.(A) OMIP1 multi-model mean, (B) OMIP2 multi-model mean, (C) CMIP6 Historical multi-model mean, and (D) from the World Ocean Atlas 2018.The number on upper left of each panel represents the globally integrated O 2 inventory.Lower (E-G) show difference between simulations and the World Ocean Atlas 2018.The (H) shows the difference between OMIP1 and OMIP2 multi-model mean [(B) minus (A)].
FIGURE 3Historical time series of upper ocean (0-700m) global O 2 inventory anomaly and ocean heat content (OHC) from OMIP1, OMIP2 and CMIP6 Historical multi-model mean and observational dataset for (A, B).Note that time series for OMIP2 simulations from 1958-1967 are omitted from the analysis because of the initial forcing shock from the repeat forcing applied to the models(Tsujino et al., 2020).The reference for the anomaly for each simulation is, 1958-2007 for the OMIP1, 1969-2018 for the OMIP2, and 1965-2014 for the CMIP6 Historical simulations, respectively.Gray triangles denote the timing of major volcanic eruptions and vertical cyan lines denote the timing of "hiatus".(C-H) show the summary of temporal SD (i.e.variability) of global O 2 inventory and OHC from the OMIP1, OMIP2 and CMIP6 Historical simulations.Temporal SD is based on SD from the past 50 years (the same as the reference periods) of detrended global O 2 inventory and OHC time series.Temporal SD from the observational datasets are shown in horizontal lines (black and cyan solid and dotted lines).

FIGURE 4
FIGURE 4 Patterns of linear trend of upper ocean (0-700m) column O2 inventory and OHC for the past 5 decades from models and observation.Model simulations are based on multi-model mean from each configuration.Linear trend calculation period for each simulation is, 1958-2007 for the OMIP1 (A, B), 1969-2018 for the OMIP2 (C, D), and 1965-2014 for the CMIP6 Historical simulations (E, F).The regions with black dots show locations of trend statistically significant above the 95% confidence level based on a modified Mann-Kendall test.The observational column O2 inventory and OHC are from WOD2018 (Ito, 2022) and Cheng's dataset (Cheng et al., 2022) (G, H).The OHC trend from other observational dataset are in Supplementary Material.

5
FIGURE 5 Hovmöller diagram of upper ocean (0-700m) O2 anomaly from a suite of model simulations and observational dataset.The values depict the global area-weighted average of O2 anomaly for each depth.The anomaly is defined as a deviation from the reference long-term climatology.The reference for the anomaly for each simulation is, 1958-2007 for the OMIP1 (A), 1969-2018 for the OMIP2 (B), 1965-2014 for the CMIP6 Historical simulations (C), and for the World Ocean Database 2018 (Ito et al., 2022) (D), respectively.Model simulations are based on multi-model mean.Gray triangles denote the timing of major volcanic eruptions and vertical gray lines denote the timing of "hiatus".

Figure 9 .
FIGURE 6 FIGURE 7Historical time series of upper ocean (0-700m) global mean O 2,sat and -1 × AOU (see section 2.3) from OMIP1, OMIP2 and CMIP6 Historical multimodel mean and observational dataset (A, B).Note that time series for OMIP2 simulations from 1958-1967 are omitted from the analysis because of the initial forcing shock from the repeat forcing applied to the models(Tsujino et al., 2020).The reference for the anomaly for each simulation is, 1958-2007 for the OMIP1, 1969-2018 for the OMIP2, and 1965-2014 for the CMIP6 historical simulations.Gray triangles denote the timing of major volcanic eruptions and vertical cyan lines denote the timing of "hiatus".

FIGURE 8
FIGURE 8 Linear trend of upper ocean (0-700m) vertical mean O2 saturation and -1 × AOU.Linear trend calculation period for each simulation is 1958-2007 for the OMIP1 (A, B), 1969-2018 for the OMIP2 (C, D), and 1965-2014 for the CMIP6 historical (E, F) simulations.The region with black dots show locations of trend statistically significant at the 95% confidence level based on a modified Mann-Kendall test.The observational O2 saturation and AOU are based on WOD2018 (Ito, 2022) and Cheng's dataset (Cheng et al., 2022) (G, H).The O2 saturation and AOU from other observational datasets are in Supplementary Material.
FIGURE 9 Relationship between upper ocean (0-700m) global O 2 saturation and -1 × AOU from a suite of model simulations and observational datasets.The O 2 saturation and minus AOU are in anomaly relative to long-term mean and scatter plots are based on the model output and observations from the past 5 decades [1958-2007 for OMIP1 multi-model mean (A), 1969-2018 for OMIP2 multi-model mean (B), and from 1965-2014 for CMIP6 Historical multi-model mean (C), and 1965-2015 for observations (A-C)].Colored dots represent individual models and solid colored lines are linear regression between O 2 saturation and minus AOU from the multi-model mean.The label +O 2 in the first quadrant shows where gain in O 2 occurs due to changes in O 2 saturation and minus AOU (and similar for -O 2 in the third quadrant).Black dots and triangles represent the O 2 saturation and -1 × AOU based on observations (World Ocean Database 2018 (gridded dataset from Ito, 2022) for O 2 and four observational datasets for sea water temperature and salinity).
FIGURE 10 Relationship between upper ocean (0-700m) global O 2 inventory and OHC from a suite of model simulations and observational dataset from the past 5 decades (1958-2007 for OMIP1 (A), 1969-2018 for OMIP2 (B), and from 1965-2014 for CMIP6 Historical (C), and 1965-2015 for observations (included in A-C along with model simulations)).Gray dots represent individual models and solid colored dots are multi-model mean.The lines show linear regression between the global O 2 inventory and OHC.

FIGURE 11
FIGURE 11Correlation coefficient patterns of column OHC (i.e.two-dimensional data) against the globally integrated upper ocean O 2 inventory time series (i.e.one-dimensional data) averaged over all models (A-C) and from an observational dataset (D).No de-trending applied to the data to focus on the trend signals.Black dots designate robustness, as defined by at least 80%model sign agreement in correlation coefficients.The correlation analysis is based on the period of1958-2007 for OMIP1, 1969-2018 for OMIP2, 1965-2014 for CMIP6 Historical, and 1965-2015 for the observations  [WOD2018 and Cheng's dataset (Cheng et al., 2022;Ito, 2022)].Correlation coefficients based on other observational column OHC dataset are in Supplementary Material.

FIGURE 12
FIGURE 12Linear trend of zonal and meridional wind stress fields from the CORE-II (A, C) and JRA-55 (B, D) atmospheric forcing used for the OMIP1 and OMIP2 simulations.Linear trend calculation period for each simulation is 1958-2007 for the CORE-II, and 1969-2018 for the JRA-55 wind stress fields respectively.The region with black dots show locations of trend statistically significant at the 95% confidence level based on a modified Mann-Kendall test.

TABLE 1 The
OMIP1, OMIP2, and CMIP6 Historical simulations used in this study, including the ocean, sea-ice, and biogeochemical model components, atmospheric forcing dataset, and simulation periods.

TABLE 2
Multi-decadal linear trend of upper ocean global O 2 inventory and OHC.

TABLE 3 A
list of global O 2 inventory and OHC relatio-2018 presented as linear regression coefficients from the past 5 decades(1958-2007 for  OMIP1 multi-model mean, 1969-2018 for OMIP2 multi-model mean, and  from 1965-2014 for CMIP6 Historical multi-model mean, and 1965-2015for observations) of time series from the models and observations.