Decadal variability of oxygen uptake, export, and storage in the Labrador Sea from observations and CMIP6 models

2 The uptake of dissolved oxygen from the atmosphere via air-sea gas exchange and its physical 3 transport away from the region of uptake are crucial for supplying oxygen to the deep ocean. 4 This process takes place in a few key regions that feature intense oxygen uptake, deep water 5 formation, and physical oxygen export. In this study we analyze one such region, the Labrador 6 Sea, utilizing the World Ocean Database (WOD) to construct a 65–year oxygen content time 7 series in the Labrador Sea Water (LSW) layer (0–2200 m). The data reveal decadal variability 8 associated with the strength of deep convection, with a maximum anomaly of 27 mol m − 2 in 9 1992. There is no long-term trend in the time series, suggesting that the mean oxygen uptake is 10 balanced by oxygen export out of the region. 11 We compared the time series with output from nine models of the Ocean Model Intercomparison 12 Project phase 1 in the Climate Model Intercomparison Project phase 6, (CMIP6-OMIP1), and 13 constructed a ”model score” to evaluate how well they match oxygen observations. Most CMIP6-14 OMIP1 models score around 50/100 points and the highest score is 57/100 for the ensemble 15 mean, suggesting that improvements are needed. All of the models underestimate the maximum 16 oxygen content anomaly in the 1990s. One possible cause for this is the representation of air-sea 17 gas exchange for oxygen, with all models underestimating the mean uptake by a factor of two or 18 more. Unrealistically deep convection and biased mean oxygen profiles may also contribute to 19 the mismatch. Refining the representation of these processes in climate models could be vital for 20 enhanced predictions of deoxygenation.


INTRODUCTION
The Labrador Sea is one of the few regions in the world where oxygen can be transported directly into the deep ocean, which acts as a supply for marine organisms throughout the global oceans.This supply of dissolved oxygen to the deep is the result of oxygen uptake through air-sea gas exchange (Atamanchuk et al., 2020), and its subsequent export out of the region (Koelling et al., 2022).Oxygen uptake occurs primarily during wintertime, when the process of deep convection continually mixes the water column to depths of more than 1000 m, bringing up low-oxygen water from below and maintaining an undersaturation of surface waters relative to the atmosphere (Wolf et al., 2018).The resulting air-sea gradient in oxygen, along with bubble injection driven by strong winds and wave breaking, leads to intense air-sea exchange of oxygen during these winter months (Koelling et al., 2017).As a result the Labrador Sea, and the subpolar North Atlantic as a whole, feature some of the strongest oxygen uptake rates anywhere in the world ocean (Huang et al., 2018), with annual values in the Labrador Sea being on the order of 20 mol m −2 y −1 (Atamanchuk et al., 2020).Thanks to this strong uptake and its link to deep water formation, the subpolar North Atlantic, along with the Southern Ocean, is the main region where oxygen can enter the deep ocean, making it one of the "lungs of the ocean" (Körtzinger et al., 2004;Portela et al., 2020).North Atlantic Deep Water (NADW), a water mass formed in part as a result of deep convection in the Labrador Sea, can be traced by its high oxygen concentrations in the formation region which decrease along southward spreading pathways (Fig. 1).The formation of NADW in the subpolar North Atlantic also leads to a stark difference in mean oxygen concentration between the Atlantic Ocean and the Pacific and Indian Oceans.
The Labrador Sea has been sampled more consistently than most open ocean regions over many decades, with regular measurements dating back to the late 1940s.Furthermore, the main convection region in the interior Labrador Sea is relatively small and homogeneous, such that individual measurements at only a few stations can give useful information about variability in the basin (van Aken et al., 2011;Yashayaev and Loder, 2016).Ocean Weather Ship Bravo was stationed in the center of the basin from the 1940s to 1970s, and following occasional cruises in the 1950s to 1980s, the AR7W line across the basin from Labrador to Greenland has been occupied almost every year since 1987 as part of the WOCE, CLIVAR and AZOMP initiatives (Kieke and Yashayaev, 2015).The comprehensive record of oceanographic variables collected by these projects has been used primarily in the context of the variability in physical properties, and their relation to deep convection and the Atlantic Meridional Overturning Circulation (AMOC) (Yashayaev, 2007;Yashayaev and Loder, 2016).Fewer studies have focused on the variability in oxygen concentrations, with notable exceptions including studies by Stendardo and Gruber (2012) and van Aken et al. (2011) that reported long-term oxygen trends and variability in the subpolar North Atlantic until 2010 from the historic shipboard data, and Rhein et al. (2017) who analyzed variability in oxygen concentrations since the early 1990s.Here, we utilize the data to construct a time series of oxygen content anomalies for the Labrador Sea Water (LSW) layer (0-2200 m) in the central Labrador Sea from 1950 to 2015.We use this time series to discuss how the decadal-scale variability of oxygen concentrations reported in previous studies relates to variability in the uptake of oxygen during deep convection and its subsequent export along NADW spreading pathways.
Changes in the strength of ocean ventilation have been proposed as one of the reasons for widespread global ocean deoxygenation observed in recent decades (Ito et al., 2017;Schmidtko et al., 2017).However, this is not reproduced accurately in climate models, which feature too little deoxygenation particularly in the deep ocean, which may be related to difficulties with representing the processes underlying ventilation (Oschlies et al., 2017(Oschlies et al., , 2018;;Buchanan and Tagliabue, 2021).To assess the ability of climate models to reproduce observed variability in oxygen ventilation in the Labrador Sea, we use the time series constructed from observational data to evaluate nine different models that participated in the Ocean Model Intercomparison Project phase 1 component of the Climate Model Intercomparison Project phase 6 (hereinafter CMIP6-OMIP1).For this purpose, we develop a "model score" based on the ability of CMIP6-OMIP1 models to reproduce the observed variability in the oxygen content time series and mean profile from the WOD data, as well as gas exchange estimates taken from other studies.This score serves as a simplified metric of the ability of climate models to represent the complex processes that underlie ocean ventilation.Our assessment highlights processes which need to be better resolved in climate models in order to more accurately portray oxygen cycling in the Labrador Sea.This analysis could inform improvements for future model generations, and may be applicable to other regions with comparable dynamics.
Finally, we combine our analysis of the long-term variability in oxygen content with insights from previous observational studies of air-sea gas exchange and oxygen export, and with the CMIP6-OMIP1 model data, in order to understand how air-sea gas exchange, oxygen storage, and oxygen export are related on interannual and decadal time scales.

Observational time series
To construct the oxygen inventory time series, we use quality controlled data from the World Ocean Database (WOD; Boyer et al. (2013)) in the central Labrador Sea, which includes oxygen measurements from as early as the 1950s.Some data from individual cruises are available as far back as 1927, but given the large temporal gaps these are not used in our study.We use all measurements within a 200 km radius of 56 • 49.3'N, 52 • 13.2'W from fixed platforms and shipboard hydrography, which are flagged as "good data" in the WOD (see Fig. 1 for location).The data were averaged into annual values with 50 m vertical bins, and cover the period from 1950 to 2015.We chose not to include more recent data from profiling floats such as BGC-Argo in order to avoid biases resulting from the shift of predominantly ship-based data typically collected during spring or summer to predominantly autonomous data collected year-round.Even with the shipboard data, there may be some biases resulting from changing sampling schemes and times between years, particularly in the early parts of the record, as well as year-to-year variability associated with instrumental errors during individual cruises (Yashayaev et al., 2020).However, the focus here is on the longer-term decadal signals rather than variability from one year to the next, and we assume these effects are small compared to the decadal variability in properties, consistent with previous studies (van Aken et al., 2011).To further test the representativeness of a single point and data collected once a year for measuring interannual variability in the whole region, we used CMIP6-OMIP1 model output to construct a synthetic time series derived from one grid point in the center of the convection region once a year and compared it to the full time series using all data within a 200 km radius (Fig. S1).The comparison suggests that taking measurements only at one location during summer each year does not introduce any significant bias.
To construct the time series of oxygen content anomalies, the time-mean oxygen profile was first subtracted from the annual WOD profiles, and the resulting anomalies were interpolated vertically between measurements, and extrapolated to the surface and seafloor, ommitting years where no measurements were available below 2000 m.The anomalies were then integrated over the 0-2200 m depth range to obtain a time series of annual oxygen content anomalies in this depth range which we assume to be the maximum depth typically occupied by Labrador Sea Water (Yashayaev and Loder, 2016).The time series used for further analysis and comparison with the CMIP6-OMIP1 data was constructed by smoothing the annual time series with a 3-year running mean filter to reduce some of the noise associated with the sampling issues discussed above, and then filling any remaining gaps in the data with linear interpolation.
To estimate the uncertainty of the time series, we first calculated the standard deviation σ in each one-year, 50 m depth bin.These annual values were then averaged to obtain a mean standard deviation profile for the entire record as an approximation of the true statistical standard deviation.The standard error was calculated from this mean profile and the number of observations in each bin as σ x = σ √ N . Bins with no measurements were treated as having N = 1, i.e. the standard error is equal to the standard deviation.This method was chosen to avoid underestimating the error in years with fewer observations, where the standard deviation of the samples will be lower than the true standard deviation of the statistical population.The resulting error estimate for the oxygen inventory ranges from 1.59 mol m −2 for recent years with more extensive coverage to 9.65 mol m −2 for years with limited or no measurements.

CMIP6-OMIP1 models
To evaluate the processes governing the oxygen budget in the Labrador Sea, we also used data from nine ocean models that were part of the Coupled Model Intercomparison Project phase 6 (CMIP6), accessed from https://esgf-node.llnl.gov/search/cmip6/.The models selected were those that participated in phase 1 of the ocean model intercomparison project (OMIP1) and included a biogeochemical component (Orr et al., 2017) (Large and Yeager, 2009) that is used in all OMIP1 runs (Orr et al., 2017).Based on the comparison in Fig. S1 we use data from a single grid point in the center of the convection region to be representative of the convective patch.

Oxygen budget calculation
As a biologically active element, the change in the inventory of oxygen for a given volume of the open ocean is controlled by both physical processes, such as air-sea gas exchange, lateral advection, and vertical export, and net community metabolism, which represents the net effect of biological production minus autotrophic and heterotrophic respiration (Martz et al., 2009;Koelling et al., 2017).
where ∆ z 0 O 2 is the change in oxygen content between the surface and depth z in a given time period, and the terms on the right hand side are the changes due to gas exchange, lateral exchanges, vertical transport, and net community metabolism, respectively.In the above definition the vertical and lateral terms, F vert and F lat , are positive for a flow of oxygen out of the volume considered.
One mechanism for vertical oxygen transport is the deepening of the mixed layer, which can have a significant effect in the upper ocean as water with different properties is entrained into the mixed layer (Martz et al., 2009).Vertical exchanges also occur as a result of deep wintertime convection, which can extend the mixed layer to depths exceeding 2000 m in the Labrador Sea, and through diffusive fluxes at the base of the integration depth.In this study, we integrate the oxygen content to the depth reached by the maximum historical convection, 2200 m (Yashayaev and Loder, 2016), so the effect of deep convection, as well as shallow summertime entrainment, is eliminated.Thus, the F vert term in our calculation represents only the exchange of oxygen that may occur with the underlying water masses by turbulent mixing, which should be small in the deep ocean.The F lat term represents lateral exchanges between the interior Labrador Sea and the surrounding boundary current (Wolf et al., 2018;Koelling et al., 2022), as well as its export along interior pathways (Talley and McCartney, 1982;Yashayaev and Loder, 2016).We refer to the sum of the F vert and F lat terms as oxygen export, as they are equivalent to the amount of oxygen that is exported by physical processes, leaving the volume of the Labrador Sea that can be affected by convection again in the following winters.However, it is more accurately the net export, as it includes both the removal of relatively high oxygen Labrador Sea Water (Koelling et al., 2022) and its replacement with lower oxygen water either from exchange with the boundary current or vertical mixing (Wolf et al., 2018).
Previous results studying parts of the oxygen budget in the Labrador Sea suggest that the biological component is relatively small compared to gas exchange and export in the mean balance (Koelling et al., 2017;Wolf et al., 2018).To estimate its variability on interannual to decadal timescales, we use data from the Carbon-based Productivity Model (Westberry et al., 2008), a satellite-based net primary production product, along with a global estimate of oxygen utilization rates in the deep ocean (Karstensen et al., 2008), to calculate annual Net Community Production (NCP, Fig. S2).Although this product only covers the period since 2002, and therefore does not resolve the historical period analyzed here, interannual variability is small (range: 0 mol m −2 to 1.16 mol m −2 of O 2 ), suggesting that the contribution of the biological component and its variability to decadal changes are negligible for the 0-2200 m integrated budget in this region of particularly strong physical forcing.
In our investigation of the oxygen content changes in the following sections, we therefore assume the year-to-year change in oxygen content, also referred to as oxygen storage throughout this study, to be driven only by air-sea gas exchange and export.In section 4, using CMIP6-OMIP1 model data, we rearrange eq. 1 to infer the oxygen export from model output of oxygen storage and gas exchange.

Model score
In order to evaluate the ability of different models to reproduce the observed variability in the Labrador Sea, we construct a model score designed to represent the processes that are relevant for deep ocean ventilation.The source code for calculating the score is provided at https://github.com/jkoell/labrador-seaoxygenSimilar analyses have been conducted in other studies for climate variables on global scales (Gleckler et al., 2008;Anav et al., 2013), as well as for more regional assessments of ocean biogeochemistry, similar to the present study (Rickard et al., 2016;Laurent et al., 2021).Unlike most of these previous works, our analysis focuses mostly on interannual changes instead of seasonal cycles.We also choose a score that is based solely on the comparison with observational data, rather than a relative model score or ranking as used in other studies (Gleckler et al., 2008;Anav et al., 2013).This method is chosen to make the score independent of the number or types of model used, making it easier to compare future models to the results from this study.For example, the score calculation could be applied to high-resolution regional models of the subpolar North Atlantic, or future generations of CMIP (e.g.Koenigk et al. (2021)), and a higher score than the models analyzed here would imply better performance.However, we note that a score calculated relative to the worst-performing model for each category, following Anav et al. (2013), would yield qualitatively similar results.
There are five factors considered, with scores for each ranging from 0 to 20 points, for a maximum total of 100 points which would indicate a perfect match with observations.These are summarized below, with details on score calculation and reasoning for each: • Standard deviation for the 2200m-bottom inventory anomaly: Used to evaluate if there is excess variability below the LSW layer.In observations, oxygen variability in this depth range is lower compared to changes above.However, some models have unrealistically large anomalies near the bottom, likely associated with excessively deep-reaching convection (see section 4.1).Scoring is based on the ratio of the standard deviations from observations and models, with 20 points for a ratio of 1.The standard deviation of the observational time series used for the calculation is σ obs = 4.23 mol m −2 .
• Average absolute difference of the time-mean profile with observations: Designed to measure the bias of the mean oxygen distributions compared to observations.This is split up into two depth ranges, 0-2200 m and 2200 m-bottom, with 0 to 10 points assigned for each.10 points: • Mean annual gas exchange: In the central Labrador Sea, annual net oxygen uptake is an important driver for variability in the oxygen inventory.We use observational estimates from the central Labrador Sea from two previous studies, for a total of five years: 22.1 mol m Mathematically, the scores as described above are given by Note that before the calculation of the final score S total , individual score components below 0 are set to 0, and scores above 20 to 20.
As noted in previous studies, the selection of metrics by which to score or rank models will always be somewhat arbitrary (Anav et al., 2013).We choose an equal weighting of all five metrics to avoid subjective biases in how important each one is, although there will still be some bias in our choice of which metrics to include at all.Gleckler et al. (2008) suggested that an overall performance metric for climate models is not very informative, because there are many offsetting errors in different components and different regions, and the performance of a given model will depend on the specific application.The approach we took here is to calculate a score that is specific to one mechanism -deep ocean ventilation with oxygen -in one key region -the Labrador Sea -rather than a more comprehensive global ranking.This should ensure that the score calculated here more meaningfully represents the fidelity of different models in simulating the overall effect of this particular process.

Variability of the oxygen inventory
The time evolution of the oxygen inventory anomaly in the Labrador Sea derived from all available historical shipboard data is shown in Fig. 2. The time series exhibits decadal variability, with maxima in the 1960s, 1970s, and 1990s, and minima in 1970, 1980, and 2010.These maxima have been reported in previous studies on Labrador Sea oxygen concentrations, and are closely linked to extended periods of particularly strong convection events (van Aken et al., 2011).A period of consistently increasing oxygen inventory anomalies occurred during the record deep convection of the early 1990s (Yashayaev, 2007), followed by a steady decrease in the late 1990s, and a minimum in 2010.There is an increase in oxygen content at the end of the record, which is associated with the re-emergence of deep convection starting in winter 2014/2015 (Yashayaev and Loder, 2016;Rhein et al., 2017) that led to enhanced oxygen uptake (Koelling et al., 2017;Atamanchuk et al., 2020).During the buildup to the maximum oxygen content anomaly from 1980 to 1992, there was an average increase, or oxygen storage, of 2.82 mol m −2 per year, and the subsequent decrease occurred at a rate of 2.04 mol m −2 per year from 1994-2006.

Frontiers
Although the inventory change could intuitively be assumed to be linked to gas exchange, the relationship between the two is not necessarily strictly proportional.An analysis of oxygen inventories from Argo floats showed that even during times of strong convection and larger than usual gas exchange, there is little year-to-year change in the oxygen inventory (Wolf et al., 2018).Indeed, the total increase in oxygen content over a 12-year period discussed above is only 1.5 times the uptake from the atmosphere for one single year of 22.1 mol m −2 observed by Atamanchuk et al. (2020).Estimating the effect of eddies such as Irminger Rings advecting lower oxygen water into the central Labrador Sea, Wolf et al. (2018) found that they could account for oxygen losses between 3.98 mol m −2 and 11.34 mol m −2 per year .Koelling et al. (2022) showed that about 50% of the oxygen taken up in 2017 is rapidly exported southward into the Deep Western Boundary Current, and hypothesized that additional oxygen export is associated with different pathways of LSW spreading, such as an interior flow towards the Irminger Sea that has been observed in transient tracer studies (Sy et al., 1997) and hydrographic data (Talley and McCartney, 1982;Yashayaev, 2007).Together with our time series, these results suggest that to first order, the annual net oxygen uptake from the atmosphere is largely balanced by the export of newly ventilated water out of the formation region, and its replacement with lower oxygen water from the surrounding regions, and the variability evident in Fig. 2 represents the cumulative effect of the small annual residual between these two processes.The continuous loss of oxygen through outflowing LSW and its replacement with lower-oxygen water may in fact be crucial for maintaining the Labrador Sea as a net oxygen sink year after year, as it ensures that the wintertime mixed layer is far from equilibrium with the atmosphere even during years of weaker convection (Wolf et al., 2018).

The Labrador Sea's oxygen reservoir
Given the dominant balance between gas exchange and export in the annual oxygen budget, we conclude that the oxygen content increase from 1980-1992, and decrease from 1994-2006, is a result of uptake from the atmosphere exceeding the losses from oxygen export during the former period, and export exceeding uptake during the latter.The first of these periods, referred to in the following as Period 1, encompasses the deepest reaching convection on record, which resulted in the coldest, freshest, and most oxygenated class of LSW ever observed (Yashayaev, 2007;van Aken et al., 2011).Following the peak of this convective activity in the early 1990s, the Labrador Sea experienced a period of weaker than usual convection during 1994-2006 coinciding with the decrease in oxygen content, which we will refer to as Period 2.
This time scale of the buildup and subsequent drainage of the oxygen reservoir may be related to the ocean's "memory" in the region, which is a feature that has been observed in the Labrador Sea Water's physical properties (Straneo et al., 2003;Yashayaev, 2007).Although there is a significant export of LSW every year (Yashayaev and Loder, 2016), some of the newly formed deep water remains in the formation region, so that the changes set by air-sea forcing during one convective season have an effect on the initial conditions during the following winter.The result is a preconditioning of the water column, making it easier for convection to penetrate deeper in a year following deep-reaching convection, and more difficult following shallower convection (Yashayaev and Loder, 2017).This process contributes to the LSW reservoir in the interior of the basin becoming progressively colder, fresher, and more oxygenated during prolonged periods of increasingly deep reaching convection (van Aken et al., 2011).At the same time, there is a continuous exchange between the interior convection region and the warmer, saltier boundary current, as well as exchanges in the inner subpolar gyre (Straneo, 2006;Yashayaev, 2007).This results in a positive advective heat flux (Yashayaev and Loder, 2009), which acts to increase stratification in the central Labrador Sea and erode the pool of homogenized deep water (Tagklis et al., 2020).This exchange, much of which happens along density surfaces, is also the main mechanism by which LSW from the interior is added to the outflowing boundary current (Georgiou et al., 2020;Zou et al., 2020).Interannual variability in the lateral exchange between the boundary current and interior Labrador Sea is related to the boundary-interior gradient of layer thickness, and the resulting advective heat gain is mostly controlled by the temperature of LSW (Straneo, 2006).
This variability in boundary-interior exchange could also affect the oxygen budget, as heat content and oxygen inventory between 0-2200 m in the central Labrador Sea are highly correlated (R = -0.83;not shown).This exchange may explain the leveling off and reversal of the inventory changes evident in Fig. 2: If the exchange of oxygen, like heat, is proportional to its reservoir in the interior, a continuous increase in oxygen content would lead to an increase in the oxygen lost through exchanges with the boundary current.
Eventually the export of oxygen would exceed gas exchange, with the uptake from the atmosphere no longer sufficient to replenish the oxygen that is exported out of the region, resulting in a decrease in oxygen content.This could occur either due to the continued intensification of the lateral oxygen exchange, or a decrease in oxygen uptake from the atmosphere.Since some LSW can stay in the formation region for several years (Straneo et al., 2003), the elevated oxygen export would persist for some time even after gas exchange weakened.The buildup of oxygen during Period 1, and following decrease during Period 2, would be consistent with this mechanism.During Period 1, several consecutive years of deep-reaching convection and intense oxygen uptake could explain the continued increase in oxygen content, which in turn would drive an increased boundary-interior oxygen exchange.In Period 2, the exchange would remain elevated even once the oxygen uptake weakened following peak convection, thus slowly draining the excess oxygen accumulated during Period 1.
However, from the oxygen content time series alone it is not possible to know for certain whether the mechanism proposed above is at play in the Labrador Sea.The variability could also be explained by changes in gas exchange with constant export, changes in export with constant gas exchange, or some other variability in the gas exchange-export balance that does not follow the pattern outlined above.In order to fully disentangle the relationship between decadal variability in oxygen uptake, gas exchange, and inventory change, sustained long-term observations of all three would be needed.While this may be possible in the future using the much broader coverage of in situ measurements in recent years (von Oppeln-Bronikowski et al., 2021;Atamanchuk et al., 2022;Sharp et al., 2022), at present a multi-decade observational record only exists for the oxygen inventory.In the meantime, data from ocean climate models may be a useful tool to better understand these processes, which is the focus of the following section.

OXYGEN BUDGET IN CMIP6-OMIP1
In this section, we use data from nine different CMIP6 models in order to analyze the balance between the different terms in the oxygen budget.We use output from all models that participated in the OMIP1 phase and provided fields of oxygen and air-sea flux of oxygen, as described in section 2. Section 4.1 focuses on evaluating how well the models reproduce the observational record, and uses the score defined in Section 2.4 to assess model performance.In section 4.2, we use the ensemble mean in order to diagnose differences in gas exchange, oxygen export, and oxygen storage between the two periods defined in section 3.2.

Model Evaluation
Figure 3A shows the central Labrador Sea's oxygen inventory anomaly between 0-2200 m taken from nine different CMIP6-OMIP1 models that include ocean biogeochemistry, as well as the mean of all models (see Section 2 for details).While there is some spread between the modeled oxygen inventory in any given year, they all exhibit variability similar to observations on decadal time scales.In particular, a maximum in Frontiers oxygen content in the mid-1990s is evident in all models, suggesting that the impact of the record-deep convection during this time (Yashayaev, 2007) is reproduced to some extent in CMIP6-OMIP1.However, the magnitude of the peak oxygen content anomaly is reduced by about half compared to the observational record.
A possible explanation for this discrepancy could be the representation of gas exchange in climate models.
The biogeochemical protocols for participants of OMIP (Orr et al., 2017) state that modelers should use the gas exchange parameterization of Wanninkhof (1992) with the updated coefficients of (Wanninkhof, 2014).This parameterization is constrained to reproduce global air-sea fluxes of carbon, but has been shown to substantially underestimate oxygen uptake in the Labrador Sea because it does not include a bubble-mediated term, which makes up a significant fraction of the total gas exchange for oxygen in this region (Atamanchuk et al., 2020;Koelling et al., 2017).Figure 3B shows the time series of the annual gas exchange from the CMIP6-OMIP1 models, with time-mean values provided in Table S1.Indeed, the models seem to significantly underestimate the gas exchange observed in the real ocean: The highest mean annual gas exchange from any of the models is 10.91 mol m −2 (NOAA GFDL), which is less than half of the uptake inferred by Atamanchuk et al. (2020) and Wolf et al. (2018), but comparable to an early estimate by Körtzinger et al. (2008) that used the Wanninkhof (1992) parameterization.The mean annual gas exchange for the ensemble mean is 7.22 mol m −2 , about a third of observation-based estimates.This strong discrepancy with the observed data highlights again the need for critical evaluation of gas exchange parameterizations for oxygen in climate models as called for in previous studies (Atamanchuk et al., 2020;Seltzer et al., 2023).
Another issue common in CMIP is that modelled convection depths in the Labrador Sea are often much deeper than in observations (Heuzé, 2017(Heuzé, , 2021)), which may be related to their inability to resolve submesoscale processes (Tagklis et al., 2020).As a result, Labrador Sea Water in these models penetrates deeper than observed in the real ocean, and the deep overflow water masses below LSW are much less pronounced than in observations.This is evident in Fig. 3c, as neither the low-oxygen Northeast Atlantic Deep Water (NEADW) layer from about 1500-3000 m depth, nor the near-bottom oxygen maximum associated with Denmark Strait Overflow Water (Yashayaev and Loder, 2017), are discernible in the model mean oxygen profile.The effect of unrealistically deep-reaching convection can also be seen in oxygen concentration anomalies (Fig. 4).In particular, the IPSL CM6a, MRI ESM2, and NOAA GFDL models have the maximum anomalies in oxygen concentrations near the bottom, rather than in the 1000-2000 m depth range typically occupied by LSW in the real ocean, and all three of these models have been shown to have much deeper convection compared to observations (Adcroft et al., 2019;Boucher et al., 2020;Jackson and Petit, 2022).This is another factor that can contribute to the underestimation of oxygen content anomalies in the LSW layer (Fig. 3a), as the variability in LSW oxygen in these models is spread over the whole water column, rather than just 0-2200 m.The lack of realistic overflow water masses could also be contributing to gas exchange being too weak.During some years, convection can penetrate into the salty, low-oxygen NEADW layer underlying LSW, incorporating it into the newly formed water mass (Yashayaev, 2007).This entrainment of old waters with lower oxygen is an important driver for maintaining undersaturation of surface waters throughout the convective season (Wolf et al., 2018).Thus, the lack of a clear oxygen minimum below LSW (Fig. 3c) could result in surface oxygen concentrations being too close to saturation, reducing the magnitude of the calculated oxygen uptake.
Overall, while there generally is multidecadal variability in oxygen content at similar time scales to observations, many CMIP6-OMIP1 models seem to struggle with representing the more detailed aspects of the Labrador Sea's oxygen dynamics.This includes weaker amplitude of the multidecadal signal (Fig.

Frontiers
3a), reduced mean oxygen uptake (Fig. 3b), biases in the mean oxygen profile (Fig. 3c), and excessively deep convection (Fig. 4).In order to assess each model's performance in these areas, we calculated a score based on five factors, as described in section 2.4, and the scores for each metric as well as the total score are given in Table 2.
The majority of the models score close to 50 out of 100 points, and none higher than 53/100.Based on the different score components (Table 2) the exact nature of the mismatch with observations is different for each model.For example, MRI ESM2 scores the highest on the correlation with the LSW time series, but also has a low amplitude for the LSW maximum, too much variability below the LSW layer, and the weakest mean gas exchange.Conversely, the NCAR CESM2 model does well for the amplitude of the maximum and the mean gas exchange, but the correlation with the observational time series is weaker.
Using the ensemble mean of all models results in a better score than any individual model, consistent with the expectation that the mean of an ensemble of models generally is closer to the "true" signal than individual members (Collins et al., 2014).However, even the ensemble mean only achieves a score of 57, with shortcomings evident particularly in the amplitude of the oxygen inventory maximum and the mean gas exchange.
While the exact scores are somewhat arbitrary, these results highlight the need for improvements in modelling the oxygen cycle in this key region for deep ocean ventilation.Given that models are our only tool for projections of future ocean deoxygenation, we echo Oschlies et al. ( 2018)'s sentiment that discrepancies between models and observations need to be resolved in order to understand how oxygen levels may change in the future.We hope that the model score created here can contribute to that cause, providing an easy and consistent way to evaluate how key metrics related to deep ocean ventilation in the Labrador Sea are represented in ocean models.

Balance between oxygen storage, gas exchange, and export
Despite the CMIP6-OMIP1 model's shortcomings discussed above, they can be useful for testing the links between gas exchange, oxygen content change, and export in the Labrador Sea.In particular, the increase in the oxygen inventory associated with the 1987-1994 class of LSW in Period 1, and subsequent decrease in Period 2, is present in all models (Fig. 3a), albeit with reduced magnitude.Below, we assess how the different terms of the oxygen budget in the Labrador Sea vary during this record deep convection period in the CMIP6-OMIP1 models.
To test how the multidecadal oxygen content variability is controlled by gas exchange and oxygen export, we calculate annual oxygen storage rates and gas exchange using the ensemble mean of the nine CMIP6-OMIP1 models, and infer the export as the residual by rearranging equation 1. Fig. 5A shows the time series of the annual mean gas exchange and oxygen storage from the model outputs.Oxygen storage is the year-to-year change in the oxygen inventory, i.e. the first derivative of the time series shown in Fig. 3A.
The long-term mean is 7.22 mol m −2 y −1 for the gas exchange and 0 mol m −2 y −1 for the storage term.
Thus, the mean of the export, calculated as the residual, is also 7.22 mol m −2 y −1 : On long time scales, the net annual oxygen uptake is balanced by a net export of oxygen.Interannual variability in gas exchange is positively correlated with variability in storage (R = 0.66 for the multi-model mean, Fig. 5B) as well as export (R = 0.64, Fig. 5C), while there is relatively little correlation between export and storage (R = -0.11,not shown).This suggests that both storage and export are driven mostly by variability in oxygen uptake, with both the local oxygen inventory and the amount of oxygen that is exported increasing during years with more intense air-sea gas exchange, and decreasing when less oxygen is taken up.There is also a relationship between the oxygen content anomaly (Fig. 3A) and oxygen export, with a correlation of R = 0.73.This Frontiers would be consistent with a lateral exchange of oxygen that is proportional to the boundary-interior gradient, as proposed in section 3.2.
On decadal time scales, variability in oxygen storage can be an important factor in controlling the linkage between uptake and export.Period 1 and 2 are shaded in Fig. 5A in green and red, respectively, and mean values of the three terms over each period are shown in table 3, with the values for individual models summarized in table S2.By definition, the storage term is generally positive in the first period (1.07 mol m −2 y −1 ) and negative in the latter (-1.22 mol m −2 y −1 ), although there is also interannual and inter-model variability around this mean.The period of positive storage changes is accompanied by the largest gas exchange over the model record (9.60 mol m −2 y −1 ), while oxygen uptake during the second period is 6.89 mol m −2 y −1 , slightly less than the long-term average.The difference in gas exchange exceeds the difference in oxygen storage because the oxygen export is also stronger in Period 1 than Period 2, changing from 8.52 mol m −2 y −1 to 8.11 mol m −2 y −1 .The modelled data suggest that the period of record strong convection in the 1980s and early 1990s featured the record strongest gas exchange and oxygen storage (Fig. 5A), as well as the record oxygen export (not shown).
Although both oxygen uptake and oxygen export are decreased in Period 2 compared to their record values in Period 1, the change in the latter is much smaller: A 2.71 mol m −2 y −1 change in gas exchange only corresponds to a change in oxygen export by 0.40 mol m −2 y −1 , or 14% of the decreased uptake.
The remainder is balanced by a decrease in oxygen storage of 2.29 mol m −2 y −1 , 85% of oxygen uptake change, as the oxygen reservoir switches from accumulating excess oxygen to exporting more than is added.This suggests that although the dominant balance in the long-term mean is between gas exchange and export, oxygen storage in the interior Labrador Sea can be important for modulating their variability on decadal time scales, acting as a buffer between increased uptake and increased export.The switch between the two periods from positive to negative oxygen storage, and stronger to weaker gas exchange, are robust across all models (table S2).The change in export is negative in most models, but there is a wider range of behaviors, likely reflecting differences in model physics such as ocean circulation.
Our analysis of the CMIP6-OMIP1 data thus suggests that although much of the oxygen taken up from the atmosphere is exported out of the formation region, there is also an accumulation, or storage, of oxygen in the interior Labrador Sea during periods of high oxygen uptake, consistent with a multi-year residence time of LSW (Straneo et al., 2003).This oxygen storage acts as a buffer, partially decoupling the export of oxygen out of the region from the uptake via gas exchange, dampening the impact of variations in oxygen uptake on export, and ensuring a more steady deep water oxygen supply to the rest of the ocean.

Ventilation and deoxygenation in CMIP6-OMIP1
Our analysis shows that contemporary CMIP6-OMIP1 models struggle to accurately portray the magnitude of oxygen content changes, the mean vertical profile of oxygen, and mean air-sea gas exchange in the Labrador Sea (section 4.1, table 2), suggesting a need for improvement of modeled oxygen cycles in this region.
This is particularly true given the importance of the Labrador Sea and the subpolar North Atlantic as a whole for ventilating the global ocean.A major shortcoming appears to be the representation of air-sea gas exchange, with CMIP6-OMIP1 models estimating the annual net oxygen uptake to be only about a third of observed values.This is in line with previous studies in the region that had shown the gas exchange parameterization used in CMIP6-OMIP1 models to severely underestimate oxygen uptake during deep convection in winter (Koelling et al., 2017;Atamanchuk et al., 2020;Seltzer et al., 2023), and could be responsible for the reduced magnitude of oxygen content changes compared to observations.
Another contributing factor may be the difficulties that some CMIP6-OMIP1 models have with modelling convection to realistic depths (Fig. 4) and representing overflow water masses (Fig. 3C), which can lead to biases in oxygen distributions throughout the water column, including surface oxygen being too close to saturation during deep convection.
Since gas exchange is balanced primarily by oxygen export out of the region this weaker oxygen uptake would also result in too little supply of oxygen from the deep water formation area to the rest of the global ocean in CMIP6-OMIP1 models.Given that the bubble-mediated gas flux is important for air-sea gas exchange in the other major deep ocean ventilation region, the Southern Ocean (Bushinsky et al., 2017), it may be expected that oxygen uptake, storage, and export there are likewise underestimated.Consequently, the mean strength of deep ocean ventilation in both subpolar regions as currently represented in global climate models, as well its variability, may be too small.This would be consistent with a recent finding that climate models underestimate ventilation changes (Buchanan and Tagliabue, 2021), and could partially explain discrepancies in ocean deoxygenation between models and observations (Oschlies et al., 2017(Oschlies et al., , 2018)).Furthermore, if the oxygen supply to the deep ocean is underestimated in global climate models, then its consumption by marine organisms would have to also be too weak in order to avoid spurious trends in global oxygen concentrations in the interior ocean.This might have important implications not only for future ocean deoxygenation, but also for model-based studies of the biological carbon pump (BCP), as too little oxygen consumption would correspond to too little carbon remineralization in the deep ocean.An improved representation of deep ocean ventilation with oxygen therefore seems imperative for predicting the evolution of both deoxygenation and the BCP in a changing ocean.

Gas exchange and oxygen export
Despite shortcomings, one aspect that the CMIP6-OMIP1 models do reproduce fairly consistently in the Labrador Sea is the interannual and decadal variability of the oxygen content.Although the magnitude is too weak, the correlation of the time series with the observational dataset is 0.73 for the ensemble mean (table S1), and all models show positive oxygen storage from 1980 to 1992, and negative storage from 1994 to 2006 (table S2).Given that the oxygen content change is related to variability in air-sea gas exchange (Section 4.2), this may suggest that although the magnitude of the gas exchange is too low in the models, its temporal variability (Fig. 3B) more accurately reproduces that of the real ocean.
The mean balance in the oxygen budget in both observations and models is largely between gas exchange and export: There is almost no long-term change in storage over the 60-year period that was analyzed (sections 3.1, 4.2), meaning that the average amount of oxygen exported is equivalent to the average amount taken up from the atmosphere.However, most of the decadal variability in gas exchange seen in the CMIP6-OMIP1 models is imprinted onto the oxygen content, rather than the export.Between the strong convection period 1980-1992 and the weak convection period 1994-2006, the mean annual gas exchange decreases by 2.71 mol m −2 (28% change from Period 1 mean) in CMIP6-OMIP1 models, 85% of which is balanced by the change in the annual storage rate (2.29 mol m −2 ), while only 14% of the decreased gas exchange is reflected in the oxygen export (0.40 mol m −2 , 5% of Period 1 mean).Thus, while the interannual to decadal variability in oxygen uptake does affect the export, its magnitude is dampened by the accumulation and subsequent drainage of oxygen in the central Labrador Sea.Körtzinger et al. (2004), using one of the first deployments of biogoeochemical Argo floats, referred to the wintertime oxygen uptake in the Labrador Sea as the ocean "taking a deep breath".Our results show that the multi-year residence time of LSW in the interior of the basin leads to the ocean "holding its breath", storing some of the excess oxygen taken up during strong convection years, and exporting it in subsequent years.
This relative insensitivity of the export to atmospheric forcing reported here differs from estimates based on the change in layer thickness of LSW, which suggest almost a factor of 3 difference in the volume of LSW export between strong and weak convection years (Yashayaev and Loder, 2016).The difference may stem from the fact that their analysis highlights the export of LSW over a 1-year period following the previous convection season, while the oxygen-based estimate discussed here includes the export of older LSW as well.During particularly strong convection periods, stronger gas exchange does lead to increased export, but also increases the oxygen content in the formation region since some of the newly formed LSW stays in the region from one year to the next (Straneo et al., 2003).This "storage" affects the export in subsequent years, such that some of the LSW and oxygen export from a strong convection period continues even after convection has weakened and the oxygen uptake has decreased.
On the other hand, if convection in the Labrador Sea were to continuously weaken or stop altogether as a result of climate change, as has been suggested could occur (e.g.Oltmanns et al. (2018)), the oxygen uptake, and therefore also the export, would likely decrease dramatically.The shallower mixed layer would reach equilibrium quickly, limiting the amount of oxygen that can be taken up, while the absence of convection would also impede any supply of this atmospheric oxygen to the deep ocean.Due to the time lag between oxygen uptake and export introduced by the Labrador Sea's oxygen reservoir, there would still be a net flow of oxygen out of the basin for some time, but its rate would continuously decrease as the oxygen stored in the interior Labrador Sea slowly drains out along LSW export pathways.Eventually, the deep oxygen reservoir would disappear, and the Labrador Sea would no longer act as a source of oxygen for the deep ocean, with potentially grave impacts for marine ecosystems as the ocean loses its breath.
One key question that remains is how uptake and export of oxygen in the Labrador Sea are related to the strength of the Atlantic Meridional Oveturning Circulation (AMOC).The strength of the AMOC varies on decadal time scales (Moat et al., 2020;Jackson et al., 2022;Le Bras et al., 2023), and future projections with CMIP models suggest that it will weaken over the next century (Bakker et al., 2016;Weijer et al., 2020).In the Labrador Sea, this would mean a decreased strength of both the outflowing boundary current, which is responsible for some of the oxygen export (Koelling et al., 2022), and the flow of lower-oxygen water into the basin.In an analysis of decadal changes in oxygen concentrations in the neighboring Irminger Sea, Feucher et al. (2022) found that changes in ocean circulation play an important role in controlling oxygen variability.In contrast, our analysis of CMIP6-OMIP1 output suggests that the oxygen export does not play a major role for the oxygen content in the interior Labrador Sea, with most of the variability in both oxygen content and export set by changes in the uptake from the atmosphere.
However, while this is true for the multi-model mean, individual models do have a larger correlation between oxygen storage and export, suggesting that this finding is dependent on model physics.A more detailed analysis of the differences in circulation between the models is beyond the scope of the current study, but could provide further insight into whether there is a linkage between the AMOC and oxygen transports.Our understanding of these processes will also benefit from the addition of oxygen sensors to the OSNAP mooring array, which will enable calculation of cross-basin oxygen transports associated with the overturning circulation (Atamanchuk et al., 2022).
In order to be able to monitor and predict the future variability in ventilation in this basin and the subpolar North Atlantic as a whole, the long-term context provided by the present study will need to be combined with more detailed analyses of the present-day oxygen uptake and export on basin-wide scales.In recent years, there have been regular oxygen measurements from gliders in the Labrador Sea (von Oppeln-Bronikowski et al., 2021), the use of moored oxygen sensors in the SPNA has increased dramatically (Atamanchuk et al., 2022), and with the proliferation of BGC-Argo floats throughout the ocean, full global 4-D fields of oxygen based on observational data are now becoming available (Sharp et al., 2022).This wealth of data concentrated in a small but key area of the ocean could be exploited in the future to study ocean ventilation by constraining the uptake, storage, and export of oxygen in much more detail than was possible in the present study, and guide improvement in how these processes are implemented in climate models.Given our current lack of knowledge about the mechanisms driving deoxygenation and changes in deep ocean ventilation (Oschlies et al., 2018), improving our understanding of these topics, and their representation in models, will be imperative for reliably predicting how they may change in the future.

FIGURE CAPTIONS
Figure 1.Global mean dissolved oxygen concentration between 500-1500 m depth from the GOBAI-O 2 gridded oxygen product (Sharp et al., 2022).Inset shows detail in the Labrador Sea, with green symbols representing measurements used to construct the time series in Section 2.

•
0-2200 m inventory anomaly with observational time series: This metric measures the extent to which the model time series reproduces the phase of the decadal variability evident in the observational data.Scoring is calculated as S corr = R 2 0.81 × 20 for R > 0, and S corr = 0 for R ≤ 0 Maximum 0-2200 m inventory anomaly: Another measure of how well the observational time series is reproduced, focused on the amplitude of the 1990s LSW layer oxygen content anomaly.To allow for slight variations in timing, the amplitude is calculated as the maximum between 1990 and 1995, minus the minimum from 2004 to 2009.Scoring is based on the ratio of the amplitudes from observations and models, with 20 points for a ratio of 1.The observational maximum used for the calculation is M ax obs = 38.51mol m −2 .

Figure 2 .
Figure 2. Time series of oxygen inventory anomaly from WOD data.Thin gray lines and symbols show annual values.The thick black line shows the time series smoothed with a 3-year running mean, and is dashed where gaps were filled.Blue shading represents the estimated error.

Figure 3 .
Figure 3.Comparison of CMIP6-OMIP1 model data from the central Labrador Sea to observations.In all panels, the thick black line shows the mean of all models, and in panels (A) and (C) the grey line shows the observational data from WOD. (A) Oxygen content anomaly (B) Annual mean air-sea gas exchange of oxygen (C) Mean profiles of oxygen concentration

Figure 5 .
Figure 5. (A) Time series of annual gas exchange (red) and oxygen storage (blue) from the CMIP6-OMIP1 models used in this study.Thin lines show individual models, and thick lines show the ensemble mean, smoothed with a 5-year running mean filter.Mean values for the two periods are given in Table 3. (B) Relationship between annual gas exchange and oxygen storage (C) Relationship between annual gas exchange and export.In panels (B) and (C), individual models are shown as colored symbols, and black circles show the multi-model mean.

Table 1 .
TABLES Summary of models used in this study.All model runs used are OMIP1.

Table 2 .
Scores for each of the five components used in constructing the model score, and total score for each model.The factors are correlation of the LSW layer (0-2200 m) oxygen content with observations, the amplitude of the 1990s maximum in LSW oxygen content, standard deviation of the oxygen content below the LSW layer (2200 m-bottom), mean absolute difference between the modeled and observed oxygen profile for the LSW layer and 2200m bottom layer, and mean gas exchange.Values for the ensemble mean obtained by averaging together the output fields from all models are given at the bottom.The values used for calculating each component of the score are given in table S1 in the supplementary material.