High-Resolution Reanalysis of the Mediterranean Sea Biogeochemistry (1999–2019)

Ocean reanalyses integrate models and observations to provide a continuous and consistent reconstruction of the past physical and biogeochemical ocean states and variability. We present a reanalysis of the Mediterranean Sea biogeochemistry at a 1/24° resolution developed within the Copernicus Marine Environment Monitoring Service (CMEMS) framework. The reanalysis is based on the Biogeochemical Flux Model (BFM) coupled with a variational data assimilation scheme (3DVarBio) and forced by the Nucleus for European Modeling of the Ocean (NEMO)–OceanVar physical reanalysis and European Centre for medium-range weather forecasts (ECMWF) reanalysis ERA5 atmospheric fields. Covering the 1999–2019 period with daily means of 12 published and validated biogeochemical state variables, the reanalysis assimilates surface chlorophyll data and integrates EMODnet data as initial conditions, in addition to considering World Ocean Atlas data at the Atlantic boundary, CO2 atmospheric observations, and yearly estimates of riverine nutrient inputs. With the use of multiple observation sources (remote, in situ, and BGC-Argo), the quality of the biogeochemical reanalysis is qualitatively and quantitatively assessed at three validation levels including the evaluation of 12 state variables and fluxes and several process-oriented metrics. The results indicate an overall good reanalysis skill in simulating basin-wide values and variability in the biogeochemical variables. The uncertainty in reproducing observations at the mesoscale and weekly temporal scale is satisfactory for chlorophyll, nutrient, oxygen, and carbonate system variables in the epipelagic layers, whereas the uncertainty increases for a few variables (i.e., oxygen and ammonium) in the mesopelagic layers. The vertical dynamics of phytoplankton and nitrate are positively evaluated with specific metrics using BGC-Argo data. As a consequence of the continuous increases in temperature and salinity documented in the Mediterranean Sea over the last 20 years and atmospheric CO2 invasion, we observe basin-wide biogeochemical signals indicating surface deoxygenation, increases in alkalinity, and dissolved inorganic carbon concentrations, and decreases in pH at the surface. The new, high-resolution reanalysis, open and freely available from the Copernicus Marine Service, allows users from different communities to investigate the spatial and temporal variability in 12 biogeochemical variables and fluxes at different scales (from the mesoscale to the basin-wide scale and from daily to multiyear scales) and the interaction between physical and biogeochemical processes shaping Mediterranean marine ecosystem functioning.

Ocean reanalyses integrate models and observations to provide a continuous and consistent reconstruction of the past physical and biogeochemical ocean states and variability. We present a reanalysis of the Mediterranean Sea biogeochemistry at a 1/24 • resolution developed within the Copernicus Marine Environment Monitoring Service (CMEMS) framework. The reanalysis is based on the Biogeochemical Flux Model (BFM) coupled with a variational data assimilation scheme (3DVarBio) and forced by the Nucleus for European Modeling of the Ocean (NEMO)-OceanVar physical reanalysis and European Centre for medium-range weather forecasts (ECMWF) reanalysis ERA5 atmospheric fields. Covering the 1999-2019 period with daily means of 12 published and validated biogeochemical state variables, the reanalysis assimilates surface chlorophyll data and integrates EMODnet data as initial conditions, in addition to considering World Ocean Atlas data at the Atlantic boundary, CO 2 atmospheric observations, and yearly estimates of riverine nutrient inputs. With the use of multiple observation sources (remote, in situ, and BGC-Argo), the quality of the biogeochemical reanalysis is qualitatively and quantitatively assessed at three validation levels including the evaluation of 12 state variables and fluxes and several processoriented metrics. The results indicate an overall good reanalysis skill in simulating basin-wide values and variability in the biogeochemical variables. The uncertainty in reproducing observations at the mesoscale and weekly temporal scale is satisfactory for chlorophyll, nutrient, oxygen, and carbonate system variables in the epipelagic layers, whereas the uncertainty increases for a few variables (i.e., oxygen and ammonium) in the mesopelagic layers. The vertical dynamics of phytoplankton and nitrate are positively evaluated with specific metrics using BGC-Argo data. As a consequence of the continuous increases in temperature and salinity documented in the Mediterranean Sea over the last 20 years and atmospheric CO 2 invasion, we observe basin-wide biogeochemical signals indicating surface deoxygenation, increases in alkalinity, and dissolved inorganic carbon concentrations, and decreases in pH at the surface. The new, high-resolution reanalysis, open and freely available from the Copernicus Marine Service, allows users from different communities to investigate the spatial and temporal variability in 12 biogeochemical variables and fluxes at different scales (from

INTRODUCTION
Optimal integration of physical-biogeochemical observations and models is becoming increasingly urgent to support both scientific and broader environmental communities. This urgency has emerged from the necessity of quantifying the current characteristics and trends of physical and biogeochemical states at global or regional scales (Fennel et al., 2019), including estimates of the carbon cycle, acidification, primary production, and oxygen and nutrient availability. Given their basin-wide, regular, and seamless spatial and temporal coverage, multidecadal reanalyses can constitute a robust basis to compute meaningful and specific environmental indicators that describe the states and trends of key biogeochemical features (e.g., essential climate and ecological variables; Bojinski et al., 2014;Miloslavich et al., 2018), fluxes, and processes that cannot be completely and extensively observed (e.g., the flux of CO 2 , primary production, and carbon sequestration). These indicators may support environmental state assessment and monitoring of climate change, and its impact on marine ecosystems, following national or international directives [e.g., European Marine Strategy Framework Directive (MSFD), UNEP/MAP]. From this perspective, the EU Copernicus Marine Environment Monitoring Service (CMEMS) has developed ocean monitoring indicators (OMIs), defined as trends and variability indexes of key marine variables computed for the global ocean and European seas (von Schuckmann et al., 2016(von Schuckmann et al., , 2018(von Schuckmann et al., , 2020. At the global scale, OMIs for the ocean biogeochemistry presently account for pH, chlorophyll, CO 2 , nitrate, and oxygen minimum zones, whereas in most regional seas, only the surface chlorophyll OMI extracted from reprocessed satellite ocean color data is available. As a semienclosed sea with a complex orography, the Mediterranean Sea exhibits high variability in physical (Pinardi and Masetti, 2000;Pinardi et al., 2015), biogeochemical, and biological characteristics (Siokou-Frangou et al., 2010). Additionally, intense anthropogenic pressure (Hulme et al., 1999;Micheli et al., 2013) and effects of climate change (Giorgi, 2006) make the Mediterranean Sea a challenging research area, as multiple signals at different scales interact. While basin-wide physical tendencies have been presented (Escudier et al., 2021 and references therein), biogeochemical tendencies are absent and mostly related to single variables (e.g., chlorophyll). Few longterm basin-scale signals of ecosystem changes have been reported for the Mediterranean Sea considering ocean color and in situ data. For example, using surface chlorophyll concentrations retrieved from satellite data, Colella et al. (2016) analyzed the 1998-2009 period and reported significant positive trends in most of the western Mediterranean and Rhodes gyre areas and negative trends in the northern Adriatic Sea. A slightly contrasting result was reported by Salgado-Hernanz et al. (2019) and by the CMEMS ocean monitoring product (Sathyendranath et al., 2018) that showed a positive (negative) trend in the western (eastern) Mediterranean subbasin based on an extended period of the reprocessed satellite dataset. Interannual variability in the dissolved oxygen concentration and a link between ventilation change signals in eastern and western Mediterranean deep waters and two major subbasin climatic shifts, i.e., the Eastern and Western Mediterranean Transients (EMT between 1995and 2001, and WMT between 2004, were recently highlighted (Li and Tanhua, 2020;Mavropoulou et al., 2020). Focusing on limited areas and periods (given the length of in situ time series), changes in the deep oxygen concentration have been reported for the Levantine basin (Sisma-Ventura et al., 2021), Gulf of Lions (Coppola et al., 2018), southern Adriatic Sea (Lipizer et al., 2014), and southern Aegean Sea (Velaoras et al., 2019). Additionally, acidification tendency signals have been observed in the western (Gibraltar Strait; Flecha et al., 2015) and eastern Mediterranean Sea (Cretan basin; Wimart-Rousseau et al., 2021), whereas changes in carbonate system variables [e.g., an increase in alkalinity and dissolved inorganic carbon (DIC) concentrations] have been indirectly estimated in relation to climate change and anthropogenic pressure (Schneider et al., 2010;Touratier and Goyet, 2011;Álvarez et al., 2014;Wimart-Rousseau et al., 2021).
Reanalyses can fill observational gaps, providing a threedimensional, basin-wide, and seamless dataset to investigate temporal and spatial variability on a variety of scales. However, producing a physical-biogeochemical reanalysis is not a trivial task (Park et al., 2018), given the uncertain representativeness of the coupling between physical and biogeochemical processes (particularly vertical transport and consequent effects on the nutricline shape), the limited amount of biogeochemical data (both for assimilation and validation), and the multivariate nature of the biogeochemical state, which includes the complex links between observed (sparse) and simulated (many, in principle) variables. Thus, assessment of the quality of reanalysis products is of paramount importance considering their potential multiple uses (Hernandez et al., 2018). Model validation can be accomplished at different levels (Hipsey et al., 2020), providing uncertainty estimation for modeled variables at different spatial and temporal scales (Stow et al., 2009) and quantifying the modeling capability in terms of the reproduction of ecosystem processes (e.g., Salon et al., 2019;Mignot et al., 2021).
At the Mediterranean basin scale, while long-term simulations with a good performance in reproducing certain specific aspects are available (Lazzari et al., 2012;Macias et al., 2014;Guyennon et al., 2015;Richon et al., 2018;Di Biagio et al., 2019), biogeochemical reanalyses are quite rare. Through the assimilation of ocean color data in a POLCOMS-ERSEM reanalysis at a 1/10 • resolution, Ciavatta et al. (2019) derived a specific eco-regionalization based on phytoplankton functional types. Within the CMEMS framework, a reanalysis at a 1/16 • horizontal resolution covering the period from 1999 to 2014 and assimilating ocean color chlorophyll data was made available in 2016 (Teruzzi et al., 2016), and annually extended until 2018 (Teruzzi et al., 2019a). This low-resolution reanalysis has been adopted to investigate the surface nutrient trend (von Schuckmann et al., 2018), the positive trend of the CO 2 air-sea flux (von Schuckmann et al., 2018), and the multidecadal variation in primary production and its relationship with mixing events in the North-Western Mediterranean Sea .
In the present study, we describe a novel version of the CMEMS biogeochemical reanalysis product covering the 1999-2019 period, with a spatial resolution as high as 1/24 • horizontally and 125 vertical levels, based on the most advanced version of the Mediterranean Sea Biogeochemical Flux Model system (MedBFM) operationally employed in CMEMS for short-term prediction purposes . Additionally, the biogeochemical reanalysis includes the latest update of the assimilation scheme (Teruzzi et al., 2018) of ocean color data (Colella et al., 2021) and off-line coupling with the latest release of a physical reanalysis (Escudier et al., 2021). The reanalysis results are available as daily and monthly 3D fields of 12 biogeochemical variables and fluxes through the CMEMS service.
This study is organized as follows: section "Materials and Methods" describes the MedBFM model setup, the upstream data considered to perform the reanalysis, and the validation framework. Section "Reanalysis Validation" presents the main results in terms of multilevel validation. The temporal tendencies and variability emerging from the reanalysis data are provided in section "Reanalysis Variabilities and Trends." A discussion on the reanalysis quality and long-term tendency and conclusions are outlined in sections "Discussion" and "Conclusion, " respectively.

Modeling Framework
The reanalysis of the Mediterranean Sea biogeochemistry at a horizontal resolution of 1/24 • (approximately 4 km; the model domain is shown in Figure 1), covering the period 1999-2019, was produced with an upgraded version of the MedBFM model system, as detailed and assessed with regard to quality in Salon et al. (2019). In summary, MedBFM was built over the coupling between the OGS transport model (OGSTM), the BFM (Lazzari et al., 2010(Lazzari et al., , 2012Cossarini et al., 2015a;Vichi et al., 2020, and references therein), and the 3DVarBio variational scheme that assimilates surface chlorophyll concentrations (Teruzzi et al., 2014(Teruzzi et al., , 2018(Teruzzi et al., , 2019b. The BFM describes the biogeochemical cycles of carbon and macronutrients (nitrogen, phosphorus, and silicon) in terms of dynamic interactions among the dissolved inorganic, living organic, and non-living organic compartments. Recent MedBFM upgrades, including the assimilation of BGC-Argo float data and biooptical components, have been described in Cossarini et al. (2019) and Terzić et al. (2019Terzić et al. ( , 2021, respectively. MedBFM is coupled off-line with the CMEMS Mediterranean Sea physical reanalysis system (Escudier et al., 2021), which has provided the necessary fields at a daily frequency for 21-year integration (currents, temperature, salinity, diffusivity, wind, and solar radiation) to force tracer transport, dependency of biochemical kinetics on the temperature, and air-sea interactions.
The setup of the biogeochemical component, in addition to that of the physical component, is reported in Table 1. Among the listed elements, a relevant dependency is the atmospheric CO 2 partial pressure, which increased almost linearly from 370 ppm in 1999 to 410 ppm in 2019 (data obtained from the ENEA Lampedusa station; Artuso et al., 2009). The nutrient inputs are imbalanced between the two main basins (i.e., the eastern basin receives almost 70% of the loads) and with atmospheric inputs higher than the nutrient discharges originating from rivers. Indeed, the nitrogen inputs into the western and eastern basins reach 41.4 and 39.8 × 10 9 mol yr −1 , respectively stemming from air deposition, and 10.1 and 34.7 × 10 9 mol yr −1 stemming from rivers, respectively. The western and eastern basin phosphorus inputs are 0.53 and 0.67 × 10 9 mol yr −1 originating from the atmosphere and 0.16 and 0.69 × 10 9 mol yr −1 originating from rivers, respectively (Ribera d' Alcalà et al., 2003;van Apeldoorn and Bouwman, 2012).
Although the boundary conditions data [i.e., Atlantic boundaries retrieved from the World Ocean Atlas and GLobal Ocean Data Analysis Project (GLODAP), and nutrient inputs obtained from Perseus] are climatological, the atmospheric (ERA5 data) and ocean dynamics forcing data (Escudier et al., 2021) are daily data, and the assimilated biogeochemical data (surface chlorophyll; Teruzzi et al., 2018) are weekly data, thus driving the high-frequency dynamics of the 21-year long reanalysis model simulation.

Validation Framework
To assess the reanalysis quality, we extended the CMEMS validation metrics framework (Hernandez et al., 2009(Hernandez et al., , 2015(Hernandez et al., , 2018, which was inspired by the GODAE Ocean View initiative (now OceanPredict), by including a set of ecosystem processbased metrics . Drawing from the multilevel validation framework proposed by Hipsey et al. (2020), we proposed three levels of skill performance metrics applicable to 12 state variables and model fluxes depending on the availability of observations: • Level 1: Model outputs are compared with climatological references (observational datasets or literature reviews) to quantify the model capability to reproduce basin-wide spatial gradients and mean annual values and average vertical profiles on a subbasin scale (i.e., based on GODAE Class 1 metrics); • Level 2: Model outputs are compared with observations at the same time and locations (match-ups) to quantify the model capability to reproduce the observed  OGSTM-BFMv5 with non-linear free-surface and rescaled vertical coordinates (z*) Data assimilation OceanVar (Dobricic and Pinardi, 2008) 3DVarBio (Teruzzi et al., 2019b) Assimilated observations T/S Vertical profiles, SLA along track altimeter (Storto et al., 2016) Weekly maps of surface chlorophyll from satellite (Colella et al., 2021) SST nudging Gaussian around midnight (max = -110 W/m 2 /K) Kd (light absorption factor) Weekly from CMEMS OC data (Terzić et al., 2021) Initial spatial-temporal variability due to mesoscale and hightemporal dynamics (i.e., based on GODAE Class4 metrics; Hernandez et al., 2018); • Level 3: The model capability is evaluated to reproduce key biogeochemical emergent properties using specific process-based metrics and BGC-Argo data Feudale et al., 2021). The metrics for nutrient vertical dynamics include the nitracline depth computed as the depth at which the nitrate concentration reaches 2 mmol/m 3 (NITRCL). The metrics for the system productivity include the chlorophyll concentration and depth of the winter bloom layer (WBL), the maximum chlorophyll concentration, and depth of the deep chlorophyll maximum (DCM), and the depth of the subsurface oxygen maximum (OXYMAX).
Most Level 1 and 2 metrics are spatially evaluated for 16 Mediterranean sea subbasins (Figure 1), in the open sea region (defined as the area with a depth greater than 200 m), and for the full water column in selected layers, i.e., 0-10, 10-30 m (for certain metrics these layers are merged into a single layer, namely, 0-30 m), 30-60, 60-100, 100-150, 150-300, 300-600, and 600-1,000 m. In the coastal areas, the reanalysis accuracy can be assessed only for surface chlorophyll using the CMEMS satellite product and for other variables and a few subbasins according to in situ data availability. The 16 subbasins synthesize the heterogeneity in the Mediterranean Sea into homogeneous areas, considering previous bioregionalization analyses (D'Ortenzio and Ribera d' Alcalà, 2009;Ayata et al., 2018;Di Biagio et al., 2020;El Hourany et al., 2021;Novi et al., 2021).
The skill performance metrics include the bias, root mean square of the differences between the model output and observations (RMSD). Even if other more robust and nonparametric skill metrics and diagrams Stow et al., 2009) could be preferable for certain variables (e.g., variables not normally distributed), we preferred to adopt the same skill metrics (i.e., applying the CMEMS validation standard; Hernandez et al., 2018) for all variables and validation levels to enhance the comparability and communicability of the results.
The variables and their validation levels are listed in Table 2 along with a list of the reference data used. In particular, the EMODnet data collections (Buga et al., 2018) integrated with additional oceanographic cruises (references in Cossarini et al., 2015a;Lazzari et al., 2016), hereinafter referred to as EMODnet_int, include 12,257 observations for nitrate, 17,323 observations for phosphate, 104,910 observations for dissolved oxygen, and approximately 4,000 observations for the DIC and alkalinity covering the 1999-2016 period. The CMEMS ocean color product (Colella et al., 2021) comprises daily L3 chlorophyll maps from 1999 to 2019, whereas the surface ocean CO 2 Atlas Database (SOCAT) dataset (Bakker et al., 2016) includes 6,500 surface pCO 2 observations over the 1998-2018 period.
Regarding the ocean color dataset, since assimilation was performed once a week, the above skill performance computation was performed before the surface chlorophyll maps were assimilated. Hence, this dataset could be considered a quasiindependent dataset.
Net primary production (NPP) and flux of CO 2 across the air-sea interface validation were based on estimates published in Bosc et al. (2004); Colella (2006) 2018) regarding the latter process. In the "Results" section, we present the different validation levels for the selected variables and provide a synthesis of representative normalized skill indicators for all variables, layers, and subbasins. However, to refine uncertainty error estimates at the local scale (e.g., error maps, where available), complementary information is provided in the Quality Information Document contained in the CMEMS catalog .

Statistical Methods
The reanalysis results were analyzed to estimate different scales of temporal variability, including the overall average, range of the seasonal cycle (difference between the minimum and maximum monthly average values), overall temporal variability (standard deviation of the time series), interannual variability (standard deviation of the annual averages), and trend slope (C-1 method by removing fixed seasonal cycles; Vantrepotte and Mélin, 2009). The significance of the trend slope was assessed with the t-test at a p-level of 0.025.

Reanalysis Validation
The next three sections (Level 1 Validation to Level 3 Validation) illustrate the application of the three validation levels, thereby providing examples of the comparison between the observations and model results. Although these three sections illustrate the accuracy of the model in terms of RMSD and bias, thus revealing the different spatial and temporal comparison scales, section "Synthesis of the Reanalysis Validation" presents an overall synthesis of the accuracy of all variables and levels using normalized metrics, which allows us to assess the reanalysis quality against a common reference (i.e., the observed variability in each variable at the different levels).
Among the 12 variables of the published reanalysis dataset, chlorophyll, nitrate, and oxygen were the only variables covering all three validation levels, basically due to the large amount of data recently made available in BGC-Argo float profiles. BGC-Argo data were also considered to validate the phytoplankton biomass, but only at Level 2. Phosphate, ammonium, and carbonate system variables (pH, alkalinity, DIC, and pCO 2 ) satisfied Levels 1 and 2, basically considering data retrieved from the EMODnet_int and SOCAT repositories, Note that both "in situ climatology" and "EMODnet in situ" refer to EMODnet_int data set. Up to three quarters of pH and in situ pCO 2 observations were reconstructed using CO 2 sys software (Sharp et al., 2020) with available DIC, ALK and other regulatory information (i.e., temperature, salinity and concentration of phosphate and silicate).
whereas NPP and air-sea CO 2 flux were validated at Level 1 only ( Table 2).

Level 1 Validation
To illustrate the applicability of Level 1 validation, in this section we provide a quality assessment of surface chlorophyll, NPP, and alkalinity, whereas a global overview of the normalized skill indicators based on the mean annual spatial distribution is provided in section "Synthesis of the Reanalysis Validation" for 10 state variables (thus, excluding NPP and CO 2 flux).
In terms of surface chlorophyll, Figure 2 shows a good agreement between the climatological maps extracted from the reanalysis and satellite datasets. The model can correctly reproduce the basin-wide zonal gradient, spatial heterogeneity at the subbasin scale, and coastal-off shore patterns in the areas most affected by terrestrial inputs (Po, Rhone, Ebro, and the Nile), external inflow into straits (Gibraltar and Dardanelles), and shallow eutrophic areas (Gulf of Gabes). The subbasin mean values, annual cycles, and interannual variability are consistent with those provided by satellite observations (not shown; please refer to Teruzzi et al., 2021).
Net primary production is a measure of the net uptake of carbon by phytoplankton groups (the gross primary production minus fast-release processes, e.g., respiration). The lack of extensive datasets of NPP observations constrains its quality assessment to a qualitative evaluation of the reanalysis consistency against previous estimations published in the scientific literature (Supplementary Table 1). We observe that the average NPP estimates (i.e., ranging from 100 to 170 gC/m 2 /y across the different subbasins, as shown in Figure 3) agree with basin-wide estimations (the maps in Figure 8 in Lazzari et al., 2012 and Figure 13 in Bosc et al., 2004) and with published values for the reference subbasins (Supplementary Table 1). The reanalysis outcomes for the Ionian and Levantine subbasins appear slightly higher than previous estimates (Supplementary Table 1).
Given the sparse and uneven distribution in time and space of the reference in situ EMODnet_int data, the mean surface variables obtained in the reanalysis can be compared with climatological maps, reconstructed following the approach described in Cossarini et al. (2015a). In the case of alkalinity (Figure 4), the surface basin-wide gradient and subbasin-scale spatial variability are consistently reproduced in the reanalysis with a correlation of 0.95. Comparing the reanalysis to the EMODnet_int climatology at the basin scale, an overestimation occurs in the 0-150 m layers (between 20 and 30 µmol/kg), decreasing to less than 10 µmol/kg below 150 m, and an RMSD value of 40 µmol/kg is observed in the surface layer, decreasing to 20-30 µmol/kg below 300 m and maintaining a constant value of approximately 12 µmol/kg in the deeper layers (Supplementary Figure 1; the normalized skill indexes at all levels and subbasins are shown in Supplementary  Figure 2, respectively).

Level 2 Validation
To illustrate the applicability of Level 2 validation, we provide a quality assessment of selected variables (i.e., phosphate, nitrate, oxygen, and pH) in this section, whereas a summary of Level 2 validation of all variables is provided in section "Synthesis of the Reanalysis Validation." All these variables were validated by matching model results with collected EMODnet_int data and in terms of nitrate and oxygen, with BGC-Argo float vertical profiles, and the skill scores are comprehensively reported in terms of RMSD at the different layers and subbasins (Figures 5G, 6).
Good agreement is found both in terms of mean phosphate values and the spread (Figures 5A-C), and variability in the upper layer, whereas the reanalysis tends to underestimate phosphate over EMODnet_int in deep layers (below 500 m), especially in the western subbasins (Figures 5D-F). Quantitatively, the uncertainty, expressed as RMSD computed at the subbasin scale for the different vertical layers, varies, on an average, between 0.03 and 0.06 mmol/m 3 in the 0-300 m layer, increasing to 0.08 mmol/m 3 below 300 m, and the RMSD is greater in the western subbasins ( Figure 5G). Generally, Level 2 uncertainty is higher than that related to the mean annual value (i.e., level 1), which ranges from 0.02 to 0.04 mmol/m 3 in the 0-300 m layer and reaches 0.07 mmol/m 3 below 300 m.
The nitrate skill in regard to Level 2 validation ( Figure 6A) is similar to that of phosphate, namely, a very good performance in the upper layer that decreases with depth, with underestimation in the western subbasins and lower variability than the reference value. The average uncertainty, based on EMODnet_int, varies between 0.7 and 1.5 mmol/m 3 in the 0-300 m layer, increasing to 1.7 mmol/m 3 below 300 m. While the surface errors are similar among the subbasins, the western subbasins exhibit a higher uncertainty than do the eastern subbasins in the deep layers. Level 2 validation of nitrate was also conducted by matching the reanalysis results with BGC-Argo data, providing a similar picture but with slightly different values, namely the average RMSD value varies between 0.41 and 0.62 mmol/m 3 in the upper 60-m layer, increases to 0.73 mmol/m 3 in the 60-100 m layer, remains smaller than 2 mmol/m 3 in the 100-600 m layer, and increases to 2 mmol/m 3 in the 600-1000 m layer ( Figure 6D).
The two datasets are self-consistent and robust [the EMODnet_int quality was reported by Buga et al., 2018, and the BGC-Argo quality control (QC) protocol was provided by Bittig et al., 2019] and capture different levels of variability, namely a broader range of values encompassing sparse observations in time and space over two decades in terms of EMODnet_int and a series of profiles closely located in time and space capturing mesoscale and short-term dynamics, but covering only the last 5 years in terms of BGC-Argo. Thus, by separately employing these two validation datasets, we reveal a slightly lower reanalysis uncertainty when matched with the more localized (i.e., capturing mesoscale weekly dynamics within limited areas) BGC-Argo float data than that when matched with the broaderranging (in value terms) and more dispersed (in terms of its spatial and temporal coverage) EMODnet_int dataset.
The amount of EMODnet_int oxygen data is one order of magnitude larger than that of nutrient data, which supports a more reliable uncertainty estimation. While the surface layer is very well simulated (Figures 6B,E), the reanalysis tends to overestimate the oxygen observations, with a positive bias of approximately 20 mmol/m 3 below 300-400 m (Supplementary Figure 1). Small errors in oxygen at the surface (e.g., RMSD values between 5 and 13 mmol/m 3 ; Figure 6E) confirm that the high-temporal resolution temperature dynamics at the mesoscale (Escudier et al., 2021), which drive saturation and biological production, are generally suitably reproduced. In the deep layers, the reanalysis does not fully capture the oxygen dynamics due to the observed bias, and the uncertainty increases in the mesopelagic zone of the water column, specifically in the western subbasins, where it overcomes 30 mmol/m 3 . The oxygen from BGC-Argo floats indicates a reanalysis skill assessment similar to the EMODnet_int-based metrics (Figure 6B), with the RMSD values generally increasing with depth, particularly in the western subbasins.
Qualitatively, the reanalysis correctly reproduces the observed pH values and the spatial (both horizontal and vertical) variability (Supplementary Figure 1). Quantitatively, the uncertainty in reproducing high-temporal frequency observations, expressed as RMSD, remains relatively constant along the water column, varying between 0.01 and 0.04 pH units, with high RMSD values in the Alboran Sea and marginal seas (Adriatic and Aegean, as shown in Figure 6C). In addition to the model inaccuracy linked to specific and local conditions in marginal seas, the modelobservation mismatches are possibly related to the inaccuracy of the reconstruction method of in situ pH observations based on other carbonate system variables, which can reach as high as 0.005 pH units (Álvarez et al., 2014).

Level 3 Validation
The increasing availability of BGC-Argo data has allowed us to design fit-for-purpose metrics that quantify the model skill in reproducing specific emergent properties of biogeochemical dynamics and their seasonal variability (please refer to Salon et al., 2019;Feudale et al., 2021;Mignot et al., 2021).
Given the general quasi-stationary trajectory of BGC-Argo floats in the Mediterranean Sea (e.g., the red points in the upper panel of Figure 7), qualitative comparison of the temporal evolution of model-float vertical profiles of nitrate and oxygen (Figure 7, second and third rows; for chlorophyll, please refer to the example in Salon et al., 2019) provides an outlook capturing seasonal and monthly variations due to local physical (e.g., vertical mixing) and biological (e.g., phytoplankton bloom) dynamics.
In addition to the quantification of model-observation mismatches with classical metrics, we computed different quality assessment metrics to assess the model capability to reproduce emergent characteristics of the vertical structure, as defined in section "Validation Framework." Examples of emergent vertical characteristic metrics for nitrate (the nitracline depth; NITRCL) and oxygen (the depth of the oxygen maximum; OXYMAX) are reported in the lower panels of Figures 7A,B. With regard to the specific nitrate float (Figure 7A), we observed that the model Despite the surface bias, the model attains a good skill in reproducing the nitracline depth (NITRCL in Figure 7A) during the whole period, indicating how the interaction between vertical physical and biological dynamics, which determines the profile shape separating nutrient-rich deep waters from depleted surface waters, is suitably simulated by the model.
The epipelagic seasonal dynamics of the oxygen profile measured by floats ( Figure 7B) are characterized by a spring surface maximum, summer surface deoxygenation and penetration into the euphotic layer, and winter mixing. These dynamics are qualitatively well represented by the model, with a simulated oxygen maximum depth during the penetration period but with systematically overestimated values below the euphotic layer. The model performs very well in reproducing surface values, mostly driven by saturation (the red dots in the third row of Figure 7B), while integrated values are overestimated by approximately 10 mmol/m 3 (INTG 0_200m; the fourth row of Figure 7B). Interestingly, the divergent evolution of the 0-200 m integrated oxygen values after January 2014 suggests either a possible model flaw or a drift in BGC-Argo float data considering that quality assessment of BGC-Argo variables is continuously progressing (Bittig et al., 2019).
Once computed for all available BGC-Argo float data and comprehensively captured with RMSD in the selected subbasins, the metrics quantify the model skill in the reproduction of the seasonal dynamics of the emergent properties of biogeochemical processes. The reanalysis reproduces the depth of the DCM (i.e., with an uncertainty of approximately 15-20 m, Table 3), which varies between 60 m (western basins) and 120 m (eastern basins, Lavigne et al., 2015). Slightly higher uncertainty is computed for the modeled depth of the WBL, at approximately 30 m, while the intensity of chlorophyll peak values (CHLMAX and winter maximum values) is reproduced with an uncertainty lower than 0.2 mg/m 3 . The uncertainty increases to 0.3 and 0.5 mg/m 3 in the south-western Mediterranean (swm) and north-western Mediterranean (nwm), respectively, with the latter subbasin already addressed by Salon et al. (2019), which is basically related to the MedBFM limits in reproducing local dynamics under stratified conditions. Then, Table 3 reports that the RMSD value for the nitracline depth (NITRCL) varies between 30 and 40 m, whereas the accuracy of the depth of the oxygen maximum is on average 25 m, which is a reasonably good result considering the dispersion in these ecosystem features (refer to Figure 7 and Supplementary Figure 1) resulting from the interaction between vertical transport and mixing and biological processes.

Synthesis of the Reanalysis Validation
To provide an overall view of the reanalysis uncertainty at the different levels of validation, we derived two normalized indicators for the model variables from the metrics presented in the previous sections, including a cost function, defined as the ratio between the absolute bias and the standard deviation of the observations σ O (qualified as very good if lower than 1, good if ranging from 1 to 2, reasonable if ranging from 2 to 5, and poor if higher than 5; Moll, 2000), and RMSD normalized with σ O , which should ideally be lower than 1 when the model predictions of individual observations are better than the average observation values (i.e., conceptually the model efficiency index defined in Stow et al., 2009). Figure 8 shows a synthesis of validation Levels 1 and 2, mainly considering the surface layer of the Mediterranean Sea. The indicators extended to all subbasins and vertical layers are shown in Supplementary Figures 2-4. The cost function at Levels 1 and 2 is lower than 1 for all the variables in the upper layer (Figures 8A-C), thus demonstrating an overall very good quality of the reanalysis in reproducing values and variability linked to basin-wide dynamics. The cost function values remain well below 1 in all layers and deteriorate only in the two deepest layers (Supplementary Figure 2, top). Among the subbasins, the Alboran Sea (alb), southwestern Mediterranean Sea (swm1), northern Tyrrhenian Sea (tyr1), and eastern Levantine (lev4) exhibit at least four variables for which the cost function value exceeds 1 in certain layers (Supplementary Figure 3), e.g., the carbonate system variables in the far eastern subbasins and nutrients in the subbasins close to the Atlantic boundary, which are affected by possible systematic biases.
The model also achieves an overall good skill when quantified with the normalized RMSD at both Levels 1 and 2 in the surface layer, with the exception of ammonium at Levels 1 and 2 and the average values of chlorophyll, phytoplankton biomass, and oxygen in the euphotic layer at Level 2 computed based on BGC-Argo data (Figure 8C). Considering all layers, ammonium and oxygen below 100 m exhibit values higher than 1 in more than three layers (Supplementary Figure 2, bottom). Phosphate in the upper layers and DIC and alkalinity below 150 m attain very low values of the standard deviation of the observations driving normalized RMSD values that are higher than 1 in at least two layers for each variable.
Even if less robust because of the fewer number of available observations, the normalized RMSD computed for the subbasins and layers provide useful indications of possible model flaws (Supplementary Figure 4). Nutrients in the western subbasins in the deepest layers and carbonate system variables in the most eastern subbasins are the most frequent cases with normalized RMSD values above 1. As already mentioned, ammonium is by far the least accurately modeled variable, while oxygen below 300 m attains values higher than 1 in at least 12 subbasins (Supplementary Figure 4).
Among the subbasins, the northern Tyrrhenian Sea (tyr1), which exhibits a high variability and is a transitional area (as reported in several regionalization analyses, e.g., Ayata et al., 2018;Di Biagio et al., 2020), reveals a normalized RMSD value higher than 1 for at least four variables in several layers.

Mean Spatial and Temporal Variabilities
The reanalysis provides an overall view of the average spatial and temporal variabilities in the biogeochemical state of the Mediterranean Sea over the past two decades. The spatial variability at the basin scale is computed as the range of the average annual subbasin values, whereas the average temporal variability is computed as the mean range (i.e., throughout the 21-year simulated period) between the minimum and maximum subbasin monthly average values (Figure 9 and Supplementary Table 2). The selected subbasins reveal a typical zonal Mediterranean gradient, while the marginal seas (Adriatic and Aegean Seas), which are characterized by important anomalies , are not shown in Figure 9.
Certain variables (e.g., alkalinity and DIC) are characterized by a higher spatial than temporal variability. In terms of nutrients such as nitrate and phosphate, even when the spatial variability is notable, the variability in the seasonal cycle at the surface in several western subbasins can reach as high as the spatial variability. The spatial variability in nutrients remains a peculiar signature of the subsurface layers (i.e., below 100 m), where the seasonal cycle is narrow and the  Details of the metrics are in the text. Aggregated sub-basins are considered, and statistics computed when at least two profiles per month are available.
Mediterranean reveals two distinctive conditions separated by the Sicily channel (Figure 9). In marine ecosystems, ammonium is generally produced by remineralization processes of organic matter and rapidly assimilated by plankton or oxidized into nitrate. Reanalysis ammonium, which exhibits a relatively high level of uncertainty (Figure 8), tends to reach zero below the euphotic layer and indicates a seasonal variability that exceeds the spatial variability.
Chlorophyll simulation achieves a good skill (Figure 8) and captures important spatial and temporal signals, which have already been reported in the literature (Lavigne et al., 2015;Barbieux et al., 2019), the enhanced seasonal cycle in both the surface (i.e., winter/spring blooms, especially in the western regions) and subsurface (50-150 m, with DCM onset from April to October) layers and significant spatial variability both in terms of the zonal gradient and vertical displacement of the subsurface chlorophyll maximum (Figure 9). At the surface, the oxygen temporal variability reaches as high as approximately 50-60 mmol/m 3 following the seasonal saturation cycle, with important spatial differences among the subbasins attributed to the spatial gradient of the surface temperature (Escudier et al., 2021). In the subsurface layer, biological activity drives the variability associated with a maximum at a depth from approximately 50-100 m (i.e., just above the DCM), and important spatial differences are associated with the different levels of productivity in the various subbasins, with the western subbasins being more productive and shallower than are the eastern subbasins.

Trends During the 1999-2019 Period
Trends were evaluated considering the simple, fixed-seasonal cycle (as shown in Figure 9) C-1 method (Vantrepotte and Mélin, 2009) based on the monthly time series for the 16 subbasins.
Trends were computed for certain important layers (e.g., surface, subsurface, or vertically integrated) and variables to provide a general view of the different scales of variability in the euphotic layer. Figure 10 shows values of the trend slope considering an arbitrary range based on either the seasonal or interannual variability (Supplementary Table 2), and highlights when trends are inferred in addition to the statistical significance.
Chlorophyll generally attained very low trends, either positive or negative, mostly not significant. Indeed, considering the surface (CHL surf ), only the Adriatic Sea (adr1) and southern Ionian (ion2) exhibit positive and negative significant trends, respectively. However, when the trends were computed based on vertically integrated values (CHL 0−200 m ), a generally positive (negative) trend characterizes the western (eastern) subbasins. Consistent with the negative trend of chlorophyll in the eastern Mediterranean Sea, the annual vertically integrated primary production reveals very small negative values in the eastern subbasins and an absence of long-term signals in the western subbasins, except the north-western Mediterranean (nwm). The negative trend values seem mostly associated with a decrease in the summer productivity, whereas the productivity of the winter bloom (NPP winter ) increases in the western subbasins due to the presence of a larger number of intense winter blooms with enhanced vertical mixing in the second part of the reanalysis with respect to the initial period, as reported in Cossarini et al. (2020). However, the limited length of the reanalysis (i.e., 21 years) might influence the robustness of the above trend analysis when the interannual variability is high (i.e., the trend slope values are generally lower than 10% of the interannual variability, as indicated in Supplementary Table 2). While a negative trend in the surface layer nitrate occurs in most of the subbasins, in the case of phosphate the trend is practically null and not significant. Most of the eastern subbasins and several western subbasins indicate a very small but significant tendency for oxygen [(−0.05, −0.10) mmol/m 3 /y], which corresponds to a decrease in the oxygen content of approximately 1.0-2.0 mmol/m 3 during the simulated period.
Significant trends for all carbonate variables were obtained for the whole Mediterranean Sea. In particular, the two prognostic model variables (alkalinity and DIC) exhibit positive trends in the eastern subbasins that are higher than those in the western subbasins: at 1.5 and 0.7 µmol/kg/y, respectively, for the alkalinity and 2 and 0.6 µmol/kg/y, respectively, for DIC. While the DIC and alkalinity variations impose opposite effects on pH, which is a diagnostic variable of the carbonate system, the overall effect revealed by the reanalysis is a pH decrease at a rate of approximately 0.0006-0.0012 y −1 (higher in the eastern subbasins). Finally, a significant trend of the CO 2 air-sea flux is detected, with values ranging from 0.05 to 0.15 (mmol/m 2 /d)/y, with the highest values in the western subbasins.

Temporal Evolution and Trend of the Carbonate System Variables
One of the most relevant signals in the present reanalysis dataset is the impact of the increase in the atmospheric CO 2 partial pressure, which affects the carbon sink in the marine system. Despite the very high seasonal cycle, the Mediterranean Sea exhibits nearly neutral conditions, with a mean annual value of 0.35 (0.24 in open sea areas only) mmol/m 2 /d in the 2010-2019 period (Figure 11). However, we observe a possible shift in CO 2 behavior between a net source and a net sink over the last two decades, at least in the open sea areas, as a consequence of the increase in atmospheric CO 2 . It should be noted that the dynamics in coastal areas that are simulated as net sinks of atmospheric CO 2 are highly uncertain due to the uncertainty in terrestrial inputs and the impact of benthic and coastal biogeochemical dynamics (Borges et al., 2006;Cossarini et al., FIGURE 10 | Trend slope values for selected variables in the 16 sub-basins. Color scale range is adapted considering the range of variability or mean values of each variable (scale factor is provided together with the variable unit). The symbol * marks significant trend (t-test at p-level of 0.05). 2015b). Regarding the spatial variability, the model exhibits strong CO 2 uptake (ingassing) in the nwm subbasin, Adriatic, and Aegean Seas and CO 2 outgassing in the south-eastern Mediterranean ( Figure 11B). Hotspots of CO 2 atmospheric uptake occur in areas of dense water formation (Pinardi et al., 2015), fostering the carbon pump mechanism in marginal areas (Cossarini et al., 2017). Considering the marine carbon cycle functioning in the open sea areas, primary production is a key component that, in addition to constituting the base of oceanic food webs, contributes to atmospheric CO 2 sequestration through CO 2 fixation and subsequent organic carbon sinks (Siegel et al., 2014;Morrow et al., 2018). We provide the mean annual value of the integrated NPP, together with an estimation of the sink of particulate organic carbon (POC) and particulate inorganic carbon (PIC) at a depth of 500 m (Figure 3). According to the BFM formulation Vichi et al., 2020), NPP is determined as photosynthesized carbon minus respiration and fast-release carbon due to a lack of nutrients for biomass synthesis. While organic carbon is produced through mortality and grazing of all plankton groups, the dynamics of PIC (i.e., calcite) comprise calcite precipitation by coccolithophores (parameterized as the mortality of nano-flagellate functional types multiplied by a PIC:POC ratio factor of 1; Krumhardt et al., 2017) and the dissolution process (Morse and Berner, 1972).
The zonal trophic gradient of primary production reflects a sink of particulate carbon, where the carbon export of the western subbasins was double that of the eastern subbasins. The particulate organic and inorganic carbon exports are similar across the western subbasins, with values of approximately 0.1-0.3 g/m 2 /y, whereas the PIC sink seems to be the dominant carbon export process in the eastern subbasins, where coccolithophores are recognized to increase eastward in the Mediterranean basin and could be the relevant phytoplankton group (Siokou-Frangou et al., 2010;Oviedo et al., 2015). Carbon sequestration in the deep layers is approximately 10-20% of atmospheric CO 2 absorption. No significant temporal tendency was detected in the sinks of POC and PIC, suggesting that the increase in carbon accumulation in the deepest layers (not shown) was mostly related to the downward diffusion/transport of DIC from the surface rather than an alteration in the model internal biogeochemistry.

DISCUSSION
This study describes a 1999-2019 reanalysis of the Mediterranean Sea biogeochemistry produced within the CMEMS framework. Combined with related physical reanalysis (Escudier et al., 2021), biogeochemical 3D daily and monthly fields are open and freely available through the CMEMS web portal. 1 A robust and rigorous validation is a critical point to inspire user confidence toward model products (Hernandez et al., 2018;Hipsey et al., 2020). Through the three-level validation framework (section "Reanalysis Validation"), the Mediterranean biogeochemical reanalysis was demonstrated to constitute a valid multiscale product that can be applied to investigate variability at different temporal and spatial scales (refer to the examples in the Ocean State Report, von Schuckmann et al., 2016) and can serve as qualified boundary conditions for subregional high-resolution model systems (refer to the example in Bruschi et al., 2021). To our knowledge, this is the first time that different levels of validation have been achieved in a reanalysis of the Mediterranean Sea biogeochemistry, quantifying the uncertainty in as many as 12 variables: chlorophyll, phytoplankton biomass, NPP, phosphate, nitrate, oxygen, ammonium, pH, alkalinity, DIC, sea surface pCO 2 , and air-sea CO 2 flux.
As thoroughly outlined by Hipsey et al. (2020) and demonstrated in specific cases within the CMEMS regional forecasting system (as an example, Gutknecht et al., 2019;Salon et al., 2019), biogeochemical data availability is an important constraint for model validation since different variables exhibit varying uncertainty levels on the basis of the availability of reference data. Therefore, our proposed framework offers a degree of confirmation (Oreskes et al., 1994) concerning the different spatial-temporal variability scales determined by the availability of observations specific to various datasets.
One caveat regarding the robustness of the present analysis is that surface chlorophyll comparison at Levels 1 and 2 was performed with the same satellite observations previously assimilated, which is a common procedure in skill assessment of assimilative systems (Gregg et al., 2009). Nevertheless, as mentioned, the model is evaluated before the observations are assimilated. Hence, the metrics provide an assessment of the skill performance in regard to short-term temporal dynamics. Among the 10 model variables compared to the EMODnet dataset, three variables were further compared with the BGC-Argo dataset. These two validations were kept separate to ensure consistent accuracy in the calculations, given the relatively different spatial and temporal coverage levels and QC procedures between the EMODnet and BGC-Argo data. Furthermore, the BGC-Argo floats provide observations at high vertical and temporal resolutions, allowing us to compute time series of the vertical properties of oxygen, chlorophyll, and nitrate (in the three-level validation framework).
Generally, our validation results revealed the good model performance and certain flaws in the reanalysis. The impact of the accuracy of the climatological Atlantic boundary, the low spatial variability in the initial conditions (i.e., a single profile for each subbasin), and the possible imbalance between deep ventilation and mesopelagic mineralization processes could be the reason for the model errors in the western subbasins and the deep-layer values for nutrients and oxygen (i.e., nitrate and phosphate underestimation and oxygen overestimation in the mesopelagic layer, with a larger error in ammonia, as shown in Figures 6-8, and Supplementary Figures 2-4). Additionally, a possible excess evaporation effect could be the cause of the alkalinity overestimation in the eastern subbasins, causing a slight overestimation of the pH (Figure 6). However, it should be noted that the alkalinity trend is fairly well aligned with the salinity trend, as described in Escudier et al. (2021), highlighting that the slight alkalinity overestimation could be attributable to a possible error in the initial conditions (stemming from the 1999 to 2016 EMODnet_int climatology, which might already include part of the trends).
Focusing on the epipelagic layer, where the accuracy is higher than that in the mesopelagic layer, the reanalysis reproduces certain long-term tendencies that occur in the Mediterranean Sea as a response to changes in atmospheric forcing. The increase in heat content (Escudier et al., 2021) and the increase in atmospheric CO 2 (Artuso et al., 2009) are the main drivers of the detected significant changes in oxygen at the surface and carbonate system variables.
Indeed, the trend of oxygen is compatible with the decrease in oxygen solubility due to the estimated increase of 0.7 • C in the sea surface temperature over the last two decades (Escudier et al., 2021). However, it should be noted that the mesopelagic layer (i.e., 300-700 m) instead indicates a positive significant trend of approximately 0.5 mmol/m 3 /y (not reported), which we hypothesize is caused by a possible missing respiration term in the BFM model formulation, as also reported in Reale et al. (2021). Thus, the reanalysis yields an accumulation of 10 mmol/m 3 that affects its quality in the deeper layers (Figure 6 and Supplementary Figure 4).
The positive alkalinity trend could be linked to the longterm change in the water balance (refer to Skliris et al., 2018), which has also caused an increase in the 0-200 m layer salt content of approximately 0.15 PSU over the last 20 years (Escudier et al., 2021).
A constant and spatially homogeneous increase in the atmospheric pCO 2 value from 370 to 410 ppm during the 1999-2019 period caused a positive increase in the sink of atmospheric CO 2 . The simulated values are in line with those already published for the Mediterranean Sea (von Schuckmann et al., 2018) and reveal how the Mediterranean has become a net sink of atmospheric CO 2 over the last 20 years. Excess of evaporation over runoff and precipitation combined with the increase in atmospheric CO 2 absorption determined the trend of DIC in the upper layer, with values ranging from 0.6 to 2 µmol/kg/y. The use of climatological boundary conditions throughout the entire duration of the simulation might have partially hampered the positive tendency of the long-term evolution of alkalinity and DIC in the southern areas of the western Mediterranean Sea, mostly influenced by Atlantic inflow. Consequently, lower pH trends were simulated in the western subbasins, where conversely, the increase in the CO 2 sink is 50% larger, suggesting an increase in the net export of carbon through the Gibraltar Strait.
We also confirmed specific long-term tendencies already unveiled by satellite and model results for chlorophyll (Salgado-Hernanz et al., 2019) and winter primary production . The consistency in surface chlorophyll tendency should not be surprising since this is an assimilated variable. Even though there is no consensus on significant trends or regime shifts in ventilation (Li and Tanhua, 2020;Mavropoulou et al., 2020), the negative trend in nitrate appears to be linked to the impact of chlorophyll satellite assimilation on non-limiting nutrients (Teruzzi et al., 2014). Indeed, in late winter and early spring, the assimilation of satellite chlorophyll observations mainly acts to correct bloom overestimation, thereby nearly uniformly reducing the phytoplankton concentration in the euphotic layer. The negative trend of nitrate occurs due to the consequent reduction in phytoplankton nitrogen remineralization. Even if this phenomenon might be a spurious tendency, the reduction is much lower than the mean value (i.e., approximately 1% over the two simulated decades) and far below that of the variable error (refer to the previous section).
As a general remark, it should be noted that some of the values of these trends, even if significant in several cases (i.e., in specific subbasins), are very low and affected by the length of the time series of the reanalysis, which cannot be extended toward the past beyond the start of the European Space Agency (ESA)-Climate Change Initiative (CCI) satellite time series.
The reanalysis was produced with the primary focus of providing the best 3D estimate of the biogeochemical state of the Mediterranean Sea, integrating the most recent upstream data made available within CMEMS and other communities, including the assimilation of surface chlorophyll data, the atmospheric and hydrodynamic physical forcing, and the biogeochemical boundary conditions. Robust validation of the 12 variables constituting the reanalysis dataset offers users a measure of the accuracy at different spatial and temporal scales. Our approach highlighted large-scale changes, whereas unveiling specific impacts on the biogeochemical features of changes in water circulation and water mass movements (Malanotte-Rizzoli et al., 2014;Schneider et al., 2014) is left to subsequent papers. In particular, changes and accuracy of the circulation fields are examined in Escudier et al. (2021), and specific analyses can be addressed in the future given the availability of physical and biogeochemical reanalysis data in CMEMS.
Furthermore, our comprehensive validation approach highlighted possible model errors, such as nutrients and oxygen in the mesopelagic-bathypelagic layers, that should be considered before drawing conclusions concerning the analysis of intermediate and deep Mediterranean waters. Thus, important multidecadal signals, such as the presence of a deep dissolved oxygen variability linked to shifts with a multidecadal signal (Mavropoulou et al., 2020) or bathypelagic negative oxygen trends (Sisma-Ventura et al., 2021), when compared with the reanalysis, could help to improve the next generation of reanalysis products, shifting the focus of biogeochemical analysis from the epipelagic layer to the meso and bathypelagic layers.

CONCLUSION
A new and high-resolution biogeochemical reanalysis of the Mediterranean Sea is presented, and the resultant dataset of daily and monthly 3D fields of 12 variables and ecosystem fluxes is freely available from the CMEMS web portal. In summary: • The new reanalysis integrates several novel elements: ERA5 atmospheric forcing, coupling with the new 1/24 • physical reanalysis, updated versions of the BFM model and biogeochemical variational assimilation scheme, and several updated observational datasets for assimilation and validation purposes. • As a result of the developed three-level validation framework, robust skill assessment of the 12 considered variables offers CMEMS users a measure of accuracy at different spatial and temporal scales. The estimated accuracy of the reanalysis is high for all variables, except ammonium, and the validation is successfully extended to certain emerging properties describing ecosystem processes (e.g., DCM and the depth of the nitracline and subsurface oxygen maximum) using the BGC-Argo dataset. A certain degradation in the quality of oxygen and nutrient estimates is reported in the deep layers. • The reanalysis represents a consistent reconstruction of the Mediterranean Sea biogeochemical state and its multiscale variability over the last two decades, quantifying the impact of external and atmospheric forcing on basin-scale tendencies toward surface deoxygenation and acidification, in agreement with previous works.

DATA AVAILABILITY STATEMENT
Publicly available datasets were analyzed in this study. This data can be found here: https://resources.marine.copernicus.eu/? option=com_csw&view=details&product_id=MEDSEA_MULTI YEAR_BGC_006_008.

AUTHOR CONTRIBUTIONS
GCos, SS, AT, GB, and LF contributed to the design of the system. GCos and SS wrote the first draft of the manuscript. All authors contributed to the computation of the reanalysis and its assessment and contributed to manuscript revision, read, and approved the submitted version.