Process-Oriented Estimation of Chlorophyll-a Vertical Profile in the Mediterranean Sea Using MODIS and Oceanographic Float Products

Reconstructing chlorophyll-a (Chl-a) vertical profile is a promising approach for investigating the internal structure of marine ecosystem. Given that the process of profile classification in current process-oriented profile inversion methods are either too subjective or too complex, a novel Chl-a profile reconstruction method was proposed incorporating both a novel binary tree profile classification model and a profile inversion model in the Mediterranean Sea. The binary tree profile classification model was established based on a priori knowledge provided by clustering Chl-a profiles measured by BGC-Argo floats performed by the profile classification model (PCM), an advanced unsupervised machine learning clustering method. The profile inversion model contains the relationships between the shape-dependent parameters of the nonuniform Chl-a profile and the corresponding Chl-a surface concentration derived from satellite observations. According to quantitative evaluation, the proposed profile classification model reached an overall accuracy of 89%, and the mean absolute percent deviation (MAPD) of the proposed profile inversion model ranged from 12%–37% under different shape-dependent parameters. By generating monthly three dimensions Chl-a concentration from 2011 to 2018, the proposed process-oriented method exhibits great application potential in investigating the spatial and temporal characteristics of Chl-a profiles and even the water column total biomass throughout the Mediterranean Sea.


INTRODUCTION
The assessment and monitoring of the marine environmental status has received ever-growing attention in recent years due to the potentially critical impact of ongoing natural and human-induced changes on related ecosystem functioning and services (Puissant et al., 2021). As key players in ocean biodiversity, the alteration of phytoplankton provides information on some of the principal climate-driven effects on environmental forcing, and consequently, on marine ecosystem equilibrium, making monitoring and assessing their composition and distribution is of great importance for marine ecosystem and even global change studies (Falkowski, 2012;Sammartino et al., 2018;Kotta and Kitsiou, 2019).
A deeper understanding of the dynamics and evolution of marine phytoplankton requires three dimensions (3D) observations of the algal abundance at different temporal and spatial scales and much wider and regular coverage than currently achievable (Sammartino et al., 2020). However, the traditional measurement is mainly based on in situ sampling either through coastal monitoring programs or time-limited oceanographic cruises, or fixed platforms such as moored buoys, which can accurately describe local conditions along the water column, but are clearly inadequate to describe processes occurring within the wide range of temporal and spatial scales impacted by undergoing changes (von Schuckmann et al., 2018). Advanced instruments such as Biogeochemical Argo (BGC-Argo) float allow for autonomously observing while drifting with ocean currents according to pre-programmed procedures, which largely alleviate the shortcomings of low sampling density and costly of the traditional in situ measurements, but still far from continuous in space.
Based on the shift of seawater spectrum from blue to green wavelengths caused by the seawater substances controlling the color of seawater, ocean color remote sensing (OCRS) revolutionized marine phytoplankton assessment by estimating the concentration of Chlorophyll-a (Chl-a), one of the most commonly used bioindicator of phytoplankton abundance (Gordon et al., 1980). With the advantages of low-cost and fast observation at large scales in a space continuous manner, Chl-a product of OCRS have become one of the most important data in marine researches (McClain, 2009). However, the signals measured by OCRS sensors are result from the interaction between light and water constituents, decreasing exponentially with water depth (Morel, 1988). As a consequence, OCRS product can only reveal signals integrated within the surface detectable layer and not properties at greater depth (Siswanto et al., 2005;Uitz et al., 2006).
Given their respective strengths, there have been many attempts to combine these two kinds of observations to improve our knowledge of the interior structure of the ocean. Referring to Liu et al. (2021), these methods can be categorized into result-oriented and process-oriented according to the strategy based on. Result-oriented approach is to infer Chl-a concentration at different depths directly from other measurable ocean variables. Due to the complexity of the marine ecosystem, this kind of method usually requires tools as artificial neural network (Sammartino et al., 2020) and its variants (Puissant et al., 2021), which are capable of revealing non-linear relations. Although the performance of such methods has been validated on regional and even global scale, the huge demand for input variables and computational resources greatly limits their utilization potentiality (Erickson et al., 2019;Lee et al., 2015).
Process-oriented approach is to extrapolate vertical profile by inferring parameters that control the shape of the profile. This kind of methods generally include a profile parameterization process, which characterizes the shape of profile as a certain number of parameters by fitting profile to a certain mathematic equation . These parameters controlling the shape of profile are usually referred to as shape-dependent parameters. After profile parameterization, the relationships between surface variables and these shape-dependent parameters then can be established explicitly by empirical models (Dierssen, 2010). However, extensive practice has revealed that it is difficult for a single model to applicable to all types of vertical profiles. It is a common practice to divide profiles into several subcategories according to certain criteria (such as surface concentration or water column total concentration), and develop subcategory-specific empirical models (Morel and Berthon, 1989;Millań-Nuñez et al., 1997). Although this strategy can effectively improve the overall accuracy, profile classification is too subjective to find the theoretical basis. The advanced machine learning methods can also be used to relate the surface variable with shape-dependent parameters, but such method are not only too unintuitive, but also too complex (Silulwane et al., 2001;Charantonis et al., 2015;Sauzède et al., 2015). A simple and objective profile classification method for profile reconstruction is urgently needed.
This manuscript attempts to propose a novel process-oriented Chl-a profile inversion method by utilizing an unsupervised machine learning profile clustering method called profile classification model (PCM) in the pre-classification stage. The remainder of this paper is organized as follows: the study area and data are introduced in Section 2; the methods are described in Section 3; the accuracy of the proposed method is validated in Section 4; a discussion is presented in Section 5; and conclusions and perspectives are provided in Section 6.

Study Area
This study was conducted in the Mediterranean Sea, a semienclosed basin located in the transition zone between temperate and subtropical environments (from 30°N to 45°N and 0°E to 30°E ). The Mediterranean Sea is surrounded by continental Europe, Asia, and Africa and is connected to the Atlantic Ocean through the Strait of Gibraltar. The Mediterranean Sea has remained an area of heightened interest for global climate change research over the past few decades, partly because it plays a major role in responding to global warming (Pisano et al., 2020), and partly because it is considered an ideal natural laboratory where processes can be characterized on smaller scales than can be achieved in other oceans (Robinson et al., 2001).
In this study, the Mediterranean Sea was empirically divided into five subsea areas, namely, the northwest Mediterranean Sea (NW), southwest Mediterranean Sea (SW), Tyrrhenian Sea (TYR), Ionian Sea (ION) and Levantine Sea (LEV), following Barbieux et al. (2019). The spatial extents of the Mediterranean Sea and the subdivided regions are shown in Figure 1.

BGC-Argo Chl-a Profiles
More than 70 BGC-Argo floats deployed between 2012 and 2016 alone (Cossarini et al., 2019). The long-term deployment has made the Mediterranean BGC-Argo network one of the densest networks in the global ocean in terms of the number of profiles per unit surface area, providing unique data support for studying ecosystem characteristics and dynamics.
In this study, Mediterranean Chl-a vertical profiles were obtained from a global database of vertical profiles derived from Biogeochemical Argo float measurements publicly available at https://www.seanoe.org/data/00383/49388/ (Marie et al., 2017). This dataset includes 0-1000 m vertical profiles of bio-optical and biogeochemical variables acquired by autonomous profiling BGC-Argo floats around local noon between October 2012 and January 2016. It contains profiles of downward irradiance at 3 wavelengths (380, 412 and 490 nm), photosynthetically available radiation, Chl-a concentration, fluorescent dissolved organic matter, and particle light backscattering at 700 nm. All variables have been quality controlled following specifically-developed procedures, and data corruption by biofouling and any instrumental drift has also been verified. In the Mediterranean Sea, all Chl-a profiles were collected by 27 BGC-Argo profiling floats, and their trajectories o are shown in Figure 1, with the various floats highlighted in different colors.
As a result, a total of 1611 Chl-a vertical profiles can be matched with other data used. To match up in situ Chl-a vertical profiles with other data, a 1°×1°box centered on each float location was adopted. These profiles were collected within a wide geographical area of the Mediterranean Sea in almost equal numbers during each season, with a total of 409, 411, 415 and 376 records measured in spring, summer, autumn and winter, respectively. Therefore, these profiles could represent the spatial and temporal characteristics of the phytoplankton vertical distribution in this basin.

GLBa0.08 Analysis Data
The 5-year product (from 2012 to 2016) of the Hybrid Coordinate Ocean Model-Navy Coupled Ocean Data Assimilation (HYCOM +NCODA) global 1/12°analysis (GLBa0.08; https://www.hycom. org/dataserver/gofs-3pt0/analysis) product was used in this study. The HYCOM Consortium is a nearly real-time global ocean prediction system based on the Hybrid Coordinate Ocean Model (HYCOM) and Navy Coupled Ocean Data Assimilation (NCODA) system (Halliwell, 2004). The advantage of HYCOM-based analysis is the implementation of a substantially evolved hybrid vertical coordinate system, which remains isopycnic in the well-stratified open ocean and combines different types of coordinates, transiting into level coordinates in less stratified regions (surface mixed layer) and very shallow water and into terrain-following sigma coordinates in nearshore regions (Chassignet et al., 2007). This feature provides the HYCOM with the ability to optimally simulate coastal and open ocean circulations.
GLBa0.08 is a uniformly gridded (1/12°) global reanalysis dataset that converts native HYCOM [ab] data into NetCDF data on a native Mercator-curvilinear HYCOM horizontal grid, interpolated into 33 z-levels (Shu et al., 2014). This product provides 11 ocean essential variables, such as the surface water flux, salinity, surface salinity trend, surface temperature trend, and mixed layer depth (MLD) (Augusto Souza Tanajura et al., 2014). All these data are freely available at https://www.hycom. org/dataserver/gofs-3pt0/analysis. In this research, only MLD products pertaining to the study period were utilized.

Satellite Data
Daily a nd monthly Moderate Resolution Imaging Spectroradiometer (MODIS) level-2 Chl-a product from 2011 to 2018 was used in this study. This product is generated with an empirical relationship derived from in situ measurements of Chla and remote sensing reflectance in the blue-to -green region of the visible spectrum (Hu et al., 2012). This product is produced by the NASA Ocean Color program and is available from https:// oceancolor.gsfc.nasa.gov/cgi.
In addition to the Chl-a surface concentration, the euphotic layer depth (z eu ) has been verified as a main variable explaining the vertical variability in Chl-a along the water column (Vadakke-Chanat and Shanmugam, 2020). The euphotic layer depth is a common indicator of water turbidity, defined as the depth at which irradiance is attenuated to 1% of the initial value at the surface (Tett, 1989;Dera, 1992). Consequently, the corresponding level-3 product of the MODIS diffuse attenuation coefficient for the downwelling irradiance at 490 nm (hereafter referred to as Kd490), which can be used to estimate the depth of the euphotic layer, was also employed in this study. The MODIS Kd490 product is produced under the GlobColour project and is freely available at http://globcolour. info/. Kd490 was generated following the model proposed by Lee et al. (2005) . In this study, the method proposed by Lin et al. (2016) was used to estimate the euphotic layer depth. The specific steps are as follows: (1) The diffuse attenuation coefficient for the downwelling irradiance of the usable solar radiation (USR) (KdUSR) was estimated from Kd490. USR represents the spectrally integrated solar irradiance within the spectral window of 400-560 nm, as defined by Lee et al. (2014) .
(2) The depth of the euphotic layer was derived according to the equation proposed by Lin et al. (2016).
All these satellite data have undergone quality and flag assessments and are generated with a spatial resolution of 4 km. To match the BGC-Argo, satellite, and analysis data, the satellite and analysis data in a 1°×1°box centered on each float location were used to provide matchups.

Methodological Overview
First, all the BGC-Argo Chl-a profiles were clustered with the PCM, after which, a decision tree profile classification model was established based on a priori knowledge provided by the clustering results. Then, these profiles were parameterized with a modified Gaussian function. Finally, for each type of Chl-a profile, empirical relationships between the corresponding shape-dependent parameters derived from previous profile fitting and Chl-a surface concentration were established. A flowchart of these steps is drawn in Figure 2.

Remote Sensing Pixel-Scale Chl-a Profile Type Identification
The vertical distribution of phytoplankton exhibits different shapes under the influence of the interactions between various biological (e.g., particular species present and physiological state) and physical (e.g., currents and shear between water parcels) factors, making it difficult to obtain one set of general parameters that can be applied to extrapolate all profiles. Therefore, a common practice is to classify vertical profiles according to their characteristics and to develop an extrapolation model for each of the subcategory. To more simply and objectively identify the profile type of each pixel of the remote sensing Chl-a product, the shape characteristics of Chl-a vertical profiles in the Mediterranean Sea were first explored by clustering all BGC-Argo Chl-a profiles. According to the profile clustering result, a binary tree profile classification model based on several publicly available marine essential variables was then established.

Revealing the Shape Characteristics of Chl-a Profiles
Next, the Chl-a vertical distribution in the Mediterranean Sea was explored with the PCM.
The PCM is an unsupervised machine learning classification technique designed to reveal the vertical distribution of ocean temperatures (Maze et al., 2017a). This method relies on a Gaussian mixture model (GMM) to decompose the probability density function (PDF) of a collection of profiles into a weighted sum of multidimensional Gaussian PDFs, thus facilitating the identification of the representative patterns of a given dataset (Maze et al., 2017a;Maze et al., 2017b). After classification, profiles within a given category are more similar to each other than they are to profiles in other categories (Jain et al., 1999).
The PCM was originally developed to characterize coherent heat patterns in the North Atlantic Ocean. Since it is more impartial than subjective grouping of profiles into classes, the PCM has since been successfully applied to characterize the vertical distributions of a variety of oceanographic variables (Boehme and Rosso, 2021). However, the performance of the PCM in the classification of Chl-a vertical profiles remains unclear and needs to be further explored.
Through objective and trial-and-error approaches, the Chl-a profiles in the Mediterranean Sea were ultimately categorized into four types with the PCM, and the shape characteristics of these profile types are shown in Figure 3. Referring to the previous literature (Lavigne et al., 2015), these four types were denoted as mixed, exponential, quasi-Gaussian and Gaussian types according to their shape characteristics. As shown in   Figure 3, due to the presence of the MLD, the Chl-a concentration in all profile types remained almost constant at either a deep or a shallow depth. Specifically, the MLD was much deeper for the mixed type than for the other three profile types, resulting in almost constant mixed type profiles up to a certain depth. The Gaussian type profiles exhibited typical Gaussian distribution shapes, namely, curves with a concentration peak, and the Chl-a concentration decreased with increasing distance from the peak depth. Furthermore, the vertical profiles of the exponential and quasi-Gaussian types were roughly similar in shape, but these profiles differed greatly with regard to the range of specific concentrations, the presence of concentration peaks, and the depth at which the concentration declines.

Decision Tree Profile Classification Model
By analyzing the shape characteristics of these four types of Chl-a profiles and referring to the previous literature, the Chl-a surface concentration, euphotic layer depth and MLD were adopted in this study to establish a decision tree classification model to classify the type of each pixel of the remote sensing Chl-a product.
In general, Chl-a vertical distributions could be divided into two main types: uniform and nonuniform, corresponding to mixed and stratified water, respectively (Mignot et al., 2011). The differences between these two types of Chl-a vertical distributions in the Mediterranean Sea lies in whether the MLD exceeds the euphotic layer depth. For the uniform type profile, phytoplankton are nearly evenly distributed along the water column due to strong water mixing, resulting in a homogeneous Chl-a concentration from the surface to great depths. Hence, the ratio between the MLD and euphotic layer depth is a common indicator for the identification of deep mixed water (Uitz et al., 2006). Since the euphotic layer is the maximum depth of the light zone suitable for phytoplankton photosynthesis (Khanna et al., 2009) , when the euphotic layer is shallower than the mixed l a y e r , t h e p h y t o p l a n k t o n a r e a l m o s t u n i f o r m l y distributed vertically.
As shown in Figure 3, the Gaussian, quasi-Gaussian and exponential types are all nonuniform due to the obvious presence of a subsurface chlorophyll maximum (SCM). By analyzing the shape characteristics of these three nonuniform profiles, the surface concentrations on the exponential type profiles were found to be much higher than those on the other two types of profiles ( Figure 4). In addition, Gaussian profiles could be distinguished from quasi-Gaussian profiles based on the MLD.
As a result, a binary tree based on the surface concentration, MLD and euphotic layer depth was developed to infer the vertical profile type in each pixel of the OCRS Chl-a product. As shown in Figure 5, the following three steps were included: ① A ratio of the MLD to the euphotic layer depth higher than 1 was adopted to identify profiles of the mixed type. ② A Chl-a surface concentration higher than 0.799 was used to recognize profiles of the exponential type. ③ Whether the MLD exceeded 31.1 m was considered to distinguish between Gaussian and quasi-Gaussian profiles.
If the MLD exceeded this threshold, quasi-Gaussian profiles were determined; otherwise, Gaussian profiles were identified.

In Situ Chl-a Profile Parameterization
To parameterize the profiles, each profile was fitted with a certain mathematical model, allowing the profile shape to be represented by model coefficients, which is a common practice in characterizing the profile shape of essential marine and atmospheric variables (Beckmann and Hense, 2007;Gonzaĺez-Pola et al., 2007). Various functions have been used to parameterize marine variable vertical profiles, including Gaussian functions (Morel, 1988;Morel and Berthon, 1989) and their derivatives (Uitz et al., 2006), power functions (Liu et al., 2021) and exponential functions (Ardyna et al., 2013). Among these functions, the generalized (or shifted) Gaussian function is the most popular, and has been verified as a suitable function for fitting profiles containing only a single peak; this function is quite common in coastal, upwelling, open oceans and Arctic waters (Lewis et al., 1983;Siswanto et al., 2005). Here, to better characterize profiles with a shallow SCM layer, where the surface concentration is higher than the deepest value, the generalized Gaussian function was updated as a modified Gaussian function by replacing the constant with a linear function (Uitz et al., 2006). The general shapes of generalized and modified Gaussian functions are shown in Figure 6.
In this study, the modified Gaussian function was selected to parameterize the BGC-Argo Chl-a profiles. Since a mixed type Chl-a profile exhibits a constant concentration from the surface to great depths, these types of profiles were excluded from the subsequent parameterization to construct the inversion model. As a result, 1360 of the 1611 BGC-Argo Chl-a profiles were parameterized with the following: where chl surf denotes the Chl-a surface concentration measured by the BGC-Argo floats, trend indicates the trend of the background concentration, chl max is the maximum concentration in the Chl-a profile, z max denotes the maximum concentration depth, and width denotes the half-peak width, a fitting-related parameter indicating the thickness of the SCM that is defined as the width (measured in meters) at the halfheight of the SCM layer. This fitting-related parameter is defined as controlling the shape of the profile but can be obtained only by model fitting and cannot be measured directly. According to the fitting formula used herein, the fitting-related parameters specifically refer to width and trend. Here, chl max is regarded as the total magnitude of the Chl-a concentration, not that of the background-subtracted value.
To ensure the fitting accuracy, the Curve Fitting C program (O'Haver, 1997), a nonlinear iterative curve fitting method developed at the University of Maryland, was used to implement profile fitting. This method adjusts parameters in a systematic manner via a trial-and-error strategy until the equation yields a fitted curve that is close to the expected curve. Two commonly used fitting accuracy proxies, namely, the coefficient of determination (R 2 ) and root-mean-square error (RMSE), were selected to evaluate the fitting effect. Overall, over 70% of the profiles had an R 2 greater than 0.8, and the mean R 2 was 0.93. In regard to the specific types, the Gaussian and exponential types attained the best fitting accuracy, both with an average R 2 of 0.94, followed by the quasi- Gaussian profile type with an average R 2 of 0.93. To illustrate the fitting effect more intuitively, panels (a), (b) and (c) of Figure 7 show example profiles with the highest fitting accuracy for each of these three types and their corresponding estimations. As shown in this figure, strong correlations were obtained between the measured and estimated vertical profiles, with R 2 values of 0.9873, 09942 and 0.9914 and RMSE values of 3.176, 2.80 and 3.825, respectively.
The statistical results (Table 1) for the shape-dependent parameters obtained after profile parameterization quantitatively confirmed the shape characteristics of each profile type mentioned in the previous section. Specifically, in terms of the average value, the Gaussian type yielded the lowest maximum Chl-a concentration but the greatest maximum Chl-a depth, with values of 0.4 mg m -3 and 84.25 m, respectively. The exponential type was the opposite to the Gaussian type, namely, the highest maximum concentration and shallowest maximum concentration depth were obtained, at 2.03 mg m -3 and 11.47 m, respectively. Compared to those of these two types of profiles, the maximum Chl-a concentration and corresponding depth of the quasi-Gaussian type were moderate, at 0.49 mg m -3 and 38.21 m, respectively. In terms of the half-peak width, that of the quasi-Gaussian type was the largest (81.51 m), followed by the exponential (73.08 m) and Gaussian types (55.24 m). According to the half-peak width, phytoplankton were relatively more vertically concentrated along the Gaussian profile direction and more dispersed along the exponential and quasi-Gaussian profile directions, with half-peak widths of 55.24, 73.08 and 81.51 m, respectively. The trend variable is a model fittingrelated parameter, with no explicit biophysical meaning, although this parameter can, to a certain extent, reflect the difference between the Chl-a surface concentration and the Chl-a concentration at great depths.

Chl-a Vertical Profile Inversion Model
Reconstructing the Chl-a vertical profile from OCRS product is to establish the correlations between its surface variables and its shapedependent parameters. Matchups between the BGC-Argo profiles and OCRS products provide the possibility to implement this inversion. After classifying and parameterizing the profile, the shape-dependent parameters of each profile were obtained. Then, the relationships between these shape-dependent parameters derived from the parameterization between their corresponding surface concentrations derived from the OCRS Chl-a products were analyzed. To ensure the robustness and accuracy of the established relationships, 487 poorly fitted profiles with R 2 values lower than 0.8 were discarded. As a result, 1124 robustly parameterized vertical profiles were retained to establish the vertical profile inversion model.
To establish the inversion model, all matchups were first used to reveal the global correlations applicable to all profile types.  Only when robust global relationships cannot be obtained, the type-specific correlations were considered by using matchups corresponding to different type profiles. The SCM depth (z max ) has been demonstrated to be inversely proportional to the Chl-a concentration derived from satellite observations (chl sat ) over a wide range of trophic conditions (Morel & Berthon, 1989;Uitz et al., 2006;Mignot et al., 2011). This conclusion remains valid in the Mediterranean Sea. As shown in panel (a) of Figure 8, z max had a strong negative correlation with chl sat : Eq: 2 This negative relationship was observed from all matchups with an R 2 value of 0.79, thus indicating this global correlation suitable to estimate z max for all types of vertical profiles with chl sat derived from OCRS products.
Panel (b) of Figure 8 shows the correlation between the maximum Chl-a concentration (chl max ) and half-peak width of the fitting curve (width) for all profile types. Similar to z max and chl sat , a strong negative correlation between chl max and width was observed: chl max =37:55Âwidth −1:174 Eq: 3 This formulation yielded an R 2 of 0.81. The relationship between the phytoplankton biomass maximum and SCM layer thickness is well understood, and thus, width was concluded to be closely correlated with chl max in the study area. Notably, as the Chl-a maximum concentration rises, more phytoplankton accumulate within the SCM layer, nutrient consumption occurs more quickly, and the SCM layer becomes thinner (Barbieux et al., 2019).
In addition to these two relationships, another global correlation was observed between chl sat and the background trend of the Chl-a concentration (trend): trend=−0:0006Âchl sat +0:00001 Eq: 4 The linear relationship between trend and chl sat yielded an R 2 of 0.92, as shown in panel (e) of Figure 8. The potential reason for this negative relationship between these two parameters could be the positive correlation between the attenuation intensity of solar radiation and the phytoplankton biomass at the sea surface.
Having established the relationship between z max and chl sat and that between chl max and width, the Chl-a vertical profile can be reconstructed only if the relationships between z max or chl sat with both chl max and width can be established. However, except for the above parameters, no other reliable global correlations were discovered. Therefore, the possible correlations between these parameters and the combinations of the vertical profile types were explored via exploratory data analysis.
In regard to the quasi-Gaussian Chl-a vertical profiles, a strong positive linear correlation between chl max and chl sat was observed: chl max =chl sat Â1:03+0:09 Eq: 5 This linear relationship yielded an R 2 of 0.96. With this relationship, chl max could be derived from the sea surface concentration measured by satellite sensors. Then, z max can be derived from chl sat via Eq. 2, and width can be estimated from the derived chl max value based on Eq. 3. In this way, the quasi-Gaussian Chl-a vertical profile can be reconstructed.
In contrast to the quasi-Gaussian type, no direct relationships were found between chl sat and chl max or width in the Gaussian type profiles. Considering that the thickness of the subsurface layer is top-bottom limited by both light and nutrients, z eu and the MLD were introduced here to estimate width. The variation in width with respect to z eu and the MLD can be formulated as follows:  This correlation yielded an R 2 of 0.89. The establishment of this equation facilitates the estimation of all shape-dependent parameters of the Gaussian type Chl-a vertical profile. First, trend and z max can be estimated based on the OCRS Chl-a products with Eqs. 2 and 4, respectively. Then, width can be calculated according to Eq. 6 based on the products of z eu and MLD. Finally, chl max can be derived from the estimated width according to Eq. 3.
Unfortunately, failure to determine the correlations between chl sat and width and chl max makes it impossible to reconstruct the exponential type Chl-a vertical profiles. Fortunately, few profiles were of this type (accounting for only 1.21% of all profiles) in the Mediterranean Sea, and thus, ignoring exponential type profiles would not significantly impact the inversion of Chl-a vertical profiles across the study region.

Accuracy Assessment Metrics
To evaluate the classification models, the most popular confusion matrix was selected in this study. Based on the selected confusion matrix, the overall accuracy (OA), user accuracy (UA) and producer accuracy (PA) were calculated. When deriving the confusion matrix, the labels provided by the PCM were regarded as the truth, and the labels obtained with the established vertical profile classification model were regarded as predictions.
To evaluate the profile inversion accuracy, the RMSE and the mean absolute percent deviation (MAPD) were selected as the accuracy metrics. Their mathematical equations are as follows: where p m denotes the shape-dependent parameter measured by the Bio-Argo profiling floats, p e denotes the corresponding model estimation, and n indicates the number of vertical profiles used for evaluation.

Spatial and Temporal Patterns of the in Situ Chl-a Vertical Profiles
To reveal the spatial and temporal patterns of the Chl-a vertical profiles in the Mediterranean Sea, the PCM classification results for all the profiles measured by the BGC-Argo floats were counted according to both the subsea area and the season, as shown in type in the Mediterranean Sea, accounting for 78.48% of all measured profiles, followed by the mixed and quasi-Gaussian profile types, accounting for 15.6% and 4.72%, respectively. As mentioned above, the exponential type attained the lowest proportion, accounting for only 1.21% of all measured profiles. In terms of the spatial distribution, the Gaussian type was widely distributed in all five subsea areas, and the quasi-Gaussian type occurred in four subsea areas (all but LEV). The distribution of the mixed type was further reduced to the ION, LEV and NW subsea areas, whereas the exponential type was found only in the NW subsea area. In terms of the temporal distribution, the Gaussian type emerged mainly in summer, autumn and winter, while the exponential type occurred mostly in early summer. The mixed and quasi-Gaussian types were primarily observed in spring and summer but also in winter in some subsea areas.
The spatial and temporal characteristics of the Chl-a vertical profiles revealed by the PCM agree with previous findings in Mediterranean bioregions (Mayot et al., 2017;Palmieŕi et al., 2018). This agreement can be considered evidence that the classification results of the PCM can provide sufficiently accurate a priori knowledge for the construction of a vertical profile classification model.

Accuracy Assessment of the Proposed Profile Classification Model
The accuracy of the OCRS-based Chl-a vertical profile classification model was evaluated based on profiles measured by the four BGC-Argo floats reserved for validation with World Meteorological Organization (WMO) identifier numbers 6901511, 6901513, 6901776, and 6902828. As a result, 151 matchups were used to evaluate the performance of the trained binary tree profile classification model. Based on these matchups, the confusion matrix was calculated, as summarized in Table 2.
The confusion matrix reveals that the classification model achieved a satisfactory overall performance with an OA value of 89%. Specifically, the PA values of the Gaussian, quasi-Gaussian, exponential and mixed Chl-a vertical profile types were 92%, 100%, 58%, and 79%, respectively, and the corresponding UA values were 96%, 63%, 88%, and 65%. Regarding the dominant Gaussian Chl-a vertical profiles, the classification model achieved both very low omission errors and very low commission errors with PA and UA values of 92% and 96%, respectively. Although the PA values of the exponential type and UA values of the quasi-Gaussian and mixed types reached only 58%, 63% and 65%, respectively, this accuracy level is considered adequate considering their low proportion in the Mediterranean Sea. The OA value of the classification model reached 89%, which is sufficiently accurate to meet the needs of inverting Chl-a vertical profiles.

Accuracy Assessment of the Established Profile Inversion Model
The matchups used in the evaluation of the proposed classification model were also used to quantitatively assess the accuracy of the established vertical profile inversion model. The performance of the profile inversion model was assessed by measuring the deviations between the measured shape-dependent parameters and corresponding model estimations. Four parameters were used to characterize the Chl-a vertical profile shape in this study, but since width and trend cannot be directly measured, only chl max and z max were thus used for evaluation purposes to avoid possible uncertainties due to the profile parameterization.
Comparisons of chl max and z max estimated with the proposed profile inversion model with the corresponding reference in situ measurements are shown in panels (a) and (b), respectively, of Figure 10. Overall, a robust relationship was observed between the z max values estimated with the inversion model based on the OCRS products and those measured by the BGC-Argo floats. This correlation was associated with a high determination coefficient (R 2 = 0.89) and low RMSE (8.49)  deviation from the observed values. A satisfactory correlation was also observed between the estimated and measured chl max values. Although the R 2 value and slope of the regression were lower at 0.64 and 0.72, respectively, and although the MAPD value increased to 22%, the correlation remained highly satisfactory given the large range of the study area. The accuracy of the proposed Chl-a profile inversion model was further evaluated for each of the different types of profiles ( Table 3). In general, the inversion model achieved a higher accuracy for the Gaussian type than for the quasi-Gaussian type (R 2 : 0.62 versus 0.51; slope: 0.72 versus 0.33; RMSE: 0.13 versus 0.2; MAPD: 22% versus 37%).
To compensate for the lack of evaluation results for width and trend, the Chl-a concentrations at the different depths estimated with the vertical profile inversion model were further compared to those measured by the BGC-Argo floats ( Figure 11). Comparisons of the Chl-a concentrations at different depths can reveal not only the accuracy of the estimated width and trend values but also the overall accuracy of the profile inversion model. According to Figure 11, the predicted vertical profiles exhibited relatively high overall accuracy in all five subsea areas. The R 2 values were all higher than 0.5, and the RMSE values were all lower than 0.1. Among the specific subsea areas, the agreement between the actual measurements and predictions remained near the 1:1 line in the TYR, ION and LEV subsea areas. In contrast, in the SW and NW subsea areas, the regression lines were below the 1:1 line, indicating that the profile inversion model yielded slightly underestimated values, with R 2 values of 0.71 and 0.63, respectively.

Spatial and Temporal Characteristics of the Chl-a Profile Types in the Mediterranean Sea
To investigate the spatial characteristics of the Chl-a vertical profile types in the Mediterranean Sea, the monthly Chl-a profile types from 2011 to 2018 were generated based on the proposed Chl-a profile classification model, and the proportion of each type was determined. The results are shown in Figure 12, with the proportions of the mixed, exponential, quasi-Gaussian and Gaussian types being shown successively in panels (a) through (d). In general, the findings are consistent with the classification results of the PCM applied to the BGC-Argo Chl-a profiles, as introduced in Section 4.1. Specifically, the Gaussian type is the most common Chl-a profile type in the Mediterranean Sea, followed by the quasi-Gaussian, mixed, and exponential types. Moreover, the Gaussian type Chl-a profile is evenly distributed across the whole Mediterranean Sea, with a proportion higher than 50% in most of the areas, even exceeding 80% in the northern coastal area and near the Kerkennah Islands, a group of islands off the eastern coast of Tunisia. The quasi-Gaussian type attains a higher proportion in the southern half of the sea than in the northern half, with a maximum frequency of nearly 40%. The mixed type Chl-a profile is similar to the quasi-Gaussian type profile in terms of its maximum proportion, but its spatial distribution is the opposite, namely, the former profile type is concentrated mainly in the northern half of the sea.
The temporal characteristics of the Chl-a vertical profile types in the Mediterranean Sea were analyzed by adopting the monthly Chl-a profile classification results in 2013 (depicted in Figure 13) as an example. This figure shows that the Mediterranean Chl-a vertical profile types exhibit obvious seasonal characteristics.  December and March. The time window of the quasi-Gaussian type profiles is similar to that of the mixed type profiles, but differently appears to predominate in January and February.

Spatial and Temporal Characteristics of the Chl-a Profile Shape-Dependent Parameters in the Mediterranean Sea
Based on the established Chl-a vertical profile classification and inversion models, the true Chl-a vertical profiles in the Mediterranean Sea can be visualized in 3D as long as the corresponding MODIS Chl-a and Kd490 products and HYCOM MLD product can be obtained. Continuous summer   and autumn Chl-a concentrations in the Mediterranean were generated, for example, by using the June and September monthly MODIS Chl-a products and the monthly average MLD products. The Chl-a total concentration was further integrated according to the reconstructed vertical profiles.
To further verify the robustness of the proposed method, the characteristics of phytoplankton vertical distribution in the Mediterranean Sea revealed by the reconstruction results were compared to those reported in the literature. The per-pixel shapedependent parameters and derived total Chl-a are shown in Figure 14, showing that the different shape-dependent parameters exhibit distinct spatial and temporal patterns. For example, in terms of their seasonality, z max is deeper in summer than in autumn (mean depth of 88.54 ± 28.11 m versus 85.17 ± 24.94 m), while the width is larger in summer than in autumn (mean value of 64.36 ± 20.96 m versus 57.59 ± 18.67 m). However, chl max reveals the opposite pattern, with a higher mean value in September (0.38 ± 0.18 mg m -3 ) than in June (0.37 ± 0.15 mg m -3 ), which agrees with previous findings (Lavigne et al., 2015). Spatially, z max exhibits a longitudinal gradient in the Mediterranean Sea, with the value rising with increasing longitude up to a depth in excess of 120 m in the eastern Mediterranean Sea, except for some coastal areas. This conclusion agrees with previous findings reported by Lavigne et al. (2015) and Crise et al. (1999). A similar longitudinal gradient is also observed for width, which is larger in the eastern than in the western Mediterranean Sea. Furthermore, the existence of an east-west gradients of these shape-dependent parameters directly leads to a similar gradient of the Chl-a total concentration. As a result, the Chl-a total concentration is higher in the eastern Mediterranean Sea than in the western Mediterranean Sea in both summer and autumn.

CONCLUSIONS AND PERSPECTIVES
To observe the 3D spatial and temporal Chl-a concentration distribution in the Mediterranean Sea, a simple and robust process-oriented profile reconstruction method was proposed in this study. To estimate Chl-a vertical profiles based on corresponding OCRS products, the established vertical profile classification model was first used to identify the profile type in each of the pixels of the Chl-a product based on both Kd490 and MLD products, and the shape-dependent parameters of the vertical profiles were then estimated by using type-related correlations based on the proposed profile inversion model. A quantitative evaluation revealed that the proposed vertical profile classification model achieved an overall accuracy of 89%, and the proposed vertical profile inversion model attained an average absolute percent deviation value ranging from 12% to 37% for the different shape-dependent parameters.
The proposed profile estimation method was then used to generate monthly 3D Chl-a profiles in the Mediterranean Sea from 2011 to 2018. Based on the reconstructed Chl-a profiles, their spatial and temporal characteristics and those of the water column total biomass in the Mediterranean Sea were investigated. Considering the important role of that Mediterranean Sea plays in climate change research, the proposed method is expected to serve as a powerful tool in studying the status and changes in the Earth's environment.

DATA AVAILABILITY STATEMENT
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

FUNDING
This study is supported by the High Resolution Earth Observation Systems of National Science and Technology Major Projects (05-Y30B01-9001-19/20-2); Key Special Project for Introduced Talents Team of Southern Marine Science and E n g i n e e r i n g Gu a n g d o n g L a b o r a to r y ( G u a n g z h o u ) (GML2019ZD0602); the National Science Foundation of China (61991454); the National Key Research and Development Program of China (2016YFC1400901).