Spatial Distribution and Abundance of Mesopelagic Fish Biomass in the Mediterranean Sea

Mesopelagic fish, being in the middle of the trophic web, are important key species for the marine environment; yet limited knowledge exists about their biology and abundance. This is particularly true in the Mediterranean Sea where no regional assessment is currently undertaken regarding their biomass and/or distribution. This study evaluates spatial and temporal patterns of mesopelagic fish biomass in the 1994–2011 period. We do that for the whole Mediterranean Sea using two well-established statistical models, the Generalized Additive Model (GAM) and Random Forest (RF). Results indicate that the bathymetry played an important role in the estimation of mesopelagic fish biomass and in its temporal and spatial distribution. The average biomass over the whole time period reached 1.08 and 0.10 t/km2, depending on the model considered. The Western Mediterranean and Ionian Seas were the sub-regions with the highest biomass, while the Adriatic was the area with the lowest. Temporal trends showed different trajectories with steep decrease and a fluctuation, using respectively RF and GAM. This study constitutes the first attempt to estimate the biomass and the spatial temporal patterns of mesopelagic fish using environmental variables as predictors. Given the growing interest in mesopelagic fish, our study sets a baseline to further develop mesopelagic fish biomass assessments in the region. Our results stress the need to improve data collection and quality in the region while identifying appropriate tools to better understand and assess the processes behind mesopelagic fish dynamics in the basin.

Mesopelagic fish, being in the middle of the trophic web, are important key species for the marine environment; yet limited knowledge exists about their biology and abundance. This is particularly true in the Mediterranean Sea where no regional assessment is currently undertaken regarding their biomass and/or distribution. This study evaluates spatial and temporal patterns of mesopelagic fish biomass in the 1994-2011 period. We do that for the whole Mediterranean Sea using two well-established statistical models, the Generalized Additive Model (GAM) and Random Forest (RF). Results indicate that the bathymetry played an important role in the estimation of mesopelagic fish biomass and in its temporal and spatial distribution. The average biomass over the whole time period reached 1.08 and 0.10 t/km 2 , depending on the model considered. The Western Mediterranean and Ionian Seas were the subregions with the highest biomass, while the Adriatic was the area with the lowest. Temporal trends showed different trajectories with steep decrease and a fluctuation, using respectively RF and GAM. This study constitutes the first attempt to estimate the biomass and the spatial temporal patterns of mesopelagic fish using environmental variables as predictors. Given the growing interest in mesopelagic fish, our study sets a baseline to further develop mesopelagic fish biomass assessments in the region. Our results stress the need to improve data collection and quality in the region while identifying appropriate tools to better understand and assess the processes behind mesopelagic fish dynamics in the basin.

INTRODUCTION
Mesopelagic fish species are considered the marine vertebrates with the highest biomass in the world ocean (Mann, 1984). They live in the water column from 200 m to around 1000 m depth, which are limits defined by the light intensity (Kaiser et al., 2011). A characteristic of most mesopelagic fish is a diel vertical migration, which consists of moving toward the surface layers at night and back to the deeper layers during the day following the diurnal movement of their preys (Olivar et al., 2012). Though not commercially exploited (only a few species in few areas are caught for commercial use, e.g., Benthosema glaciale in Oman Sea; Valinassab et al., 2007), they are important key species in the marine food web as they influence the dynamics of their prey, notably zooplankton (Williams et al., 2001;Saunders et al., 2019), and their predators, e.g., highly commercial pelagic fish (Würtz, 2010), marine mammals (Giménez et al., 2017), sea birds (Barrett et al., 2002) and benthic species (Herring, 2002).
Research on mesopelagic fish ecology has intensified in recent years and focused on life-history, morphology, behavior, trophodynamics and biomass distribution (Salvanes and Kristoffersen, 2001;Loots et al., 2007;Bernal et al., 2015;Gorelli et al., 2016;Freer et al., 2019). However, gaps still remain about their biological importance and their overall abundance (Irigoien et al., 2014). The first preliminary global assessment of mesopelagic fish biomass was conducted by Gjøsaeter and Kawaguchi (1980), who estimated, using data from acoustic and trawl surveys, that around one billion ton of mesopelagic fish live in our oceans. Other studies assessing global mesopelagic fish biomass showed that these estimates might vary considerably between 1 and 14 billion tons depending on the methods used (e.g., acoustic trawl surveys or modeling tools; Wilson et al., 2009;Irigoien et al., 2014;Jennings and Collingridge, 2015). Particularly in relation to acoustic and net-sampling methods, both have been shown to provide an approximate measure of the mesopelagic fish biomass. The main issue of the use of acoustic methods is related to the gas-filled swimbladder (important morphological organ of the acoustic "backscatter" signal; Dornan et al., 2019) of mesopelagic fish. Some species, in fact, do not have one or some lose the gas component in the adulthood. In addition, other organisms at midwater depths, e.g., siphonophores, possess gas-filled organ, which might be subsumed to the biomass from fish, leading to an overall overestimation of the midwater fish biomass (Kaartvedt et al., 2012;Davison et al., 2015;Dornan et al., 2019). Last, the relationship between targetstrength (an acoustic parameter) and biomass can introduce a potential bias in the estimate of the biomass (Saunders et al., 2013). The net-based method, on the other hand, is likely to underestimate the mesopelagic biomass through escapement (e.g., large trawls and large meshes) or avoidance (e.g., small nets and fine meshes) phenomena (Davison et al., 2015;Escobar-Flores et al., 2020). Yet, this method is still the most used and informative (e.g., providing coordinates and exact number of specimens) and provides a ground-truth of size and species composition of the sampled fish (Cotter, 2009;Davison et al., 2015).
In the Mediterranean Sea, our study area, no regional estimates of mesopelagic fish biomass exist. The only preliminary assessment was conducted by Gjøsaeter and Kawaguchi (1980) for the Western and Eastern basins, excluding the Adriatic and Ionian seas, revealing that approximately 2.5 million tons of mesopelagic fish inhabit these sub-regions. The basin is classified as a "data-poor region, " because data (e.g., biomass and biological data of non-commercially important species and deep-sea organisms), are still very fragmented, or even lacking for certain areas (mainly for southern countries) and, in some cases, difficult to access (Piroddi et al., 2015(Piroddi et al., , 2017Demirel et al., 2020).
The Mediterranean Sea is also defined a sea "under siege" because the basin accumulates the impact of multiple stressors, such as fishing, climate change, invasive species, and pollution, on its ecosystem (Coll et al., 2010;Katsanevakis et al., 2014). Several regional studies, in fact, have reported severe declines of predatory species in the region (Ferretti et al., 2008;Piroddi et al., 2017Piroddi et al., , 2020 with important consequences to its biodiversity. Thus, assessing how many mesopelagic fish inhabit the basin and where they distribute is a step further in understanding the role of mesopelagic fish in the functioning and structuring of this fragile ecosystem. As mentioned before, several approaches have been used to estimate the biomass of mesopelagic fish. Here we utilized jointly two statistical tools to evaluate the biomass of mesopelagic fish in the Mediterranean Sea. Currently, the use of multiple models has been recognized to be fundamental for increasing the reliability of model predictions, decreasing their uncertainty and better supporting policy decisions (Littell et al., 2011;IPBES, 2016;Lotze et al., 2019). In this study, we used two species distribution models to obtain a first preliminary estimate of mesopelagic fish biomass for the whole Mediterranean Sea, and assess spatial distributions and temporal changes over time. We did that using two methods: (1) Generalized Additive Model (GAM), and (2) Random Forest (RF), both of which utilize physical, chemical and biological parameters to predict biomass/abundance of a species and the likelihood of that species to inhabit a particular environment (Knudby et al., 2010).
Our study sets a baseline to further develop mesopelagic fish biomass assessments in the region and aims at identifying appropriate approaches to better understand and assess the processes behind mesopelagic fish dynamics in the basin.

The Mediterranean Sea
With a total surface of 2.5 million km 2 , the Mediterranean Sea is a semi-enclosed sea that only have limited water exchange with adjacent seas (the Atlantic Ocean via Gibraltar's strait, the Black Sea via the Bosporus and the Dardanelles and the Red Sea with the Suez Canal; see Figure 1).
The basin is heterogeneous with shallow sub-regions like the Adriatic Sea (mean depth around 300 m, Figure 1) and deep ones like the Ionian and Levantine seas (depths between 2500 and 4000 m; Zavatarelli and Melor, 1995;Durrieu de Madron et al., 2011). Decreasing gradients of nutrient and productivity are observed from north to south and west to east, making the Mediterranean Sea one of the most oligotrophic areas of the world ocean (Danovaro et al., 2010). Opposite trends are observed for temperature and salinity with increasing values eastward. In the twilight layer (i.e., mesopelagic and deeper layers) of the basin, temperature and salinity values are higher than in any other ocean (Gorlov, 2009). In the mesopelagic layer, there are longitudinal gradients of temperature and salinity, which correspond to 13 • C and 38.4 PSU in the western basin and 15.5 • C and 39.1 PSU in the eastern basin (Zavatarelli and Melor, 1995). This enclosed sea, despite covering only 0.32% of the world ocean volume, is an area known for its endemic biodiversity hotspot, hosting 7-10% of the world's marine biodiversity (Bianchi and Morri, 2000;Coll et al., 2010).

Biomass of Mesopelagic Fish
We built a database of georeferenced mesopelagic fish biomass (t/km 2 ) using data from peer-reviewed articles and/or scientific reports (Figure 1 and Supplementary Table 1). The database was constructed following a framework to account for differences in recorded data (e.g., abundance vs. biomass data); the details are provided in the Supplementary Materials). We only considered the biomass of adults or juveniles, excluding thus mesopelagic fish larvae. Selected deep-sea species consisted of species defined by FishBase (Froese and Pauly, 2015) either as mesopelagic or in some cases, as bathypelagic fishes. Even if bathypelagic fish usually live in waters deeper than the mesopelagic layer, some species occur in both mesopelagic and bathypelagic layers (e.g., Arctozenus risso: 0 to 2200 m depth) or are sometimes described as meso-and bathypelagic species by FishBase. In our study, we excluded bathypelagic species estimates if those species were not described as migrators to the mesopelagic layer. Overall, we extracted 11,430 mesopelagic fish biomass estimates for the period 1994-2011 that were collected either during day or night or both (Supplementary Table 1). Thirteen mesopelagic families were collected with Myctophidae and Sternoptychidae representing 70% of the total biomass. We then tested the diurnal and seasonal effect on the biomass data and excluded this effect from model formulation because of their non-significance in explaining data variability (F(1,402) = 0.1346; p > 0.05, F(4,1952) = 2.16; p > 0.05, respectively).
Sampling stations were mainly distributed along the slope of the Mediterranean Sea (Figure 1) and 49.7% of them were in the Western sub-region. An overview description of this database is presented in Supplementary Table 1 of the Supplementary Information. To finally keep one biomass value per station and year, mesopelagic fish biomass of the different collected species was summed up, which reduced the database to 2,694 records.

Environmental Predictors
Environmental data were used to predict the distribution of mesopelagic fish biomass in space and time. We used yearly gridded environmental data from a three-dimensional hydrodynamic biogeochemical model (GETM-MedERGOM) covering the whole Mediterranean Sea at a spatial resolution of 0.083 • (Macias et al., 2013(Macias et al., , 2014a. Detailed description and validation of this model can be found in Macias et al. (2013). Environmental predictors were selected based on their potential to influence mesopelagic fish distribution: primary production (PPR) (Olivar et al., 2012); dissolved oxygen (DO) (Kinzer et al., 1993), seawater salinity (S), seawater temperature (T) (Themelis, 1997;Danovaro et al., 2004).
Environmental predictors were averaged for salinity, temperature, and DO and integrated for the PPR over three depth layers: 0-10 m depth layer (i.e., surface): 0-150 m depth layer (i.e., the euphotic zone also delimited by the Mixed Layer Depth) 0-1000 m depth layer (i.e., the entire water column where mesopelagic fish might occur).
This was done to understand the influence/importance of selected depth related environmental data in the estimation of the biomass and distribution of mesopelagic fish.
Also, to consider sub-regional differences in environmental and biological characteristics of the region, georeferenced biomasses were linked to the four main Mediterranean subregions (i.e., Western, Adriatic, Ionian, and Eastern) following the division as provided by the Marine Strategy Framework Directive (MSFD; 2008/56/EC).

Generalized Additive Model
Generalized additive model is a non-parametric regression method that uses a link function to establish a relationship between the response variable and a "smoothed" function of the explanatory variable(s) (Hastie and Tibshirani, 1986). The strength of GAM is its ability to deal with highly non-linear and non-monotonic relationships between the response and the set of explanatory variables. Here, we used GAM to estimate the mesopelagic fish biomass in function of selected environmental predictors following the equations: log(biomass) ∼ s(log(Bathymetry)) + s(log(DO Layer )) + s(log(Temperature Layer )) + s(log(Salinity Layer ) (1) if Layer = 0-10 m: log(biomass) ∼ s(log(Bathymetry)) + s(log(PPR Layer )) + s(log(Temperature Layer )) + s(log(Salinity Layer ) where s and poly stand for smooth functions, Layer the depthlayer considered and Sub-region, one of the four sub-regions dividing the Mediterranean Sea and ε the residuals. The residuals ε were assumed to follow a Gaussian distribution.
To deal with spatial autocorrelation, we assess the residuals correlation structure using the exponential correlation structure (ECS). In the ECS, the correlation σ between two observations is given by: where d is the distance between the two observations and r is the range (corresponding to the distance at which the residuals are no longer correlate). Longitude and latitude are projected in ETRS-LAEA (also called ETRS 89) for the calculation of the d and r of the residuals' correlation structure in metric units. The ECS is chosen between the Gaussian, spherical and linear correlation, based on the reduction of the Akaike's information criterion (AIC). The optimal combination of the predictor variables is performed by comparing nested models using the likelihood ratio test with maximum likelihood estimation (Zuur et al., 2007). To validate the model, we randomly split the data into training (2/3) and validation (1/3), we then compared the predictions of both models (training/validation) and once considered suitable, we used the entire data set to predict the biomass of mesopelagic fish. The best model was chosen based on (1) the lowest AIC and (2) significant covariates (Wood, 2001). Here, GAM was run using the "mgcv" package (version 1.8-7) in R.

Random Forest
Random forest is an ensemble of decision trees based on a bootstrap aggregation method (also called bagging method; see Breiman, 1996), which partitions the variable of interest, here, biomass, from decisions made on the predictor data. RF is a statistical model, which has been increasingly used in ecological modeling to predict fish biomass with the lowest errors (Knudby et al., 2010). As for GAM, RF can deal effectively with nonlinearity between predictors and the response variable (Breiman, 2001). To tune the algorithm, the model requires two parameters: the number of regression trees, ntree and the number of different predictors tested at each split, mtry. Prior to the growth of each tree, one-third of the dataset, called the Out-Of-Bag (OOB), is excluded from the bootstrap and used for cross-validation. Decision trees are built by splitting the other two-thirds of Frontiers in Marine Science | www.frontiersin.org the dataset according to four randomly selected predictors. The algorithm of RF looks for the optimal values (e.g., a value that reduces the variance) of these predictors before the splits.
Once the decision trees are built, the accuracy of the tree predictions is assessed by using the observed biomass from the OOB dataset. In our study, we decided to set mtry to 2 and ntree to 500 to optimize the time computation and avoid errors in predictions (Breiman, 2001). We then estimated the mesopelagic fish biomass by averaging the predictions obtained from the 500 decision trees. To be consistent with GAM, RF used the same set of environmental predictors. RF was implemented using the "randomForest" package (version 4.6-10) in R, which is based on an algorithm described by Breiman (2001).

Models Performance
Generalized additive model and RF performances were compared through data variability explained by the models (i.e., crossvalidated R 2 in RF and deviance in GAM) and the importance of environmental predictors (i.e., the reduction in deviance in GAM and the increase rate of the mean square error in RF). In GAM, the reduction in deviance is calculated as the difference between residual deviance of the final model and the residual deviance of a model where one or more predictors are dropped. In RF, the increase rate of the mean square error is obtained excluding one predictor from the model. In addition to these methods, we decided to run (100 times) both models with a 10-fold cross-validation technique. The rootmean-square-error (RMSE) obtained through the 10-fold crossvalidation was used to summarize the errors and compare the two statistical methods (the one yielding the lowest RMSE was the model with a better fit). RMSE was estimated as follow: with n, the total number of data, y i the i-th prediction and y i the i-th observation.

Density Estimates in Space and Time
Once calibrated and validated, the best RF and GAM models were selected to predict the mesopelagic fish biomass (t/km 2 ) in space and time for the period 1994-2011 using the spatially explicit and time-resolved environmental variables provided by the hydrodynamic-biogeochemical model. Taking into consideration that mesopelagic fish occur mainly in waters deeper than 200 m, we discarded all the predictions estimated over a topography shallower than 200 m. For both models and each depth layer, we calculated the arithmetic average of the estimated biomass for the entire Mediterranean Sea and for each subregion (Western, Adriatic, Ionian, and Eastern). Additionally, we estimated biomass trends and spatial distributions using a geometric average over the period 1994-2011. Finally, we tested the sensitivities of the models to changes in catchability, an input parameter that was used to estimate the mesopelagic biomass in the dataset (see Supplementary Materials for details). Both selected models were trained and run with catchability values ranging between 0.05 and 0.35, which were the values of catchabilities observed in the available scientific literature.

Model Performances
The most statistically significant results were obtained with RF. This model, in fact, explained between 27.0 and 28.4% the variance of the data depending on the depth layer considered ( Table 1); hypotheses on the residuals were analyzed and verified (Supplementary Figures 1-3 in Supplementary Information). As for GAM, the deviances ranged between 14.9 and 15.4% with smooth functions statistically significant (p-values < 0.05) and analysis on residuals verified (Supplementary Figures 4-6 in Supplementary Information). Bathymetry was the most important predictor for both models followed by salinity, temperature, and PPR for RF and by DO, PPR, and salinity for GAM ( Table 1).
Based on the RMSE from the 10-fold cross-validation analysis, the performances of RF were generally better than the ones from GAM, with an average of 0.57 and 0.84, respectively ( Table 1).
In both models, the first two layers (0-10 and 0-150 m) contributed the most to the total explained variance. In GAM, the environmental predictors from the depth layer 0-150 m showed the highest improvement while in RF this was observed for the depth layer 0-10 m. For these GAM and RF models, the marginal effects of the environmental predictors in common (i.e., bathymetry, temperature, and salinity) on the biomass showed similar patterns (Supplementary Figures 7, 8).

Sub-Regions
High differences were observed between the two methods even at sub-regional scale (Figure 2). Overall, in RF there was a clear  decreasing gradient of mesopelagic biomass from West to East while in GAM, such pattern was more homogenous. In GAM, the Ionian Sea was the sub-region with the highest densities with an average estimated at 1.77 t/km 2 and estimates were contained in the 95% confidence interval from 0.01 to 5.80 t/km 2 . In RF, the highest values were found in the Western sub-region and averaged at 0.22 t/km 2 . The 95% confidence interval for that subregion included values from 0.01 to 0.66 t/km 2 . For both GAM and RF models, the Adriatic sub-regions had low estimates of biomass (i.e., <0.06 t/km 2 ). Then, the Ionian and Eastern subregions had the highest discrepancy rates (i.e., biomasses from GAM were ∼130 times higher than those estimated by RF).

Spatial Distribution and Temporal Dynamics
Spatial differences between the two models were mainly detected looking at the biomass distributions over the bathymetry (Figure 3). In general, small biomass estimates (<0.03 t/km 2 ) were observed in all the sub-regions, particularly near the continental shelves and had low value variations (Supplementary Figure 10). On the contrary, the highest estimates (>3.5 t/km 2 , Figure 3A) in GAM were encountered over the deep areas of the Western Mediterranean Sea, the Ionian and Eastern sub-regions (Figure 3). In those areas where the average estimates were high and the data coverage was absent, we also observed high standard deviation (i.e., from 0.4 to 3.5 t/km 2 ) of the prediction (Supplementary Figure 10A). Similarly, but with values 10 times smaller, biomass estimates in RF were high (>0.4 t/km 2 , Figure 3B) in the deep areas of the Western Mediterranean Sea. Nonetheless, the standard deviation was relatively the highest (from 0.4 to 1.9 t/km 2 ) near Spanish islands and the Sicilia Strait in the Western Mediterranean Sea. Temporal dynamics estimated by both models for the whole Mediterranean Sea suggested an increase of total mesopelagic fish biomass from 1995, which then decreased softly until 2007 before to rose steeply again in GAM (Figure 4). Meanwhile, in RF, the total mesopelagic biomass declined in 1998 and remained stable throughout 2001 and 2011. Similar trends were observed at subregional scale (Supplementary Figure 11), with the exception of the Ionian sub-regions in RF where the decline was not pronounced as in the other sub-regions. The temporal dynamic pattern in GAM was mainly driven by the temporal variability observed in the Ionian sub-region (Supplementary Figure 11C), where the predicted biomass was the highest (see Figure 2) while it was the Western sub-region influencing the most the modeled values in RF (Supplementary Figures 10, 11A). Temporal trends showed that RF was able to capture the decline of mesopelagic fish as observed in the sampling data (Figure 4) for the end of 1990s while GAM showed only a fluctuation.

Data Quality and Model Performance
This study provides a first spatial and temporal estimate of mesopelagic fish biomass in the Mediterranean Sea. It covers the last two decades and is based on the best available data and best available modeling tools. Yet, we acknowledge that considerable gaps still exist, particularly with regards to data quality. For instance, environmental data were extracted from a hydrodynamic-biogeochemical model that, despite being tested and validated, still show levels of uncertainties. This particular model set-up has been shown to properly represent past (Macias et al., 2014a), present (Macias et al., 2014b), and future  conditions in this semi-enclosed basin, although it is still far from a perfect description of environmental characteristics. As a particle example of model limitations, it could be mentioned the underestimation of primary production values in certain particular areas such as along the Egyptian Coast, on the Tunisian continental shelf and in the shallow Adriatic waters, which are known to be highly productive FIGURE 3 | Spatial distribution of mesopelagic fish biomass averaged over the whole time period (1994-2012) using GAM (A) and RF (B). White areas correspond to the topography shallower than 200 m which was discarded from the analysis. (Bosc et al., 2004). Using time series from survey data (instead of modeled climatological data) could possibly result in a better model performance; however, data at regional scale with appropriate spatial and temporal coverage are inexistent.
The extraction of mesopelagic fish biomass estimates from scientific literature can itself be a considerable source of bias. Indeed, in some cases, mesopelagic biomass was obtained from studies where the scope was generally not aimed at targeting mesopelagic fish. Lack of regional data impedes a proper analysis of the dynamics of mesopelagic fish in the region. Besides, it also hampers the development of a comprehensive ecosystem analysis to evaluate the functional role of these fish in the food web. More efforts should be dedicated to improve data collection and quality in the region. Sampling data should be more homogeneously distributed in space and should have a better time coverage. Nonetheless, the biomass database that was constructed for this study followed a rigorous approach (Supplementary Materials) using, when missing, data (weights from length-weight relationships) coming from FishBase (a global biodiversity information system on finfishes) 1 . We use this system to gather consistent information among the different studies and species, and to reduce further noise coming from information collected in localized/individual studies. Yet, we acknowledge that FishBase has limitations, particularly associated to uncertainties on the ecology of many mesopelagic fish species; thus, further efforts should be made to reduce this knowledge gap. 1 www.fishbase.org Probably the most uncertain parameter of our biomass database is the catchability that was fixed at 0.16 (i.e., only 16% of mesopelagic fishes in the volume swept by the trawl gear(s) were assumed to be caught by the gear), to estimate the biomass present in the water column, across species and surveys. This catchability was estimated by averaging published catchabilities of different gears [i.e., 6 feet Isaac Kidd Midwater Trawl (Barkley, 1964(Barkley, , 1972; Engel 152 (May and Blaber, 1989)], which were also the main gears used to sample mesopelagic fish biomass in the Mediterranean Sea. Such value might have led to under-or overestimate of the mesopelagic fish biomass. Catchability is a matter of fish size, fish species, fishing depth and a lot of other parameters, which makes it difficult to measure and estimate. Unfortunately, at the time this research was undertaken, no more information was available to fill these gaps. The results obtained from the sensitivity analysis, which tested different catchabilities, showed that biomass values may vary greatly depending on the catchability used. This result reinforces the need to conduct further research on catchability to reduce the uncertainties associated to the mesopelagic fish biomass estimation.
Lastly, our sampling sites were located close to the coast or the continental shelf and mainly in the northern part of the Mediterranean Sea. Spatial autocorrelation was not detected in the residuals of the two models. However, due to the scarcity of biomass sampling data for open waters, the estimation of the mesopelagic fish biomass distribution could be biased. In fact, with a better distribution of the sampling, we could undoubtedly improve the patterns of some relationship between biomass and predictors (e.g., Supplementary Figures 7, 8) and for the spatial estimation of the biomass (e.g., Figure 3). Nevertheless, the estimation of biomass on the continental slope of the Mediterranean Sea is in line with existing knowledge. The increase of biomass with bathymetry was previously observed by Moranta et al. (2004) and Massutí and Reñones (2005), and high estimation above the continental slope is expected since those zones shelter a high biodiversity and food abundance (Danovaro et al., 2010).
With the run of two models, we were able to assess communalities and main differences in depicting the relationship of mesopelagic fish with its environment. Our results highlight a general different pattern among these tools, not only in the absolute biomass estimates but also in the temporal and spatial distribution. These two models are commonly used to assess species distributions in both marine and terrestrial realms (e.g., Moisen et al., 2006;Rooper et al., 2017;Kosicki, 2020), however RF models have been seen to have substantially more performance power and control over the model overfitting compared to GAM approaches (Elith et al., 2006;Rooper et al., 2017), as it is also shown in this study. This is likely because of the RF algorithm, which is an ensemble of models (regression trees), each built on a random selection of relatively few predictors (Breiman, 2001). Besides, contrary to the GAM approaches, the RF algorithm did not seem to overestimate the biomass at places where we originally lacked data.
In addition, the statistically significant differences in predictive performance between these two modeling tools, suggest the need for further comparative studies, including other techniques and model types, to get stronger and more reliable conclusions (Martínez-Minaya et al., 2018). It has been observed that these statistical tools, together with machine learning methods, have great potential in reducing prediction error while incorporating non-linear and interaction effects (Knudby et al., 2010). As pointed out by several studies (Fulton, 2010;Piroddi et al., 2017), there is "no one best model" but rather a suite of tools should be examined given the high uncertainty associated to the questions that such approaches aim to address.

Estimates of Biomass
The distribution of biomass over the whole Mediterranean Sea and the relationship with some environmental predictors had comparable patterns with the available literature. First, our predicted biomass with GAM for the whole region was of the same order of magnitude as estimated by Gjøsaeter and Kawaguchi (1980), which was ∼3 t/km 2 for the entire region. On the other hand, RF was more in agreement with Gjøsaeter and Kawaguchi (1980) in relation to the distribution of mesopelagic biomass which was higher in the western Mediterranean Sea compared to the eastern part. This is also in line with other studies in the Mediterranean Sea that suggest a decreasing gradient of species richness from west to east (Danovaro et al., 2010;Piroddi et al., 2017) due to an increased oligotrophy in the eastern regions. Since our study is the first of its kind in producing spatiotemporal maps and trends of mesopelagic fish biomass for the basin, unfortunately, at this stage, no further and clear conclusions can be drawn. A hypothesis could be associated to an increase in water temperature in the basin that together with low levels of primary production, typical of the oligotrophic nature of the Mediterranean Sea might have created unfavorable conditions for mesopelagic fish biomass to thrive in the eastern region.
In this study, bottom depth was the variable that explained the greatest portion of the biomass and spatial distribution of mesopelagic fish. Many studies have demonstrated the importance of this parameter on the distribution of reef and demersal fish (Knudby et al., 2010;Pham et al., 2015), and it is not surprising that even deeper fish like mesopelagic fish are as well highly influenced by this variable. Having environmental data from a 3D hydrodynamic biogeochemical model allowed us to explore the influence of different depth related environmental data on the dynamics of mesopelagic fish. Our results showed no significant variations among the three assessed depth layers, suggesting an important contribution of each of these compartments in the pattern of these fish. Yet, our study indicated that the euphotic zone was the area with the highest influence on the distribution and abundance of mesopelagic fish. These results might be biased by the survey data used in this assessment, which was mainly collected close to coastal/shelf areas. Further studies should better evaluate this aspect when/if more data (in open waters) become available. The use of spatially explicit environmental variables from hydrodynamic-biogeochemical models, extracted as in this case at different depths, highlighted the importance of considering such tools to better capture fish dynamics and their linkage with the environment (Piroddi et al., 2017(Piroddi et al., , 2020. Biomass could also be linked to other parameters we did not consider in this study, notably those of species-dependent character. We think particularly of an increase in natural mortality or a modification of certain biological features e.g., growth (smaller sizes) and reproduction (less survival of larvae; Herring, 2002) with environmental changes. We did not consider the rate of by-catch from the fishing activities, but it might be important when looking at mesopelagic fish biomass trend. Even if mesopelagic fish are not targeted by any commercial fisheries, they are discarded at a considerable scale in such fisheries (Gorelli et al., 2016).

CONCLUSION AND PERSPECTIVES
This study constitutes the first attempt to estimate the biomass and the spatial temporal patterns of mesopelagic fish using environmental variables as predictors. Compared to the work of our predecessors, Gjøsaeter and Kawaguchi (1980), this work provides a first preliminary estimate of the mesopelagic fish biomass for the whole Mediterranean Sea (not only the Western and Eastern sub-basins), and for more recent years (while the previous estimates were for the 1980s). Most importantly, this is the first study that assesses potential trends and spatialtemporal maps of this group of fish in the region. We agree that there is still considerable uncertainty associated to data availability and quality and model results, but our study aims to provide a baseline for assessing future mesopelagic fish biomass and distribution in the Mediterranean Sea. The two well-established statistical models used to estimate spatially and temporally mesopelagic fish biomass distribution in the Mediterranean Sea showed different results in relation to the distribution and the variation of the mesopelagic fish biomass in the region. Our analysis indicate that RF was the model that gave best biological results and acceptable performance. Finally, we believe that in order to improve our understanding in the processes behind mesopelagic fish dynamics in the Mediterranean Sea, more tools and analyses with integrated regional data should be tested.

DATA AVAILABILITY STATEMENT
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

AUTHOR CONTRIBUTIONS
MC-H gathered materials, built models, carried out postanalyses, wrote the manuscript. CP conceived the manuscript concept, supervised the first author, gathered materials, revised the manuscript. FQ built models, revised the manuscript. DM provided the material, revised the manuscript. VC conceived the manuscript concept, supervised the first author, gave access to materials, revised the manuscript. All authors contributed to the article and approved the submitted version.