Biogeography of Micronekton Assemblages in the Natural Park of the Coral Sea

Mesopelagic resources are central to the ecosystem but remain poorly studied mainly due to the lack of observations. This paper investigates the assemblages of micronekton organisms and their habitat in the Natural Park of the Coral Sea around New Caledonia (southwest Pacific) using data from 141 pelagic trawls. A total of 67,130 micronekton individuals (fish, crustaceans, and mollusks) were collected with 252 species identified among 152 genus and 76 families. In the analyses, we focused on 22 species; each were present in more than 33 trawls (i.e., in more than 25% of the total number of trawls) and studied their spatial distribution and vertical dynamic behavior. Community structure was investigated through region of common profile (RCP), an innovative statistical multivariate method allowing the study of both species assemblages and environmental conditions’ influence on species occurrence probability. Nine major assemblages were identified, mainly driven by time of the day and sampling depth. Environmental variables, such as mean oxygen concentration, mean temperature, and bathymetry, also influenced micronekton assemblages, inducing a north/south distribution pattern. Three major day-assemblages were identified, distributed over the whole EEZ but segregated by depth: one assemblage in waters shallower than 200 m and the other two in deeper waters, respectively, in the north and the south. The night-assemblages were mostly segregated by depth with two community changes at approximately 80 and 200 m and spatially with a north–south gradient. The predominant northern night assemblages were dominated by crustacean, whereas the southern assemblage mostly by cephalopods and fish species. Generally, the southwest part of the EEZ was the most diverse part. Statistical analyses allowed the prediction of the spatial distribution of each species, and its vertical migration behavior was determined. Based on results, three important areas were identified to be considered for special management measures as part of the Natural Park of the Coral Sea.


INTRODUCTION
Marine biodiversity is predicted to decrease in the coming decades on a global scale (Garcia Molinos et al., 2016;Worm and Lotze, 2016) with human activities, such as fishing, pollution, and habitat perturbation, identified as key drivers of this decrease (Lotze et al., 2006;Diaz et al., 2019). Climate change, another main threat to marine diversity, is causing large modifications in species spatial distribution (Cheung et al., 2009;Pecl et al., 2017;Hillebrand et al., 2018;Lotze et al., 2019) and, therefore, induces local changes in assemblage composition (Dornelas et al., 2014).
This global loss of biodiversity is threatening ecosystem functioning as well as the services it provides, which are essential for human well-being (Worm et al., 2006;Beaumont et al., 2007;Cardinale et al., 2012;Hannah et al., 2013). Marine protected areas (MPAs) and large-scale MPAs have been identified as key species-conservation tools (Ehler and Douvere, 2009;Christie et al., 2017;Davidson and Dulvy, 2017) as well as to mitigate the effects of climate change (Roberts et al., 2017). In the vast pelagic domain, MPAs have been offered as tool to protect migratory species (Davies et al., 2012). However, it has been emphasized that, at the global scale, the current level of protection of marine ecosystems spatially mismatched the important regions in terms of diversity; for instance, main diversity hot spots are mostly outside protected areas (Lindegren et al., 2018). This relatively inefficient global management could be explained by the knowledge gap in marine species ecology (especially environmental drivers of their spatial distribution) and species interactions (trophic or others) (Dornelas et al., 2014). Therefore, climate change impacts could be mitigated by improving the understanding of species optimal environmental conditions and by including this in global strategic frameworks for management and conservation (Pecl et al., 2017;Glover et al., 2018). However, this should not be limited to a small number of emblematic species, as is often the case, but should aim at encompassing the whole species assemblages (Fisher et al., 2011).
In pelagic ecosystems, mid-trophic-level organisms (MTLOs), also referred to as micronekton, include crustaceans, mollusks, gelatinous organisms, and fish with size ranging from 1 to 20 cm long (e.g., Bertrand et al., 2002;Young et al., 2015). Micronekton species play a central role in the pelagic ecosystem as food of predators, including commercially targeted species, such as tuna (Bertrand et al., 2002;Olson et al., 2014;Duffy et al., 2017), and emblematic marine species, such as seabirds, cetaceans, and sharks (Lambert et al., 2014;Miller et al., 2018). Hence, micronekton diversity directly impacts predators' diet (Duffy et al., 2017;Portner et al., 2017). Micronekton perform daily migrations between the surface layer (0-200 m), where they stay during the night, and the mesopelagic layer (200-1000 m), where they stay during the day to avoid visual predation. This vertical migration largely participates in the downward flux of nutrients and particulate organic matter via respiration and excretion processes Ariza et al., 2015;Drazen and Sutton, 2017). This diel vertical migration pattern (DVM) is observed in the world's oceans (Bianchi and Mislan, 2016;Klevjer et al., 2016) and is performed by a large majority of species (75% of organisms in the southwest Pacific according to Receveur et al., 2019). Despite their important role, the dynamic of micronektonic species in their environment is still poorly understood (St. John et al., 2016;Hidalgo and Browman, 2019). In particular, species vertical behavior (e.g., percentage of individuals performing DVM) is one knowledge gap to solve in order to include micronekton in biogeochemical models of the carbon cycle (Aumont et al., 2018). The large intrinsic micronekton diversity (Olivar et al., 2017) and the poor observation level complicate a complete understanding of the mesopelagic species ecology. Indeed, each micronektonic species has a specific habitat that is challenging to identify (Duffy et al., 2017).
The Natural Park of the Coral Sea was recently created, covering the entire oceanic waters of New Caledonia (Decree 2014-1063/GNC). To develop the management plan of the park, there is an important need for robust scientific information on the productivity and functioning of this pelagic ecosystem, including micronekton dynamics and role in food webs (Ceccarelli et al., 2013;Gardes et al., 2014). Additional information on micronekton assemblages and diversity will contribute to designing the Natural Park management plan, accounting for the conservation of the food stock of all top predators.
In this context, the large diversity of predators and micronekton makes the New Caledonian EEZ a good case study to understand the pelagic ecosystem functioning and to implement a management tool. The present study focuses especially on micronekton assemblage diversity, one of the most poorly understood links in pelagic ecosystems (St. John et al., 2016;Hidalgo and Browman, 2019). Using data from 141 mesopelagic hauls conducted in the New Caledonian EEZ in 2011-2016, we applied region of common profile (RCP) methodology to quantitatively characterize the micronektonic assemblages, to identify the environmental conditions influencing those assemblages, and to determine their spatial distribution. RCP also allowed the description of the FIGURE 1 | Trawl hauls' spatial position colored by the moment of the day (gray and dark crosses) in the New Caledonian Exclusive Economic Zone. The background blue colors represent the seabed depth (where lighter colors are shallower). Note that hauls from two cruises in the northern part partially overlap, and for visualization purposes haul positions have been shifted by 0.5 • to the north (A). The number of hauls per depth colored by the moment of the day (B).
vertical distribution of the most abundant species as well as their spatial distribution.

Sample Collection
We gathered data from five cruises (Nectalis 1 to 5, 1 'N1' to 'N5' hereafter; Supplementary Table S1) on board the R/V Alis in the New Caledonian EEZ, covering the area between 156 • E-175 • E and 14 • S-27 • S over the period 2011 to 2016 ( Figure 1A). The sampling strategy was constrained by the availability of the vessel, the cyclonic season, the duration of the cruise, the small size of the vessel (28 m), and the number of hours worked by the crew. The sampling strategy of the five cruises aimed primarily at covering spatially the whole EEZ of New Caledonia. N1 and N2 covered the northern region; N3, N4, covered the southwest; and N5 covered the southeast (Supplementary Figure S1). The second objective was also to consider seasonal variability with N1 covering the cold season and N2 covering the warm season in the northern area. It was originally planned to do the same in the southern part of the EEZ with N3 carried out during the warm season, and N4 was supposed to be conducted during the cold season. However, the availability of the vessel allowed us to carry out N4 at the transition period between the two seasons only. Moreover, N4 was interrupted after a few days because of an engine failure. N5 was meant to reproduce N4 the following year, but the vessel was available only during the warm season, and considering the weather conditions at the time, it was decided to explore the southeast instead of the southwest area. These successive constraints explained some lack of systematic sampling of space and time. Sampling stations were distributed evenly along the tracks and lasted 5-7 h each (Supplementary Figure S1). The 1 doi: 10.18142/243 sampling was conducted as much as possible to balance the day and night and the depth according to the ship's arrival time at the station and avoiding dawn and dusk when micronekton migrates. At each station one, two, or three trawl hauls were conducted within 4 h at different depths. Depth sampled was selected according to an echosounder signal targeting the layers where the acoustic signal was the strongest. Given all these limitations, the data sampling remains rather well balanced ( Figure 1B and Supplementary Table S2).
Acoustic data were recorded continuously during cruises using an EK60 echosounder (SIMRAD Kongsberg Maritime AS, Horten, Norway) connected to four split-beam transducers at 38, 70, 120, and 200 kHz. Micronekton organisms were sampled at each station with a mid-water trawl with a 10-mm mesh. Vertical and horizontal mouth openings of about 10 m were monitored with trawl opening sensors (Scanmar, Åsgårdstrand, Norway). Horizontal tows were conducted to target aggregations visually detected with the EK60 echosounder during both day and night periods ( Figure 1B). Once the net was stabilized at the echo depth, based on Scanmar depth-positioning information, it was towed for about 30 min with a vessel speed around 3-4 knots. One to three hauls were conducted at each station at depths varying between 10 and 590 m during the day and between 10 and 520 m during the night. Organisms collected were quickly sorted by broad taxon (e.g., Crustacea, Actinopterygii, Cephalopoda) on board the vessel and immediately frozen at −20 • C.
In the laboratory, specimens were defrosted and identified to the lowest possible taxon based on morphologic criteria and using identification keys. Identified specimens were counted and weighed. Gelatinous organisms were removed from analysis as accurate counts and weights cannot be recorded due to the fragility of those organisms and their high percentage of water. Indeed, they are easily damaged during trawling and freezing, limiting the accuracy of identification. Individuals from colonies got separated during trawling, making the counting of organisms imprecise, and thawing made them lose their water content such that weighting was non-representative of the fresh wet weight.
The volume filtered by the net was calculated as V = SD with S = hv and D = 2R tan −1 ( √ a/(1 − a)). The intermediate a was calculated as follows: where V is the volume filtered (m 3 ); S is the net mouth opening (m 2 ); D is the distance covered by the trawl (m); h and v are the net horizontal and vertical mouth openings (m); R = 6371 is the Earth's radius (km); and lat 1 , lat 2 , lon 1 , and lon 2 are the latitude and longitude of the start and end of the set at the chosen depth (radian). More details about N1 and N2 are given in Menkes et al. (2015).

Environmental Variables
Environmental variables were selected to explore the sampling conditions of micronekton assemblages ( Table 1). For each haul, environmental data were extracted at the date and position of the sample. Sun inclination was calculated as a function of spatial position and date (Michalsky, 1988;Blanc and Wald, 2012), and hauls were categorized into "Day" and "Night" periods (negative values corresponding to Night and positive values to Day). Bathymetry data was extracted from the ZoNéCo database at a 500 m spatial resolution (ZoNéCo, 2013), 2 and the slope of the bottom was calculated as the spatial derivate of bottom depth. The depth of the 20 • C isotherm (referred as "isotherm depth" hereafter) were extracted from the Armor3D data set (Guinehut et al., 2012) available at a weekly time scale. Chlorophyll-a surface concentration was extracted from GLOBCOLOUR (Saulquin et al., 2009) at a daily resolution. These two variables were extracted on a 1/4 • spatial grid.
Armor3D provided an ocean reanalysis of observed vertical profiles of ocean temperature (T) (Guinehut et al., 2012) and CARS vertical distribution of oxygen concentration (O 2 ) (Ridgway et al., 2002). These data are provided on a 1/4 • spatial resolution. As interannual data were not available for CARS, we used the climatology calculated between 2001 and 2016 (cruise period range) for the oxygen. We checked that Armor3D temperature interannual variations were weak enough for this period to use the climatology of mean temperature in our model to be consistent with the other vertical mean variable (i.e., the oxygen). Armord3D and CARS were used rather than field data, because CTD casts were only taken at a limited number of sampling stations. However, we systematically checked the matching between in situ CTDs and Armor3D temperature (Spearman correlation of 0.99, p-value < 1e −16 ) and between in situ CTDs and CARS oxygen (Spearman correlation of 0.74, p-value < 1e −16 ). We used the 0-600 m (the haul depth range) averages of climatologic temperature and oxygen concentration.
We ensured the absence of collinearity between variables by checking that Spearman correlations were below 0.5 (Dormann et al., 2013).

Methodological Background
Commonly used methods to identify species assemblages and homogenous region are mostly based on dissimilarity metrics between sites (Gotelli and Ellison, 2012), allowing the grouping of sites with similar species composition. The link between the groups and the environmental parameters is done in a second step, often by fitting a model between groups and environmental variables or by averaging variables by group (e.g., Williams and Koslow, 1997;Olivar et al., 2012;Ariza et al., 2016). However, such approaches do not allow statistically modeling the links between species composition and the environmental conditions . Methods based on dissimilarity metrics implicitly assume a mean-variance relationships (e.g., linear) that is not always true (Warton et al., 2012;Warton and Hui, 2017). To alleviate these limitations, the use of a modelbased approach allows exploration of species assemblages and the response of species abundance to environmental variables in only one step . Therefore, in the present study, assemblages and environment influence were tested using the RCP method, a model-based approach (Foster et al., 2013). The RCP method is based on multivariate generalized linear models (GLMs) and aims at simultaneously grouping trawls of similar species composition and describing assemblages' variability using environmental parameters .

Model Parameterization
To fit the model, only specimens identified at the species level were considered. The response variable was the number of individuals by species and by trawl haul. It was modeled using RCP methodology with a negative binomial sampling distribution and a log link function. The model was fitted with maximum penalized likelihood. The number of individuals by species and by trawl was explained as a function of the time of the day (categorical variable with two levels, day and night) and seven quantitative variables (trawl sampling depth, bathymetry, slope, chlorophyll-a concentration, isotherm depth, mean oxygen concentration, and mean temperature) ( Table 1). Therefore, the seasonal, interannual, and spatial variabilities of assemblages were included in the statistical model through the inherent time variations of the seven quantitative variables.
In the statistical model, the number of observed presences has to be large enough to predict species occurrence probability with robustness. Moreover, the response variable distribution was negative binomial, and therefore, the number of zero (i.e., of absence per species) could not be too high. For these two reasons, we only considered the most abundant species for the analyses. We choose to keep species present in more than 25% of hauls, corresponding to the presence in at least 33 hauls for each studied species. Consequently, conclusions do not bear on all micronektonic species of the region but on those most commonly found. The model predicts two outputs: (1) the probability of observing each RCP based on a set of environmental values and Variables with a temporal resolution "Clim" are climatological data (2000)(2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010)(2011)(2012)(2013)(2014)(2015)(2016)(2017)(2018). Details on the variables and on the sources are provided in the main text.
(2) each RCP composition, i.e., probabilities of observing each species by RCP allowing the calculation of a diversity index.
(1) One RCP, also called an assemblage, is defined as a group of hauls where the probability of observing one species assemblage is approximately constant among trawl content and distinct from other groups of hauls (Foster et al., 2013). One RCP could be viewed as cluster of trawl content but in a probabilistic world. The method offers the advantage to do predictions of RCP's occurrence probability where hauls are absent.
(2) For one species, the probability to belong to one RCP is called species prevalence (p j i for the i th species in the j th RCP). An equivalent of the Shannon index (H j , for the j th RCP) was calculated for each RCP: with n the total number of species. The higher this index, the higher the diversity of one RCP.
Trawl hauls were different in terms of duration (from 16 to 56 min). Moreover, the total number of sampled individuals strongly varied among trawls (from 1 to 1544). To take into account these two factors, we added into the model the total number of individuals divided by the volume filtered as a sampling factor .
We used the Bayesian information criteria (BIC) (Raftery, 1995;Hui et al., 2015) and our personal knowledge of the region based on acoustics data analyses (Receveur et al., 2019(Receveur et al., , 2020 to choose the appropriate number of RCPs.

Model Predictions
To visually explore environmental drivers on the occurrence probability by RCP, partial dependence plots were produced. Those plots were created by fixing all environmental variables except one to their mean values and by predicting RCP probabilities for the remaining variable increasing from its minimal to its maximal value (Friedman, 2001).
RCP occurrence probabilities were spatially predicted for the whole New Caledonian EEZ using climatology of all four environmental variables (i.e., chlorophyll-a concentration, isotherm depth, mean oxygen concentration, and mean temperature) as well as moment of the day, trawl depth, bathymetry, and slope. We predicted the probability of each cell of the grid belonging to each RCP at a 1/4 • spatial resolution. We only predicted RCP probabilities within the sampled range of values of each environmental variable to avoid extrapolation. We then extracted the most probable RCP by grid cell. By averaging RCP probabilities by season (warm season from December to May and cold season from June to November), spatial distribution change due to seasonal variability was visually explored.
We also calculated the mean Shannon index for each spatial cell by averaging the RCP Shannon index weighted by RCP occurrence probability to visually explore spatial distribution of the diversity concerning only the 22 most abundant species.
To improve our understanding of species ecology, we calculated occurrence probability of each species by grid cell by multiplying RCP presence probability by species prevalence inside this RCP and summing those values across all RCPs. With this technique, we were able to predict occurrence probability of one species as a function of varying environmental conditions (Foster et al., 2013;Hill et al., 2017). We used this technique to predict species occurrence probability along depths for day and night periods. The diel vertical behaviors of species could then be assessed. To better identify difference between night and day vertical distributions, the difference between night and day occurrence probability (called night-day anomaly) was calculated at each depth. We defined three types of vertical behavior: -Resident with night-day anomaly smaller than 0.1. -Weakly migrant, assuming that only part of the population migrated, with night-day anomaly ranging from 0.1 to 0.2. -Migrant with night-day anomaly higher than 0.2.

Model Validation
As described in the previous paragraph, we calculated species occurrence probability based on a set of environmental variables. To validate these predicted probabilities, we did a fourfold crossvalidation procedure (Browne and Cudeck, 1989). The model was first fitted on a training data set (75% of randomly selected hauls) and then tested on a validation data set (the remaining 25% of hauls) four times. The area under the curve (AUC) was then calculated by species (a species was "present" with a predicted occurrence probability higher than 0.5). AUC indicates the probability that the model predicts a presence when a presence is observed (called a true positive). It gives a measure of the model robustness to predict the presence of the considered species: the higher the AUC, the better the model prediction.
Following Hill et al. (2017), we performed 500 bootstrap optimizations with random first values to avoid drawing broad conclusions based on a local likelihood maximum and to measure the method's accuracy. The species profile of each RCP was then averaged between the 500 simulations.
Statistical analyses were performed using R (R Core Team, 2018) version 3.5.0. RCP models were carried out using the library "RCPmod" (Foster et al., 2013.

Haul Data Set
After removing gelatinous organisms, the total number of specimens was 38,523, weighting about 50 kg and classified into 624 taxa (i.e., a mix of specimens identified at different taxonomic levels, such as family, genus, species). A total of 252 species belonging to 152 genus and 76 families were identified. Myctophidae (fish, 20,529 individuals), Sternoptychidae (fish, 3829), Gonostomatidae (fish, 765), Euphausiidae (crustacea, 6642), Oplophoridae (crustacea, 903), Phronimidae (crustacea, 630), and Pyroteuthidae (mollusk, 1471) were the most abundant families, representing more than 90% of the total number of individuals (Supplementary Table S3). The cumulative abundance of the 22 most abundant species kept in the present study corresponded to 64% of the total abundance sampled (22,473 individuals out of 35,523). The 22 species retained for the assemblage analyses belong to the families Myctophidae, Sternoptychidae, Gonostomatidae, Euphausiidae, Oplophoridae, Phrosinidae, Phronimidae, Cranchiidae, Pyroteuthidae, and Hydromylidae (Supplementary Figure S2). Therefore, the seven most abundant families from the total data set were represented in the 22 species in the retained data set.
The full data set encompassed 141 hauls in the 10-600 m depth range. Three hauls were removed because of technical problems during trawling as were three extra hauls because they did not contain specimens identified at the species level. We removed nine additional hauls for which none of the 22 most frequent species were sampled. Seven of these nine hauls were conducted in the 0-80 m vertical layer during the day and were almost empty (no more than 10 specimens per trawl). The two other hauls were conducted on unusual acoustic detections and targeted monospecific schools of a new species of the genus Polyipnus. This new species is currently under description. Three additional hauls were removed because environmental variables were not available at the sampling location. Therefore, the final data set analyzed was composed of the abundance of 22 species in 123 trawls.
As described in Section "Model parameterization, " the statistical model included only one factorial parameter: the moment of the day. The sampling of the two levels of factor day/night was balanced as each level was sampled more than 50 times (65 hauls during the day and 58 at night; Supplementary Table S2). The statistical model included sampling depth as a second variable inherent to haul. Sampled depth ranged between 10 and 600 m, done during both day and night even if daily hauls in the mesopelagic layer (i.e., deeper than 200 m) were more rare than during night (respectively, 10 and 33; Supplementary Table S2). Other variabilities (spatial and temporal, such as seasonal and interannual variabilities) were taken into account in the model by associating these environmental variables to each haul (bathymetry, slope, chlorophyll-a concentration, isotherm depth, 0-600 m mean oxygen, and 0-600 m mean temperature). Sampled values for each environmental variable were well distributed and covered the two seasons (Supplementary Figure S3).
The selection of the number of assemblages to describe was based on BIC values that increased quickly between three and seven assemblages (or RCPs) and then decreased up to 10 assemblages with similar BIC values for nine and 10 RCPs (Supplementary Figure S4). Based on that BIC information and on a previous study (Receveur et al., 2019), the three-assemblage solution was first analyzed, which kept the analysis simple, and then the nine-assemblage solution was explored, which provided more detailed and nuanced information.

Description of the Three Assemblages
When fitting the three RCPs, assemblages were almost perfectly discriminated by two factors: the vertical layer where the trawl occurred (epipelagic or mesopelagic) and the moment of the day (day or night) (Figure 2A). Indeed, RCP1 was composed mostly of hauls occurring during the day in the epipelagic layer (0-200 m), RCP2 gathered hauls in the epipelagic layer but occurring during the night, and RCP3 was composed of hauls occurring in the mesopelagic layer during both day and night. The day epipelagic community (i.e., RCP1) was dominated by Phrosina semilunata (crustacean) and Cranchia scabra and Liocranchia reinhardti (mollusks) with a relatively small number of species predicted with a prevalence higher than 0.2 (Shannon index of 2.5; Figure 2B). The night epipelagic community (i.e., RCP2) was composed of a larger number of species and was more diverse (Shannon index of 2.9). Most particularly, almost all fish species except the three Argyropelecus sp. had an occurrence probability higher than 0.3. The two Thysanopoda sp. also appeared in this assemblage although they were absent from the day epipelagic assemblage. Finally, the day and night mesopelagic community (i.e., RCP3) was characterized by the presence of three Argyropelecus sp. and prevalence higher than 0.2 for a majority of the species. This community had a similar level of diversity than the night epipelagic community (Shannon index of 2.9).
As the aim of the study was to explore nuanced assemblages and to determine the influence of parameters other than depth and day/night on the assemblages; the rest of the results focus on the nine-assemblage solution.

Description of the Nine Assemblages
The RCP model diagnostic plots indicate that the model was adequate to describe variation in data (i.e., independence of observed data, good residual distribution, and residuals' independence to fitted values) (Supplementary Figure S5). There was also no evidence of residual spatial autocorrelation by species (Supplementary Figure S6).
RCP1 of the nine-assemblage model was similar to the day epipelagic assemblage of the three-assemblage model with a small number of species and the dominance of Phrosina semilunata (crustacean) and Cranchia Scabra and Liocranchia reinhardti (mollusks) (Figure 3). RCP7 also demonstrated a low Shannon index (2.11) with few species dominating the assemblage and was mainly composed by the crustacean Phronima sedentaria and Phrosina semilunata as well as the fish Argyropelecus aculeatus.
Other assemblages were more diverse (RCP2, RCP3, RCP4, RCP5, RCP6, RCP8, and RCP9) with a Shannon index above 2.5 and an index above 2.8 for RCP3 and RCP5 (Figure 3). RCP2 was mostly dominated by Thysanopoda tricuspidata, and RCP3 was characterized by a large proportion of fish species. The high occurrence of the three Argyropelecus sp. characterized RCP4 and RCP5, and those two RCPs were correlated (Spearman correlation of 0.72; Table 2). RCP6 was similar to RCP3 (Spearman correlation of 0.66) but with a smaller proportion of all species. RCP5 was dominated by Ceratoscopelus warmingii. Finally, RCP8 and RCP9 were similar (Spearman correlation of 0.7) with Thysanopoda tricuspidata and Ceratoscopelus warmingii dominating those assemblages, and RCP9 had the largest proportion of mollusk species.

Environmental Variables Influencing Assemblages
Probabilities of RCPs' presence were especially driven by the depth and the moment of the day (Figures 2, 4). A succession of RCPs was observed along the depth (Figure 4A). During the day, mainly four RCPs were observed: RCP1 prevailed in the top 200 m, RCP4 and RCP7 between 250 and 400 m, and RCP5 dominated deeper than 400 m. Among the RCPs found during the day and sharing the same vertical layer (RCP4 and RCP7), RCP4 was mostly favored by bathymetry deeper than 4000 m, and RCP7 optimal values were mostly for high mean oxygen concentration (Figure 4B).
At night, layers deeper than 200 m were mainly also occupied by RCP5, and in 80-200 m, RCP8 and RCP3 dominated.
Low oxygen values and cold temperature were important for RCP3, and deep bathymetry for RCP8. RCP9, RCP2, and RCP6 were found in waters shallower than 200 m with a predominance of RCP9. RCP2 appeared only in waters with low oxygen concentration, RCP6 in warm mean temperature  waters (above 17.5 • C), and shallow isotherm 20 • C was the factor influencing RCP9.

Spatial and Temporal Distribution of Predicted Assemblages
As explained in Section "Model predictions, " no extrapolation was done; i.e., spatial maps of RCPs occurrence were predicted based on environmental variables within the observed range of each variable (Supplementary Figure S3). Based on night RCP depth distribution (Figure 4A), three vertical layers were selected to map out RCP probabilities on a 1/4 • spatial grid. Vertical limits of these layers were determined when dominant assemblages changed (i.e., RCP9 dominated from the surface down to 80 m, then RCP3 and RCP8 for depths between 80 and 200 m and RCP5 in the deeper mesopelagic layer). We averaged predictions over these three vertical layers and by season (warm season from December to May and cold season from June to November), and spatial predictions were used to visually examine the spatial and temporal extent of species assemblages.
During the day, RCP1 dominated almost the whole EEZ between 10 and 200 m except for the extreme north (RCP2 and RCP4) and the region around Chesterfield/Bellona reef for the whole water column (RCP6) (Figure 5). RCP5 (south) and RCP4 (north) occupied the entire EEZ deeper than 200 m except for the Chesterfield/Bellona area (RCP6). Small changes were observed across seasons: RCP spatial distributions were very similar for the two seasons with only a switch from RCP2 to RCP4 in the north between 80 and 200 m, the presence of the RCP7 in the extreme south, and a small southward extension of RCP4 during the warm season.
Night predictions were more partitioned and spatially variable; there were differences among the three vertical layers with a general north/south splitting. Over 10-80 m, RCP2 dominated the area north of 20 • S during the two seasons, and RCP9 (warm season) and RCP3 (cold season) dominated the southern area. Between 80 and 200 m, RCP3 prevailed during the cold season, whereas RCP8 prevailed during the warm season. The 200-600 m spatial distribution at night was very close to that of the day with a dominance of RCP5. RCP6 was the most probable assemblage at night and day around Chesterfield/Bellona reef.
By averaging RCP Shannon indices weighted by their corresponding RCP probabilities, spatial patterns of the day-night, warm-cold seasons, and 10-600 m, mean micronektonic diversity was assessed (Figure 6). The mean spatial distribution of Shannon indices across the EEZ showed larger values in the southwest (∼2.73) compared to the northern part (∼2.64) or the southeast part (∼2.63) of the EEZ, especially around Chesterfield/Bellona reef and south of Fairway-Lansdowne Banks. The channel between the FIGURE 6 | Proxy of Shannon index (mean H) predicted in the whole studied area on average for depths between 0 and 600 m and for day and night combined. Black triangles indicate the location of seamounts according to Allain et al. (2008).
Main Island and Loyalty Islands also had a high Shannon index (∼2.74).

Vertical and Spatial Distribution per Species
Based on the AUC criterion ( Table 3), predictions of distribution should be considered with caution for a number of species, such as Phronima sedentaria, Liocranchia reinhardti, and Hygophum hygomii with AUC smaller than 0.7. Hydromyles globulosus had the smallest AUC (0.37) of all species.
Recall that night and day vertical profiles were used to assess diel vertical behavior: A species was identified as migrant when its night-day anomaly was larger than 0.2, as weakly migrant when the anomaly was between 0.1 and 0.2, and as resident when the anomaly was smaller than 0.1 ( Table 4).
Among fish, the three marine hatchetfish Argyropelecus species were found to be resident deeper than 200 m (Figure 7). The lanternfish (family Myctophidae), Bolinichthys longipes, Ceratoscopelus warmingii, Diaphus perspicillatus, Hygophum hygomii, Hygophum reinhardtii, Lobianchia gemellarii, and Symbolophorus evermanni were identified as migrant. B. longipes, the two Hygophum sp., and S. evermanni were predicted to stay deeper than 600 m and the others deeper than 400 m during the day. During the night, the predicted occurrence was higher in the 100-300 m layer than for the 0-100 m vertical layer for B. longipes, C. warmingii, D. perspicillatus, and L. gemellarii. Finally, Gonostoma atlanticum was identified as weakly migrant from 400 to 600 m during the day to 100-200 m during the night.
For crustaceans, Janicella spinicauda, Oplophorus gracilirostris, Thysanopoda monacantha, and Thysanopoda tricuspidata were found to have a migrant behavior, migrating from 400 to 600 m in the day to 100-300 m at night for the three first species and migrating from 400 to 600 m in the day to 0-200 m at night for T. tricuspidata. Phronima sedentaria was identified as resident in the 200-600 m vertical layer. Phrosina semilunata was identified as resident in the 10-600 m layer.
For mollusks, a weakly migrant behavior was associated to Hydromyles globulosus from 400 to 600 m in the day to 0-200 m at night and a migrant behavior to Pterygioteuthis giardi and Pterygioteuthis microlampas from 400 to 600 m in the day to 0-100 m at night. On the contrary, Liocranchia reinhardti and Cranchia scabra were classified here as epipelagic resident (i.e., 0-200 m). Finally, the average spatial distribution over the 10-600 m depth layer and over day and night was investigated for each of the 22 species (Figure 8).
Regarding crustacean species, Janicella spinicauda was predicted to be more abundant in shallow waters, especially around Chesterfield reef. On the contrary, Oplophorus gracilirostris, Phronima sedentaria, and Thysanopoda monacantha were predicted to be absent around Chesterfield reef with the three latter species more abundant south of 20 • S. Phrosina semilunata was found in shallow waters close to the coast and, on average, was more abundant in the south than in the north of the EEZ. Finally, Thysanopoda tricuspidata was mainly predicted in the extreme north of the EEZ.
Regarding mollusks, Cranchia scabra and Hydromyles globulosus had higher probability of presence in the southern part of the EEZ with a homogenous repartition in the south. Pterygioteuthis giardi was also predicted to be more abundant in the south but with a less homogenous distribution and especially in deeper waters. Spatial patterns were weaker for Liocranchia reinhardti and Pterygioteuthis microlampas with larger abundance predicted around Chesterfield reef, and P. microlampas was slightly more abundant south of 20 • S.
Finally, among fishes, three marine hatchetfish, Argyropelecus aculeatus, A. hemigymnus, and A. sladeni, were predicted to have the same spatial patterns with a higher abundance in regions characterized by high bathymetry slope and lower abundance around Chesterfield. Bolinichthys longipes was mainly predicted as present in the western part of the EEZ. Ceratoscopelus warmingii, Diaphus perspicillatus, and Hygophum hygomii were predicted with slightly higher probability of presence in the south of the EEZ, especially south of Chesterfield/Bellona reef and of Fairway-Lansdowne Banks. The two species of Myctophidae, Hygophum reinhardtii, Symbolophorus evermanni, and the species of Gonostomatidae, Gonostoma atlanticum, had similar spatial distribution and were found relatively homogeneously in the entire New Caledonian EEZ but with a slightly higher probability in the northern part of the EEZ. Finally, the presence of Lobianchia gemellarii was predicted as present almost everywhere.
of the day and the sampling depth were the two parameters influencing the three-assemblages model. There were larger proportions of fish and crustaceans in day and night mesopelagic assemblages and a large proportion of mollusks in the night epipelagic assemblage. By splitting the data set into nine major assemblages, the moment of the day and the depth of sampling remained the main driving parameters; however, influence of other environmental variables was detected. Variables such as mean oxygen concentration, mean temperature, and bathymetry also influenced micronekton assemblages and induced a specific north/south pattern. Three major daytime assemblages were identified, covering the whole EEZ, and showed a depth layer structure. One assemblage dominated in waters shallower than 200 m, and the other two were found in deeper waters (down to 600 m) in the north and in the south. The four major night communities were more diverse with a patchier spatial distribution and two depth changes, at approximately 80 and 200 m. The predominant night northern assemblage (RCP2) in the top 80 m was dominated by mollusks and fish species, whereas the southern assemblage (RCP9) was dominated by crustacean and fish species. The 80-200 m nighttime assemblages were characterized by more cephalopods and fish in the north than in the south. Generally, assemblages' spatial distribution indicated that the southwest part of the EEZ was more diverse than other FIGURE 7 | Species occurrence probability predicted by depth according to day (light gray line) and night (black line). Red dotted lines show positive anomaly between night and day (i.e., night probability -day probability). Vertical red stripes highlight the [0.1-0.2] probability range used to determine the species migratory behavior (resident: night-day anomaly smaller than 0.1; weakly migrant: night-day anomaly between 0.1 and 0.2; and migrant: night-day anomaly higher than 0.2). Colored outline around species name indicates the migratory behavior. regions. Finally, the spatial distribution of 22 frequent species was described as well as their vertical behavior.

Statistical Framework
To our knowledge, previous studies on micronekton mostly used clustering methods based on various dissimilarity metrics (e.g., Drazen et al., 2011;Parker et al., 2015;Eduardo et al., 2018;Flynn et al., 2018). Here, an alternative was offered to simultaneously classify trawl data in homogenous groups and study their links with environmental drivers. We confirmed Hill et al.'s (2017) conclusions about RCP capacities and extended the application domain moving from demersal fish of the Kerguelen Plateau to pelagic micronektonic communities in a subtropical region. The method was especially useful for micronekton species that had a complex vertical habitat and behavior. Indeed, the approach allowed taking into account the DVM signal by including moment of the day and trawl depth as well as looking at the influence of oceanographic parameters on more subtle variations in assemblages. Moreover, the species occurrence probabilities were predicted for all 0-600 m depths, and therefore, an incomplete or unbalanced data set composed of a small observation number (e.g., missing sample depths) could be used to assess species vertical habitat. Hence, the model was no longer limited to the depth sampled as presented by Williams and Koslow (1997) or Nishikawa et al. (2001). A complete description of vertical habitat by species could be especially useful to integrate in biogeochemical models over depth for example (Aumont et al., 2018). The largest weakness of that statistical modeling approach was the removal of the less frequent species. Indeed, to have robust statistical relationships, the number of observed occurrences had to be significant (i.e., more than 33 presences in the present study). Consequently, all conclusions were only made on 22 species although more than 250 were identified in the region. For these reasons, conclusions on diversity spatial distribution, measured here with the Shannon index, concerned only a small number of species. A different analysis including all species would be an interesting next step to describe the total micronektonic species diversity but would require the design of another statistical methodology. There was no link between the total number of observations and the model ability to efficiently predict the probability of presence, quantified by AUC. For example, Phronima sedentaria had one of the worst AUCs, whereas it was the third most frequent species. It was assumed that species with a small AUC, such as Phronima sedentaria, had large environmental tolerance, at least for variables included in the RCP model. This large tolerance window complicated the fit of the relationships between environmental gradient and the probability of presence. Therefore, to improve predictions of spatial distribution for these species, sampling may need to be performed in more contrasted environmental conditions. The other hypothesis was that other environmental variables (e.g., the salinity) were missing in the model, and it could be more influential over those species.

Ecoregionalization of Prey Communities
It should be noted that the trawl used in this study did not allow sampling of micronekton communities in the top 10 m, resulting in the likely absence of an important community as previously shown in the southern ocean (Flores et al., 2014). Moreover, mid-water trawls are known to select some species due to net avoidance behavior by some organisms and to select some specific size ranges of organisms (Heino et al., 2011;Kaartvedt et al., 2012). This trawl selectivity could be tested by using different trawls to sample the same layers (De Robertis et al., 2017). Finally, as the trawl stays open during setting and hauling, layers sampled could be mixed with other layers during these two periods. In the absence of an open-close trawl, future specific hauls and sampling strategies are needed to answer this question. We checked, and in the literature, the vertical habitats of the 22 studied species were deeper than 0-10 m. Moreover, as the studied species in the present study were the most abundant, there were certainly mostly collected during the 30 min of hauling in comparison to the few minutes in the upper layers.
During the night, communities of the northern region were relatively stable across seasons. On the contrary, communities in the south changed in waters shallower than 200 m during the night across seasons. Those differences could be explained by environmental conditions as the assemblages predicted in the south of 20 • S were supported by high oxygen values and cold temperature, whereas northern assemblages were supported by low oxygen and warm temperature.
The north-south transition observed between the communities RCP2-RCP8 and RCP3-RCP6-RCP9 at night in the surface layer around a latitude 20 • S appeared to be mainly linked to a temperature meridional gradient. Spatial patterns of the temperature in the New Caledonian EEZ indeed shows a strong north-south gradient, well delimited around 20 • S (Menkes et al., 2015), which also reflects different oceanographic conditions more generally (the northern part is influenced by warm waters coming from the equator, and the southern part by cold and rich waters coming from the southeast). Therefore, 0-600 m mean temperature appeared to delimit large homogeneous hydrodynamic regions constraining micronekton habitats rather than a direct biological impact. The 0-600 m averaged oxygen concentration was another important variable influencing a community's probability of presence. The link between oxygen concentration and micronekton vertical distribution has been studied previously using acoustic estimates. For instance, in the South Pacific, the lower vertical expansion of epipelagic micronekton layers was found to be limited by the minimum oxygen depth (Bertrand et al., 2010). Bianchi et al. (2013) demonstrated on a global scale that the higher the oxygen concentration at depth, the deeper the micronekton deep layer was during the daytime. Results from the present study as well as from Suntsov and Domokos (2013) in the north Pacific show that oxygen also influenced the micronekton species composition with higher oxygen values linked to a richer community with potentially a larger proportion of mollusks.
In addition to oceanographic parameters, we observed that micronekton assemblages were influenced by bathymetry, which is very diverse in New Caledonia with ridges, seamounts, trenches, and sedimentary basins (Gardes et al., 2014). According to Escobar-Flores et al. (2018), shallower waters offer lowerquality habitat for mesopelagic organisms and may induce a densification of organisms in the upper layers of the water column. Spatial predictions of the Shannon index showed a higher diversity around seamounts in the north of the Loyalty Islands (i.e., around the Astrolabe and Petrie reefs) and north and south of Chesterfield/Bellona reef. The small overall range of Shannon values predicted (about 0.15) could be linked to the methodology used and because it was calculated only on 22 species. Climate change will have a profound impact on ecosystem structure (Lotze et al., 2019), but impacts on the micronekton component are largely unknown (St. John et al., 2016). Based on the present study, the predicted decrease of oxygen concentration (Stramma et al., 2010(Stramma et al., , 2008 could favor RCP2 (dominated by Thysanopoda tricuspidata) and RCP3 (large proportion of fish) in the future, whereas the predicted temperature increase (Bopp et al., 2013) could favor RCP6 (Chesterfield assemblage), RCP8 (dominated by Ceratoscopelus warmingii), and RCP9 (large mollusk proportion). Moreover, changes in oxygen minimal depth would change the vertical distribution of micronektonic communities, potentially making them more vulnerable to epipelagic predators (Stramma et al., 2010;Olson et al., 2014). As predictions of future marine oceanographic conditions are becoming more and more precise and realistic, one next step could be to forecast future communities by forcing the RCP model with climate change model outputs with both present and future simulations to measure change in communities.
The strong latitudinal shift in micronekton assemblages found in our study mirrored Flynn and Marshall (2013) and Flynn et al. (2018) in bioregionalization of the eastern Australia region. Communities described in the present study were in agreement with previous studies in the Coral Sea with the dominant fish species belonging to the families Gonostomatidae, Sternoptychidae, and Myctophidae (Grandperrin, 1975;Flynn and Paxton, 2012;Sutton et al., 2017). However, the 22 most abundant species identified were completely different from the 20 most abundant species described in the southern of Tasmania even if some species were found in both studies (A. aculeatus, A. hemigymnus, C. warmingii) (Williams and Koslow, 1997). Some abundant species in this study were also found in Hawaiian waters (18 • N-21 • N) (e.g., B. longipes, C. warmingii, D. perspicillatus, G. atlanticum, and J. spinicauda) (Reid et al., 1991;De Forest and Drazen, 2009;Drazen et al., 2011) or close to Papua New Guinea (0 • N-20 • N) (e.g., B. longipes, C. scabra, D. perspicillatus, H. reinhardtii, and S. evermanni) (Duffy et al., 2017). Given those similarities and differences, an analysis on the scale of the Pacific Ocean would be a large improvement for the understanding of prey distribution.

Micronektonic Species Behavior
Vertical behaviors and spatial distribution of the 22 most frequent species were investigated in the New Caledonian EEZ. Regarding the vertical behavior, previously described vertical behavior was confirmed for 10 species (the three marine hatchetfish Argyropelecus, Ceratoscopelus warmingii, Hygophum reinhardtii, Lobianchia gemellarii, the two Pterygioteuthis species, Symbolophorus evermanni, and Thysanopoda tricuspidata); new vertical behavior was described for three species (Hydromyles globulosus, Janicella Spinicauda, and Oplophorus gracilirostris), and vertical behavior was refined for seven species (Bolinichthys longipes, Diaphus perspicillatus, Gonostoma atlanticum, Hygophum hygomii, Liocranchia reinhardti, Phronima sedentaria, and Thysanopoda monacantha) ( Table 4). No clear vertical patterns were associated to C. scabra and P. semilunata. These results indicate that the RCP method appears to be a valuable technique to determine micronekton vertical behavior. Spatial predictions of the species distribution were very similar to the bathymetry pattern, highlighting the importance of this parameter in species distribution and also in assemblage composition.
Concerning C. scabra and P. semilunata, there was a larger proportion of these species in the epipelagic layer (0-200 m) during the day in comparison to the night. A sampling artifact could explain this result: The presence of a large mix of species during the night could be conducive to a less efficient sampling of those two species and their absence in the epipelagic night hauls. This sampling artifact appears more plausible based on literature results than an inversed migration (from surface at day to deep at night).
The present study participated in improving the understanding of how sensitive all micronekton species are to oceanographic variables (physical and biogeochemical), information needed to fully evaluate the impact that climate change could have on top predators' food (Choy et al., 2016). Eleven species (A. aculeatus, A. sladeni, C. scabra, H. globulosus, L. reinhardti, O. gracilirostris, P. sedentaria, P. semilunata, P. giardi, P. microlampas, and T. tricuspidata) have been observed in predators' stomach contents in the western and central Pacific Ocean database of the Pacific Community 3 (Sanchez et al., 2018) (e.g., Coryphaena hippurus, Kajikia audax, Katsuwonus pelamis, Lampris guttatus, Sphyraena barracuda, Thunnus albacares, Thunnus obesus, Xiphias gladius). Lanternfish species (or Myctophidae) are also a non-negligible part of predators' diets (Brodeur et al., 2005;Williams et al., 2014), and although Myctophidae are found very often in stomach contents (Elodie Vourey, personal communication), their quick digestion limited their identification at the species level and explained their absence in the previously cited species list. The 22 most frequent micronekton species found in this study were also observed in other predators' diet studies in the Pacific Ocean (Griffiths et al., 2009;Kuhnert et al., 2012;Olson et al., 2014;Portner et al., 2017), corroborating the importance of this group in the pelagic ecosystem and reinforcing the need for a better understanding of their dynamics (St. John et al., 2016). The predicted spatial distribution by species would help to improve the predator niche modeling in future studies.

Link to Acoustics
Two previous studies analyzed the acoustic data from the same Nectalis surveys to link the mean 20-120 m acoustic echo-intensity and the 0-600 m vertical acoustic profiles to environmental conditions (Receveur et al., 2019(Receveur et al., , 2020. They proposed a bioregionalization of the New Caledonia EEZ based on the micronekton vertical distribution and identified three homogenous regions during the day: (i) north of 21 • S with a weak deep scattering layer (DSL) and a weak shallow scattering layer (SSL); (ii) south of 21 • S and west of 165 • E (intense SSL at 80 m and very intense SSL at 30 m); and (iii) south of 21 • S and east of 165 • E (intense SSL and very intense DSL at 550 m). The spatial extent of assemblages found in the present study were similar to the three homogenous regions defined based on acoustic profiles. This result reinforced the useful contribution of acoustics data to characterize mesopelagic organisms' distribution. However, gelatinous organisms are known to be responsible for a significant part of the acoustic signal (Warren, 2001;Proud et al., 2018), and the development of methods capable of including these organisms in trawl analyses is crucial to more accurately compare acoustics and haul data. Therefore, a study focusing on the direct link between trawl content and multifrequency acoustic signal is needed to fully understand these complex links and to be able to quantify a total micronekton biomass for the region. Toward that end, it would be necessary to group species that respond similarly in terms of acoustics (e.g., "fluid-like, " "with gaz"). Yet, in order to deduce biomass estimation from acoustic data, knowledge of a community's species composition and of its target strengths is primordial. Given the high species diversity found in our study, going further into transforming acoustic signal into biomass appears very challenging in the region (Fielding, 2004;Godø et al., 2009).

Natural Park of the Coral Sea Management
The creation of the Natural Park of the Coral Sea is recent (2014), and the management plan is still under development. The first management measures focused primarily on isolated reefs and islands and not on the oceanic areas. 4 Recently, part of Chesterfield, Bellona, Entrecasteaux, Petrie, and Astrolabe (see Figure 1 for location of those features) have been classified as strict nature reserves (category I of "no human activities authorized" according to IUCN protected area classification or "1-No take-No go" according to the classification of Horta e Costa et al., 2016), including waters with a bathymetry shallower than 1000 m (Supplementary Figure S7). Based on results from this study, the Petrie and Astrolabe reefs seem to aggregate a high number of mesopelagic organisms' compared to the surrounding area. Hence, these areas appear to be good candidates to implement a higher degree of protection level on offshore areas as well. The southwestern part of the EEZ (south of 22 • S and west of 164 • E) could be another area of interest for management as the micronekton assemblage (RCP3) was composed of a large number of species identified as prey for top predator species important for the pelagic ecosystem.

CONCLUSION
By analyzing the abundance of the 22 most frequent species collected in 123 pelagic trawls and with the help of an innovative multivariate statistical approach, variability in micronektonic assemblages around New Caledonia in the Coral Sea was measured and linked to oceanographic drivers. The largest variability highlighted was function of depth due to DVM. Moreover, the method helped to identify areas that may 4 https://mer-de-corail.gouv.nc/en be regarded as conservation areas of interest. The used could also valuably contribute to assess the climate change impact on micronektonic assemblages and distribution, a crucial knowledge with consequences for the whole pelagic ecosystem.

DATA AVAILABILITY STATEMENT
The datasets generated for this study are available on request to the corresponding author.

ETHICS STATEMENT
Ethical review and approval was not required for the animal study because we had the necessary authorizations to take these samples. In addition, our sampling protocol followed that widely used in other trawling studies to collect on micronektonic specimens.

AUTHOR CONTRIBUTIONS
AR performed the statistical analysis. AR, VA, and EV analyzed the data and interpreted the results with the help of CM, AL-D, and FM. AR, VA, and CM wrote the manuscript with the help of EV, AL-D, and FM. All the authors contributed to and provided feedback on various drafts of the manuscript.

FUNDING
This work has been conducted with the financial assistance of the BEST2.0 program of the European Union. The content of this document is the sole responsibility of AR and can under no circumstance be regarded as reflecting the position of European Union. This work also benefited from the financial support of the French Biodiversity Agency (OFB, Office Français de la Biodiversité).

ACKNOWLEDGMENTS
We thank R/V ALIS officers and crews and science parties who participated to the cruises for which data are included in the present manuscript. We also thank SPC staff and external taxonomic experts who contributed to the identification of the micronekton species.