Comparing Deep-Sea Larval Dispersal Models: A Cautionary Tale for Ecology and Conservation

Larval dispersal data are increasingly sought after in ecology and marine conservation, the latter often requiring information under time limited circumstances. Basic estimates of dispersal [based on average current speeds and planktonic larval duration (PLD)] are often used in these situations, usually acknowledging their oversimplified nature, but rarely with an understanding of how oversimplified those assumptions are. Larval dispersal models (LDMs) are becoming more accessible and may produce “better” dispersal predictions than estimates, but the uncertainty introduced by choosing one underlying hydrodynamic model over another is rarely discussed. This case study uses theoretical and simplified deep-sea LDMs to compare the passive predictions of dispersal as driven by two different hydrodynamic models (HYCOM and POLCOMS) and a range of informed basic estimates (based on average current speeds of 0.05, 0.1, and 0.2 m/s). The aim is to provide generalizable insight into the predictive variability introduced by (a) choosing a model over an estimate, and (b) one hydrodynamic over another. LDMs were found to be up to an order of magnitude more conservative in dispersal distance predictions than even the slowest tested estimate (0.05 m/s). The difference increased with PLD which may result in a bigger disparity for deep-sea species predictions. Although the LDMs were more spatially targeted than the estimates, the two LDM predictions were also significantly different from each other. This means that choosing one hydrodynamic model over another could result in contrasting ecological interpretations or advice for marine conservation. These results show a greater potential for hydrodynamic model variability than previously appreciated by larval dispersal ecologists and strongly advocates groundtruthing predictions before use in management. Advice is offered for improved model selection and interpretation of predictions.

Larval dispersal data are increasingly sought after in ecology and marine conservation, the latter often requiring information under time limited circumstances. Basic estimates of dispersal [based on average current speeds and planktonic larval duration (PLD)] are often used in these situations, usually acknowledging their oversimplified nature, but rarely with an understanding of how oversimplified those assumptions are. Larval dispersal models (LDMs) are becoming more accessible and may produce "better" dispersal predictions than estimates, but the uncertainty introduced by choosing one underlying hydrodynamic model over another is rarely discussed. This case study uses theoretical and simplified deep-sea LDMs to compare the passive predictions of dispersal as driven by two different hydrodynamic models (HYCOM and POLCOMS) and a range of informed basic estimates (based on average current speeds of 0.05, 0.1, and 0.2 m/s). The aim is to provide generalizable insight into the predictive variability introduced by (a) choosing a model over an estimate, and (b) one hydrodynamic over another. LDMs were found to be up to an order of magnitude more conservative in dispersal distance predictions than even the slowest tested estimate (0.05 m/s). The difference increased with PLD which may result in a bigger disparity for deep-sea species predictions. Although the LDMs were more spatially targeted than the estimates, the two LDM predictions were also significantly different from each other. This means that choosing one hydrodynamic model over another could result in contrasting ecological interpretations or advice for marine conservation. These results show a greater potential for hydrodynamic model variability than previously appreciated by larval dispersal ecologists and strongly advocates groundtruthing predictions before use in management. Advice is offered for improved model selection and interpretation of predictions.

INTRODUCTION
Larval dispersal is an important ecological process. Many benthic animals rely upon this phase as their only means to colonize a new area, making the process pivotal in individual survival as well as in population dynamics and persistence.
Existing global efforts to establish networks of Marine Protected Areas (MPAs) are hampered without knowledge of larval dispersal. An effective self-sustaining network needs each MPA to supply larvae to both itself and another for protected populations to persist (Roberts et al., 2003) -something that will only be achieved by chance without dispersal data to base informed decisions upon. It is therefore imperative that we gather information on larval dispersal as soon as possible.
The most basic way to fulfill this need is to estimate larval dispersal using a distance /speed /time calculation based on average current speeds and planktonic larval durations (PLDs). This technique, hereafter termed "an estimate, " while highly simplistic, takes very little time, money, effort, or expertise to produce. Consequently, estimates have been used both in ecology (e.g., McClain and Hardy, 2010) and conservation (e.g., Roberts et al., 2010), although always acknowledging their oversimplified nature and the need for more detailed study. However it is hard to quantify just how oversimplified these estimates may be.
Among the more advanced methods that exist for identifying dispersal patterns , larval dispersal models (LDMs) are gaining popularity in ecology and conservation (e.g., Aleynik et al., 2018;Kenchington et al., 2019). An LDM is a simulation of dispersal driven by a numerical hydrodynamic model to produce maps predicting which populations may be linked. As a simulation it doesn't require expensive and difficult to obtain biological samples beyond knowing initial positional information, but it should integrate any other relevant biological and ecological data (e.g., larval behavior, mortality, or buoyancy) should they be available (see Metaxas and Saunders, 2009). Furthermore, the LDM can later be assessed and improved by groundtruthing with other sample-requiring methods (e.g., geochemical tracers and population genetics; Cowen et al., 2007). The ability to "provide an answer now" without requiring the time, money, and effort for additional sampling makes the LDM method particularly attractive for marine conservation's urgent needs, especially in the deep sea (Hilário et al., 2015).
However, it is well acknowledged that, despite the specialist skills needed to produce an LDM, their quality and accuracy may be highly variable. Poor bathymetry, temporal and spatial averaging, a lack of sub-mesoscale processes, and unknown or estimated biological parameters can all add to the error included within an LDM Putman and He, 2013). The true extent of such (often unavoidable) predictive inaccuracies will always remain elusive until groundtruthing (e.g., population genetics; see Foster et al., 2012;Sunday et al., 2014) and validation can take place: essential steps in any modeling process.
Once groundtruthed, the worth of these models can be quantified, but there remains a question as to how useful un-groundtruthed LDM predictions are, and whether they should be used in preliminary management decisions? If the errors in such un-groundtruthed predictions are large, then perhaps the crude, but fast and less expertise-demanding back-of-the-envelope estimates may be just as useful. Shanks (2009) did examine the difference between estimated and modeled predictions of dispersal distance while exploring the influence of PLD on dispersal. He found the estimate to be the least conservative prediction (an overestimate), with an LDM being up to an order of magnitude more conservative. However, the LDM also overestimated the predicted distance of dispersal when compared with those approximated from genetic data.
Shanks's study focused on shallow-water and coastal species which are concentrated in areas of arguably more complex hydrodynamics and faster current speeds than the deep-sea. There is therefore potential for a greater similarity between estimated and modeled dispersal predictions if a similar study were focussed in deep-water.
When assessing the stability of model predictions without new sampled validation data, one approach often used in other ecological modeling disciplines, is a model comparison (e.g., Elith and Graham, 2009;Piechaud et al., 2015). It stands to reason that if all different models are trying to represent reality, there should be some similarity in their predictions, provided that their assumptions are suited to the task at hand. Exploring the differences and similarities between models promotes a greater understanding of which variables control predictions and where previously unexplored sources of error may lie.
As an ecologist running an LDM, the selection of a hydrodynamic model to power simulations is the most difficult choice to make. The huge number of models available is testimony to the variation in how they are set up -with different spatial and temporal averaging, target areas, target processes, and numerical solutions. Furthermore, each model is often supplied as source code and customized by the user so any one model name (e.g., HYCOM, POLCOMS, NEMO, MITgcm, ROMS) may represent a family of models where each individual iteration has been tailored to a different purpose. So it is easy to see why hydrodynamic models can appear as a black box to ecologists looking to utilize one as part of an LDM.
Despite the glut of options, model choice will be restricted first by study location and finding suitable parameterization (e.g., see advice from Werner et al., 2007;North et al., 2009;Fossette et al., 2012), but also by access (e.g., proprietary issues). Deep-sea studies, for example, due to the distance from shore and large spatial scales, are likely to be limited to global circulation models (GCMs), shelf models, and occasional custom build models from local observations (which carry their own limitations, see Fossette et al., 2012).
At the end of the model selection process you may be faced with only a couple of imperfect but differently (potentially) suitable models that are hard to choose between. Allowing for parameterization differences, a comparison of the dispersal predictions obtained from two such hydrodynamic models must logically display some difference. The question is whether that difference is negligible, and therefore potentially cross-validating, or substantial, making groundtruthing absolutely necessary before either model prediction has value.
The need to source additional data to confirm or reject model predictive ability should be considered mandatory regardless of the results of model cross-validation, but if cross-validated models are found to broadly agree they would provide a first level of validation for each other and therefore allow meaningful research output before additional (in the deep sea, potentially considerable) groundtruthing costs are outlaid.
This study will therefore investigate: (1) The difference between estimated and modeled predictions of larval dispersal in the deep sea, to extend Shanks (2009) study into deep water, and understand the value of an LDM over an estimate, and (2) The difference between the predictions of LDMs driven by two different hydrodynamic models, each selected as potentially suited to larval dispersal simulations in the study area. This will help us understand whether the hydrodynamic models are cross-validatory, or, more worryingly, contradictory, thereby reducing the trustworthiness of LDM results prior to groundtruthing.
Note that this study does not offer a formal validation or criticism of either of the hydrodynamic models tested, nor does it seek to recommend one over the other (even within larval dispersal modeling a different one may benefit one scenario over the other), instead it aims to highlight the differences and similarities between LDMs, driven by two example hydrodynamic models, to offer insight relevant to understanding the importance of model choice and the value of modeled outputs.
The results of this study should be beneficial to both ecologists and marine managers in all marine settings. For them, we are not aiming to provide ecological answers for any specific species, but instead hope to provide interpretive guidance on the impact of: (a) Choosing to run an LDM instead of using estimates, and (b) Selecting one hydrodynamic model over another as the basis of any LDM.

Study Area
This study was conducted in the NE Atlantic in offshore deep water west of the United Kingdom and Ireland (Figure 1). The Rockall Trough is one of the best studied areas of deep-sea in the world, providing historic datasets for at least a qualitative groundtruthing of predictions (Ellett et al., 1986;Holliday and Cunningham, 2013). Arguably this area may not be representative of all deep-sea regions as it has more rapidly changing bathymetry (and therefore bathymetrically induced hydrographic features) than, for example, an area of flat abyssal plain ; although abyssal plains too are known to experience complex hydrodynamics (e.g., Gardner et al., 2017). This could, however, make for a fairer comparison to complex shallow water and coastal hydrodynamics and also promotes a greater similarity to estimate predictions which represent a null model of maximal uncertainty and spreading of larvae. However, more importantly, this study area was chosen as the region where further species-specific larval dispersal work would be carried out. Therefore, this assessment of model suitability must be carried out in the same region to give results relevant to the subsequent applied studies. We recommend all dispersal modelers do similar regional sensitivity (e.g., Ross et al., 2016) and suitability tests prior to any species-specific work, in order to best interpret the results of your simulations.

Estimate Calculation
This deep-sea case study relates findings to a figure published in McClain and Hardy (2010). The figure, notably with a caption full of caveats, displays potential larval dispersal distances of deep-sea fauna based on two different potential deep-sea averaged current speeds derived from Havenhand et al. (2005). This study will use a range of three possible average current speeds as the estimates, after Ellett et al. (1986)

LDMs
Two LDMs were run in this study, each consisting of a single particle simulator paired with one of the two hydrodynamic models; additional details on all model algorithms and parameterizations are available in Supplementary Material S1.

Particle Simulator: Connectivity Modeling System (CMS)
The CMS was used as the particle simulator (hereafter "simulator") for both LDMs. There are many types of simulator available, but, without in-depth numerical modeling expertise, ecologists are likely to be limited to the use of offline simulators paired with the outputs from a hydrodynamic model (see Hilário et al.'s, 2015) supplementary table for a list of offline simulators and their compatibilities). The CMS is one such offline simulator. It is both freely available and designed especially with LDM in mind (v 1.1 1 ; Paris et al., 2013). This simulator has shown success in recent estimates of species connectivity (Wood et al., 2014;Ross et al., 2017;Baeza et al., 2019) as well as driving 1 https://github.com/beatrixparis/connectivity-modeling-system investigations of abyssal hydrodynamic transport (Van Sebille et al., 2013) among other studies. While it is easy to integrate biological data, this study uses the simulator in its simplest configuration simulating passive dispersal for the cleanest comparison (and acting as a pre-cursor to later more complex biologically parameterized simulations). An hourly particle tracking timestep was used as decided by model sensitivity testing (Ross et al., 2016), and positional outputs were recorded daily.

Hydrodynamic Model 1: POLCOMS
POLCOMS is a shelf and coastal model used in United Kingdom and Irish waters. This version comes from Plymouth Marine Laboratory, United Kingdom. It was previously used by the United Kingdom Met Office in weather forecasting -a fact which might recommend it above other models in this area (Holt et al., 2001;Wakelin et al., 2009) and has been extensively validated over the United Kingdom surrounding waters (Holt et al., 2005). The 1/6 • × 1/9 • (c. 12 km 2 ) resolution offers an eddy-resolving solution, however, it can only capture major eddies (c. 64 km in size based on needing six or more data points to adequately resolve an eddy; Lacroix et al., 2009) making this the coarser of the two models trialed. The model was run with 40 terrain following depth layers (sigma-levels) although outputs were interpolated to a z-level format (a list of set depth levels) using Matlab (v.R2013a) in order to make them compatible with the CMS. POLCOMS has been used in several dispersal studies to date (e.g., Lee et al., 2013;Phelps et al., 2015).

Hydrodynamic Model 2: HYCOM
HYCOM is a freely available global hydrodynamic model developed by the US Navy (Chassignet et al., 2007) 2 . It is uniquely set up to use a hybrid of water mass following, terrain following, and depth specific vertical layers, changing with the underlying topography, which may make it well suited to deep-sea studies. The outputs, however, are in the z-level format required by the CMS (but may lose some of the hybrid grid details in the reformatting). The 1/12 • resolution (c. 8 km × 4 km in the study area), allows smaller eddies (c. 48 km wide) to be captured than in POLCOMS, although this is still coarse relative to the multiple scales of oceanographic processes acting upon a larva. The global nature of HYCOM may be an upside for wide-ranging studies but is also a downside as the validation of the model was performed on a global scale and it may therefore not validate so well on a local scale (Fossette et al., 2012). HYCOM has already been used in multiple dispersal studies (Christie et al., 2010;Mora et al., 2011;Vasile et al., 2018), including in the deep-sea (Adams et al., 2011;Young et al., 2012;Ross et al., 2016Ross et al., , 2017.

Larval Releases
Theoretical "larvae" were released from three locations in the Rockall Trough in order to access different current regimes in the area: Rosemary Bank in the north, Anton Dohrn Seamount in the center, and Porcupine Bank in the south (see Supplementary Material S2 for exact positions). Releases were made from 16 positions per depth band from four depths (700, 1,000, 1,300, 1,500 m). Releases were made daily for 366 days from 4th January 2003 to 4th January 2004. All particles were tracked for 270 days in line with McClain and Hardy (2010), although daily positional outputs allow sub-setting of this PLD.
Simulations were run without additional subgrid-scale diffusivity parameters which would have added a random kick to particles at a regular timestep to represent unresolved small scale hydrodynamic processes. As this additional randomness would complicate interpretation of the difference in hydrodynamic model instruction, would require a different setting for each model, and would be subjectively chosen as a nest-wide parameter, these parameters were excluded from simulations. This decision is in line with the study undertaken by Shanks et al. (2003) in comparison to Siegel et al. (2003). As a result of excluding extra subgrid-scale diffusivity only one particle is released per day as simultaneous releases will follow identical tracks. Note, that, were this a species-specific study, we would advocate adding such subgrid-scale diffusivity parameters.
Neither of this study's hydrodynamic models supply vertical velocity fields (w) due to their large spatial domains so simulations in this case are effectively 2-dimensional.

Analysis
In order to perform a comparison meaningful to ecologists and marine managers, both distance and spatial predictions were analyzed. Ecologists often examine dispersal kernels (a probability distribution of dispersal distances) and the potential distance of larval dispersal (terrestrial examples, Hovestadt et al., 2001;Baguette, 2003;Nathan, 2006;and marine examples, Siegel et al., 2003;Cowen et al., 2007;McClain and Hardy, 2010;Nickols et al., 2015), while marine managers may require more spatially explicit descriptions examining whether Location X is connected to Location Y (Treml and Halpin, 2012;Anadón et al., 2013;Puckett et al., 2014).

Distance Comparisons
Distance comparisons were illustrated by converting larval fates into effective dispersal kernels. CMS outputs consisting of daily positions of each simulated particle were converted into straight line distance (SLD) from source, per day, in Matlab (version R2013a) using the Haversine formula to account for earth curvature. A median SLD per day was then calculated for each model, as well as per depth per model, and associated quartiles. Note the median was selected, as opposed to the mean, as it is more robust in the presence of outlier values when sample size is large enough. The result was plotted against the average speed 0.1 m s −1 line in the same format as the McClain and Hardy (2010) figure for ease of comparison.
The difference between SLDs was compared using repeated measures ANOVAs across time and within model-type to compute generalized effect sizes (η 2 G ) attributable to model differences (these are considered more valuable than p-values which can be overinflated for comparisons of model results; see White et al., 2014). In order to maintain balanced sample sizes, median SLDs represented the LDMs, together with the results from the slowest estimate (0.05 m/s) as it represents the prediction that is closest to the model simulations. Two ANOVAs were performed: one including all three methods (HYCOM, POLCOMS, Estimate), and another for the LDMs alone (HYCOM, POLCOMS), allowing the difference in generalized effect size ( η 2 G ) to quantify the influence of the Estimate upon model effect in SLD predictions over time. Gamma GLMs (selected due to a continuous positive left-skewed distribution, and after extensive testing of model types and transformations) were used in tandem with the R function "drop1" to test whether the difference in 4th root transformed SLD per day was predominantly due to the effect of hydrodynamic model, rather than depth or location. These analyses were undertaken for the full 270 day time frame, with further reference points after 35 and 69 days tracking which were discerned by Hilário et al. (2015) as the median and 75% quartile PLDs of all deep-sea and eurybathic species, where PLD is currently known (n = 92 species).

Spatial Comparisons
Larval fates were mapped to offer a means of qualitative and quantitative spatial comparison. While some qualitative assessment is made on the full scale of these predictions (Figure 1), the majority of these analyses were restricted to the POLCOMS domain for fairer comparison (see Figure 1; the prediction area with the most restrictive boundaries).
Maps were created in ArcGIS (version 10.3) using an Albers Equal Area Conic Projection with modified standard parallels (46 • N, 61 • N). Quantitative comparisons are based on rasters with a grid of constant 4 km 2 cell size (approximately half the HYCOM model resolution), applied across the POLCOMS domain. For each depth band, grid cells occupied by topography were removed resulting in the 2D maximal possible area of occupancy.
The estimate spatial predictions were mapped as a sphere of influence buffer zones with radius equal to the predicted dispersal distance. The major limitation of an estimate prediction is that it cannot easily be extrapolated into a probabilistic spatial prediction without a method to quantify the error caused by assuming a constant current direction. All areas within the buffer zone were therefore considered as a presence-only record. All estimate predictions extended beyond the POLCOMS domain (see Figure 1) and could therefore always be considered equivalent to the full 2D maximal possible area of occupancy, when making comparison within this domain.
The prediction from each LDM was mapped as a percentage track density per grid cell occupancy in order to provide a spatial "heat map" of dispersal. To achieve this, Matlab was used to convert CMS outputs into ArcGIS-compatible subset.csv files for conversion into line shapefiles and subsequently density rasters. Track density values were used for the quantitative LDM comparison, while the estimate vs. model comparison required a binary (presence only) comparison of occupied cells.
A cumulative cell by cell linear correlation coefficient computed in R offers a single correlation value as representative of the comparison between each prediction (a raster correlation). This was performed per release location, per depth, and summarized as an average correlation between the LDM models and any of the three estimates within the POLCOMS domain.
Additional qualitative assessments offer real world interpretations of potential connectivity between sites. These are relevant to how usefully similar predictions may be to marine management and conservation.

LDMs vs. Estimate
Both in terms of distance and spatial dispersal patterns, the estimate predictions were the least conservative and specific, with both modeled predictions being considerably more retentive and spatially targeted (Figures 1, 2).

Distance
Plots of median dispersal distance over time show a clear difference between the two LDM predictions and the range of estimates (Figure 2A). Tests of model type influence on SLD showed a 10-20% increase in variance explained, even when only the slowest estimate predictions (0.05 m/s) were included ( η 2 G , Table 1). The estimates offered the least conservative dispersal distances, being 1-7x the median LDM predicted dispersal distance from day one, scaling to 2-15x at 35 days, 2-19x at 69 days, and ending at 5-35x at the full 270 days tracking.
However, there was reasonable convergence between the 0.05 m/s estimate and the HYCOM 700 simulation up until day 21 (Figure 2B), showing that estimates may still be useful for estimating dispersal distances for species with shorter PLDs.

Spatial -Correlation
The HYCOM LDM spatial predictions (as compared only within the POLCOMS domain) were the most similar to any estimate, although the similarity was still less than 0.5 (0.44 across all depths and locations; Table 2). The correlation between the estimates and POLCOMS LDM was very weak averaging 0.17 across all depths and locations. Across all depths and locations the correlation between estimated and modeled spatial extents was max. 0.67 (HYCOM LDM, Porcupine Bank, 700 m simulations), and min. 0.06 (POLCOMS LDM, Rosemary Bank, 1,500 m simulations).

Spatial -Qualitative
Qualitative spatial comparisons between the estimates and LDM predictions further emphasize the differences that could be encountered. Coarsely the 0.05 m/s estimate shows the greatest similarity to the LDM predictions (Figure 1), possibly being good enough if considering area of influence on the scale of the North Atlantic, but having an inadequate level of specificity to determine management measures within the study region.
However, if the faster estimates were to be chosen, then their spheres of influence, even considered on the scale of the North Atlantic, would be wildly different to LDM predictions (Figure 1, inset). The 0.1 m/s estimate would suggest a sphere of influence extending as far as North Africa, Svalbard, and Western Greenland, while the 0.2 m/s may suggest connections across the majority of the North Atlantic (Figure 1).
Focusing on the 0.05 m/s estimate (the estimate with the greatest similarity to the LDMs), the intra-study-region comparison between the estimate and each of the two LDMs separately, shows that POLCOMS LDM is the most different to the estimate. Even though the POLCOMS domain is the most restricted, there is a large area within that domain that remains untouched by POLCOMS dispersal pathways. For example, none of the POLCOMS releases connect to much of the Hatton Rockall Basin, the north side of Hatton Bank, or south beyond the Whittard Canyon in the Bay of Biscay (Figure 1).
Maps per model, location, and depth within the POLCOMS domain (Figures 3-5) allow visualization of more detailed comparisons. For example, Porcupine Bank simulations (Figure 5) suggesting no connection to Rosemary Bank at 1,300 and 1,500 m in either model, while even the 0.05 m/s estimate would comfortably make that distance. This would make a difference to a marine manager who might want to know whether known fauna at 1,500 m depth on Porcupine Bank can reach a protected area at Rosemary Bank: an estimate would say "yes, " and an LDM would say "no." Within the POLCOMS domain, the low correlation between any estimate and the POLCOMS LDM in particular can be exemplified by Figure 3. Here the estimate might expect connections between Rosemary Bank and anywhere in the Rockall Trough and Bay of Biscay to the south, while the POLCOMS LDM suggests that larvae may not even reach neighboring Rockall Bank in the west at any depth. Therefore, a marine manager asking whether there would be a dispersal connection between Rosemary Bank and Anton Dohrn Seamount would be told "yes" from an estimate, and "no" from a POLCOMS LDM.

HYCOM LDM vs. POLCOMS LDM
Generally, the two LDMs tested in this study give notably different predictions of dispersal, displaying differences in distance, spread, and, in some cases, direction of travel. Figure 2A shows the lower median dispersal distances and much larger interquartile range of the POLCOMS predictions when compared to HYCOM (ANOVA η 2 G = 0.25-0.34, i.e., 25-34% of variance is explained by model type between days 35 and 270, Table 1). GLMs confirmed that the effect of the model was greater than both depth and location for all PLDs tested (days 35, 69, and 270; see Supplementary Material S3, and bear in mind the advice of White et al., 2014). Plots of median dispersal distance per depth (Figure 2B) demonstrate that the shallowest (most dispersive) simulations in the POLCOMS model on average travel less far than the deepest (least dispersive) simulations in the HYCOM model.

Spatial -Correlation
The correlation between the track density maps of each LDM was generally low ( Table 2,    correlations which are also shown in Table 2 and are still low: max 0.52 (Rosemary Bank 700 m), min 0.18 (Rosemary Bank 1,500 m), av. 0.36.

Spatial -Qualitative
Generally, Rosemary Bank simulations were the most dissimilar (Figure 3). For example, while the 1,500 m Rosemary Bank simulations in HYCOM suggest connection southwards to most of the eastern flank of Rockall Bank, POLCOMS predicts a relatively small dispersal range suggesting there may be no connection to Rockall Bank at all.
Of most concern is when the two models disagree in the direction of dispersal. In Rosemary Bank 1,300 m simulations, HYCOM show the "highways" of high track density extending west down the eastern flank of Rockall Bank, while POLCOMS extends down the east of the Rockall Trough following the continental slope. Indeed in 1,000 m Rosemary Bank simulations, HYCOM larvae travel North, while POLCOMS larvae travel South.
By contrast, the results from Anton Dohrn Seamount (Figure 4) are more similar, with all "highways" generally extending north-east toward Rosemary Bank in both HYCOM and POLCOMS simulations. Yet if a marine manager were to ask whether larvae from Anton Dohrn reach the Darwin Mounds to the north-east, HYCOM would say "yes" and POLCOMS would say "no." Simulations from Porcupine Bank (Figure 5) might indicate a broad agreement that larvae will eventually reach the southern Rockall Bank, but the less direct "highways" in the POLCOMS model might reduce chances of larvae getting that far.

DISCUSSION
This study explored the value of larval dispersal predictions from LDMs by considering two questions.

Will LDMs Give a Notably Different Result to an (Informed) Estimate?
Our results agree with Shanks (2009) and suggest that yes, there can be a large difference in the predicted distance, area, and specificity of estimated and modeled dispersal patterns. There could therefore be a distinct advantage in going to the effort of modeling predictions, provided that models are shown to adequately approximate realistic distances better than the estimate. Although, were the study focus to be on a larger area (e.g., the North Atlantic, as mapped in Figure 1, inset), then the predictions given by a conservative estimate (here 0.05 m/s) and applied to sub-regions could be reasonably used to show local dispersal ranges. Binary correlations are between presence only grids, while the track density correlation between LDMs are sensitive to the full spatial spread as well as the locations of "dispersal highways." Minimum values are highlighted in underlined italics, and maximum values in bold.
FIGURE 3 | Maps per depth band of predicted larval dispersal as simulated from Rosemary Bank. Simulations from HYCOM and POLCOMS models are displayed as track densities delineating between occasional and persistent pathways of dispersal. All estimate prediction areas fill the POLCOMS domain but differ between maps due to the 2D nature of simulations excluding areas of raised topography. Spatial correlations were conducted comparing the extent of modeled and estimated predictions and the extent and density information of each modeled prediction ( Table 2). [All maps were created in ArcGIS 10.3 (http://www.esri.com) with GEBCO 30 arc-second topography, available from www.gebco.net, and projected using Albers Equal Area Conic with modified standard parallels and meridian (sp 1 = 46 • N, sp 2 = 61 • N, m = 13 • W)].
One important note is to consider the range of current speed values we used for the estimates (as based on Ellett et al., 1986): the value with the greatest congruence to model simulations was the value least likely to be chosen to represent our study's depth range. This study was focussed on depths between 700 and 1500 m so could reasonably have excluded the 0.05 m/s value which was recorded as the average for depths > 1750 m. The 0.1-0.2 m/s values that were associated with our studied depths showed vast overpredictions relative to the models, and highlights an issue of either gross overprediction from the estimates, or underprediction in the models. Only biological (genetic/geochemical) groundtruthing can provide "true" values to compare to, and these must be applied to species-specific models, not theoretical generalized models such as those undertaken here.
This study may also suggest that for deep-sea species the differences in predicted dispersal distance between models and estimates may be even more pronounced than in shallow water. As the divergence in predicted distance between estimates and models increased exponentially with tracking time (up to a maximum 34-fold difference, Figure 2), species with longer PLDs, such as those in the deep sea (Hilário et al., 2015), may show even greater disparity between estimated and modeled predictions.
However, there may have been better congruity between estimated and modeled predictions if simulations were undertaken in the open ocean. If any of the estimates were similar to the modeled predictions this would suggest that the model simulates currents with fairly straight trajectories and constant speeds: something more likely to occur on a relatively featureless abyssal plain at 5,000 m (but see Gardner et al., 2017). The complex topography of the Rockall Trough induces a lot of mesoscale activity (Holliday et al., 2000) which likely promotes greater local retention, and therefore differences in modeled predictions. Alternatively estimates could be made to better approximate what the models include; either by following the topography to account for the distance added by including depth, as is the case in 3D models, or by at least accounting for FIGURE 4 | Maps per depth band of predicted larval dispersal as simulated from Anton Dohrn Seamount. Simulations from HYCOM and POLCOMS models are displayed as track densities delineating between occasional and persistent pathways of dispersal. All estimate prediction areas fill the POLCOMS domain but differ between maps due to the 2D nature of simulations excluding areas of raised topography. Spatial correlations were conducted comparing the extent of modeled and estimated predictions and the extent and density information of each modeled prediction ( Table 2). [All maps were created in ArcGIS 10.3 (http://www.esri.com) with GEBCO 30 arc-second topography, available from www.gebco.net, and projected using Albers Equal Area Conic with modified standard parallels and meridian (sp 1 = 46 • N, sp 2 = 61 • N, m = 13 • W)].
topographic barriers at the simulated depth, which would be a closer approximation to this study's 2D simulations.
It is also important to mention that all the biological and ecological complexities (e.g., larval buoyancy, behavior, swimming speeds, growth, photo taxis, feeding methods, mortality, habitat selection, etc.) that can be simulated within an LDM, and that may have a large impact on larval dispersal patterns (Metaxas and Saunders, 2009), cannot necessarily be accounted for by using an estimate. These were excluded from the LDMs in this study, as they would only complicate the picture when trying to understand the impact of hydrodynamic model choice, but we absolutely advocate their inclusion in applied species-specific modeling studies and have done so in our own (e.g., Ross et al., 2017). On a coarse scale, inclusion of these characters in our LDMs may actually have increase the congruence with the slowest estimate, allowing larvae to bypass topographic barriers, potentially extending tracks north east toward Norway (see Ross et al., 2017). However, if focussed within our study region, again the specificity offered by an LDM may be of more value (again provided that those predictions are correct).
Are Two LDMs, Driven by Different Purpose-Selected Hydrodynamic Models, Cross-Validating, and Therefore of Some Value Prior to Targeted Groundtruthing?
Broadly, while some local comparisons may be cross-validating (e.g., in Anton Dohrn simulations), in this study the different hydrodynamic models also gave some contradictory predictions. Indeed, the variability in the predictions suggests that the potential for error within LDMs may be larger than previously recognized in ecology and conservation -a variability that cannot be apparent when modeling with only one hydrodynamic model. This result emphasizes that in all areas where the models disagree, there can be no trusted consensus until targeted groundtruthing takes place, and that the un-groundtruthed LDM outputs must not act as a basis for decision-makers before they have either been thoroughly assessed, or a groundtruthed consensus can be reached.
Broadly our result agrees with a comment from Bode et al. (2018) which showed that a re-run of a study by Hock et al. (2017) FIGURE 5 | Maps per depth band of predicted larval dispersal as simulated from Porcupine Bank. Simulations from HYCOM and POLCOMS models are displayed as track densities delineating between occasional and persistent pathways of dispersal. All estimate prediction areas fill the POLCOMS domain but differ between maps due to the 2D nature of simulations excluding areas of raised topography. Spatial correlations were conducted comparing the extent of modeled and estimated predictions and the extent and density information of each modeled prediction ( Table 2). [All maps were created in ArcGIS 10.3 (http://www.esri.com) with GEBCO 30 arc-second topography, available from www.gebco.net, and projected using Albers Equal Area Conic with modified standard parallels and meridian (sp 1 = 46 • N, sp 2 = 61 • N, m = 13 • W)].
with an equivalent set up but using a different hydrodynamic model might have recommended different reefs for protection. They caution that biophysical models may be too immature to provide advisory results, and suggest that ensemble models may be a way to reach a conservative consensus in the meantime.

Model Differences
While we are not going to provide any criticism or endorsement for either model (as there are study-specific reasons for choosing one over another), we can offer some limited analysis and advice to aid model selection and interpretation in the future. Remember, however, that different applications may warrant different choices, even if your study is in the same region and depth range as ours.
In this case, there are three hydrodynamic model parameters that are worth highlighting and which may account for the differences in LDM predictions.
First, there are differences in the scales of validation between the models, but also in the relevance of these validations for dispersal modeling purposes. Despite both models being published and validated (Holt et al., 2001(Holt et al., , 2005Chassignet et al., 2007), HYCOM was assessed on a global scale and therefore may potentially be less locally reliable. However, neither model was validated for the purpose of larval dispersal modeling which may place greater weight on, for example, current directions and strengths than heat exchange and mixed-layer behavior. This makes it hard to use a model's published validation status to judge whether the model is fit for purpose and recommends that study specific validation is vital, starting with a comparison to observational oceanography in the area (Vasile et al., 2018). In this study, for example, the southward trajectories of POLCOMS larvae down the eastern side of the Rockall Trough from Rosemary bank at 700-1,300 m (Figure 3) are contrary to the observations of northward transport down to 1,000 m, and below that southward transport down the western side of the Trough (Holliday and Cunningham, 2013;Holliday et al., 2015). However current speeds simulated in each model are different, with velocities in HYCOM being twice those in POLCOMS, although both fall within the range of observed current speeds recorded in the shelf edge current (10-21 cm s −1 ) (White and Bowyer, 1997). Note that both models will suffer from many other errors including (but not limited to) currents that are too fast due to the exclusion of tides (Müller et al., 2010), coarse bathymetry that may exclude hydrographically influential features (Sandwell et al., 2014), and no representation of possible benthic storms which may divert dispersal pathways (Harris, 2014). Only targeted groundtruthing can quantify the error margins and clarify whether one model is more representative than the other for this purpose, and indeed they may each prove to have areas of accuracy at different depths or locations (Vasile et al., 2018).
Second, Spatial and temporal resolution has been shown to make a great difference in whether a model represents realistic trajectories or not. Putman and He (2013) advocate using the highest resolution model you can find, summarizing that model choice must aim to preserve physical processes on the scale tens of kilometers and days (respectively). Although both models may comply with this broad advice, HYCOM is still more highly resolved than POLCOMS. In this study area, while the major eddies may be over 100 km in diameter (Sherwin et al., 2015), there are still some influential semi-permanent features of 50-60 km in diameter (Booth, 1988;Ullgren and White, 2010). Given a rule-of-thumb that six data points are required to make an eddy (Lacroix et al., 2009), POLCOMS may omit these smaller eddies (min. eddy size of ∼64 km in diameter), while HYCOM may be capable of capturing them (min. ∼48 km in diameter). This difference in horizontal resolution may account for some of the difference in trajectory direction and tortuosity between the models. Consequently, we recommend that minimum resolution choice could be based on the size of permanent eddies in the study region (if that information is known).
Third, and possibly least obviously to ecologists, differing algorithms for error handling may be responsible for the more diffuse trajectories in the HYCOM model. The horizontal pressure gradient error stems from the issue of numerically interpreting flow around steep discretized (pixelated) topography and can result in perpetuated errors throughout the water column. This issue is handled in both models but using different approaches (see Supplementary Material S1 for model approaches). A representation of this can be seen in Supplementary Material S4, where plots of current ellipses per model, per depth, can help highlight these differences: the POLCOMS model shows less variable current direction and speed (smaller ellipses) and s tight shelf edge current, suggesting a stricter handling of these errors, and resulting in a less diffuse spread of particle trajectories. Accounting for this difference during model choice is more problematic for anyone who is not a numerical oceanographer, but highlighting its effect here may offer a means of recognition and inform interpretation of model predictions.

Groundtruthing
Groundtruthing should be regarded by all modelers as essential, and were the models found to be similar it would not have supplanted this necessity but could have lent some credence to modeled outputs before groundtruthing data became available.
As it stands, however, the tested models could not be used interchangeably without consequence to ecological or marine management conclusions [e.g., whether Rosemary Bank was connected to south-east Rockall Bank at 1,300 m (Figure 3)]. Hence the next step must be to identify whether one model is more accurate than the other.
This study was entirely theoretical, for the purposes of hydrodynamic model comparison, and was therefore not tailored to any specific species (with biological characters as mentioned before). The only groundtruthing relevant to this study is therefore to use oceanographic observations. We have offered some provisional qualitative comparisons in section "Model Differences" but this could be taken further if truly assessing the hydrodynamic models, using chemical tracer data (e.g., Lavelle et al., 2010) and argo floats (e.g., Speer and Thurnherr, 2012). However, if this were a species-specific study then we would also have the option to undertake biological groundtruthing using sampled genetics or geochemical signatures [something that is rarely available but offers a next step for such studies (Levin, 2006)]. There majority of success stories to date have compared LDMs and seascape genetics (Foster et al., 2012;Sunday et al., 2014;Baeza et al., 2019).
Once groundtruthed, an LDM could be incredibly useful across disciplines and purposes, allowing subsequent simulations in the same region to be run and trusted for multiple species provided that similar oceanographic features are important to larval fates. New species predictions would then be able to rule out hydrodynamic sources of error leaving biological components as the main areas requiring groundtruthing in the future. It is therefore advisable to build the first species LDMs in a region upon the species with the greatest amount of data available for both model parameterization and groundtruthing. Once completed and tested, LDMs for other species can be created, safe in the knowledge that error due to hydrodynamic model choice is now quantified and controlled for.

Advice for Marine Managers and Ecologists
In this case, the variation in direction of dispersal makes it unwise to rely upon these predictions until they have been groundtruthed. However, were the differences only in speed and spread, these predictions may have been more useful, allowing interpretation relative to appropriate precautionary principles for the issue being considered. For example, MPA network design may wish to accommodate the most conservative predictions of dispersal to ensure that the larvae from a protected population can reach the next protected area (in this instance that would be POLCOMS predictions). Meanwhile if you were estimating invasive species spread, you may wish to default to a less conservative estimate (here HYCOM).
The variability between models also advocates the interpretation of LDM results as probabilistic (i.e., possible) rather than deterministic (i.e., true). Practically this may be translated as looking at the high density "highways of dispersal" which had some localized consensus between models, so these could be interpreted as the more likely pathways of dispersal, with lower density predictions being thought of as uncertain.
In summary, LDMs will have a place in marine ecology and conservation and offer a great improvement on informed estimates of dispersal potential, however, the hydrodynamic models they are based on can be very variable in their predictions, so should always be assumed to need some level of study-specific groundtruthing prior to relying upon predictions for management decisions and ecological theories. Utilizing local oceanographic observations and model comparisons can indeed offer some basic means of quantifying the uncertainty in model predictions to improve trust, but future comparison to population genetics, geochemical isotope tracers, or study-targeted groundtruthing data must still be considered essential.

DATA AVAILABILITY STATEMENT
The datasets generated for this study are available on request to the corresponding author.

AUTHOR CONTRIBUTIONS
KH and AN-S acquired the funding. RR, AN-S, and KH conceived the study. RT supplied the model data. RR analyzed the data. AN-S and KH provided advise and supervision. RR, AN-S, RT, and KH wrote the manuscript and interpreted the analysis.