Good Practices for Species Distribution Modeling of Deep-Sea Corals and Sponges for Resource Management: Data Collection, Analysis, Validation, and Communication

Resource managers in the United States and worldwide are tasked with identifying and mitigating trade-offs between human activities in the deep sea (e.g., fishing, energy development, and mining) and their impacts on habitat-forming invertebrates, including deep-sea corals, and sponges (DSCS). Related management decisions require information about where DSCS occur and in what densities. Species distribution modeling (SDM) provides a cost-effective means of identifying potential DSCS habitat over large areas to inform these management decisions and data collection. Here we describe good practices for DSCS SDM, especially in the context of data collection and management applications. Managers typically need information regarding DSCS encounter probabilities, densities, and sizes, defined at sub-regional to basin-wide scales and validated using subsequent, targeted data collections. To realistically achieve these goals, analysts should integrate available data sources in SDMs including fine-scale visual sampling and broad-scale resource surveys (e.g., fisheries trawl surveys), include environmental predictor variables representing multiple spatial scales, model residual spatial autocorrelation, and quantify prediction uncertainty. When possible, models fitted to presence-absence and density data are preferred over models fitted only to presence data, which are difficult to validate and can confound estimated probability of occurrence or density with sampling effort. Ensembles of models can provide robust predictions, while multi-species models leverage information across taxa, and facilitate community inference. To facilitate the use of models by managers, predictions should be expressed in units that are widely understood and validated at an appropriate spatial scale using a sampling design that provides strong statistical inference. We present three case studies for the Pacific Ocean that illustrate good practices with respect to data collection, modeling, and validation; these case studies demonstrate it is possible to implement our good practices in real-world settings.


INTRODUCTION
Deep-sea corals and sponges (DSCS) are among the longestliving sessile marine organisms and are important biogenic components of habitat in many marine ecosystems (Roberts et al., 2009;Buhl-Mortensen et al., 2010;Hogg et al., 2010;Rossi et al., 2017). They are a diverse group of habitat-forming invertebrates spanning two phyla and exhibiting numerous growth habits; some species reaching meters in height and others forming large reefs (Roberts et al., 2009;Maldonado et al., 2017). Like their tropical counterparts, DSCS create hotspots of diversity by providing structure and refuge for many invertebrates and fishes, especially when forming dense aggregations (Stone, 2006(Stone, , 2014Buhl-Mortensen et al., 2010;Baillon et al., 2012). DSCS are suspension feeders, making them important contributors to carbon and nutrient cycling in the deep ocean (Cathalot et al., 2015;Maldonado et al., 2017), but are fragile and slow-growing, making them vulnerable to human impacts like fishing, deep-sea mining, offshore oil and gas development, etc.
Due to their vulnerability and the many ecosystem functions they provide, DSCS have received increasing attention from conservationists and resource managers worldwide. Internationally, they have been identified as key indicator taxa for vulnerable marine ecosystems [FAO (Food and Agriculture Organization of the United Nations), 2009] and ecologically or biologically significant marine areas [Convention on Biological Diversity (CBD), 2010]. In the United States (U.S.), several laws provide the authority to regulate offshore activities (e.g., fishing, energy development, and leasing) that might damage environmentally sensitive seafloor habitats. For example, in 1984 the U.S. established the Oculina Banks Habitat Area of Particular Concern off Florida, the first area specifically designed to protect deepwater coral reefs from fishing impacts. Since 2005, deep-sea coral conservation efforts have accelerated across the U.S. Exclusive Economic Zone (EEZ), with a focus on protecting seafloor habitats from impacts of fishing gear (Figure 1; Hourigan et al., 2017). These area-based gear restrictions included large precautionary closures in deep water designed to freeze the footprint of bottom trawling (Figure 1), the type of fishing usually considered the most substantial threat to DSCS Rooper et al., 2017b). This approach has prevented the expansion of the most damaging fishing into deeper waters (Hourigan, 2014), but also means that further conservation gains will require more targeted information and management within the footprint of existing fisheries. The National Oceanic and Atmospheric Administration's (NOAA) Deep Sea Coral Research and Technology Program (DSCRTP; https://deepseacoraldata.noaa.gov/), in partnership with other NOAA offices, federal agencies, and academic and stakeholder groups funds and coordinates research and supports the development of management measures to protect DSCS.
Species distribution modeling (SDM) is a technique for quantifying species-environment relationships and applying those relationships to predict and map the abundance or habitat of species of interest (e.g., Guisan et al., 2017). SDMs use biological and environmental data as input and produce distribution maps at spatial scales relevant to management. The majority of existing DSCS SDMs are "presence-only" models, often applications of maximum entropy (MaxEnt) and ecological niche factor analysis (ENFA) (Figure 2; Vierod et al., 2014;Guinotte et al., 2017). These models make use of the most commonly available type of biological data for DSCS: locations of observed occurrences of individual taxa. The maps produced from presence-only models indicate where suitable habitat is predicted to be more or less likely. These maps have been used to guide targeted field surveys in underexplored areas (Georgian et al., 2014), to inform management decisions made by regional fisheries management organizations (Vierod et al., 2014;Georgian et al., 2019) and U.S. regional fishery management councils (Kinlan et al., 2020), to examine potential environmental covariates of DSCS habitat , and to provide information to the U.S. Bureau of Ocean Energy Management for its environmental compliance and leasing decisions (Bauer et al., 2016). However, presence-only models have limitations (e.g., lack of data on sampling effort) that reduce their usefulness for effective management of DSCS. Alternative SDM techniques exist that can overcome these limitations and have been applied to DSCS (Figure 2), but these techniques generally require additional types of biological data (e.g., absence or density data).
To better manage human activities impacting DSCS, managers and analysts would benefit from an improved understanding FIGURE 1 | Restrictions on bottom-trawling, usually considered the most substantial threat to deep-sea coral and sponge habitat, in the U.S. exclusive economic zone (EEZ) as of January 2020. Protections are illustrated by year of implementation and grouped as taking effect before 1996 (the year essential fish habitat was introduced; red polygons), or between 1996-2000 (dark blue), 2001-2005 (yellow), 2006-2010 (green), 2011-2015 (light blue), or 2016-2019 (purple). The Pacific Region map legend includes 2020 because the Pacific Fishery Management Council approved trawl restrictions in 2018 that were enacted on 1 January 2020. Note the different geographic scales used for each region. The New England and Gulf of Mexico Fishery Management Councils approved additional protections in 2018 that were yet to be enacted by the time of publication, so are not shown here.
of current data limitations, preferred SDM approaches, and future data needs. In this paper, we describe good practices for the data and methods needed to inform SDMs of DSCS distributions for management purposes. We also present three case studies from the Pacific Ocean, including two from U.S. waters, that highlight some of the challenges associated with these data and methods. Our good practices are not meant to be a "one size fits all" solution. Individual scientific and management needs, data availability, intended application, funding availability, and field logistical constraints may require individualized approaches to data collection and modeling. Our interest and objective is to provide "good practices" to help guide data collection and analysis by the DSCS research and management community. Note that some studies employed more than one model type, so the total number of models is greater than the number of studies.

GOOD PRACTICES
Our good practices are first presented for biological and environmental data included in DSCS SDMs followed by analytical methods and then management applications. Case studies that illustrate our good practices (Table 1) are presented in the next section.

Biological Data
Because the logistical and technical complexities and expense of operating at deep depths are limiting factors for collecting observations of deep-sea biota, DSCS SDMs built at global and regional scales typically include biological data from a range of sources (Vierod et al., 2014;Anderson et al., 2016a;Rooper et al., 2019). Many of the earlier DSCS SDMs relied on presence data compiled from historical records of DSCS observations, including from existing databases, museum collections, and cruise reports (Davies and Guinotte, 2011;Vierod et al., 2014). More modern biological data available to DSCS SDMs have been obtained at finer scales from georeferenced videos or photos collected during seafloor surveys (e.g., manned submersibles, autonomous underwater vehicles, and remotely operated vehicles) or from fisheries trawl surveys (Vierod et al., 2014). Many of these data have been reported only as DSCS presences, sometimes because of concerns over the ability to confirm the complete absence of a species in an observation (Vierod et al., 2014). However, because of the limitations of SDMs built with presence-only data (described in Section Analysis), we recommend that DSCS biological data from fine-scale surveys be recorded to quantify presence-absence, abundance, biomass, or density (abundance or biomass per unit area) (Case Studies 1-3) with a measure of effort for each sampling unit (e.g., area surveyed). From a practical standpoint, resource managers are most interested in identifying areas of potential high density or diversity of vulnerable DSCS rather than simply presence. Both density and size of DSCS are likely ecologically important factors for habitat use by other species (e.g., Case Study 1 and also Du Preez and Tunnicliffe, 2011). When the only data available are presence, analysts should attempt to recover or infer the distribution of sampling effort and the locations of absences if feasible by re-analysis of existing data, recognizing that this will increase the time required for analysis. For example, absence "observations" along a transect can sometimes be recovered from existing data recordings, or can be inferred from locations where other species were recorded as present during the same surveys (e.g., Case Study 3 and also Isaac et al., 2014). Future sampling programs should record biological data at the highest taxonomic resolution possible. Models developed using data with low taxonomic resolution may mix species with very different life-histories and environmental requirements, resulting in overly broad predicted distributions and potentially increased model uncertainty. Models of functional groups or otherwise reduced taxonomic resolution may be sufficient to address some management applications, but in some cases models may be needed for specific taxa like species of concern (e.g., endangered species). We recognize that the identification of DSCS species, especially from images alone, can be difficult and their taxonomy is an active area of research, so these issues can be challenging to the development of models with high taxonomic resolution.
Most DSCS distribution modeling currently relies on existing data collected for other purposes. Given the increasing use of SDMs for management, future biological data collection should be designed to inform improved models including increased attention to statistically-robust survey design (Williams and Brown, 2019), along with measures of abundance, density, size, and survey effort. A centralized DSCS data repository, akin to the DSCRTP data portal (https://deepseacoraldata.noaa.gov/), hosting DSCS data collected and analyzed with these suggested data standards would facilitate SDMs to inform management.

Environmental Data
We focus on SDMs that integrate DSCS biological data with environmental data as covariates. Informative covariates often include measures of depth, seafloor terrain (e.g., slope, aspect, curvature, bathymetric position, ruggedness), and substrate properties (e.g., sediment composition), usually derived from remotely sensed acoustic bathymetry and backscatter data or geological samples (Wilson et al., 2007;Brown et al., 2011;Vierod et al., 2014;Rooper et al., 2016;Guinotte et al., 2017). Oceanographic properties (e.g., water temperature, chemistry, and current speed and direction) derived from field samples, remotely sensed data, and ocean models can also be informative covariates (Davies and Guinotte, 2011).
It is important to consider whether environmental covariates are available at spatial scales and resolutions that are relevant to management needs and to the ecological patterns and processes of interest (Davies and Guinotte, 2011). Mismatches in spatial scale and resolution between environmental covariates and management needs risk a loss of information necessary for effective management. Mismatches in spatial scale and resolution between environmental covariates and important ecological processes risk the failure to detect important relationships and can compromise the accuracy of predicted species distributions. The relevant ecological spatial scale may vary by DSCS species or functional group, so we suggest analysts consider environmental covariates at multiple spatial scales, but first use a hypothesisdriven approach to decide which scales may be most appropriate to include.
Spatial resolution and accuracy are important considerations in determining which bathymetry data to use in DSCS SDMs. Earlier DSCS SDMs built at global and basin scales typically included depth and seafloor terrain covariates derived from coarse bathymetry data from satellite altimetry or from compilations of hydrographic data. However, at coarse resolution these covariates may not resolve fine-scale seafloor features that provide habitat for DSCS. We suggest that DSCS SDMs include depth and seafloor terrain covariates derived from multibeam acoustic bathymetry data (that are collected at International Hydrographic Organization standards when possible), followed by regional (e.g., Zimmermann et al., 2013) and basin-scale [e.g., GEBCO (General Bathymetric Chart of the Oceans) Compilation Group, 2019] bathymetry compilations with rigorous data assembly methods, followed by bathymetric models and opportunistic measurements (e.g., Olex software 1 ; Jakobsson et al., 2012). We suggest researchers evaluate whether the spatial resolution of the bathymetry data and consequently derived seafloor terrain metrics will capture habitat processes that 1 http://www.olex.no/ are ecologically important to DSCS. Certain terrain metrics, such as ruggedness measures, should not be derived from compilations of bathymetry data sets of varied quality and resolution.
High resolution multibeam acoustic seafloor scattering strength (backscatter) can be applied directly as a covariate or to derive measures of angular response, and seafloor properties such as hardness, ruggedness, and substrate composition (e.g., Brown et al., 2011). Backscatter data have been applied to distinguish (Weber et al., 2013) and model (Pirtle et al., 2015) the spatial extent of trawlable and untrawlable seafloor types where DSCS occur and in DSCS SDMs (Dolan et al., 2008;Buhl-Mortensen et al., 2012;Rowden et al., 2017). Backscatter is a relative measure, and compilations of backscatter from multiple surveys should generally only be used in SDMs when the acoustic surveys have collected backscatter using frequencies that do not differ by more than one octave (Hughes Clarke, 2015), and when the surveys have been relatively calibrated across years, platforms, and sensors (Lurton and Lamarche, 2015).
Oceanographic data include biological, chemical, and physical oceanographic properties, such as surface chlorophyll concentration, oxygen and aragonite saturation, temperature, salinity, current speed and direction, and turbidity. Oceanographic data collected during surveys or modeled based on survey data [e.g., Regional Ocean Modeling Systems (ROMS); (Hermann et al., 2016;Fiechter et al., 2018)] are generally of much coarser resolution than seafloor mapping data (km vs. m) and sometimes require interpolation to the spatial extent of management areas (e.g., Brown et al., 2011). These oceanographic covariates have been useful to model and predict DSCS distribution and habitat (Georgian et al., 2014(Georgian et al., , 2019Rooper et al., 2014Rooper et al., , 2017aBargain et al., 2018).
We suggest that analysts fit multiple SDMs with different environmental covariates at multiple scales to identify the best performing model (e.g., Wilson et al., 2007;Pirtle et al., 2019;Weijerman et al., 2019;Dove et al., 2020), and only retain the important environmental covariates at the appropriate scales in SDMs (Case Studies 1-3). From an ecological modeling perspective, it is also prudent to consider only those covariates that have realistic direct or indirect linkages to biological and ecological processes that would be expected to influence DSCS distribution (Case Studies 1-3). It is important that researchers factor in the potentially extensive time required for literature review, synthesis, and expert involvement for covariate development during the planning phase of a modeling project.

Presence-Only Models
Presence-only models, like MaxEnt and ENFA, have been the most commonly employed SDM techniques for DSCS (Figure 2; Vierod et al., 2014;Guinotte et al., 2017). These methods have proven useful for some applications such as guiding surveys for exploration (Georgian et al., 2014), contributing to the development of conservation areas (Kinlan et al., 2020), and providing a foundation for subsequent higher-resolution modeling with alternative methods (Case Study 3). Indeed, presence-only models are the only option when only presence data are available. However, presence-only models have several disadvantages that are not conducive to effective management of DSCS. The major limitation of presence-only models is that sampling effort and resource density are confounded when the former is not appropriately represented in the model (Peel et al., 2019). Presence-only models can mistakenly identify wellsampled areas with many presence observations as areas with greater densities in contrast with less-sampled areas with fewer presence observations, even if densities are similar between areas . That being said, presence-only models can accommodate and correct for independent information about the distribution of sampling effort when it is available (e.g., MaxEnt; Elith et al., 2011), and several approaches have been used to attempt to account for sampling effort when direct information is not available (Merow et al., 2016;El-Gabbas and Dormann, 2018). Another limiting issue with presence-only models is that they do not appropriately account for the effect of sampling effort on estimates of model uncertainty (Renner et al., 2015). Finally, it is not clear how to generalize presence-only models. Predictions from these models are usually in relative terms and the predicted quantities (e.g., relative habitat suitability) can be difficult to interpret and validate. As a result, inference across species and models is challenging, inhibiting the use of presenceonly models for understanding community structure and DSCSfish associations.

Preferred Modeling Frameworks
Given the limitations and challenges associated with presenceonly models, we recommend using alternative modeling frameworks when possible (Case Studies 1-3). When data types other than presence are available (e.g., presence-absence, count, biomass), analysts should preferentially employ these data types to develop DSCS SDMs that produce predictions that are straightforward to interpret and validate (e.g., probability of occurrence for presence-absence data). Use of absence, count, or biomass data will naturally account for the distribution of sampling effort in the model, by including data from areas where taxa were detected and where they were not. It is also important to account for the amount of sampling effort (e.g., area viewed, trawl swept area) represented by each data replicate in a model, either by expressing the modeled response data (e.g., biomass) per unit of effort, or, in the case of presence-absence and count models, by including an effort "offset" in the model. Modeling approaches such as generalized linear models (GLMs), generalized additive models (GAMs), boosted regression trees (BRTs), and random forest models (RFs) accommodate these data types and model features.

Integration of Multiple Datasets and Types
In some cases it may be appropriate to combine multiple biological datasets. A common issue with DSCS sampling data is that the spatial footprint of an individual sampling program often does not cover the entire geographic area of interest to management or the geographic range of the species. A solution to this issue is to fit SDMs to data collected by multiple sampling programs (Case Study 1). Differences in detectability among sampling programs can be accounted for through inclusion of a "catchability" covariate in the SDMs (Grüss et al., 2018). If different data types (e.g., presence-absence, count, and biomass) are available, recent developments allow for the implementation of "data-integrated" SDMs (Fletcher et al., 2019;Grüss and Thorson, 2019;Miller et al., 2019). It is also possible to fit data-integrated SDMs to a combination of presence-only data and other types of data. However, in this specific case, the data-integrated SDMs must be expanded to account for sampling intensity and the covariates influencing sampling intensity for the presence-only data .

DSCS-Fish Associations
Managers are often interested not only in DSCS spatial distributions, but also in community structure and DSCS-fish associations. A simple way to explore these associations is to examine correlations between DSCS presence, abundance, density, or total cover and fish presence, abundance, or density (Auster, 2005;Malecha et al., 2005;Stone, 2006;Tissot et al., 2006;Kenchington et al., 2013;Conrath et al., 2019). Alternatively, the presence, abundance, or density of fish can be modeled as a function of DSCS-related and abiotic covariates (Laman et al., 2015;Sigler et al., 2015). A more community-oriented approach is to identify distinct habitat clusters through analysis of DSCS sampling data and then ascertain which fish species are associated with the different habitat clusters (Woolley et al., 2013;D'Onghia et al., 2016). Joint SDMs, which consider fish and DSCS species simultaneously and correlations among these species (e.g., Ovaskainen et al., 2017;Thorson and Barnett, 2017), can be employed to reveal community structure and which fish groups associate with which DSCS groups. With any of these approaches, it is important to distinguish the correlations between DSCS and fish per se from any apparent correlations that arise simply because both are correlated with other covariates in similar ways.

Model Ensembles
No SDM is perfect and it is, therefore, good practice for analysts to consider multiple models. Ensemble modeling techniques facilitate the integration of results across multiple models (Case Studies 1 and 3). When working with a model ensemble, it is important to weight the predictions made by the different models of the ensemble using an objective weighting scheme. For example, Rooper et al. (2017a) employed a model ensemble including GLMs, GAMs, BRTs, and RFs to predict the spatial distribution of DSCS in the Gulf of Alaska, and utilized model errors to produce model ensemble predictions. Another example is that of Georgian et al. (2019), who used a model ensemble including MaxEnt models, BRTs and RFs to determine the spatial distribution of vulnerable marine ecosystem indicator taxa in the South Pacific Ocean; the authors constructed a weighted average of the predictions of the different models in the ensemble on the basis of the area under the receiver operating characteristic curve (AUC) and the coefficients of variation of the model predictions. Ensemble models can have better performance and produce less uncertain predictions than individual models (Rooper et al., 2017a;Lo Iacono et al., 2018;Georgian et al., 2019).

Spatial Autocorrelation
An important consideration for any SDM is the assessment of and accounting for spatial autocorrelation in the residual errors (Case Study 3). It is typically the case that the predictor variables included in an SDM only explain some of the spatial variation in a species' distribution. Remaining unexplained variation can lead to spatial autocorrelation in residual errors, which affects statistical inference (Legendre, 1993). There are numerous approaches for addressing spatial autocorrelation in SDMs , although DSCS SDMs have rarely employed these techniques (but see Georgian et al., 2019). An SDM approach that has become more common recently is to estimate spatial and spatio-temporal variation in the quantities of interest (e.g., probability of presence) to explicitly account for the component of the species' distribution that is not explained by the predictor variables (e.g., Shelton et al., 2014;Thorson and Barnett, 2017). At a minimum, spatial autocorrelation in the residual errors of any DSCS SDM should be assessed and statistical inferences adjusted accordingly .

Uncertainty and Validation
Regardless of the modeling approach taken, predictions from DSCS SDMs should be presented with associated estimates of uncertainty (Case Studies 1 and 3) and be validated (Case Studies 1-3). A common approach to estimating uncertainty in model predictions is non-parametric bootstrapping (Georgian et al., 2019). Bootstrapping provides a set of replicate model predictions from which various statistics can be calculated to characterize their statistical distribution (e.g., mean, standard deviation). The coefficient of variation (CV) can be particularly useful for comparing relative uncertainty in predictions among models (Rooper et al., 2017a;Georgian et al., 2019). Mapping spatial uncertainty is informative for managers, for example uncertainty information can be used to prioritize areas that would benefit most from additional data collection (i.e., areas with high uncertainty). Model predictions should also be validated; ideally, by using new independent data collected in a statistically robust manner for model validation purposes (Williams and Brown, 2019). Simulation can be a useful tool for determining optimal sampling designs (Hirzel and Guisan, 2002). When new surveys are not practical or are cost prohibitive, data subsetting can be used to derive "training" and "test" data, whereby the model is fit to the former and then the fitted model is used to predict the latter allowing an assessment of how well the model performs with respect to "new" data (Rooper et al., 2014(Rooper et al., , 2017a. Crossvalidation is a common form of data subsetting that is often combined with a model selection process to optimize a model's predictive ability (Kuhn and Johnson, 2013). Using spatial units as cross-validation folds can be a useful technique for developing a model that predicts well to new areas (Valavi et al., 2019).

Model Performance
Many metrics exist for assessing model fit and predictive performance, each with strengths and limitations. AUC is commonly employed for presence-only and presence-absence models, although it is important to be aware of limitations with this metric, especially when comparing the performance of models across species with different distributions and abundances (Lobo et al., 2008;Jimenez-Valverde, 2012). A variety of threshold-dependent metrics exist, but the choice of threshold, choice of metric, and species prevalence can affect apparent performance and must be considered carefully (Liu et al., 2005;Allouche et al., 2006). Other metrics appropriate for models fitted to binary data include point biserial correlation between predictions and observations, calibration plots, and adjusted-R², among others (Rooper et al., 2018;. In general, assessing the fit of models to binary data is challenging; there are more standard options for assessing fit to count and continuous data such as correlations between predictions and observations, root mean square error, residual deviations, etc. Likelihoodderived measures like percent deviance explained and the Akaike Information Criterion (AIC) can be calculated for likelihoodbased models of presence-absence, count, and continuous data. AIC is an example measure that provides an indication of model fit balanced against model complexity thereby potentially better representing a model's predictive ability. In general, performance metrics calculated with respect to training data will indicate how well a model captures variation in existing data while performance metrics calculated with respect to test data will indicate a model's predictive ability (Section Uncertainty and Validation). If one's primary objective is prediction to unsampled areas, performance should be evaluated in terms of test data.

Management
Arguably, the most important element in science designed to support resource management is how the interface between science and management is designed. This interface is best conceived as an iterative process, where managers define goals, scientists provide scientific recommendations, managers refine goals and request more input, scientists validate and extend previous results, and so on (Case Study 1). Integrated ecosystem assessments (sensu Levin et al., 2009) are an example formalization of such an iterative process.
Science that supports management of DSCS poses several unique challenges. In particular, SDMs representing DSCS are usually developed based on opportunistic data at large spatial scales that are not designed specifically to sample DSCS (e.g., Sigler et al., 2015), or alternatively based on data that were designed to sample DSCS but only at small spatial scales. To develop confidence in results among stakeholders and managers, SDMs that integrate these two data types must be carefully validated to (1) determine whether the geographic distribution is accurately represented, and (2) identify critical knowledge gaps regarding habitat. This validation should be conducted across the entire spatial extent being considered using a probabilistic sampling design that provides strong statistical inference and which includes areas with both high and low predicted DSCS density to corroborate inference for both high and low-quality habitats, using sampling techniques such as underwater cameras that can positively confirm the presence or absence and density of DSCS (e.g., Rooper et al., 2016).
To further promote stakeholder and manager understanding, confidence, and usage of DSCS SDMs, we suggest expressing model predictions and their associated uncertainties in units that are easily tested and widely understood (Case Studies 1-3). For example, SDM results should be expressed as encounter probabilities for a given sampling program, expected biomass, size, etc. rather than as "relative habitat suitability" which is typically produced by presence-only models. These ecologically interpretable metrics also lend themselves more easily to thresholding if required for management. These metrics can be computed using models that are simultaneously fitted to different types of data, e.g., biomass, counts, and presenceabsence samples (Grüss and Thorson, 2019), although this data-integrated approach has not typically been used when modeling DSCS.
Productive communication between scientists, managers, and stakeholders also is essential for stakeholder and manager understanding of DSCS SDMs. Stakeholders often view protection of DSCS as a zero-sum game in balancing sustainable fishing practices and habitat protection. Non-governmental organizations (NGOs) want more habitat protection while the fishing community is concerned that the protections will go beyond those necessary for sustainable fishing. Some controversy is inevitable when management affects allowed behavior for different stakeholders (MacLean et al., 2017). Early and frequent communication by scientists with NGOs and fishing associations helps everyone understand the SDM progress and results and helps to reduce controversy in the management decision.
Finally, DSCS SDMs can play an important role in choosing among alternative management actions. For example, management strategy evaluation (Bunnefeld et al., 2011) can be conducted to evaluate how different spatial fishery closures are likely to perform in terms of protecting DSCS given the uncertainty associated with existing SDMs.

CASE STUDIES
We now review three case studies that illustrate some of the good practices that we have discussed previously. These case-studies demonstrate that our good practices can be accomplished and inform management objectives despite real-world constraints regarding funding, data availability, and analytical capacities.

Eastern Bering Sea Coral Canyons
The canyons that incise the eastern Bering Sea outer shelf and slope are among the largest in the world. The eastern Bering Sea is also an important source of wild-caught seafood, accounting for about 40% of the volume by weight of U.S. catches. In 2011, a group of NGOs asked the North Pacific Fishery Management Council (NPFMC) to consider protecting some eastern Bering Sea canyon habitat from the effects of fishing. In response in 2012, the NPFMC requested an analysis of existing data on canyon habitat, fish associations, and fishing activity (Figure 3). DSCS presence-absence data collected during region-wide bottom trawl surveys were analyzed using GAMs to estimate probability of presence of DSCS for the eastern Bering Sea outer shelf and slope (Sigler et al., 2015; Figure 3); these results were reported to the NPFMC in 2013. In their discussion, the NPFMC recognized that management measures might be necessary for coral conservation and management and that the distribution modeling had highlighted areas that might be of concern; however, the NPFMC called for a validation study to test model predictions before making a decision.
A validation study was conducted in 2014 using an underwater stereo camera system (Williams et al., 2010). The stereo camera system is highly desirable for region-wide surveys because it can generate large sample sizes (8-10 deployments per day at depths of up to 800 m are routinely achievable) potentially across a large region, where each deployment can sample ∼1,100 m 2 during a 15-min transect. A stratified random survey was conducted where nine strata were identified as 5 canyons, inter-canyon areas, and the outer continental shelf. Sample locations (n = 210 stations) were randomly chosen, but weighted by the probability of coral presence from the bottom trawl survey model, so that on average, locations with higher probability of coral presence were more likely to be chosen. In addition, 10 locations per stratum (n = 90 stations total) were randomly assigned to ensure that unlikely coral habitats also were sampled. During the subsequent fieldwork, a total of 250 locations were sampled (sample locations in the two canyons north of Zhemchug Canyon were largely unsampled due to weather and time constraints). The model validation was conducted by comparing the predictions of the bottom trawl survey model with the observations from the field survey. The new stereo-camera presence data were also used to update the model predictions and used to predict the distribution of density and height of corals and sponges from the underwater stereo camera survey (Rooper et al., 2016; Figure 3); these results were reported to the NPFMC in 2015.
The NPFMC decided against closures of the eastern Bering Sea outer shelf and slope. They concluded that the scientific evidence did not suggest a risk to deep-sea corals, citing the low occurrence and density of deep-sea corals (other than pennatulaceans), the lack of hard substrate to support these corals, and their relatively small size, interpreted as indicating lower vulnerability in these areas to fishery impacts (MacLean et al., 2017). Their conclusion was based on the coral density and height models and the stereo camera survey. The key lessons in this case study in terms of coral modeling were: (1) a model was generated using the best available data (presence and absence from the bottom trawl survey); (2) this model was presented to managers and evaluated by scientists; (3) managers requested more information and validation of the model before using it to make decisions that would affect fisheries; (4) additional data on occurrence, density, and height to validate and improve the model were collected with a sample size and robust statistical design that allowed for inference throughout most of the spatial domain of the model; and (5) these new findings were presented to managers for further evaluation. Throughout the process there were multiple points where updates were provided to managers, stakeholders and scientists so there would be no surprises when each report was formally presented to managers and stakeholders. In this case, a key point was that managers waited for completion of a regionwide field validation study before reaching a decision and were able to reach that decision with more confidence that the deepsea coral model based on trawl survey data was largely correct (predicting the presence or absence of corals in the camera observations correctly in 72% of cases; Rooper et al., 2016).  Sigler et al., 2015, left) and predicted density of coral (adapted from Rooper et al., 2016, right). Predicted presence (left) was reported in June 2013 and was based on presence-absence data collected during ecosystem-scale bottom trawl surveys. Predicted density (right) was reported in October 2015 and was based on density data collected during an ecosystem-scale underwater stereo camera survey (the validation study) that was used to update the predicted presence model. Black areas indicate land, and black lines represent boundaries between inner, middle, and outer shelf and slope areas (Sigler et al., 2015).

Southern California Bight
The Southern California Bight (SCB) has a variety of seafloor habitats and oceanographic conditions that promote a rich diversity of demersal organisms, including a wide variety of fishes (Love et al., 2009) and dense stands of vulnerable DSCS (Tissot et al., 2006;Yoklavich et al., 2011). The SCB is also bordered by one of the most populated areas along the Pacific coast of North America, and the waters of the SCB have been intensively fished both commercially and recreationally to depths over 300 m for at least 40 years (Yoklavich et al., 2011;Salgado et al., 2018). Fishes in the SCB are dominated by rockfish (genus Sebastes; Love et al., 2009), and the two overfished species in the SCB are both within this genus: cowcod (S. levis) and yelloweye (S. ruberrimus). Stakeholder input typically includes a tension between competing goals (maintaining fishing and rebuilding overfished species). Since 2002, management agencies (Pacific Fishery Management Council, NOAA Fisheries, Channel Islands National Marine Sanctuary, and the State of California) have worked together to establish a network of marine reserves with multiple aims, including habitat protection. To conserve and manage the habitats these fish rely on, including DSCS, it is critical to develop SDMs that can provide guidance on the best locations for restrictions on human activities.
In support of these goals, researchers at NOAA's Southwest Fisheries Science Center collected a long time-series (2001-2011) of submersible video from which they built a database of spatially explicit fish and DSCS data (Tissot et al., 2006;Love et al., 2009;Yoklavich et al., 2011) that can be used to build SDMs. All video data recorded during these surveys were processed (i.e., no sub-sampling occurred) and georeferenced. All fish and invertebrate species were identified to the lowest possible taxonomic level and the length and height of invertebrates were measured to the nearest 5 cm using calibrated lasers. Furthermore, the submersible track locations from all survey dives were recorded, providing data on both DSCS presence and absence. These data were used to quantify the area surveyed and, thus, estimate the density of DSCS. Using these data, Huff et al. (2013) built a GAM to estimate the density of the Christmas tree coral (Antipathes dendrochristos) as a function of multiple environmental covariates (depth, slope, profile, surface productivity, and oceanographic conditions near the seafloor). This GAM was then used to develop a SDM for Christmas tree coral in the SCB (Figure 4).
These SDMs are extremely valuable to fisheries managers mandated to conserve and manage marine fisheries and their associated habitats, and recent evidence from the SCB has provided further support that multiple species of DSCS are important habitat for demersal fish. Structure-forming DSCS can play an important role in deep-sea benthic communities (D'Onghia, 2019), providing increased prey density (Quattrini et al., 2012) and nursery habitat (Stone, 2006;Baillon et al., 2012). A recent study (Henderson et al., in revision) in the SCB used various spatially explicit analytical methods to identify 8 DSCS taxa that increased the probability of presence for commercially important Sebastes species (S. rufus and S. paucipinis) as well as young-of-year Sebastes, even after accounting for depth and seafloor relief. These results support the classification of these DSCS as essential fish habitat.
The key lessons from this case study are that: (1) it is important to analyze, and geo-reference, all collected data so researchers are not restricted to using presence-only methods; (2) georeferencing both the fish and DSCS data can provide sufficient data to investigate whether DSCS taxa serve as essential fish habitat within the survey area; and (3) quantifying the area surveyed allows estimation of the density of any DSCS taxa of interest, which provides more management-relevant information than presence-absence estimates. Based on our good practices, we recommend that researchers implement a field-sampling program to validate results from these DSCS SDMs.

Louisville Seamount Chain
The Louisville Seamount Chain is comprised of over 80 seamounts spanning a distance of more than 4,000 km across the South Pacific Ocean. These seamounts are historical and current targets for commercial bottom-trawling fisheries, dominated by the catch of orange roughy (Hoplostethus atlanticus) by New Zealand and Australian flagged vessels . These fisheries pose a considerable threat to Vulnerable Marine Ecosystems (VMEs), ecosystems that are particularly susceptible to anthropogenic disturbance as determined by the fragility, functional significance, rarity, and life history traits of their components (FAO (Food and Agriculture Organization of the   (Rowden et al., 2017). Left: Regional habitat suitability models (1 km, adapted from Georgian et al., 2019) indicated high suitability for the stony coral Solenosmilia variabilis throughout the Louisville Seamount Chain. Right: High resolution models (25 m, adapted from Rowden et al., 2017) were constructed for six individual seamounts (Forde Seamount shown), and Vulnerable Marine Ecosystems (VMEs) were identified based on a threshold applied to S. variabilis abundance models.
United Nations), 2009). The seamount chain lies outside of national jurisdiction, and fisheries management is conducted by the South Pacific Regional Fisheries Management Organization (SPRFMO), an intergovernmental agency mandated by the United Nations to protect VMEs while also ensuring the future of sustainable fisheries [UNGA (United Nations General Assembly), 2006]. SPRFMO has implemented two primary interim measures to protect VMEs. First, a move-on rule that requires vessels to cease fishing within 5 nautical miles of an encountered VME, usually triggered by the bycatch of a VME indicator taxon. Indicator taxa are those that are vulnerable to fishing gear, functionally significant (e.g., habitat creators), rare or endemic, or low productivity species (e.g., slow growth rate, low fecundity) (Parker et al., 2009). Second, SPRFMO enacted a series of large (20 arc-min) closures in areas with a historically low fishing impact. However, the underlying effectiveness and real-world implementation of these interim measures has been questioned (Penney and Guinotte, 2013), and SPRFMO is actively pursuing long-term replacement measures with a focus on more effective spatial closures. Rowden et al. (2017) used a variety of habitat suitability modeling techniques to map the distribution of VMEs across the Louisville Seamount Chain in order to support the improved spatial management of the region's fisheries (Figure 5). The authors built on a series of broad-scale regional models that generally highlighted the seamount chain as being highly suitable for a variety of VME taxa (Anderson et al., 2016a,b;Georgian et al., 2019). Three VME indicator taxa were modeled at a high spatial resolution (25 m) on seven seamounts: the stony coral Solenosmilia variabilis, sea stars (Brisingida), and crinoids (Crinoidea). Biological data were collected during a cruise designed to ground-truth the regional models using a randomstratified survey. Environmental data (backscatter and a large suite of multi-scale terrain metrics) were derived from highresolution multibeam surveys conducted during the cruise. Both presence-absence and abundance ensemble models were constructed using the performance-weighted average of BRT, GAM, and RF models. Model performance was assessed using a 70-30 cross-validation approach and by bootstrapping the input presence-absence and abundance data to estimate model uncertainty (CV) (sensu Anderson et al., 2016b). To reduce the effect of spatial autocorrelation on model performance, the residual autocovariate was calculated from the residuals of preliminary models (Crase et al., 2012).
Ultimately, Rowden et al. (2017) demonstrated that the modeled seamounts constituted less suitable habitat than previously predicted by broader scale regional models. However, small sections of each seamount (<0.1% of the modeled area) contained highly suitable habitat for one or more VME indicator taxa. These results suggest that an optimal management solution that prioritizes VME protection while simultaneously allowing fishery access to less suitable areas may exist for the Louisville Seamount Chain. The modeling approach used provides several useful lessons for future modeling studies, particularly when resource management is the eventual goal: (1) it is imperative to revise and improve models when new or better data become available; (2) high resolution models continue to be important tools that capture spatial patterns that may differ from broad scale models and have the potential to significantly alter spatial management plans (e.g., Dolan et al., 2008); (3) true presence-absence and abundance models should be used when the data allow, given their generally improved performance and easier ecological interpretation (see Yackulic et al., 2013); (4) sampling bias and spatial autocorrelation, although frequently ignored in modeling studies (see Dormann, 2007), should be accounted for using either statistical approaches (e.g., Georgian et al., 2019) or an appropriate survey structure (e.g., Rowden et al., 2017); (5) ensemble models are powerful tools that avoid overreliance on the underlying structure and assumptions of different modeling techniques (Robert et al., 2016); and (6) calculating a spatial measure of model uncertainty allows decision makers to prioritize management measures based on both the suitability of a given area as well as the confidence of the model at that site (Anderson et al., 2016a,b).

CONCLUSION
The geographic scale of potential human impacts on DSCS combined with the localized nature of available information about the spatial distributions of these species necessitates the use of SDMs by managers looking to assess and mitigate these impacts. Here we have described some good practices for developing such models and making use of them in management decision-making (Table 1). In particular, we recommend that SDMs move beyond presence-only models to include measurements of DSCS presence-absence, abundance, and height; these predictions are easier to validate and/or test in subsequent targeted field-sampling programs and are more easily understood by managers and stakeholders. Model predictions should be presented with associated measures of uncertainty, where this uncertainty is used to weight predictions from different models within an ensemble of SDMs. We strongly encourage consideration of scale in DSCS SDMs used in management decision-making, including spatial (e.g., large marine ecosystem or management region, or more localized marine protected area or ecologically sensitive location) and temporal (e.g., short-term response to specific impacts or long-term) scale (Lecours et al., 2015). We recognize the logistical challenges and expense of collecting data on DSCS and appreciate that it may be impractical to implement all of these good practices in every situation. Indeed, the case studies presented here sometimes required significant financial resources, and they did not necessarily incorporate every one of our good practices. Nevertheless, we hope that these good practices provide useful guidance for future DSCS data collections and SDMs, especially models that will be used to inform management of human activities in the deep sea.