Predicting Optimal Sites for Ecosystem Restoration Using Stacked-Species Distribution Modeling

Habitat restoration is an important tool for managing degraded ecosystems, yet the success of restoration projects depends in part on adequately identifying preferred sites for restoration. Species distribution modeling using a machine learning approach provides novel tools for mapping areas of interest for restoration projects. Here we use stacked-species distribution models (s-SDMs) to identify candidate locations for installment of manmade reefs, a useful management tool for restoring structural habitat complexity and the associated biota in marine ecosystems. We created species distribution models for 21 species of commercial, recreational, ecological, or conservation importance within the Southern California Bight based on observations from long-term reef surveys combined with high resolution (200 x 200m) geospatial environmental data layers. We then combined the individual species models to create a stacked-species habitat suitability map, identifying over 800 km2 of potential area for reef restoration within the Bight. When considering only the 21 focal species, s-SDM scores were positively associated with observed bootstrap species richness not only on natural reefs (linear model: slope = 0.27, 95% CI = 0.17 - 0.36, w = 1), but also this result was supported by two independent test datasets. The predicted richness from this linear model was associated with observed species richness when considering only the focal species on manmade reefs (linear model: slope = 0.52, 95% CI = 0.13 - 0.92, w = 1) and also when considering 204 other non-focal species on both natural and manmade reefs in southern California (slope = 3.65, 95% CI = 2.93 - 4.37, w = 1). Finally, our results demonstrate that the existing manmade reefs included in our study on average are located in regions with habitat suitability that is not only less suitable than natural reefs (t-value = -5.4; p < 0.05) only, but also only slightly significantly better than random (p < 0.05), demonstrating a need for more biologically informed placement of manmade reefs. The stacked-species distribution model provides insight for marine restoration projects in southern California specifically, but more generally this method can also be widely applied to other types of habitat restoration including both marine and terrestrial.


INTRODUCTION
Increasing habitat loss and degradation worldwide threatens many of the world's species (Foley et al., 2005), resulting in population declines (Bender et al., 1998), loss of genetic diversity (Sih et al., 2000), and even species extinctions (Barnosky et al., 2011). Habitat restoration is an important tool for managing degraded ecosystems (Polak and Saltz, 2011), in an effort to restore and prevent species loss (Pavlik, 1996). Yet, habitat restoration initiatives are not always successful (Fischer and Lindenmayer, 2000;Godefroid et al., 2011). One key factor that influences the success of habitat restoration projects is the quality of sites chosen for management (Bottin et al., 2007). For example, manmade habitat structures can fail when placed in areas with non-ideal environmental conditions (e.g., Frissell and Nawa, 1992). To prevent such failures, it is crucial that we develop and test methods for identifying candidate sites for habitat restoration.
Species distribution modeling (a.k.a. ecological or environmental niche modeling) has been proposed as tool for identifying sites for habitat restoration (Pearce and Lindenmayer, 1998) and is increasingly being utilized for this purpose (Rodríguez et al., 2007). Using observation data in conjunction with spatially gridded environmental data, species distribution modeling identifies environmental predictors of species occurrence, creating a model that is then projected across the landscape to identify other areas of suitable habitat (Elith and Leathwick, 2009;Elith et al., 2011). Species Distribution Models (SDMs) have been used to predict optimal sites for restoration for a wide variety of species, including plants (Yang et al., 2013) and animals (Pearce and Lindenmayer, 1998;Wilson et al., 2011). Yet, while many of these efforts focus on single keystone or focal species (e.g., Pearce and Lindenmayer, 1998;Wilson et al., 2011;Yang et al., 2013), for some degraded habitats, restoration is needed for entire communities (Palmer et al., 1997). Stacked-SDMs, where SDMs are first created for individual species and then combined, provide an opportunity to identify suitable habitat across multiple species. Stacked-SDMs have been used for studying spatial patterns of environmental suitability across a range of taxa (Dubuis et al., 2011;Guisan and Rahbek, 2011;Hof et al., 2012;Calabrese et al., 2014;Hof and Svahlin, 2016;da Mata et al., 2017).
Here we implement stacked-SDMs (s-SDMs) to predict optimal locations for the placement of manmade reefs to restore habitat for shallow rocky reef-associated marine fish, invertebrate, and algal communities. To conduct this research, we take advantage of long-term reef survey datasets (Caselle et al., 2015;Pondella et al., 2015a;Zahn et al., 2016) to build SDMs for the entire extent of rocky reefs within the Southern California Bight (SCB). The rocky reefs of the SCB are a habitat of particular interest because this region is on par with some of the most highly productive ecosystems in the world (Hubbs, 1960;Horn and Allen, 1978;Pondella et al., 2005;Horn et al., 2006). Here, cool waters of the California current from the north meet with warm waters from the south to create a set of unique environmental conditions that support a wide variety of marine species (Horn and Allen, 1978;Bograd and Lynn, 2003;Pondella et al., 2005;Horn et al., 2006;Hamilton et al., 2010). Naturally occurring hard substrates make up the base of rocky reef habitats from which wide ranging giant kelp forests (Macrocystis pyrifera) grow, providing extensive habitat for many marine fish, invertebrate, and algal species (Graham, 2004;Stephens et al., 2006). Yet at the same time, this productive ecosystem is located next to one of the world's largest megacities, Los Angeles (Nicholls, 1995). As a result, there is intense anthropogenic pressure exerted on this critical ecosystem, including overfishing (Love, 2006;Zellmer et al., 2018), habitat modification due to landslides (Kayen et al., 2002) or development (Ambrose, 1994), and pollution (Schaffner et al., 2015).
Manmade (artificial) reefs have long been used as a successful option for restoration of marine ecosystems (Bohnsack and Sutherland, 1985;Bohnsack et al., 1994). Many hard substrates create manmade reefs, from purposefully designed quarry rock structures to breakwalls, pier pilings, and even sunken shipwrecks (Morris et al., 2018). When standardized, comparisons with natural reefs suggest that manmade reefs can sustain similar levels of species richness and abundance (Carr and Hixon, 1997;Pondella et al., 2002Pondella et al., , 2006. Further, some of the best manmade reefs, for example tall quarry rock reefs with high rugosity and steel oil platform structures with extensive spatial coverage, even show evidence of sustaining higher productivity than natural reefs Johnson et al., 1994;Claisse et al., 2014;Granneman and Steele, 2014;Pondella et al., 2015b). By using environmental data and SDMs to select preferred sites for placement of such manmade reefs, it may be possible to further optimize restoration efforts.
Yet, creating species distribution models for manmade reef restoration poses some unique challenges. Species distribution modeling has been used to study a number of marine ecosystems (Brodie et al., 2018) and for the conservation of marine species (Robinson et al., 2017), including for habitat restoration. For example, Adams et al. (2016) created SDMs for eelgrass restoration. However, rocky reef ecosystems differ from systems like eelgrass communities as they require specific habitat structures that are largely independent of environmental conditions -rocky infrastructure can and is built in many different places (e.g., manmade reefs, breakwalls, jetties; Morris et al., 2018) -and themselves are not constrained by environmental conditions. Further, such projects are time consuming and costly, requiring a significant amount of planning, collaboration, and management. Thus, it is necessary to establish an approach for identifying preferred candidate sites for rocky reef infrastructure by modeling environmental constraints of species found inhabiting these reefs.
To investigate the utility of SDMs for optimizing the placement of manmade reefs, we created individual-SDMs for 21 species and combined them to create stacked-SDMs to identify hotspots for habitat suitability across multiple species. We validate this approach for identifying candidate sites for habitat restoration by assessing whether the s-SDM values for reefs are positively associated with observed richness of the 21 focal species from reef surveys on already established manmade reefs as well as for an independent dataset of non-focal species that includes 204 fish, invertebrate, and algal species on all surveyed reefs. If SDMs provide an accurate tool for identifying candidate sites, then we would expect multi-species habitat suitability from the s-SDM to increase with observed richness of established manmade reefs.

Species Distribution Modeling
To determine optimal habitat for each of the focal species, we utilized species distribution modeling using a machine learning approach in the program MaxEnt (Phillips et al., 2006a). This approach allows us to develop a model of habitat suitability for each species based on the environment in places where each species has been observed (Elith and Leathwick, 2009). We can then project that model over all other locations to identify additional suitable habitat. To construct SDMs, we first collated observation data for each of the focal species and downloaded and created spatial environmental data layers.

Observation Data
We initially chose 39 fish, invertebrate, and algal species for this analysis that are targeted commercial or recreational species (CDFG, 2001) or of particular ecological concern for the SCB (Supplementary Table S1), including representative fish (e.g., Rockfish, Sebastes sp.), invertebrate (e.g., Red sea urchin, Mesocentrotus franciscanus), and algal species (e.g., Giant kelp, Macrocystis pyrifera). Spatial locality information was collected for each of the focal species from long-term monitoring surveys from the Vantuna Research Group (VRG; Pondella et al., 2015a;Zahn et al., 2016), Channel Islands National Park Kelp Forest Monitoring Program (KFM; Kushner et al., 2013), and the Partnership for the Interdisciplinary Studies of Coastal Oceans (PISCO; Hamilton et al., 2010;Caselle et al., 2015;Pondella et al., 2015a). These observations were made from transect surveys on rocky reefs across the entire SCB at 296 sites during 35 years from 1982-2017 (Kushner et al., 2013;Caselle et al., 2015;Pondella et al., 2015a;Zahn et al., 2016). In short, divers conducted subtidal surveys up to 30 m deep with a depth-stratified random sampling design at each site in which randomly located transects were sampled using four methods: (1) fish density and size distribution are recorded along 30 m belt transects on the reef, in the midwater and in the top section of the water column if kelp canopy is present, (2) density of large (>2.5 cm) motile invertebrates and macroalgae recorded along 30 m "Swath" transects, (3) percent cover of sessile invertebrates, turf algae, and habitat characteristics are estimated using uniform point contact along 30 m transects on the reef and (4) size frequency data for commercially and ecologically important invertebrates (Kushner et al., 2013;Caselle et al., 2015;Pondella et al., 2015a;Zahn et al., 2016). We used only presence and absence data from these surveys.
We divided the dataset into localities from natural rocky reefs (578 sites) and manmade reefs (38 sites). The natural reef data were split into training and test data (described below), whereas the manmade reef data were used only for validating the models. To prevent spatial bias, we used spatial thinning to remove points within 1km of one another. Spatial thinning was completed using the "spThin" R package (Aiello-Lammens et al., 2015). Only species with at least 30 unique observed localities on natural reefs greater than 1km apart were included in subsequent analyses. The SDM method used in this study, MaxEnt (described below), is less sensitive to small sample sizes (10-30) than other distribution modeling methods available, although caution should still be taken in interpreting models with the smallest sample sizes (Wisz et al., 2008). Preliminary analyses of species with fewer than 30 unique observations resulted in SDMs with low support. Of the initial 39 focal species, 21 had at least 30 unique observed locations at least 1 km apart ( Table 1). These 21 species included 16 fish, six invertebrates, and one algal species.

Environmental Data
Spatially gridded environmental data was collected for the entire SCB. We used six environmental variables at a resolution of 200 m 2 , including: aspect, bathymetry, mean annual Chlorophyll-A (ChlA), distance to the 200 m isobath (a proxy for upwelling potential), slope, and mean annual sea surface temperature (SST). For bathymetry, we used a seafloor bathymetry digital elevation model (DEM) which is a product of the California Department of Fish and Wildlife Bathymetry Project. This coastwide 200 by 200 m DEM was clipped to the extent of the Southern California Bight. Seafloor aspect and slope were derived from the bathymetry DEM using the Aspect and Slope tools in ArcMap 10.3. We collated data from MODIS-derived sea surface temperature (SST; degrees Celsius) and Chlorophyll-A (ChlA; mg·m −3 ) from the University of California San Diego, Scripps Institution of Oceanography Photobiology Group 1 . The raw data consists of 15 day averages throughout the California Current Large Marine Ecosystem. We took the mean of each year from 2002-2017 and then took the grand mean of all years. Both SST and ChlA were downsampled in R using the bilinear method. All data layers were projected to the WGS 1984 UTM Zone 11N coordinate system to limit distortion. We masked each of the environmental layers using the 45 m isobath contour to restrict all analyses to only cells with average depths shallower than or equal to 45 m, since all reef survey observation data is limited to this region. We tested for correlations among each of the environmental variables at each of the unique locations from the thinned dataset using Pearson Correlation Coefficient. None of the 15 pairwise comparisons of the six environmental variables were highly correlated at observed focal species localities, with | r|≤ 0.5 for all.
Reef presence across the SCB was identified based on a composite of hard-bottom substrate and historical kelp canopy cover (Williams et al., unpublished;Zellmer et al., 2018). We created a second stack of environmental data layers with all variables masked by reef presence. This masked raster stack was used to build individual-SDMs on current established reefs and then the full raster stack was used to project the individual-SDMs across the remaining area in order to identify candidate sites for restoration in the SCB. All candidate models tested allowed for all feature classes to be used (LQHPT) but varied in the regularization multiplier (RM). AUC is the Area Under the Curve for the full dataset. Mean AUC is averaged across each of the iterations for only the training data. Diff AUC is the mean difference in the AUC values between the training data and the test data. w is the Aikaike weight. K is the number of parameters included in the MaxEnt model. n is the total number of unique observations of each species on natural reef sites prior to splitting into training and test datasets.

Individual-SDMs
Individual SDMs were developed for each species using MaxEnt v. 3.4.1, a presence-only machine learning approach to modeling species distributions (Phillips et al., 2006b) called through the R programming language (R Core Team, 2015). MaxEnt includes two options, feature classes and a regularization multiplier, to customize models and control overparameterization. Feature classes are a transformation of the environmental variables to enable modeling of complex relationships and include linear, product, hinge, threshold, and quadratic (Elith et al., 2010), whereas the regularization multiplier adds a penalty for overparameterization (Elith et al., 2010;Shcheglovitova and Anderson, 2013). By default, MaxEnt allows all feature classes to be selected in training the model and uses a regularization multiplier of one as determined by optimization from empirical studies across a variety of species (Phillips and Dudík, 2008). However, these parameters need to be optimized for each species to prevent overly simplified or overly complex models (Radosavljevic and Anderson, 2014;Morales et al., 2017). Thus, we utilized a model selection approach to compare models based on the corrected Akaike information criterion (AICc) approach for SDMs developed by Warren and Seifert (2011) and implemented in the "ENMeval" R package (Muscarella et al., 2014). For each species, we tested a set of 12 candidate models each with a different regularization multiplier (1-12, increasing by one) and allowed all feature classes in each model. We used the "block" method for model evaluation to account for spatial autocorrelation. This approach divides the data into four spatial blocks. The model is then run four times with three blocks set as training data and one block set as test data for each iteration and evaluation metrics are then summed across the iterations (Muscarella et al., 2014). Models were evaluated first by comparing the mean Receiver Operating Characteristic Area Under the Curve (AUC) for the training data to the test data. This value measures the true positive rate to the false positive rate at varying thresholds for classifying habitat suitability. AUC values close to one indicate good fit of the model to the data whereas an AUC value of 0.5 indicates the model is no better than random. Comparing the AUC values for the training to test data allows us to validate how well the models fit an independent dataset, thus smaller differences in the training and test data AUCs indicate better transferability. In addition, we calculated AICc scores to compare the 12 candidate models for each species, allowing us to evaluate the fit of each model while accounting for the number of parameters in each model. The model with the lowest AICc score was considered a best model of the candidate models (Burnham and Anderson, 2002) and was used for subsequent analyses. To identify suitable habitat for reef restoration, the best-fit model was then projected across the entire study area using the complementary log-log link (cloglog) function, which is more appropriate for estimating probability of presence than the previous MaxEnt default, a logistic transformation (Phillips et al., 2017).

Stacked-SDMs
To create a model for predicting preferred locations for manmade reefs that optimizes suitability across most of our focal species, we constructed a stacked-species distribution model (s-SDM) by combining each of the individual-SDMs. To do this, we simply added together each of the individual-SDMs as derived by Calabrese et al. (2014). We selected this approach as opposed to combining thresholded binary habitat suitability classification, since combining thresholds has been shown to result in biased s-SDMs (Calabrese et al., 2014).

S-SDM and Species Richness
To assess the quality of the SDMs for identifying high quality habitat for restoration, we evaluated the extent to which the s-SDM is associated with species richness. We calculated observed species richness at each reef site and tested for a positive linear relationship with the s-SDM score for each reef site using linear regression. We performed this analysis first at all of the natural reef sites for the 21 focal species and then used the linear regression to predict species richness at already established manmade reef sites (test dataset 1) and at all reef sites for 204 other non-focal fish, invertebrate, and algal species (test dataset 2).
Species richness was calculated using only the VRG reef survey data (Pondella et al., 2015a;Zahn et al., 2016) to ensure standardized sampling. To quantify species richness, we estimated species richness using the R "vegan" package using the bootstrap estimator, since sites were surveyed an uneven number of times. We calculated bootstrap species richness for all fish and swath (algae and invertebrate) surveys separately then added the estimates together for each site. We used linear regression to statistically evaluate the relationship between observed bootstrap species richness and s-SDM scores. We include two covariates in the model to account for variation in quality of individual reef sites, depth zone and standard deviation (SD) of reef relief. Depth zone describes the different depths at which a reef was surveyed: inner (∼5 m), middle (∼10 m), outer (∼15 m) and deep (∼25 m). Reef relief was measured at 31 points along each survey transect and a higher standard deviation of these relief measurements indicates greater fine scale habitat heterogeneity. The linear regression was evaluated using AICc by comparing to a null model with SD relief and depth zone alone. The linear model was then used to calculate predicted species richness at manmade reef sites. Observed versus predicted bootstrap richness at manmade reef sites was evaluated using linear regression and AICc by comparing to a null model. In addition, we tested whether predicted values from the linear model were correlated with observed bootstrap richness for the 204 other non-focal fish, algae, and invertebrate species surveyed at all reef sites. This additional independent dataset allows us to test not only the validity of the model but also whether the focal species list is sufficient to predict restoration sites for the reef-associated communities or if only applicable to the species included in the model.

Identifying Candidate Restoration Sites
To identify candidate sites for reef restoration, we isolated regions where there is high predicted habitat suitability across multiple species but no existing reefs using the reef data layer. High predicted habitat suitability was defined as the s-SDM score at or above which the linear model predicts species richness as equal to half the number of focal species. We calculated the proportion of cells with an s-SDM score above this threshold for the entire study region as well as for only cells outside of existing reef areas. Cells outside of existing reef areas with greater than the s-SDM threshold are considered candidate regions for installation of manmade reefs, whereas cells within existing reef areas with greater than the s-SDM threshold are considered candidate regions for restoration or rehabilitation of existing reef habitat.

Existing Manmade Reef Quality Assessment
We further assessed the predicted habitat suitability of already established manmade reef sites in the SCB to determine the current quality of restored habitats. We extracted s-SDM scores for all manmade reefs in our study region (n = 21) and for all natural reef survey sites (n = 250) and calculated the mean for both. We compared mean s-SDM scores for manmade and natural reef sites with a t-test. We then conducted a permutation analysis by randomly sampling sites across the study region (n = 21) and calculating the mean value of the s-SDM at those sites iterated 1000 times. We then compared the mean s-SDM value of the manmade sites as well as the mean s-SDM value of the natural sites to the distribution to quantify significance. As habitat restoration may be limited to areas where reefs do not currently exist and therefore random selection of sites may be artificially biased as being more suitable, we recalculated the null distribution from only areas in the SCB where there is no existing reef habitat and reran the analyses.

Individual-SDMs
For each species, we selected the best fit model ( AIC = 0) from the 12 candidate models with varying values for the regularization multiplier (Supplementary Figures S1-S21). Based on mean AUC values for the test data, all models predicted test observations well (mean AUC range: 0.69-0.84; Table 1) and were not overfit (difference between training and test AUC range: 0.04-0.11; Table 1). The optimal regularization multiplier selected for each species was higher than the default value in MaxEnt (1) and ranged from 3-7 (Table 1). Of the nine environmental predictors, Slope (52%), Distance to 200m (22.6%), and Bathymetry (8.7%) on average contributed the most to each of the individual-SDMs (Figure 1). This pattern was consistent among fish and algae species, although for invertebrate species, Distance to 200 m contributed more on average than Slope (Figure 1).

Stacked-SDM
The s-SDM showed high variation in predicted multi-species habitat suitability across the Southern California Bight coastline for the 21 focal species (Figure 2). Approximately 38.7% (1132 km 2 ) of the studied region included habitat that is predicted to be suitable for many of the focal species. In general, The s-SDM model used was based on 21 focal species. The linear model was calculated first for only the natural reef sites for just the 21 focal species (Natural Reef Focal), and the linear model was then used to calculate predicted species richness values for already established manmade reef sites (Manmade Reef Focal). Predicted and observed bootstrap species richness were compared with linear regression. The association between predicted and observed bootstrap species richness was also evaluated for 204 additional non-focal fish, algae, and invertebrate species (All Reef Non-Focal). AICc is reported relative to the null model tested for each dataset.
average habitat suitability was lower inside bays and higher along points and around islands. The model identified multiple regions with high average predicted habitat suitability that do not already contain reef habitat (Figure 2). After removing cells with existing reef habitat, there remained approximately 33% (804 km 2 ) of the remaining study region that included habitat that is predicted to be suitable for many of the focal species. For the focal species on natural reefs, the model including s-SDM scores and the two covariates, SD relief and depth zone, better predicted observed bootstrap species richness than the null model with only the two covariates (w = 1; Table 2 and Figure 3). Observed bootstrap species richness increased with increasing s-SDM scores (R 2 = 0.22, slope = 0.27, 95% CI = 0.17-0.36). When this model was used to predict species richness for the established manmade reefs, there was high support for a relationship between observed and predicted species richness (slope = 0.52, 95% CI = 0.13-0.92, w = 1; Table 2 and Figure 4A). Similarly, predicted species richness values from this linear model were positively associated with observed bootstrap species richness when considering 204 other non-focal fish, invertebrate, and algal species that were surveyed on both natural and manmade reefs (slope = 3.65, 95% CI = 2.93-4.37, w = 1; Table 2 and Figure 4B).

Quality of Existing Manmade Reefs
To assess the habitat suitability of previously established manmade reefs, we extracted the s-SDM score for each survey site and compared s-SDM scores among natural and manmade reefs to one another and relative to a random distribution. Manmade reefs (mean = 8.7) significantly differed on average from natural reefs (mean = 14.2) in s-SDM scores (t-value = -5.4, P-value = 1.8e-05; Figure 5A). Manmade reefs had an average s-SDM score that was slightly although significantly greater than randomly selected sites, both when the entire study region was considered ( Figure 5B) and when only areas with no existing reef was considered (P-value < 0.01) (Figure 5C). FIGURE 3 | Linear relationship between predicted s-SDM score and observed bootstrap species richness for the 21 focal species. The linear regression was completed for the 21 focal species on all natural reef sites (circles) at four depth zones: inner (∼5 m), middle (∼10 m), outer (∼15 m) and deep (∼25 m). Established manmade reef sites (triangles) for reference. Colors indicate standard deviation (SD) of relief at each reef site, from low (blue) to high (red) fine scale habitat heterogeneity. 95% confidence intervals of the regression lines are shown in gray.

DISCUSSION
Improving the success of habitat restoration projects is a necessity as ecosystems worldwide continue to face increasing anthropogenic pressures and habitat loss. This need is especially great in marine ecosystems, due to increasing coastal urbanization (Dafforn et al., 2015;Morris et al., 2018). Species distribution modeling can be used as an important tool in identifying the best places where habitat restoration is likely to be successful by identifying suitable habitat (Pearce and Lindenmayer, 1998). Here we apply this method to the ecosystem level by calculating individual species distribution models for 21 focal species from shallow rocky reefs and stack FIGURE 4 | Linear relationship between predicted and observed bootstrap species richness for two independent datasets based on the best fit model for natural reefs. Predicted versus observed species richness for (A) the 21 focal species on established manmade reefs and (B) the 204 non-focal fish, algae, and invertebrate species on all reef sites, including natural and manmade, at four depth zones. Colors indicate mean standard deviation (SD) of relief at each reef site, from low (blue) to high (red) relief. 95% confidence intervals of the regression lines are shown in gray.
these SDMs to identify areas with suitable habitat for a majority of the species. Our results illustrate a number of potential areas within the Southern California Bight -an area with immense human pressure due to proximity to major metropolitan areaswhere habitat is predicted to be suitable for the majority of our focal species, including many areas that do not already contain natural or manmade reefs (Figure 2 and Supplementary Figures   S1-S21). This approach allows us to identify sites for habitat restoration using organism-based habitat considerations rather than simply landscape-(or seascape-)based considerations, which is crucial when restoring habitat for multiple species (Miller and Hobbs, 2007).
Moreover, when the individual-SDMs were combined together as the s-SDM, there was a positive linear relationship between s-SDM scores and observed bootstrap species richness on the natural reefs when considering only the 21 focal species (Table 2 and Figure 3), and this relationship was validated by two independent datasets. First, the predicted richness values from this linear model were associated with observed bootstrap species richness at manmade reefs when considering only the 21 focal species (Table 2 and Figure 4A). Second, the predicted richness values from this linear model were also correlated with increases in observed species richness when considering all other 204 fish, invertebrate, and algae species surveyed on Southern California shallow rocky reefs (Table 2 and Figure 4B). Thus, by identifying crucial focal species and combining distribution models for each of these species, it is possible to identify areas that may support greater species richness. For restoration projects in which species diversity or richness is a primary goal (Wortley et al., 2013), this method may provide an opportunity for managers to successfully select more-ideal locations for restoration.
While species richness generally increases with increasing s-SDM scores, there is a high degree of variability, particularly for sites with the highest s-SDM scores. This pattern suggests that additional factors influence suitability of a site beyond just environmental suitability. While our approach identifies environmental suitability and potential locations for habitat restoration, suitability does not guarantee success on its own (Higgs, 1997). Additional factors that need to be considered when selecting sites, include: habitat design (Baine, 2001), species relationships (e.g., Jude and Deboe, 1996), cultural needs (Higgs, 1997), and public participation and socioeconomic factors (Wortley et al., 2013). With Species Distribution Models, species relationships are especially important to consider as SDMs do not inherently account for ecological relationship such as competition and predation (Freeman and Mason, 2015). Thus, some variation could be explained by which species are present at particular reefs.
Additionally, and possibly alternatively, the variation in species richness of high quality sites may instead reflect habitat degradation. For instance, highly suitable sites may be overfished  or exposed to pollution (Schaffner et al., 2015). In fact, some of the natural sites for which there is lower than expected species richness despite high s-SDM scores include some of the more degraded reef sites in our study (high s-SDM score, low richness; Figures 3, 4B). Since many of the best predicted s-SDM scores are on or near existing reefs, our results suggest that there may be immense opportunity for restoring natural reefs as opposed to simply building manmade structures in areas where rocky reefs did not previously exist. In other words, it is important to consider the difference between habitat "restoration" or "rehabilitation" versus habitat "conversion" (Erftemeijer and Lewis, 1999). Restoring previously existing reefs may not only be more cost effective, but as our results suggest, may also be more likely to succeed based on habitat suitability.
Further, the variability observed could also be explained by the physical structure and design of reefs. Previous research has shown that reef structure is an important component of restoration success (Baine, 2001;Pondella et al., 2006). Consistent with this previous research, our analyses suggest that even when environmental conditions are suitable, reef structure may influence species presence as predicted species richness was more accurate for purposefully designed manmade reef structures as opposed to unintended manmade reef structures. Thus, once candidate sites are selected based on habitat suitability, restoration should be done in conjunction with expert opinion as to the specific design of manmade reefs.
Regardless of the specific causes of the variability, our model provides an estimate of areas that are predicted to be suitable for multiple species, suggesting that at least some of the focal species could persist in these locations. Conservation managers should consider the locations identified by this model as a set of candidate locations from which they can then select sites after considering these other factors. Thus, this approach adds an additional tool to help managers consider holistic success of habitat restoration. However, while the s-SDM identifies sites where there is high habitat suitability across a majority of the focal species, for some species, such as rare or endangered species (e.g., Abalone, Haliotis sp.), more directed conservation measures may be necessary. For such species, the individual-SDMs can be used to help in identifying diverse sites for ecosystem restoration.
Interestingly, the manmade reefs included in this study that are already established in the SCB are in regions that are on average not only less suitable than natural reefs, but are in regions only slightly more suitable than sites selected at random (Figure 5). For example, some manmade reefs are placed in gently sloping, sandy-bottom regions. If these reefs had been placed in areas with higher predicted habitat suitability, then it is possible more species could be observed. While we do still observe some species at these locations (Pondella et al., 2015a;Zahn et al., 2016), the lack of habitat suitability suggests that these manmade reefs may be hosting sink populations (van Horne, 1983;Smallwood, 2001). Based on these results, there is strong evidence that habitat restoration may have the most potential when completed at sites with degraded reefs (e.g., inundated by landslides) as opposed to constructing reefs far from existing reef structures. Not only are these more distant sites potentially less suitable, but also have lower connectivity with existing, productive reef habitat . With clear predictions for habitat suitability across multiple species, managers can be best prepared to advocate for selection of appropriate sites.

Future Directions
To ensure that habitat restoration is successful in these locations, future studies should focus on continued monitoring and follow up research. While habitat restoration has become an essential tool in conservation biology, long-term assessments of restoration success remain limited (Godefroid et al., 2011;Wortley et al., 2013). Further, future research should consider species specific differences in how they contribute to community biodiversity and success. Finally, species distribution modeling also offers an opportunity to assess how habitat suitability might vary under future global environmental change (Peterson et al., 2002). As global environmental changes continue to occur, it is crucial to consider how those changes influence the goals of habitat restoration (Higgs et al., 2014). Future research should focus on assessing changes in habitat restoration priorities based on potential changes in habitat suitability across multiple species under numerous possible climate scenarios.

AUTHOR CONTRIBUTIONS
AZ envisioned the project, ran the computational analyses, and wrote the manuscript. JC envisioned the project, contributed to field work, data analysis, and writing. CW and SS contributed to the data analysis, field work, and writing. DP envisioned the project, managed and contributed to the field work, and contributed to data analysis and writing.

FUNDING
The work was supported by the NOAA Saltonstall-Kennedy Grant #NA15NMF4270320.