A Synthesis of Deep Benthic Faunal Impacts and Resilience Following the Deepwater Horizon Oil Spill

The Deepwater Horizon (DWH) oil spill significantly impacted the northern Gulf of Mexico (nGoM) deep benthos (>125 m water depth) at different spatial scales and across all community size and taxa groups including microbes, foraminifera, meiofauna, macrofauna, megafauna, corals, and demersal fishes. The resilience across these communities was heterogeneous, with some requiring years if not decades to fully recover. To synthesize ecosystem impacts and recovery following DWH, the Gulf of Mexico Research Initiative (GOMRI) Core 3 synthesis group subdivided the nGoM into four ecotypes: coastal, continental shelf, open-ocean, and deep benthic. Here we present a synopsis of the deep benthic ecotype status and discuss progress made on five tasks: (1) summarizing pre- and post-oil spill trends in abundance, species composition, and dynamics; (2) identifying missing data/analyses and proposing a strategy to fill in these gaps; (3) constructing a conceptual model of important species interactions and impacting factors; (4) evaluating resiliency and recovery potential of different species; and (5) providing recommendations for future long-term benthic ecosystem research programs. To address these tasks, we assessed time series to detect measures of population trends. Moreover, a benthic conceptual model for the GoM deep benthos was developed and a vulnerability-resilience analysis was performed to enable holistic interpretation of the interrelationships among ecotypes, resources, and stressors. The DWH oil spill underscores the overall need for a system-level benthic management decision support tool based on long-term measurement of ecological quality status (EQS). Production of such a decision support tool requires temporal baselines and time-series data collections. This approach provides EQS for multiple stressors affecting the GoM beyond oil spills. In many cases, the lessons learned from DWH, the gaps identified, and the recommended approaches for future long-term hypothesis-driven research can be utilized to better assess impacts of any ecosystem perturbation of industrial impact, including marine mineral extraction.


INTRODUCTION
As the global demand for hydrocarbon-based energy sources has increased and nearshore marine resources have been depleted, the oil and gas industry has gradually progressed offshore into deeper waters (Cordes et al., 2016;Murawski et al., 2020). Deepwater petroleum extraction is a global industry and in some cases already outpacing shallow water production (Cordes et al., 2016;Murawski et al., 2020). Significant research has been undertaken on the prevention, mitigation, environmental impact, response, and restoration from oil spills since the Deepwater Horizon (DWH) oil spill in 2010 . Considering the global industrial and environmental implications, it is important to address the lessons learned from previous oil spills and their bearing on future development, both on environmental governance (e.g., baseline establishment prior to resource exploitation) as well as prevention and mitigation of future spills. These lessons and best practices are also applicable to other industrial exploration and production in the deep ocean such as marine mineral extraction.
To synthesize 10 years of research following the DWH oil spill, the Gulf of Mexico Research Initiative (GOMRI) organized groups around several core research themes. The goal of the GOMRI Core 3 synthesis group was to assess the northern Gulf of Mexico (nGoM) ecological impact of and resilience to DWH. For these purposes, the nGoM was divided into four "ecotypes": (1) coastal, (2) continental shelf, (3) open ocean, and (4) deep benthic (Murawski et al., in review A; Murawski et al., in review B; Patterson et al., in review; Sutton et al., in review; this study). The Core 3 synthesis deep benthic group then assembled the record of species and community change in the nGoM deep benthic ecotype before, during and following the DWH oil spill. This synthesis is critical to establish baseline conditions, provide new understanding on how offshore ecosystems respond to oil spills, and provide estimates on recovery from such disturbances. This approach has applicability anywhere in the world where hydrocarbon exploration and production occurs.
The Gulf of Mexico is a semi-enclosed marginal sea with an areal extent of 1.6 million square kilometers bordered by the United States, Mexico and Cuba (Holmes, 1976). In this review, the deep benthic ecotype in the nGoM is defined by water depths greater than 125 m, which is consistent with industrial deep water (125-1500 m) and ultra-deep water (>1500 m) zones (Locker and Hine, 2020) and represents greater than 40% of the Gulf of Mexico seafloor (Gore, 1992). However, in accordance with ongoing restoration efforts (Open Ocean Trustee Implementation Group [OOTIG], 2019), we have also included some mesophotic coral sites, which are shallower than 125 m (Etnoyer et al., 2016). The portion of the nGoM deeper than 125 m includes the lower extent of the continental shelf, slope, abyssal plain and interacts with multiple water masses including the Caribbean Subtropical Underwater (<250 m), Tropical Atlantic Central Water (250-700 m), Antarctic Intermediate Water (600-1000 m), and North Atlantic Deepwater (>1,000 m; Schroeder et al., 1974;Vidal et al., 1994;Rivas et al., 2005). There are multiple, distinct habitats within the deep benthic ecotype including shelf and slope softbottom habitats, methane seeps, and live hard-bottom habitats including coral reefs and gardens (Ward and Tunnell, 2017). The Mississippi and Atchafalaya river systems are important controls for sediment and nutrient delivery to the nGoM system and the Mississippi Canyon and DeSoto Canyon are important bathymetric features governing the distribution of benthic fauna due to their physical structure and focusing of settling detrital food material (Ward and Tunnell, 2017).
This effort focuses on seven benthic size and taxa groups: (1) microbiota, (2) benthic foraminifera, (3) meiofauna (0.044-0.3 mm), (4) macrofauna (0.3-30 mm), (5) megafauna (>30 mm), (6) deep corals, and (7) demersal fishes. Many benthic fauna serve as important bioindicators of ecological health and habitat suitability. There is also legal authority to include them in the Natural Resource Damage Assessment (NRDA) in the case of an oil spill within the 200 nautical mile limit of the United States exclusive economic zone (Oil Pollution Act [OPA], 1990). Benthic fauna serve vital functional roles in marine ecosystems. In the context of oil spills, benthic fauna play particularly important roles in oil bioremediation, carbon storage and sequestration, and provision of forage for commercially and recreationally valuable epipelagic fishes and cetaceans (Danovaro et al., 2008;Levin and Dayton, 2009;Ramirez-Llodra et al., 2010Jobstvogt et al., 2014;Thurber et al., 2014;Fisher et al., 2016).
The following review provides a synopsis of the DWH oil spill scenario, the benthic impacts (i.e., changes from baseline), benthic resilience trajectories (i.e., recovery), the processes and interactions among groups, a vulnerability and resilience analysis for the representative benthic groups and provides recommendations for future research. The impacts and recovery are identified from existing time-series data (when available) from pre-DWH, during-DWH, and post-DWH timeframes versus longer-term trends in the nGoM due to other stressors. All types of impacts observed, the length of time for each group to either recover to pre-DWH status or reach a steady state ("new normal" status), and the references associated with the summary metrics (Holling, 1973;Gunderson, 2000;Walker et al., 2004) are summarized ( Table 1).

OIL SPILL SCENARIO
The explosion of the DWH at the Macondo wellhead site created an oil spill lasting 87 days during the summer of 2010 releasing a cumulative 4.9 million barrels (one barrel = 42 gallons = 159 liters) of oil with 3.2 million remaining in the environment following mitigation efforts (burning, booming, and dispersant;McNutt et al., 2012;US District Court, 2015) (Figure 1). There were two primary pathways for benthic oil exposure. The first was the formation of subsurface hydrocarbon intrusions, which formed when emulsified oil (or oil droplets) exiting the broken riser pipe reached neutral buoyancy before reaching the surface. The primary intrusion depth was 1000-1300 m water depth, which impinged directly on benthic habitats along the bathymetric   slope of the nGoM (Joye et al., 2011;Kessler et al., 2011;Paris et al., 2012;Romero et al., 2015;Perlin et al., 2020). The second mechanism, now termed Marine Oil Snow Sedimentation and Flocculent Accumulation (MOSSFA) is the enhanced flocculation and sinking of particles containing petrogenic, pyrogenic, lithogenic, and biological (organic and inorganic, marine, and terrestrial) sources (Passow et al., 2012;Ziervogel et al., 2012;Passow, 2014;Brooks et al., 2015;Romero et al., 2015Romero et al., , 2017Daly et al., 2016Daly et al., , 2020Schwing et al., 2017aSchwing et al., , 2020aQuigg et al., 2020). MOSSFA resulted in a four-fold increase in bulk sedimentation Larson et al., 2018), intensification of reducing conditions for up to 3 years following the oil spill (Hastings et al., 2016), and a two-three-fold increase in polycyclic aromatic hydrocarbon (PAH) concentrations . Depending on the tracer used (hopane, PAHs, and radioisotopes, etc.), estimates vary widely on the seafloor coverage of MOSSFA (1,030-35,425 km 2 ) and the proportion of the total oil budget that was deposited (3.7-14.4%; Valentine et al., 2014;Chanton et al., 2015;Romero et al., 2015Romero et al., , 2017Passow and Ziervogel, 2016;Stout and German, 2017;Schwing et al., 2017a). The diversity indices of the groups included in this review (with the exception of corals) reach a maximum at approximately 1,500 m water depth, which is coincident with the depth of the Macondo wellhead site (∼1,520 m; Fisher et al., 2016). Considering the extent and concentration of MOSSFA-related oil deposited on the seafloor, it is logical then to evaluate the impacts and responses of benthic communities.

Microbiota
Deepwater Horizon was the first major oil spill for which genomics was applied over large spatial and temporal scales Kostka et al., 2020). Using these approaches, microbial communities were determined to be predominantly (90%) oil degrading species in areas exposed to hydrocarbons (Kleindienst et al., 2016). There was also a succession of microbial blooms with species adapted to degrade specific types of petroleum compounds in the water column and in surface sediments (Kleindienst et al., 2016;Yang et al., 2016a,b;Kostka et al., 2020). Alphaproteobacteria (Roseobacter) were predominant in surface sediments collected in September 2010 (Yang et al., 2016b). Yang et al. (2016b) initially suggested the presence of sulfate reducing families (Deltaproteobacteria) in October 2010 was indicative of MOSSFA stimulating microbial metabolism in sediments. Cycloclasticus, a hydrocarbon-degrading genus found in surface oil slicks and subsurface intrusions, was also dominant in surficial sediments in October-November 2010 (Yang et al., 2016b) contrasting with pre-DWH observations (March 2010, Yang et al., 2016a). Mason et al. (2014) found enriched Gammaproteobacterium and Colwellia species at the most heavily oil-impacted benthic sites, which were similar to those found in the subsurface intrusion, potentially fueled by nitrogen availability, and hydrocarbon induced mortalities on more sensitive species. Overholt et al. (2019) utilized sediment collections from 2012 to 2015 to characterize microbial communities for 29 sites (>700 samples) throughout the nGoM. By comparing these records to samples collected prior to, and just following DWH impacts (Mason et al., 2014), it was evident that sedimentary microbial communities impacted by DWH returned to near baseline conditions (e.g., at the class level of organization) within 2 years ( Table 1). Overholt et al. (2019) also developed a predictive microbial model leveraging geospatial and environmental variables to aid future oil spill response and mitigation efforts.
During surveys of the wellhead area in 2010 and 2014, extensive areas impacted by MOSSFA were observed in submersible ALVIN Yang et al., 2016a). The relative abundance of Deltaproteobacteria (Yang et al., 2016b) was lower in these samples and the relative abundance of Alphaproteobacteria and Planctomycetes was higher relative to abundances observed in the November 2010 samples. Sediment microbial community composition was extremely variable from site to site and over time, likely reflecting the heterogeneous nature of MOSSFA deposition (Westrich et al., 2020). Likewise, sediment metabolic rates, especially sulfate reduction, in MOSSFA layers were low compared to areas of natural seepage and to nearby controls lacking MOSSFA deposition (Fields and Joye, 2014). The dominant process in MOSSFA layers was denitrification and sulfate reduction rates were often below detection (Fields and Joye, 2014). It appears that microbial activity in MOSSFA layers is not elevated, as originally anticipated based on the large influx of organic carbon from the event (Westrich et al., 2020). The microbial community associated with MOSSFA layers is clearly distinct from deeper sediments from the same sites and from background and control sediments (Yang et al., 2016;Westrich et al., 2020). The perturbation in the community structure appeared to persist for 2-3 years after relaxing back to a new "steady state." Rates of benthic metabolism across the nGoM exhibit extensive and dramatic heterogeneity  due to the complex nature of organic matter inputs from oil and gas seepage and from terrestrial inputs, which diminish with distance from shore, and marine inputs, which are low relative to seepage inputs locally . Along the shelf and upper slope, delivery of terrestrial organic matter, as well as inorganic nutrients, leads to extremely high rates of sediment metabolism -up to 55 mmol DIC and 4.4 mmol NH 4 + released m −2 d −1 while up to 20 mmol of O 2 m −2 d −1 are consumed . Rates of sulfate reduction along the shelf are high (Canfield, 1989), but rates of sulfate reduction at oil and gas seeps are extreme: the highest volumetric rates of sedimentary sulfate reduction (max = 14 µmol cm −3 d −1 ) in the marine environment were documented at a Gulf oil seep (Arvidson et al, 2004). While the highest integrated rates of sulfate reduction in naturally oiled sediments are high (up to 700 mmol m −2 d −1 ; Joye et al., 2010), rates of sulfate reduction in the MOSSFA layer of sediments in the vicinity of the DWH wellhead were extremely low (<1 mmol m −2 d −1 ; Fields and Joye, 2014;Westrich et al., 2020). Orcutt et al. (2017) assessed microbial metabolic rates in situ at a Gulf cold seep, GC600 showing that addition of weathered oil stimulated sulfate reduction rates to levels rivaling those documented in oily cold seep sediments . The results of Orcutt et al. (2017) suggest that oil alone does not inhibit activity, rather the rapid deposition or some aspect of the sedimenting material instead suppressed sulfate reduction activity (Westrich et al., 2020).

Benthic Foraminifera
Benthic foraminifera are sensitive and diverse bioindicators of aquatic petroleum exposure (Morvan et al., 2004;Mojtahid et al., 2006;Denoyelle et al., 2010;Brunner et al., 2013;Lei et al., 2015). Following DWH, there was an 80-93% decrease in density and a 30-40% decrease in species diversity of benthic foraminifera at oil-impacted sites Schwing et al., 2017b). The assemblages were predominantly high-organic material flux and low oxygen tolerant species (e.g., Bulimina aculeata, Globocassidulina subglobosa) consistent with the conditions associated with a MOSSFA event (Schwing et al., 2017b). Following the initial decrease in 2010-2011, benthic foraminifera density and diversity increased from 2011 to 2014 and reached a steady state 5 years after the DWH oil spill (Schwing et al., 2018a). However, for many sites, the assemblages remain (as of 2018) significantly different than those prior to DWH (Schwing and Machain-Castillo, 2020;Schwing et al., 2020b). At the Macondo wellhead site and to the east near DeSoto Canyon, the predominant taxa (Uvigerina spp. and Bolivina spp.) were tolerant of high organic carbon deposition; these taxa were different than those documented prior to DWH (Schwing and Machain-Castillo, 2020;Schwing et al., 2020b). The benthic foraminiferal tests (shells) carbon isotopic composition was depleted in δ 13 C, relative to background, for up to 2 years following DWH (Schwing et al., 2018b;Schwing and Machain-Castillo, 2020). This isotopic signal suggests carbon uptake from oil, possibly caused by increased organic carbon flux due to MOSSFA (Schwing et al., 2018b). The depleted carbon signal in benthic foraminiferal tests (2010-2012) was preserved below surface sedimentary layers and likely represents the most robust tracer for long-term preservation of the MOSSFA signal in the sedimentary record (Schwing et al., 2018b;Schwing and Machain-Castillo, 2020).
Total foraminifera counts and short-lived radioisotope dates were utilized to construct a long-term (decadal) time series (Supplementary Figure 1 and Supplementary Table 2) of foraminifera richness, Shannon diversity and evenness Larson et al., 2018;Schwing et al., 2018a). With the exception of one site (SW01), the benthic foraminifera richness, Shannon diversity, and evenness decreased slightly in 2010 and returned to pre-DWH values within 3 to 5 years, which is consistent with the findings of Schwing et al. (2018a). Despite the return to pre-DWH diversity indices, as revealed in PERMANOVA tests (methods in Supplementary Material), the post-DWH (2011-2015) benthic foraminifera assemblages were significantly different than the pre-DWH (1977-2009) assemblages (Supplementary Figure 2). At site SW01, which is located approximately 90 km southwest from the DWH wellhead, there was a continuous downward trend in all diversity indices beginning in 2010, which had not returned to pre-DWH values as of 2015.
For meiofauna, data from the Northern Gulf of Mexico Continental Slope Study (NGOMCSS, Pequengnat et al., 1990), the Deep Gulf of Mexico Benthos Program (DGoMB; Baguley et al., 2006;Rowe and Kennicutt, 2009), and post-DWH, the NRDA (Montagna et al., 2013;Reuscher et al., 2017) were combined to create a long-term (decadal) time series (Supplementary Figure 3 and Supplementary Table 3). Shannon diversity has decreased and the nematode:copepod ratio has increased continuously in the nGoM since the 1980's, which were consistent with a long-term decreasing ecological quality status (EQS) unrelated to DWH. However, there was also a noticeable additional decrease in evenness and increase in abundance in the post-DWH collections, which was consistent with an increase in opportunistic taxa related to DWH-related stressors.

Macrofauna
Macrofauna range from 300 µm to 30 mm in size and are dominated by Polychaeta (worms), Crustacea (shrimp), and Mollusca (clams, snails; Montagna and Girard, 2020). Moderate to severe impacts to macrofaunal density (impacted: 7,000 n/m 2 , reference: 8,600 n/m 2 ) and diversity (N1, impacted: 11, reference: 17) were assessed following the DWH throughout an area of about 148 km 2 and 24 km 2 , respectively, surrounding the wellhead (Montagna et al., 2013(Montagna et al., , 2017a. Macrofaunal diversity fell below background values in the impacted area (up to 29 km away from the wellhead) and crustaceans were observed to be the most sensitive to oil residue exposure (Washburn et al., 2016). Abundance increased (impacted: 11,800 n/m 2 , reference: 8,900 n/m 2 ) within a 1 km radius of the wellhead in 2011, due primarily to opportunistic polychaetes (Family Dorvilleidae; Washburn et al., 2017). Throughout the remainder of the impacted area, macrofaunal richness, and diversity remained 22.8% (impacted: 20 taxa/sample, reference: 26 taxa/sample) and 35.9% (N1, impacted: 11, reference: 18) lower, respectively, than reference areas in 2011 (Montagna et al., 2017a). As of 2014, much like meiofaunal taxa richness, macrofaunal taxa richness remained lower in the impacted (25 taxa/sample) areas versus the reference (30 taxa/sample) areas, suggesting that a full recovery had not occurred as of 4 years after the DWH . Overall, macrofauna and meiofauna richness and abundance indicate a more disturbed environment in a broad area surrounding the DWH site in the nGoM from pre-to post-DWH (Schwing et al., 2020b). Also, taking into account the average sediment accumulation rates, oil residue degradation rates, and metabolic rates it may take between 50 and100 years to fully bury and/or degrade DWH-contaminated sediment below macrofaunal bioturbation depths, thus allowing a full recovery of benthic species diversity and abundance (Montagna et al., 2017a).
For macrofauna, data from the NGOMCSS (Gallaway 1988), DGoMB (Haedrich et al., 2008;Rowe and Kennicutt, 2009), and post-DWH (Montagna et al., 2013;Washburn et al., 2017) were used to create a long-term (decadal) time series across the Gulf of Mexico deep softbottom habitats (Supplementary Figure 4). With the exception of site MT1, there was a gradual increase in macrofauna abundance and gradual decrease in evenness over the entire record . These changes were similar to the meiofauna records, which were consistent with a long-term decreasing EQS unrelated to DWH. There was also a decrease in Shannon diversity beginning in 2010 and continuing through the latest collections (2014), which was also consistent with ongoing impact from DWH.

Megafauna
There were very limited surveys that included benthic megafauna in deep waters of the nGoM following DWH. Valentine and Benfield (2013) performed remotely operated vehicle (ROV) megafauna (>30 mm) surveys in August and September 2010 at 2,000 m (north, west, south, and east) and 500 m (north) of the wellhead. They were challenged to quantify impact and response due to the limited baseline measurements of megafauna (e.g., Chaceon quinquedens, Nematocarcinus, Venefica procera, Dicrolene spp., Synaphobranchus spp., Halosauridae, Bathypterois quadrifillis, Cerianthid anemone) and associated physicochemical parameters in the area prior to DWH. Lowest species richness and abundances were measured at the sites located at 500 m north and 2,000 m south of the wellhead, which is consistent with hydrocarbon concentrations that were determined to be sufficiently high to cause mortality and/or emigration (Valentine and Benfield, 2013). Valentine and Benfield (2013) also documented widespread pyrosome and salp carcasses, indicating that planktonic assemblages up to 2,000 m away from the wellhead were impacted. Mcclain et al. (2019) performed similar ROV surveys in June 2017 at the DWH wreckage site, wellhead site, the 500 m north and 2,000 m south sites from Valentine and Benfield (2013), and four other control sites throughout the nGoM. There was still considerable degradation at the DWH wreckage, wellhead, 500 m north and 2,000 m south sites 7 years after DWH (Mcclain et al., 2019). The impacts observed included lower species diversity, higher homogeneity among assemblages, and abnormal population densities (Mcclain et al., 2019). Employing multivariate analysis techniques PERMANOVA, PERMDISP, and Canonical Analysis of Principle Coordinates (methods described in Supplementary Material) we found 4.5-times lower dispersion between sites in 2017 than in 2010, consistent with the hypothesis and information that community homogenization (similarity in composition and abundance between sites) is a consequence of DWH pollution as described by Mcclain et al. (2019) and consistent with reduced resilience from 2010 to 2017 (Supplementary Figure 5 and Supplementary Table 4). Arthropod abundance (e.g., red shrimp Nematocarcinus, white caridean Glyphocrangon shrimp, and C. quinquedens) at the DWH wreckage site was more than 7-times higher than the background sites (Mcclain et al., 2019). Mcclain et al. (2019) explain that this abnormally high abundance may potentially be caused by the attraction of crustaceans to hydrocarbons mimicking natural chemical cues (Kittredge, 1973).

Corals
An initial survey of deep-sea corals, funded by the Bureau of Ocean Energy Management (BOEM) and the National Oceanic and Atmospheric Administration (NOAA) Office of Ocean Exploration and Research (OER), was performed in October 2010, 3 months after the DWH wellhead was capped. This survey was focused on healthy coral colonies throughout the northern Gulf, but also found a previously unknown coral community near Mississippi Canyon (MC294), which was visibly impacted by flocculent material containing both oil and dispersant (White et al., 2012(White et al., , 2014DeLeo et al., 2015). Visible impacts (e.g., mucous strands, loose tissue, and bare skeleton) from oiled flocculent material were assessed by examination of digital imagery for Paramuricea biscaya, Paragorgia regalis, and Swiftia pallida (White et al., 2012;DeLeo et al., 2018;Montagna and Girard, 2020). ROV surveys performed in 2011 as part of the NRDA (Fisher et al., 2014a,b) identified two additional coral sites (MC297, MC344) near (6-20 km) the DWH wellhead from 1,560 to 1,850 m water depth as also being visibly impacted by oiled flocculent material. The observed impacts were primarily associated with P. biscaya, including patchy tissue death attributed to microdroplets of oil and dispersant or marine oil snow, and hydrozoan colonization (Fisher et al., 2014b).
Following the initial discovery and documentation of impacted corals at deeper sites, additional work focused on mesophotic (<100 m depth) coral banks between Louisiana and Florida. Multiple species of octocorals exposed to elevated hydrocarbon concentrations presented visible signs of colony injury, including tissue and branch loss (Silva et al., 2015). Impacts at the Alabama Alps site, Roughtongue Reef, and Yellowtail Reef of up to 50% were documented in the coral colonies and the rate of injury was approximately 10 times that of the same sites compared to pre-spill conditions (Etnoyer et al., 2016). In a follow-up survey in 2014, the majority of injured coral colonies marked in 2011 continued to decline in health, with little obvious signs of recovery (Etnoyer et al., 2016).
In 2012, most corals at the deeper sites (MC294, MC297, and MC344) were still impacted, but the level of impact had decreased from the 2010 and 2011 surveys and the recovery rate was determined to be dependent on the level of initial impact (Hsing et al., 2013;Montagna and Girard, 2020). Visible impacts, including branch loss, remained higher at these sites than reference sites through 2017 Montagna and Girard, 2020). According to model results based on branch loss/growth from 2010 to 2017, most impacted corals may take up to 30 years to recover to a state where all remaining branches appear healthy and are projected to reach a pre-DWH status (with some unhealthy branches) within 10 years . However, overall branch loss within that time period accounts for a 10% reduction in biomass at the impacted sites . Due to branch loss and the extremely slow growth rates of some corals (e.g., P. biscaya = 0.14-1.2 cm/year/colony), the colonies at site MC294 are expected to achieve their original size (pre-DWH) in over 50 years on average FIGURE 2 | A conceptual model of the benthic ecotype including the major groups, the interaction amongst the groups the interaction between groups and environmental controls, food sources and/or stressors [e.g., hydrocarbon (HC) toxicity], and processes within the benthic ecotype (panels A-D) and among the other ecotypes (panels E-H) for four time periods: (1) pre-DWH, (2) during DWH, (3) post-DWH 1-2 years, and (4) post-DWH 3-10 years. The pre-DWH model is modified from Rowe and Kennicutt (2009). Processes not discussed in the text are: "flux by fin" (Nelson et al., 2012), high molecular weight (HMX) hydrocarbons from petroleum seeps .
with some individual colonies requiring more than 100 years to recover (Girard et al., 2019;Montagna and Girard, 2020).
A model of Paramuricea population growth estimated recruitment to occur at approximately 10-20 individuals per year per site, but recruitment was also estimated to be highly variable and patchy (Doughty et al., 2014). Doughty et al. (2014) also estimated 40-50% mortality in the youngest size classes in the model and mortality declining to less than 1% in colonies over 20-30 cm. Assuming that larvae are capable of colonizing and surviving to maturity at existing sites, estimates of growth that range between 0.03 and 0.2 cm year −1 (Prouty et al., 2016) suggest that it would take 100 to over 600 years for a colony to reach 20 cm, corresponding to the smallest size classes measured at the impact sites. Therefore, recovery of sites that require complete replacement of entire colonies would require centuries to complete.

Fishes
Deepwater Horizon impacts on the majority of benthicdependent and demersal fish species are described in Sutton et al., (In Prep) and Patterson et al., (In Prep). Benthic-dependent fish species are diverse; the majority having meroplanktonic larval stages (Limouzy- Paris et al., 1994;Powell et al., 2017). Cusk eels (Family Ophidiidae), a major and representative component of the benthic fish community of the nGoM are bottom dwelling, living in the surface sediments with a pelagic larval stage. Their larval stages are cod-like and are ranked quite high (#14 over a range of 1-86.5) in both frequency of occurrence and abundance in plankton samples from the upper 100 m of the water column, which makes them an important benthic resource and direct coupling from benthic to the pelagic system (Limouzy- Paris et al., 1994;Mann et al., 1997). They spend the majority of their time near the seafloor or burrowed within the sediment (Mann et al., 1997). While Ophidiidae were included in some megafauna surveys (e.g., Mcclain et al., 2019), their status prior to DWH for the majority of the nGoM and any subsequent impact was understudied. It is likely that Ophidiidae habitat in the deep benthos was impacted by the deposition of MOSSFA material, considering its spatial coverage in the deep benthos of the nGoM.

CONCEPTUAL MODEL OF KEY ECOLOGICAL GROUPS IN RELATION TO THE DEEPWATER HORIZON OIL SPILL
The DWH had both broad spatial scale and long-term consequences for many groups in the benthic ecotype. As recommended by Peterson et al. (2003) following the Exxon Valdez oil spill, it is important to attempt to connect the various responses of these size and taxa groups within the benthic ecotype and among other ecotypes (open ocean, continental shelf, and coastal) to prepare for assessing and managing future perturbations. One way to do this is via conceptual models, or summary process graphics, which account for the major ecological groups, their interactions (predation, scavenging, and bactivory, etc.), the interaction between groups and environmental controls, food sources, and/or stressors (particulate/dissolved organic matter, deadfalls, and hydrocarbon seeps, etc.) via deposit feeding, lysis and dissimilation, assimilation, and chemotrophy. The model presented here (Figure 2) is presented over four time periods: (1) pre-DWH, (2) DWH, (3) post DWH 1-2 years, and (4) post-DWH 3-10 years. These time periods were chosen to be consistent with other syntheses describing the transport and fate of oil residues (Overton et al., in prep). The pre-DWH model of associations within the deep benthic ecotype is based on GoM interaction diagrams from Rowe (2017) and Rowe and Kennicutt (2009), which employ comparable state variables in each group including density, biodiversity and species composition. The conceptual model presented herein accounts for fluxes (arrow thickness), increased abundance/activity/respiration (emboldened reservoirs), lethal and sub-lethal effects (red and yellow coloration, respectively), steady-state (green outline), and changes in community structure (blue outline). This conceptual model is reflective of changes observed in the literature cited above in most cases. The exceptions, due to gaps in observation, are the assumed decrease in predation and diminished respiration between the benthos and the water column.

VULNERABILITY-RESILIENCE ANALYSES
To prepare for trade-off analyses in response to future oil spills, we performed a vulnerability-resilience (V-R) analysis for deep benthic ecotype synthesis group. The V-R analysis presented here (Figures 3A,B) was based on productivity and susceptibility analysis (Arrizabalaga et al., 2011;Murawski et al., in review A) and adapted for vulnerability and resilience assessment (Lange et al., 2010) to best characterize the deep benthic system impact and response following DWH. The V-R analysis utilizes predetermined attributes to assess the vulnerability of a group or taxa to a specific insult and its resilience. A matrix was developed based on these rankings along two axes: (1) a vulnerability axis based on the overlap of oiled sedimentation with the distribution of a specific taxa and the sensitivity of that taxa to oil toxicity and (2) a resilience axis based on the productivity, connectivity with other taxa, and level of decline of a specific taxa. Based on these attributes, each taxa for which observations were available was placed in the matrix based on a low, medium or high ranking in both the vulnerability and resilience categories (Figure 3 and Supplementary Table 1). The V-R analysis ultimately assesses potential vulnerability of deep benthic resources to oil spills by comparing relative risk among populations and has the potential to alert managers to taxa or populations that are more likely to be sensitive to oil spills in general. This approach is particularly useful as a hypothesis-testing tool for the deep benthic ecotype because it is generally data-poor and there are highly variable levels of data richness among the various groups. The placement of each population on the matrix can be supported, tested and uncertainty assessed by existing literature (or lack thereof). This approach is also useful to identify gaps in existing knowledge and has the potential to support the design of response efforts and NRDA surveys in different settings. The analysis ultimately evaluates the relative risk to ecosystem components for an oil spill with characteristics similar to DWH in the nGoM.
To streamline the results of the V-R analysis, a summary figure of the distribution of each group on the V-R matrix is also presented ( Figure 3B). Overall, there is a transition from the bottom left of the matrix (low vulnerability, high resilience) to the top right of the matrix (high vulnerability, low resilience) with increasing organism size (e.g., microbes to megafauna). This transition is likely a result of the slower growth rates, slower community turnover, and longer lifespan of the fauna. The coral group is widely distributed with representative taxa ranging from low to high vulnerability and resilience due to the disparity in abundance, range size, and longevity of the different species; information that was primarily obtained after the DWH. The demersal fish group is represented only by the Family Ophidiidae (cusk eels) family (43 species), which have been classified as medium vulnerability (overlap of habitat with MOSSFA) and medium resilience, based on their size and relatively short lifespan of 2-3 years (Retzer, 1991). This analysis has also shed light on many gaps in the assessment of the deep benthos. First and foremost, there were only two surveys of deep benthic fishes and megafauna following DWH with 7 years between the surveys (Valentine and Benfield, 2013;Mcclain et al., 2019) and no direct comparative studies prior to DWH. The temporal resolution and lack of baseline severely limited the ability to assess resilience and vulnerability for these two groups. The deep benthic ecotype, as a whole, is relatively data-poor and warrants better spatial and temporal coverage to establish long-term trends in multiple, chronic biotic controls upon which episodic stressors may be superimposed. Parker and Wiens (2005) argue that long-term hypothesisdriven research is the only way in which to responsibly assess vulnerability and resilience of a given system. They argue that ecological assumptions about the system in question affect the assessment of recovery following an incident. This argument is supported by the presentation of four cases: (1) steady state equilibrium, which is a basic model in which the system has not been unduly perturbed, (2) spatial equilibrium, where a system's recovery potential is based on a return of EQS with respect to a reference area, (3) dynamic equilibrium, where the reference and impacted areas have a different EQS but the same temporal dynamics, and (4) highly variable, where it is possible to mistakenly assess recovery or continued impact due to the lack of knowledge about the natural variability of the system (Parker and Wiens, 2005). With this in mind, it is important to identify existing long-term programs, develop robust decision-making tools that can be utilized by natural resource managers, and identify important areas of the GoM upon which to focus these efforts and provide recommendations on the design of such efforts.

Future Research Recommendations
There have been a few previous efforts to summarize the implications of existing long-term research programs for their utility in oil spill effects assessment. The BOEM organized the Gulf of Mexico Workshop for International Research, reporting on ongoing, long-term research efforts (McKinney et al., 2019). The Ocean Conservancy also produced a report that includes ongoing long-term research projects in the GoM that include deep benthic collections and surveys (Love et al., 2015).
Similar descriptions of relevant research exist elsewhere. For example, the European Water Framework Directive (European Union Water Framework Directive [EUWFD], 2000) established a legal framework in which water bodies are defined by their EQS and must meet certain requirements, or the governing body must identify the pollution source and take action to remediate (Borja et al., 2004a,b). The EQS standards include physicochemical, hydromorphological, and biological categories. One successful outcome of this directive was the development of an EQS measurement tool, the AZTI marine biotic index (AMBI; Borja and Muxika, 2005). AMBI utilizes benthic invertebrates, typically macrofauna and/or foraminifera (Alve et al., 2016;Jorissen et al., 2018) to produce an easily measured, simple to communicate, and cost effective decision support tool (Rees et al., 2006). The development of a GoM-wide, foraminiferabased AMBI is already underway (O'Malley et al., 2020) and with input from resource managers, would provide an easily operationalized tool to interpret benthic habitat suitability and ecological health status.
Considering the areas affected by DWH and the ongoing oil and gas exploration and production in the GoM, there are several areas that are important to include in future long-term research efforts. The first is the Perdido field in the west-central GoM along the United States-Mexico border where current petroleum exploration is occurring as deep as 3,000 m water depth and where there are few baseline measurements of the status of the benthic fauna. The importance of including the Perdido field in future research efforts is primarily to coordinate, in advance, the multinational response effort that would be necessary in the case of a spill in that area. In the eastern GoM, there is currently a moratorium on energy development, but a likely area of development in the future is the Norphlet play, in the East-central GoM near the Florida escarpment (Reid, 2014). The importance of this area for inclusion in future research efforts is primarily to establish baselines prior to any development of energy resources in this area. Following the DWH, the DeSoto Canyon and Mississippi Canyon were identified as biologically important areas that also acted as focusing mechanisms for MOSSFA (Rowe and Kennicutt, 2009;Schwing et al., 2017a). It is important to continue ongoing research efforts in these areas to disentangle the effects of ongoing petroleum exploration and DWH from other chronic stressors, particularly to document the pace of long-term recovery. Finally, the Campeche Canyon in the southern GoM, which is located just north of Laguna Terminos and the petroleum production exclusion zone in the Cantarell field, has also been identified as a biologically sensitive area (Machain-Castillo et al., 2019) and is analogous to the canyons in the northern GoM with respect to MOSSFA focusing during past (Ixtoc 1) and future spills.
Several recommendations for future long-term research programs will build from the success of the approaches employed during and following, and the gaps identified since DWH. Future long-term research programs must address communities that are important to management such as economically important taxa and/or keystone and indicator taxa. It is also important to focus on oil spill preparedness in the context of other stressors and perturbations (Parker and Wiens, 2005). Recognizing that long-term, hypothesis-driven research programs are not simply monitoring efforts, but allows testing models of system resiliency (Table 1 and Figure 2), has important implications for future damage assessments in the case of an oil spill. Long-term research programs can account for and report trade-offs for petroleum extraction locations and oil spill response. In fact, the only way to quantify trade-offs is by accumulating an adequate information base from long-term research efforts. Future longterm research efforts must also take into account habitat degradation and recovery rates as primary metrics (Danovaro et al., 2020). Finally, long-term research efforts can help tailor improved pollution abatement and resource protection strategies for threats resulting from ongoing discharges, temporal restrictions necessary to protect primary reproductive periods, and adequate spatial management practices (Cordes et al., 2016). These recommendations also align with overarching goals of the United Nations Ocean Decade of Ocean Science for Sustainable Development (2021Development ( -2030 which are to (1) identify, quantify and remove pollutants to provide a clean ocean, (2) measure multiple impacts to provide a resilient ocean, and (3) increase capacity to understand current and future ocean conditions to provide a predicted ocean (Ryabinin et al., 2019).

Synthesis Statements
The following statements regarding the record of species and community change in the north-central and northeastern Gulf of Mexico (nGoM) deep benthic ecotype before, during and following the DWH oil spill in 2010 along with recommendations on how to prepare for future perturbations arise from this effort: • Avoidance of toxic petroleum residues was not possible for non-motile benthic organisms during an oil spill of DWH magnitude. • The response of deep benthic communities was a spatially and temporally variable and some communities have not recovered as of 10 years after the DWH spill. • Overall, larger organisms require longer to recover, which is likely due to increased lifespan and slower community growth rates (Table 1 and Figure 3). • Although some communities have reached steady state since DWH, not all of these returned to the pre-DWH state. • Considering sediment accumulation rates, hydrocarbon residue degradation rates and bioturbation depths, it may take 50-100 years to fully bury DWH contaminants and achieve a full recovery. • Traditional effect levels based on shallow-water studies do not explain the relationship in the deep sea between contaminant levels and documented impacts. Experimental culturing studies under appropriate concentrations and pressure are needed to improve organic contaminant targets for monitoring. • More comprehensive environmental assessment (funding for baseline measurement) prior to exploration/exploitation is necessary.
• Long-term, hypothesis driven studies are needed to extricate DWH and future oil spill impacts and recovery trajectories from other stressors. Longer-term funding periods should be commensurate with benthic faunal growth rates, turnover and predicted length of recovery time.

FUNDING
Funding for this project was provided by the Gulf of Mexico Research Initiative (GOMRI) through its research centers including: The Center for Integrated Modeling and Analysis of Gulf Ecosystems (C-IMAGE); and Ecosystem Impacts of Oil and Gas Inputs to the Gulf (ECOGIG). of the other ecotype groups for their input and refinement of the core 3 synthesis effort and design of the vulnerabilityresilience analysis.