Validation of Oil Trajectory and Fate Modeling of the Deepwater Horizon Oil Spill

Trajectory and fate modeling of the oil released during the Deepwater Horizon blowout was performed for April to September of 2010 using a variety of input data sets, including combinations of seven hydrodynamic and four wind models, to determine the inputs leading to the best agreement with observations and to evaluate their reliability for quantifying exposure of marine resources to floating and subsurface oil. Remote sensing (satellite imagery) data were used to estimate the amount and distribution of floating oil over time for comparison with the model’s predictions. The model-predicted locations and amounts of shoreline oiling were compared to documentation of stranded oil by shoreline assessment teams. Surface floating oil trajectory and distribution was largely wind driven. However, trajectories varied with the hydrodynamic model used as input, and was closest to observations when using specific implementations of the HYbrid Coordinate Ocean Model modeled currents that accounted for both offshore and nearshore currents. Shoreline oiling distributions reflected the paths of the surface oil trajectories and were more accurate when westward flows near the Mississippi Delta were simulated. The modeled movements and amounts of oil floating over time were in good agreement with estimates from interpretation of remote sensing data, indicating initial oil droplet distributions and oil transport and fate processes produced oil distribution results reliable for evaluating environmental exposures in the water column and from floating oil at water surface. The model-estimated daily average water surface area affected by floating oil >1.0 g/m2 was 6,720 km2, within the range of uncertainty for the 11,200 km2 estimate based on remote sensing. Modeled shoreline oiling extended over 2,600 km from the Apalachicola Bay area of Florida to Terrebonne Bay area of Louisiana, comparing well to the estimated 2,100 km oiled based on incomplete shoreline surveys.

Trajectory and fate modeling of the oil released during the Deepwater Horizon blowout was performed for April to September of 2010 using a variety of input data sets, including combinations of seven hydrodynamic and four wind models, to determine the inputs leading to the best agreement with observations and to evaluate their reliability for quantifying exposure of marine resources to floating and subsurface oil. Remote sensing (satellite imagery) data were used to estimate the amount and distribution of floating oil over time for comparison with the model's predictions. The model-predicted locations and amounts of shoreline oiling were compared to documentation of stranded oil by shoreline assessment teams. Surface floating oil trajectory and distribution was largely wind driven. However, trajectories varied with the hydrodynamic model used as input, and was closest to observations when using specific implementations of the HYbrid Coordinate Ocean Model modeled currents that accounted for both offshore and nearshore currents. Shoreline oiling distributions reflected the paths of the surface oil trajectories and were more accurate when westward flows near the Mississippi Delta were simulated. The modeled movements and amounts of oil floating over time were in good agreement with estimates from interpretation of remote sensing data, indicating initial oil droplet distributions and oil transport and fate processes produced oil distribution results reliable for evaluating environmental exposures in the water column and from floating oil at water surface. The model-estimated daily average water surface area affected by floating oil >1.0 g/m 2 was 6,720 km 2 , within the range of uncertainty for the 11,200 km 2 estimate based on remote sensing. Modeled shoreline oiling extended over 2,600 km from the Apalachicola Bay area of Florida to Terrebonne Bay area of Louisiana, comparing well to the estimated 2,100 km oiled based on incomplete shoreline surveys.

INTRODUCTION
Given the catastrophic nature of the Deepwater Horizon (DWH) oil spill of April-July 2010, considerable effort has been applied by many to understand the oil movements and fate, exposure of environmental resources, and potential impacts on biota and ecosystems. Under United States law, the Natural Resource Damage Assessment (NRDA) was undertaken by the Deepwater Horizon Natural Resource Damage Assessment Trustee Council (2016) to evaluate exposures to oil and quantify injuries to biota and the Gulf of Mexico (GOM) ecosystem resulting from the spill, supported by hundreds of studies and thousands of scientists and engineers. As logistics constrained obtaining enough field data to completely characterize the contamination in space and time during and after the period of oil and gas release, as part of the NRDA effort in support of the trustees, we pursued a numerical modeling effort to quantify oil movements, concentrations, exposures of water column biota and biological effects. As a first step, oil trajectory and fate modeling was performed using the established and verified model SIMAP (Spill Impact Model Application Package ;French McCay, 2003French McCay et al., 2015a, 2018b. The results of the oil trajectory and fate modeling were then used to evaluate exposure and biological effects of fish and invertebrates in the GOM (French McCay et al., 2015a,b,c,d,e). Subsequent to the NRDA settlement in 2016, the oil trajectory and fate modeling was refined as part of model validation studies (French McCay et al., 2018a,c) and pursuant to on-going efforts to evaluate the environmental effects of the spill.
The extent of oil contamination from the DWH was visibly evident during 2010 across the northern GOM in remote sensing imagery (Garcia-Pineda et al., 2013a,b;MacDonald et al., 2015;Svejkovsky et al., 2016) and based on field observations (see review in Deepwater Horizon Natural Resource Damage Assessment Trustee Council, 2016). Oil released from the broken riser both dispersed at depth and rose through nearly a mile of water column before reaching the surface. There are substantial uncertainties in the spill trajectory and spatialtemporal distributions of oil arising from the differences and uncertainties in wind and ocean current model data used to force the oil spill model, as is evident when comparing published oil spill model trajectories (Adcroft et al., 2010;MacFadyen et al., 2011;Liu et al., 2011;Mariano et al., 2011;North et al., 2011North et al., , 2015Dietrich et al., 2012;Paris et al., 2012;Le Heìnaff et al., 2012;Kourafalou and Androulidakis, 2013;Jolliff et al., 2014;Boufadel et al., 2014;Goni et al., 2015;Testa et al., 2016;Özgökmen et al., 2016;Weisberg et al., 2017). Hence, we examined implications of using various wind and current data sets, as well as assumptions related to wind drift of surface oil, to determine the inputs yielding the most accurate trajectory for evaluating the oil fate and environmental exposures resulting from the spill. The trajectories were compared to observational data such as floating oil distributions based on remote sensing, shoreline oil surveys, chemical analyses of field samples, and sensor data. It is important to consider the influence of uncertainties in the input data driving transport, as well as other model inputs, on exposure concentrations relevant to evaluating biological effects.
Thus, the objective of this article was to evaluate the accuracy of DWH model trajectories and oil distributions in the water column and at the surface based on various input data sets and implications to quantification of oil fate and exposure concentrations. The details of the modeled mass balance, exposure concentrations and biological effects are to be described in other publications (in preparation).

Approach
Model simulations were performed using four available meteorological and seven hydrodynamic model products covering the northeastern GOM. In addition, measured currents were used to evaluate transport along the continental slope. Model trajectories were also performed with winds alone to evaluate the influence of the currents. Wind drift and horizontal dispersion coefficients were altered to evaluate sensitivity to those assumptions. The model trajectories and concentration distributions were compared to observations of surfacing oil, remote sensing data-based observations, shoreline oiling distributions, fluorescence and other sensor data indicating the path of the deep plume, and chemistry sample results. The focus of this article is on the influence of physical forcing data (currents and winds) on distributions of surface and shoreline oil. Subsurface oil and oil component concentrations were compared to field measurement data in French McCay et al. (2018a,c), demonstrating sensitivity to the current data used as input.

Model Description
Spill Impact Model Application Package quantifies oil movements and concentrations of pseudo-components representing groups of petroleum compounds of like properties in droplet and dissolved phases in the water column, in floating oil slicks, emulsions and residuals, and as mass stranded on shorelines, settling to sediments, volatilized to the atmosphere, degraded in/on the water, shorelines and sediments, and as removed by response activities (i.e., mechanical removal and burning). Processes modeled included spreading (gravitational and by shearing), evaporation of volatile and semi-volatile oil pseudo-components from surface oil, current transport on the surface and in the water column, randomized dispersion from small-scale motions (mixing), emulsification, entrainment of oil as droplets into the water (natural and facilitated by dispersant application), dissolution of soluble and semi-soluble pseudocomponents, volatilization of dissolved compounds from the surface wave-mixed layer, adherence of oil droplets to suspended particulate matter (SPM), adsorption of semi-soluble compounds to SPM, sedimentation, stranding on shorelines, and degradation (using pseudo-component-specific first-order biodegradation and photo-oxidation rates). Sublots of the discharged oil are represented by Lagrangian Elements (LEs, called "spillets"), each characterized by location, state (floating, subsurface droplet, on sediment, and ashore), mass of the various pseudo-components, water content of the oil, thickness, diameter (i.e., a floating spill is treated as a flat cylinder with increasing area as oil spreads locally), density, viscosity, and associated SPM mass. A separate set of LEs is used to track mass, spatial distribution of the diffusing mass, and movements of the dissolved components.
The SIMAP model's algorithms and assumptions are fully described in French McCay et al. (2018b). The surface oil entrainment algorithm is described in Li et al. (2017a,b). Brief descriptions of the SIMAP oil trajectory and fate model, along with validation studies for two major oil spills, the North Cape and the Exxon Valdez oil spills, are available in French McCay (2003. Thus, only the most pertinent algorithms to this reported effort are described here. Transport is modeled as the sum of advective velocities by currents input to the model, wind-driven drift of floating oil, vertical movement of subsurface particles according to (droplet and SPM) buoyancy (using modified Stokes Law; White, 2005), and randomized turbulent diffusive velocities in two (floating oil) or three (subsurface oil) dimensions French McCay et al., 2018b). The wind-driven drift due to waves (i.e., Stokes drift) and Ekman transport at the surface was either modeled, based on the results of Stokes drift and Ekman transport modeling by Youssef (1993) and Youssef and Spaulding (1993), which indicates about 3.5% of wind speed 20 degrees to the right of downwind for a fully developed sea offshore, or assumed a constant (range 1-4%) of wind speed and either downwind or at a specified angle. Environmental data such as temperature, salinity, SPM concentrations, water depth, and habitat characteristics were input as spatial data sets, varying temporally as appropriate (e.g., temperature and salinity).
Another key input influencing oil trajectory is the droplet size distribution (DSD) of the oil released at depth. The oil and gas from the DWH were released as part of a momentumdominated jet, which became a buoyant plume (Camilli et al., 2010;Socolofsky et al., 2011Socolofsky et al., , 2015aJohansen et al., 2013;Spaulding et al., 2015Spaulding et al., , 2017Gros et al., 2017). As seawater entrained into the buoyant plume and gas dissolved or escaped, the plume became neutrally buoyant with the surrounding seawater at about 200-400 m above the release point, as was evident in observational data (Camilli et al., 2010;Diercks et al., 2010;Valentine et al., 2010;Reddy et al., 2012;Spier et al., 2013;Payne and Driskell, 2016Driskell and Payne, 2018). Thus, oil droplets were initialized in the SIMAP model at this "trap height" in droplet sizes estimated by Spaulding et al. (2015;using the Li et al., 2017a algorithm) based on daily estimates of the oil volume released, the percentage of gas and dispersant (from subsea injection) in the oil and gas plume, the depth of the release, the orifice configurations, and the environmental conditions, all of which are important controlling variables to the DSD (National Academies of Sciences Engineering and Medicine, 2020).
Model developments reflected by the DWH analysis herein include the updated oil entrainment algorithm (Li et al., 2017a,b), increased numbers of pseudo-components tracked (from 7 to 19), refinements in biodegradation and photolysis rates, the ability to utilize 4-dimentionally varying environmental data sets provided in a variety of grid types, and increased resolution via the numbers of spillets used and the concentration outputs. The Li et al. (2017a,b) model allowed the implications of both surface and subsea dispersant applications on oil droplet size to be addressed. The later four developments were needed to address the vast extent of the GOM affected, including resolving concentrations over ∼2,000 m of water column. Instantaneous, surface spills previously addressed (e.g., French McCay, 2003 had not required as much resolution, given that most of the oil was on/in surface waters and many of the components evaporated immediately.

Model Inputs Winds
Four wind reanalysis products covering 2010 in the northeastern GOM obtained from National Oceanic and Atmospheric Administration (NOAA) and United States Navy government websites were used as model inputs.
• The North American Mesoscale Forecast System (NAM) data set provided by NOAA National Center for Environmental Prediction (NCEP) 1 was at approximately 12 km resolution and 1-h time steps. • The North American Regional Reanalysis (NARR) data set provided by NOAA NCEP 2 was available at 3-h time steps and approximately 0.3 • (32 km) resolution. • The Climate Forecast System Reanalysis (CFSR) data set provided by NOAA NCEP was a reanalysis of 2010 at a horizontal resolution of 0.5 •3 . • The Navy Operational Global Atmospheric Prediction System (NOGAPS) data set provided by the United States Navy had horizontal resolution of 0.5 • , with a time step of 6 h 4 .

Currents
Current measurements. Acoustic Doppler Current Profiler (ADCP) data meeting quality criteria (French McCay et al., 2018a,c) were available for April-September 2010 from 17 locations along the continental slope of the Northeastern GOM (Figure 1) (Mulcahy, 2010). During the period from April-July 2010, the temporally averaged current velocity at station 42,916 was 2.2 cm/s (0.04 kt) to the northeast and 3.9 cm/s (0.08 kt) to the southwest at 64 and 1,087 m below the water surface, respectively. The maximum current was 51 cm/s in the surface layer. The ADCP data meeting quality criteria were used in SIMAP model simulations for comparison with simulations run with hydrodynamic model results. The velocity at the location of each spillet used for transport calculations at each time step was interpolated from the data taken at the 17 ADCP stations using an inverse distance-weighted scheme employing all sensors. The ADCP coverage extended along the continental slope in the area of concern, but there were no data on the shelf. In addition, the interpolation provided a smoothed current field, and did not resolve smaller scale features and shear less than the scale of the distance between ADCP moorings (order of 30-100 km). Also, ADCPs do not provide estimates of surface currents. Thus, simulations using ADCPs were realistic only for oil transport below the 40 m mixed layer in areas where ADCP data were available, i.e., over the continental slope.
Hydrodynamic models. Seven data sets of currents predicted by hydrodynamic models were evaluated. The wind data from the model used to force the hydrodynamics, along with the current data, was used in the oil spill modeling.   (Hyun and He, 2010;Xue et al., 2013). The horizontal resolution is ∼5 km with 36 terrain-following vertical layers. SABGOM was forced with NARR winds. Current predictions were provided every 3 h. • IAS ROMS: The Intra-Americas Sea Regional Ocean Modeling System (IAS ROMS) was developed from SABGOM. It was applied with a grid resolution of ∼6 km in the horizontal and 30 levels in the vertical. An IAS ROMS simulation (version "4C") for 2010, that included a 2-km nested grid within the larger IAS ROMS domain, was run as part of the trustees' NRDA program and provided by Chao et al. (2014) in April 2014 (model described in Chao et al., 2009). This simulation forced with NAM winds provided hourly data.

Geographical data
A rectilinear grid was used to designate the location of the shoreline, the water depths (bathymetry), and the habitat or shore types. NOAA Office of Response and Restoration, Environmental Sensitivity Index data 9 were used to define shoreline habitat types, reclassifying Environmental Sensitivity Index codes to a simpler habitat classification, i.e., rocky, cobble, sand, mud, wetland, and artificial (man-made) shore types. Bathymetric data were obtained from the General Bathymetric Chart of the Oceans Digital Atlas (General Bathymetric Chart of the Oceans, 2009) one arc-minute gridded data set, which is based on qualitycontrolled ship depth soundings interpolated using satellitederived gravity data as a guide. Oil loading onto shorelines was assumed to occur up to a maximum holding capacity that was related to shore type (and so shoreline wave climate, slope, width, and grain size) and the beaching oil's viscosity. Shore holding capacities as a function of oil viscosity were developed for each shore type based on observations from the Amoco Cadiz spill in France and the Exxon Valdez spill in Alaska, as described in French et al. (1996) andFrench McCay et al. (2018c).

Environmental conditions
In order to utilize the same temperature and salinity distributions for all model runs, and given the uncertainties in the subsea distributions and currents predicted by the hydrodynamic models, monthly mean water temperature and salinity data from the World Ocean Atlas (Boyer et al., 2004) were input on a three-dimensional grid. The floating oil model trajectory was not sensitive to the difference in resolution between the atlas data and the hydrodynamic model data, since most of the oil surfaced within hours of release. A synoptic map of SPM concentrations was defined for use in the oil spill modeling by combining results from field and modeling studies with satellite imagery depicting SPM plumes (French McCay et al., 2018a,c).
For the base cases, the horizontal diffusion (randomized turbulent diffusion) coefficient was assumed 100 m 2 /s for floating oil, 2 m 2 /s in surface waters (above 40 m), and 0.1 m 2 /s in waters below 40 m. The vertical diffusion coefficient was assumed 10 cm 2 /s in surface and 1-0.1 cm 2 /s in deep waters. The coefficients were also varied in sensitivity analyses up to one order of magnitude larger and smaller. These values are based on empirical data reviewed for the deep-water GOM (French McCay et al., 2015a;based on Okubo and Ozmidov, 1970;Okubo, 1971, Csanady, 1973Jirka, 2005, Ledwell et al., 2016).

Oil properties, composition, and degradation
The spilled oil was a light GOM crude oil. The liquid "dead" oil (i.e., oil without the C1 to C4 natural gas hydrocarbons) densities at 5, 15, and 30 • C were 0.8560, 0.8483, and 0.8372 g/cm 3 , respectively; dynamic viscosity was 10.93 cp at 5 • C, 7.145 cp at 15 • C, and 4.503 cp at 30 • C; and IFT was 19.63 mN/m (Stout, 2015b). Based on its asphaltene (0.27%) and resin (10.1%) content (in un-weathered dead oil, Stout, 2015b) and behavior of similar light crude oils (Fingas and Fieldhouse, 2012), the fresh oil would form an unstable emulsion. After weathering concentrated asphaltenes and resins sufficiently, the oil was observed and assumed to form a mesostable water-in-oil emulsion (mousse) up to a maximum water content of 64% water (Belore et al., 2011).
Concentrations of volatile-insoluble and soluble to semisoluble hydrocarbons and related compounds were calculated for 18 pseudo-components used to characterize the dead oil [including nine soluble/semi-soluble and volatile/semi-volatile components defined by octanol-water partition coefficient (K ow ) range, eight insoluble and volatile/semi-volatile aliphatic components defined by boiling ranges, and one residual oil component ;French McCay et al., 2015a, 2018b and input to the SIMAP oil fates model, along with fractions of the oil volatilized in boiling cut temperature ranges (Stout, 2015a;. Physical-chemical properties of each pseudo-component were developed by French McCay et al. (2015a).
First-order biodegradation rates in the water column used as model input were based on reviews by French McCay et al.

Amounts and droplet sizes of released oil
The DWH spill location (88.367 • W, 28.740 • N in Mississippi Canyon Block 252, MC252; Figure 1) was ∼80 km southeast of the mouth of the Mississippi River in ∼1,500 m of water. The amount of oil (C5+) released to the environment (totaling 4.1 million bbl, ∼554 thousand metric tons, MT; i.e., not including the amount recovered at the release site) was specified from April 22, 2010 at 10:30AM CDT (local time) for 2015 hours (i.e., 84 days until July 15, 2010 at 2:30PM CDT) in 15-min time-step increments using daily estimates based on information provided by the Flow Rate Technical Group (McNutt et al., 2011, as summarized by Lehr et al. (2010). The oil mass was initialized in the SIMAP model at depths and in DSDs calculated via the nearfield and droplet size models, reflecting subsea dispersant application activities (Li et al., 2017a;Spaulding et al., 2017).
From April 28 to June 3, oil and gas flowed from both the broken end of the fallen riser and holes in a kink in the riser pipe just above the blowout preventor. After June 3 (when the fallen riser pipe section was sawed off and Top Hat #4 with a recovery pipe was placed over the riser), oil was released only from the riser pipe, around the top hat immediately above the blowout preventor. The median droplet diameters of the riser flows (∼2-3 mm) were significantly larger than those from the kink release (∼300-500 µm), due to the much higher exit velocity from the kink relative to the larger diameter riser release. For most model simulations, the best estimate of the DSD was used, which was a bimodal distribution based on partial dispersant treatment of the oil in the blowout plume (details in Spaulding et al., 2015Spaulding et al., , 2017. Sensitivity to the bounding range DSDs resulting from assumed high (100%) and low (50%) effectiveness of the subsea dispersant injections (Spaulding et al., 2015(Spaulding et al., , 2017 was examined for simulations using HYCOM-FSU.

Response activities
Modeled response activities at the water surface included removal by in situ burning and dispersant application to floating oil from the air and vessels. Spatially explicit quantitative measurements of oil volume mechanically removed were not available; therefore, mechanical cleanup was not included in the model simulations.
Polygons were input to the model specifying amounts of oil burned or treated by dispersant, and where and when the response activities occurred. Details are provided in French McCay et al. (2018c).
Quantitative estimates of oil volume removed by in situ burning were obtained from the Response After-Action Report by Mabile and Allen (2010; summarized by Lehr et al., 2010;Allen et al., 2011). Time ranges for burns each day were composited into a daily burn time window. The mass rate of removal on a given date was calculated as the total burn volume times typical floating oil density in the areas of the burns, 0.97 g/cm 3 , divided by the time range of burning that day. The model removed oil mass at this daily rate within each time and spatial window up to the maximum daily mass prescribed by the model input.
Surface dispersant application data were obtained from response records (NOAA, 2013). The volume of oil treated per dispersant volume applied (i.e., the dispersant-to-oil ratio) was based on assumptions in Lehr et al. (2010), who assumed that the minimum, median, and maximum ratios were 5, 10, and 20 by weight of oil, not including the water in mousse. The median value (1:10) was used as the base case. The fraction of oil dispersed into the water column (i.e., efficiency) was calculated using the entrainment algorithm (Li et al., 2017b) based on the assumed dispersant-to-oil ratio and resulting interfacial tension (based on data from Venkataraman et al., 2013).

Analysis of Results
To evaluate model agreement with observations, modeled floating oil distributions were compared to interpretations of remote sensing imagery, modeled shoreline oiling was compared to field survey data, and modeled subsea concentrations were compared to chemistry sample data and sensor-based indicators (e.g., fluorescence peaks).

Floating Oil
Remote sensing (satellite) imagery data were used to evaluate the distributions of surface oil over time. Synthetic Aperture Radar images were analyzed for presence and thickness category (thick or thin) of floating oil by Deepwater Horizon Natural Resource Damage Assessment Trustee Council (2016; based on Garcia-Pineda et al., 2009, 2010, 2013aGraettinger et al., 2015). MacDonald et al. (2015) used a neural network analysis of SAR images to quantify the magnitude (averaging 70 µm for thick and 1 µm for thin) and distribution of surface oil in the GOM from persistent, natural seeps and from the DWH discharge. SAR resolutions ranged from 25 to 100 m per pixel, with a few images at 6 m per pixel ERMA, 2016).
MODIS Visible data were available for 18 days depicting surface oil, at a pixel resolution of 250 or 500 m, depending on the spectral band. The images were classified into three oil thickness classes: thin oil class (primarily silver sheen and rainbow, using NOAA, 2016), a thick oil class (transitional dark to dark color), and a moderately thick oil class (metallic sheen) that falls between the other two classes ERMA, 2016).
MODIS Thermal IR data at a pixel resolution of 1,000 m were available for 25 days during the spill period. The images were classified into a thin oil class [primarily silver sheen and rainbow, using NOAA (2016) nomenclature], a thick oil class (transitional dark to dark color), and a moderately thick oil class that falls between the other two classes ERMA, 2016).
Useful Landsat TM data were available over a spatially limited area on 8 days when DWH oil was on the surface of the northern GOM. Landsat TM satellite data have a pixel resolution of about 30 m. Ocean Imaging estimated the areal coverage per pixel of three oil thickness classes: a very thick class comprising heavy emulsions, a moderately thick class of dark/opaque oil, and a thin oil category that is thicker than sheen but thinner than dark/opaque oil. The Landsat TM oil thickness analyses did not classify oil sheens ERMA, 2016).
For comparisons with model results, floating oil distributions from 84 dates and times were used, these being times where the image was judged sufficiently synoptic of the area of the floating oil. These included 34 SAR, 18 MODIS Visible, 25 MODIS Thermal IR, and 7 Landsat TM images (Supplementary Table 1). The three thickness classes of the MODIS Visible, MODIS Thermal IR, and Landsat TM images were assumed 1, 10, and 50 µm thick on average, based on ranges estimated by Graettinger et al. (2015). Graettinger et al. (2015) and MacDonald et al. (2015) aggregated the pixelated data as gridded data in a 5 × 5 km 2 geographic grid and developed statistical models to interpolate between observations in space and time. However, since the interpolations may have missed weather-related and other events between actual observations, comparisons of the SIMAP model results to imagery results were made using the pixelated data based on the observational data, without use of the interpolations. These non-interpolated data were gridded in the same 5 km by 5 km grid used by the Deepwater Horizon Natural Resource Damage Assessment Trustee Council (2016; Graettinger et al., 2015;ERMA, 2016).
For both the remote sensing data and the model, at each observation time for the remote sensing, the percentage of floating oil present in each grid cell was calculated in three ways.
• The relative area of oil in each grid cell indicated the distribution of oil cover. For remote sensing data, the area within the cell covered by oil was divided by the total area of oil in all cells estimated from the imagery. For the model, the area covered by spillets falling in the cell was divided by the total area covered by all floating spillets at that time step (accounting for overlapping spillet areas, avoiding double-counting). • The estimated volume of oil in each grid cell was calculated for remote sensing data from the area covered by each thickness category. For the model, the total mass in spillets falling in the cell was used as an index of volume. • The relative volume of oil in each grid cell accounts for the relative distribution of oil, irrespective of the actual amounts, which are subject to the assigned oil thickness. For remote sensing data, the volume of oil in each cell was divided by the total volume of oil in all cells estimated from the imagery. For the model, the total mass in spillets falling in the cell was divided by the total mass of all floating spillets in all cells at that time step. The mass was not corrected for oil density to convert to volume. Thus, all spillets were assumed to be of equal density for this index.
The Root Mean Square Error (RMSE) was calculated for each date/time there was an observation available. The RMSE is a frequently used measure of the differences between values predicted by a model and the values observed, and thus is a measure of accuracy. The individual differences (residuals) are aggregated in the RMSE as a single measure of predictive power (Fitzpatrick, 2009). For each pair of grids, the RMSE was calculated by summing the squares of the differences between modeled and the observed over all cells of the grid, where n = total number of cells in each grid: n RMSE values were averaged over all dates, in order to judge relative fit comparing dates and/or among simulations, the minimum RMSE indicating the best fit.

Shoreline Oil Distributions
Available data for shore oiling consisted of maps of where oil was first observed on various shoreline segments and assessments by the Shoreline Cleanup and Assessment (SCAT) program during Response. A binary discriminator test (Fitzpatrick, 2009) was used to evaluate the timing of oil coming ashore in the model, as compared with observations made by the SCAT program. The presence or absence of oil according to SCAT observations and the model predictions were gridded using the 5 km by 5 km Albers grid (ERMA, 2016) employed by the Deepwater Horizon Natural Resource Damage Assessment Trustee Council (2016) in their evaluations of oil exposure after the DWH spill. A cell was considered to have oil presence if any shore segment within the cell was observed oiled by the SCAT teams. SCAT data were downloaded from ERMA in July of 2014 as shape files 10 . The observational data were binned into 10-day intervals, from April 22 to September 30, 2010. In the analysis, only those SCAT segments where oil was observed to arrive before September 30 were considered as oiled. Segments checked during the 10-day interval, but where no oil was observed to arrive, were considered as "no oil." Segments where oil was observed to arrive after September 30, but earlier observations showed it did not arrive there before September 30, were coded as "no oil." Note that the shorelines were not searched synoptically, and areas were not visited for days or weeks; thus, the time oil was first observed could have been a considerable time after the actual initial oiling. Also, the Deepwater Horizon Natural Resource Damage Assessment Trustee Council (2016) used additional observations and SAR to identify where oil came ashore, to develop more comprehensive maps of the locations where oil came ashore during and after (including after September 30) the spill. However, we relied solely on the SCAT survey data, as the SAR-based data were localized analyses at specific instances.

Subsea Oil
Modeled subsea concentrations were compared to chemistry sample data and sensor-based indicators (e.g., fluorescence peaks) both qualitatively, to evaluate oil locations, and quantitatively, to evaluate concentrations. We used chemical measurements of samples taken May 11 (the first date available) -July 15, 2010 and processed by the trustees' quality-controlled NRDA program (Deepwater Horizon Natural Resource Damage Assessment Trustee Council, 2016; described in Horn et al., 2015;Payne and Driskell, 2015bDriskell and Payne, 2018). The NRDA sample data set is available in ERMA (2016). Details of the model comparisons to subsea chemistry data are in French McCay et al. (2015a, 2018a. We summarize those findings below as part of the present analysis of the influence of physical forcing on the model trajectories and oil distributions.

RESULTS AND DISCUSSION
The model trajectory selected as the base case for waters above 200 m was that forced with the hydrodynamic and wind input combination (i.e., HYCOM-FSU currents and NARR winds) yielding the best agreement with floating oil distributions and shoreline oiling over time. In deep water, ADCP data yielded the best agreement with oil contaminant distributions >200 m and defining the deep plume. Videos of the floating oil movements color-coded by "age" (time since release) and of the oil droplet trajectory below 900 m are in Supplementary Material.

Floating Oil
In prior modeling examining mass balance and fate of oil released from hypothetical oil and gas blowouts in nearby locations of the GOM (French McCay et al., 2018d, 2019, the initial DSD was the most influential input controlling the amount of oil surfacing. For a spill at 1,400 m, with a trap height at 1,100 m (i.e., similar to the DWH spill), a DSD with a median droplet size <700 µm reduced the amount of surfacing oil dramatically, whereas the amount of oil surfacing was similar for all DSDs with larger median droplet sizes (French McCay et al., 2019). As the breadth of (range of droplet sizes in) the DSD has been found to be narrow (Li et al., 2017a), most of the mass in the DSD is in droplet sizes close to the median droplet diameter.
In all model simulations, oil droplets >0.7 mm diameter, weathered by dissolution such that their density when they reached the surface approached 940 kg/m 3 (in agreement with measurements of "fresh" floating oil by , rose to the surface in <17 h. Droplets 1 and 3 mm in diameter surfaced in 11 and 3 h, respectively. ADCP-measured currents at the wellhead averaged <5 cm/s between 40 and 1,400 m. Assuming a mean current of 5 cm/s during their rise, 0. 7-, 1-, and 3-mm droplets would travel ∼2, 3 and 0.5 km horizontally, respectively. Svejkovsky and Hess (2012); Svejkovsky et al. (2016) and Driskell (2015d, 2018) observed fresh oil surfacing <4 km from the wellhead on various dates in May and June 2010. Ryerson et al. (2012) observed fresh (as evidenced by measured volatiles in the air above) oil surfacing at 1.0 ± 0.5 km from the wellhead June 8-10, 2010, which based on ADCP measurements at that time implied a 10-h surfacing time. Ryerson et al. (2012) noted that visual observations from response vessels suggested a ∼3-h lag time between deliberate intervention at the well and the onset of changes in the freshly surfaced oil. These observations imply that much of the surfaced oil was comprised of droplets ∼1-3 mm in diameter. Droplets smaller than 0.7 mm rose over a longer period and so were carried progressively farther from the release point. Droplets less than ∼100 µm did not rise appreciably and formed the deep-water plume along with dissolved hydrocarbons. Thus, the DSD could affect the surfacing locations of the oil, and the floating oil distribution, depending on the relative amounts of mass in these size ranges. Floating oil would be more spread out from the well if most of the mass was in droplet sizes <700 µm than it would be if most of the mass were in droplets >700 µm. Figure 2 shows the cumulative days of oil presence on the water surface based on SAR analysis (data from ERMA, 2016, produced by Graettinger et al., 2015). Figure 3 and Supplementary Figures 1-10 summarize the oiled footprints (north of 27 • N and east of 92 • W, the domain used for comparisons to focus on most of the oil and limit the number of null cells) for the model trajectories forced by various winds and currents as cumulative days of oil presence on the water surface in each grid cell over the simulation. The average RSME values over all dates of comparison for these runs, assuming the floating oil horizontal dispersion coefficient was 100 m 2 /s and using the modeled wind drift, are in Table 1. For the base case using HYCOM-FSU and NARR winds, and all simulations using other hydrodynamics, the DSD was the best estimate, as described in Spaulding et al. (2015Spaulding et al. ( , 2017. High and low cases in Table 1 represent the potential range of DSDs (Spaulding et al., 2015(Spaulding et al., , 2017. Considering the RSME means for all three metrics, the base case using HYCOM-FSU with NARR winds produced a trajectory that best fit the remote sensing data. IAS ROMS showed second best agreement to the remote sensing in terms of the relative spatial distribution of floating oil (relative area and relative volume). However, the HYCOM-NRL Reanalysis with CFSR winds showed the same degree of agreement with the remote-sensing data as the HYCOM_FSU/NARR simulation in terms of oil volume distribution ( Table 1). SABGOM moved more oil northeast toward Florida in June than did the other hydrodynamics (Supplementary Figure 3). The HYCOM-NRL Real-time hydrodynamics transported more oil southeast toward southern Florida (Supplementary Figure 2) than was observed in the remote sensing (Figure 2). The RMSE values show poorer agreement of the model with the remote sensing products when no currents and only winds are used, than for when any of the hydrodynamic models are used ( Table 1), indicating that the hydrodynamic model currents improved the trajectories over wind drift alone. Use of ADCP data for surface transport (extrapolated from below 40 m) resulted in better agreement to the observational data than without currents, suggesting that transport of droplets rising from depth, and thus the DSD, influenced the floating oil distribution. While the mapped comparison (Supplementary Figure 7) showed the floating oil distribution reasonably agreed with the observations, there was no ADCP data on the shelf or nearshore, such that along-shore transport was not captured and there was not enough eastward or westward spread of the oil. Thus, the simulations using hydrodynamic modeled currents produced better agreement with the observed floating oil distributions than simulations using ADCP data for currents. However, as will be discussed in section "Subsea Oil, " modeled oil distributions below 40 m were in better agreement with ADCP-based observational data than simulations using any of the hydrodynamic models.
Across the range of potential subsea dispersant effectiveness assumptions (high and low cases in Table 1), the DSDs influenced the volume distribution of the floating oil to a similar degree as varying the hydrodynamic model input. However, the relative area and relative volume RMSE values did not change much with change in DSD within the potential range (Spaulding et al., 2015(Spaulding et al., , 2017, whereas those metrics were sensitive to the hydrodynamic input. The best-estimate DSD used for the base case with HYCOM-FSU yielded the lowest RMSEs overall (Table 1). Thus, the floating oil spatial and volume distributions were sensitive to the hydrodynamics used. The relative spatial distribution of the floating oil was not sensitive to the plausible DSD inputs because the substantial mass in large droplets (>0.7 mm) surfaced close to the well and resulted in similar trajectories, regardless of the amount in those droplet sizes. If the DSD were skewed to much smaller sizes than those examined, the relative spatial distribution would be affected, and floating oil would be more broadly distributed since smaller droplets would be carried farther from the well before surfacing. However, such DSDs are not realistic for the DWH spill where the conditions and dispersant volumes applied at depth were not conducive to creating smaller droplet sizes (Adams et al., 2013;Zhao et al., 2014Zhao et al., , 2015Socolofsky et al., 2015b;Spaulding et al., 2015Spaulding et al., , 2017Nissanka and Yapa, 2016;Testa et al., 2016;Gros et al., 2017;Li et al., 2017a;Daae et al., 2018; National Academies of Sciences Engineering and Medicine, 2020). The extremely low estimates of the median (<100 µm) and maximum (<300 µm) droplet diameters (for both untreated and treated oil) of Paris et al. (2012) and Aman et al. (2015) were not representative of the DWH conditions (see Adams et al., 2013;Testa et al., 2016), nor are they consistent with the field evidence.
As SIMAP is a Lagrangian model, the diffusion coefficients moved the spillet centers randomly each time step at a scale determined by the coefficient. Those displacement distances were small relative to the ∼5-km concentration grid used to compare the model to the remote sensing data. Thus, the RMSE values changed slightly (<10%) with differing floating oil horizontal dispersion coefficients from 5 to 200 m 2 /s, indicating this assumption had little influence on the results at the scale of a 5 km grid. The hydrodynamic models did not resolve the surface oil drift in the upper wave-mixed layer resulting from wave motions and Ekman flow. Simulations using the Youssef and Spaulding (1993) model of these processes, as well as varying percentages-of-wind-speed drift rates (2-4%) and angles (0-20 • ) to the right of downwind, showed that the best fit was consistently that using the Youssef andSpaulding (1993, 1994) model, as opposed to using a constant wind drift percentage and angle for all dates, although the differences between the results were small on most days (results not shown). Moreover, there was much more variation between runs with different currents and winds used for forcing than the differences due to variation in wind drift model or the horizontal dispersion coefficient. Thus, variations of wind drift and horizontal dispersion coefficient assumptions were not examined further. Use of 800,000 spillets instead of 100,000 spillets (as used for the simulations presented) to represent the floating oil slightly improved the model agreement with the observed for the HYCOM-FSU base case, as the additional spillets filled in some of the areas where remote sensing data indicated oil was present. However, model run time was increased considerably in the tests using more spillets.
Snapshots of surface oil distributions over time, predicted by the model using HYCOM-FSU currents and based on remote sensing data, are compared in the Supplementary Material.   Figures 8-10) are similar, and do not transport of the floating oil enough to the east or west. Otherwise, the no-current simulations result in realistic floating oil patterns centered just north of the wellhead. The simulation using ADCP currents and NARR winds generates results similar to no currents and NARR winds (Supplementary Figure 7), because the ADCP currents are relatively weak in the offshore area and do not cover the shelf or nearshore. These results indicate the importance of the wind drift in transporting the floating oil, but that the currents used can change the patterns dramatically.
The modeled number of days of oil cover for the base case using HYCOM-FSU (Figure 3) is of the same range as the SARbased estimates (Figure 2) in the area of the wellhead and off the Mississippi River Delta. However, the duration of oil cover in the For all model simulations except those noted as high or low, the best estimate of the droplet size distribution was used. High and low represent the range of potential droplet size distributions by varying subsea dispersant effectiveness from high to low (Relative area and relative volume are unitless).
area near the coast of Mississippi and Alabama is higher in the model than the SAR data indicate. Also, floating oil trapped near shore in the model remained longer than observed. Mechanical removal, on water or on shorelines, was not included in the simulations, and this could potentially account for at least some of the observed reduction in the nearshore floating oil.
The RMSE values for the base case using HYCOM-FSU currents (Figure 4) show the best agreement in terms of relative area or relative volume distribution in early June, whereas the model estimated volume of oil was most similar to the remote-sensing based estimates in April-early May and in July. The results for other current-wind combinations showed similar temporal patterns. The relatively high RMSE values for spatial coverage in April were because the modeled distribution remained more localized around the wellhead than the remote sensing indicated. The relatively high RMSE value for relative volume on May 17 was for the MODIS visual image of that date, when the "Tiger Tail" feature (i.e., the extension to the southeast as oil sheen was drawn into a cyclonic eddy) was seen in the imagery (Walker et al., 2011;Olascoaga and Haller, 2012) but not indicated by the model (i.e., the hydrodynamic model did not locate the eddy in the same location at that time).
The modeled maximum amount of oil in each grid cell at any time in the simulation is shown in Figure 5. Using gridding with a cell size of 25 km 2 , MacDonald et al. (2015) estimated the footprint of aggregated floating oil and oil emulsions, where oil coverage exceeded ∼1 g/m 2 at some time during the spill, which extended over 149,000 km 2 . By comparison, the base case model prediction, using the same resolution and threshold, was 194,000 km 2 . The model predictions included some low concentrations of floating oil in areas far from the well where oil was not detected by the remote sensing. These small patches of highly weathered oil residuals (>20 days old) in the outskirts (see the video of the floating oil movements color-coded by age in Supplementary Material) were likely undetected in the remote sensing analyses because of the resolution of those sensors. The modeled time-averaged mean oil cover (Figure 6) is similar in pattern and of the same magnitudes as the estimates made by MacDonald et al. (2015) based on SAR analysis. Their estimated daily average footprint was 11,200 km 2 (SD = 8,430 km 2 ). The model estimate (base case) of the daily average surface area affected by floating oil >1.0 g/m 2 was 6,720 km 2 (SD = 4,960 km 2 ), not significantly different from the remote sensing daily estimate.
Note that the gridded summaries of floating oil distributions, both for the model and for the remote sensing data, provided average amounts of oil mass over the cell area. They should not be interpreted as an actual oil thickness, as the oil is patchy and of varying thicknesses within the cells. The remote sensing data are typically expressed as volumes per cell for this reason. Furthermore, the total area of the cells where oil is present is larger than the actual oil coverage at any given time. The modeled areas covered by oil, the sum of the areas covered by spillets (representing patches) at a single time step, were an order of magnitude lower than those estimated from the gridding. The mean swept area from April 24 to August 3, 2010 (101 days) was 1,960 km 2 /day for the model base case.
The modeled amount of floating oil for the base case simulation is compared to remote-sensing based estimates in Figure 7. The modeled floating oil volumes, and to some degree the remote-sensing-based estimates, are inversely related to wind speed, as wind events entrain oil into the water and oil accumulates on the water surface during calm periods. While there is considerable variability in the remote-sensing based estimates, in large part due to uncertainty in the oil thickness estimates

Shoreline Oil Distributions
Approximately 2,100 km of beaches and coastal wetlands were exposed to MC252 oil in 2010, according to the Deepwater Horizon Natural Resource Damage Assessment Trustee Council (2016; Nixon et al., 2016). The oil was documented by shoreline assessment teams as stranding on 1,773 km of shoreline. Deepwater Horizon Natural Resource Damage Assessment Trustee Council (2016) mapped maximum observed oiling, categorized as not surveyed, no oil seen, or various degrees of oiling. Many areas were not surveyed, including much of Mobile Bay and considerable areas of wetlands in Louisiana. Thus, the 2,100 km estimate likely underestimates the actual length of shoreline affected by oil.
Modeled shoreline oiling results are summarized in Table 2. For simulations run without currents, the length of shoreline oiled is smaller and focused on the area between the Mississippi River Delta and Alabama. The total length of shore oiled estimated by the Deepwater Horizon Natural Resource Damage Assessment Trustee Council (2016; Nixon et al., 2016) was 2,113 km. The categories of degree of oiling used by the Trustees cannot be translated to oil loading amounts (g/m 2 ). The total lengths of shoreline oiling predicted by the model using most of the hydrodynamic model currents (except NGOM) are 2,000-2,700 km oiled, with the base case predicting 2,568 km oiled. These results are in good agreement with the observations, considering some areas were not surveyed. Figure 8 and Supplementary Figures 11-16 summarize the comparisons of the modeled shoreline oiling with SCAT-based observations, showing the variability resulting from different current and wind inputs. These maps color code where oil came ashore in the model but where the shoreline had not been surveyed ("no observed coverage, modeled oil"), as well as where both modeled and observed indicate oil ("match"), where both observed and the model indicate no oil ("no observed oil"), where there are false negatives (observed only), and where there are false positives (modeled only).
The modeled shoreline oiling for the base case compares well with the observations (Figure 8), the model showing oiling from the Apalachicola Bay area of Florida to Terrebonne Bay area of Louisiana. Note that the model predicted shore oiling inside Mobile Bay in areas where it was not observed. However, as most of Mobile Bay's shoreline areas were not surveyed, oiling of those areas is unknown. No oil was reported in some of the small bays along the Florida panhandle. However, booming may have prevented oil from entering those inlets, whereas booming was not included in these model simulations.
Simulations using HYCOM-NRL Reanalysis currents with CFSR winds (Supplementary Figure 11), HYCOM-NRL Realtime currents with NARR winds (Supplementary Figure 12),  Figure 14), predict similar oiling patterns to the base case using HYCOM-FSU currents and NARR winds (Figure 8). SABGOM spreads oil to shorelines too far to the east and into western Louisiana where no oiling was observed (Supplementary  Figure 13). NGOM currents with NARR winds and the IAS ROMS simulations carry too much oil to western Louisiana and Texas but otherwise show good agreement with the SCAT observations (Supplementary Figures 15, 16). Simulations made with no currents, forced with winds only, and those forced with ADCP currents and NARR winds, do not bring as much oil ashore west of the Mississippi River Delta as was observed. Thus, coastal currents prevailing toward the west apparently transported the oil to those areas. Also, currents brought the oil east to Florida, as winds alone do not account for that shoreline oiling.

and NCOM Real-time with NARR winds (Supplementary
The shoreline oil distributions were more accurate when westward flows near the Mississippi Delta were simulated in the forcing hydrodynamic model. Novelli et al. (2020) identified westward flows as due to easterly winds. Kourafalou and Androulidakis (2013); Androulidakis and Kourafalou (2013), and Androulidakis et al. (2015Androulidakis et al. ( , 2018 examined and stressed the importance of the Mississippi outflow in controlling nearshore oil movements and shoreline oiling distributions. The HYCOM models seemed to have captured these dynamics reasonably well in simulating conditions in the summer of 2010, whereas the ROMS implementations examined did not reflect those dynamics. It is possible that Mobile Bay was protected by freshwater outflows not captured in any of the hydrodynamic models. However, oiling data in Mobile Bay were insufficient to determine how much entered the bay. Weisberg et al. (2017) concluded that the general circulation from the hydrodynamics modeling could account for transporting the Deepwater Horizon oil to near shore, but that the waves, via Stokes drift, were responsible for the actual beaching of the oil. In our analysis, we found the wind drift (i.e., including Ekman flow and Stokes drift) was the primary driver for bringing oil to shore. Le Heìnaff et al. (2012) and Boufadel et al. (2014) also concluded that wind drift was the most important factor bringing oil ashore. Using several hydrodynamic models as input, Boufadel et al. (2014) FIGURE 6 | Time-averaged mean floating oil concentration (g/m 2 ) in each cell for the period when oil was observed by SAR, 24 April to 3 August 2010, based on the base case model simulation.
concluded that the wind drift rate was 1-4% of wind speed. Studies by Novelli et al. (2020) found that along-shore easterly winds followed by southerly wave-induced surface drift brought drifters, and so the DWH oil, ashore. Conversely, westerly winds moved the surface water and drifters away from shore. However, to the degree that the circulation modeling includes the wind-forced circulation, the relative importance of what is considered due to hydrodynamic inputs versus "wind drift" would vary. Dietrich et al. (2012) performed a detailed analysis comparing modeled oil drift using hydrodynamic and wave models as compared to satellite imagery, finding the best agreement using their implementations of Simulating WAves Nearshore (SWAN) and ADvanced CIRCulation (ADCIRC) models as the sole forcing. The inclusion of wind drift on top of the hydrodynamic and wave models transported the oil too fast. Thus, careful consideration is needed to determine the degree to which wind drift should be applied in the oil spill model, depending on the forcing and resolution included in the hydrodynamics and wave models. Figure 9 maps the modeled shoreline oiling for the base case using HYCOM-FSU, along with modeled concentrations of sedimented oil by August 31, 2010 (Sedimentation is discussed below). Maps of the results for the base case simulation for each of the 16 10-day intervals from April 22 to September 30 are available in French McCay et al. (2018c). No oil came ashore in the first 10-day period, April 22-May 1, in either the model or SCAT observations. The oil's arrival to shore was in good agreement with the SCAT observations for most areas and observation periods. The exception was that the (base case) model did not bring as much oil ashore west of the Mississippi River Delta during early to mid-June as was observed. Oil did come ashore in that simulation later in June and in July.

Subsea Oil
Comparisons of modeled subsea oil component concentrations to field chemistry samples are described in French McCay et al. (2018a). Here we summarize those findings and evaluate the modeled movements of the oil below 200 m using various current data as input.  Camilli et al. (2010).
While the oil-affected volume has been described as an inverted cone over the well with large droplets rising in the center and progressively smaller ones further from the well (e.g., Ryerson et al., 2012;Spier et al., 2013), due to ADCPdocumented current shear and varying rise rates for different diameter droplets, "plumes" of rising oil droplets followed different trajectories during their ascent toward the surface, a behavior captured in our model simulations (French McCay et al., 2018a,c). In addition, while rising, the intermediatesized droplets lost some of their relative buoyancy due to weathering (dissolution and biodegradation), as well as potentially combining with SPM in the water column. The ambient current higher in the water column is increasingly stronger than in deep water (Hyun and He, 2010), such that the intermediate sized droplets would have separated to form "multiple plumes" of slowly rising droplets in the upper layers mimicking the deep water plume. Fluorescence anomalies (peaks) and water column hydrocarbon chemistry data (Camilli et al., 2010;Valentine et al., 2010;Spier et al., 2013;as well as NRDA data, Horn et al., 2015and Payne and Driskell, 2015a,c, 2018 showed relatively high concentrations in finite "clouds" of particulate-and dissolved-phase oil at various depths above the intrusion at ∼1,100-1,200 m. The SIMAP model simulated these behaviors and distribution. Sensitivity analyses by North et al. (2011North et al. ( , 2015 and Paris et al. (2012) showed similar behavior: droplets with diameters of <50 µm formed distinct subsurface plumes that were transported horizontally and remained in the subsurface for >1 month; while droplets with diameters ≥90 µm rose to the surface, more rapidly at larger diameters. Fluorescence peaks (relative high values) and dissolved oxygen "sags" (relatively low values) in vertical profiles associated with elevated hydrocarbon concentrations in water samples were consistently observed at depths ∼1,000-1,300 m mainly southwest of the wellhead (Joint Analysis Group, 2010;French McCay et al., 2015a, 2018aHorn et al., 2015). In the deep plume, the ADCP-measured speeds averaged 3.9 cm/s to the southwest. Measured current speeds were consistently <10 cm/s at all depths and for most of the period of the oil release. ADCP data in ∼30-60 m vertical bins throughout the upper ∼1,000 m of the water column showed currents in adjacent depths differing by as much as 120 • in direction during May-June 2010 (French McCay et al., 2015a, 2018a. In contrast, in the area of the DWH wellhead, all of the hydrodynamic models examined (three HYCOMs: HYCOM_FSU, HYCOM-NRL Reanalysis, HYCOM-NRL Realtime; two ROMS: SABGOM, IAS ROMS; and two POMs: NCOM Real-time, NGOM) calculated much higher speeds for the currents in the deep plume and water column below 200 m than is indicated by the ADCPs. Furthermore, the movements were less consistently toward the southwest than indicated by the ADCP and field observation data. This was evident in trajectories of oil droplets below 200 m. The video of the oil droplet trajectory below 900 m using ADCP data in the Supplementary Material summarizes the movements of oil in the deep plume. Snapshots in the deep plume and at intermediate depths from trajectories using the hydrodynamic models are available in French McCay et al. (2018a,c).
The hydrodynamic models produced current fields below 200 m that at times agreed with the ADCP data, and in other times diverged in direction and speed. The HYCOM-FSU model produced a trajectory of small droplets in deep water (French McCay et al., 2018a,c) that was most similar to that using the ADCP data (video of the oil droplet trajectory below 900 m using ADCP data in Supplementary Material). The ROMs models tended to transport the small droplets along the bathymetry toward the southwest in narrow smooth flows much faster than indicated by the ADCP data. The IAS ROMS simulation (described in French McCay et al., 2015a was very similar to the SABGOM simulation (in French McCay et al., 2018a,c). The HYCOM-NRL Real-time and HYCOM-NRL Reanalysis both predicted the deep plume moved primarily northeastward from April through July of 2010, such that the modeled deep plume did not extend southwestward in July as observed (French McCay et al., 2018c). The POM models (NGOM is shown in French McCay et al., 2018a) predicted movements at times to the northeast and other times to the southwest, but the timing of movements in these directions did not agree with FIGURE 9 | Cumulative amount of oil coming ashore and settling to sediments (in g/m 2 ) by August 31, 2010 for the base case simulation using HYCOM-FSU currents and NARR winds.
the ADCP and observational data. The NRL NCOM model was superseded by NRL's HYCOM, which produced more realistic (slower) currents at depth, and so the NCOM simulations were not considered further.
Given the more narrow and accurate surfacing locations (see section "Floating Oil"), the simulation using ADCPs was the most realistic for depths below 200 m. While the interpolation of the ADCP data was not a hydrodynamic model, which conserved mass and momentum, the ADCP current data does indicate the actual flow field and which of the hydrodynamic models most closely simulated it (i.e., the HYCOM-FSU model).
In the SIMAP model simulations using the ADCP data and using HYCOM-FSU, the concentrations of droplets and dissolved constituents were highest close to the source (i.e., ∼1,200 or ∼1,300 m). Because the smaller oil droplets were spread out by spatially and time-varying currents as they rose through the water column, the modeled concentrations decreased considerably higher in the water column. Oil concentrations in the deep water were low and in a narrow cylinder stretching toward the surface in April, when the release was not treated with dispersants at the release point and the oil was mostly in the form of large droplets >0.7 mm in diameter. During May when the kink holes appeared and subsea dispersant began to be applied, such that small droplets were formed in addition to droplets >0.7 mm in diameter (Spaulding et al., 2015(Spaulding et al., , 2017, the modeled subsurface concentrations were much higher, and the contamination was dispersed over a wider area. As shown by the sample data (Horn et al., 2015;Payne and Driskell, 2018), the deep plume of small droplets and dissolved components persisted from May to July. The model results show more extensive plumes in deep water in June-July when more effective subsea dispersant applications were used than prior to June 3. See French McCay et al. (2015a, 2018a for further detail and maps depicting the concentration distributions in space and time.

Oil Sedimentation
Oil from the DWH spill was identified in the sediments in the offshore area surrounding and down-stream of the well site (Joye et al., 2011;Montagna et al., 2013;Valentine et al., 2014;Romero et al., 2015;. Valentine et al. (2014) noted that the pattern of contamination indicates deep-ocean intrusion layers as the source, consistent with deposition of a "bathtub ring" formed from an oil-rich layer of water impinging laterally upon the continental slope (at a depth of ∼900-1,300 m) and a higher-flux "fallout plume" where oil-SPM aggregates sank to underlying sediment (at a depth of ∼1,300-1,700 m). Figure 9 maps the modeled sedimentation by August 31, 2010 for the base case using HYCOM-FSU for surface waters and ADCP data in deep water. The modeled sedimentation in deep water due to fallout by interactions with SPM and the bathtub ring impingement was in consistent locations to those mapped by Valentine et al. (2014), , and Romero et al. (2015). The model also predicted considerable sedimentation in the nearshore waters of Louisiana.

Summary of Findings
The modeled spatial distribution of surface oil was validated by comparison to remote sensing data. As the model ran continuously from April 22 through September of 2010, disagreement would be expected for older oil. However, freshly surfacing oil reset the modeled origin of the slicks on a continuous basis. The results indicate the importance of the wind drift in transporting the floating oil northward toward shore (consistent with Dietrich et al., 2012;Le Heìnaff et al., 2012;Boufadel et al., 2014;Weisberg et al., 2017), but that the currents used can change the patterns considerably. The model-predicted amount of floating oil agreed with remotesensing based estimates, confirming the modeled DSDs and fate processes were realistic.
The modeled shoreline oiling for the base case (∼2,600 km oiled) compares well with the observations (∼2,100 km oiled; Nixon et al., 2016), with the model showing oiling from the Apalachicola Bay to Terrebonne Bay. The model predicted oiling on shore in areas that were not surveyed, so oiling of those areas based on observations is unknown. Other model simulations demonstrated the variability resulting from different current and wind inputs. Shoreline oiling events were discontinuous and occurred when winds (via wind drift) carried oil onshore.
Simulations of subsurface oil movements using current data from seven hydrodynamic models resulted in very different trajectories, and to varying degrees the hydrodynamic models generally over-estimated the speed of the currents in the area of the wellhead. Published DWH oil trajectory simulations (MacFadyen et al., 2011;Mariano et al., 2011;North et al., 2011North et al., , 2015Dietrich et al., 2012;Le Heìnaff et al., 2012;Paris et al., 2012;Boufadel et al., 2014;Lindo-Atichati et al., 2014;Testa et al., 2016;Weisberg et al., 2017)  compared their predicted surface oil movements to field data, finding wind drift to be important to transport, along with the hydrodynamic model produced currents. However, they did not analyze subsea movements of the oil in comparison to observations. In our modeling studies, the base case simulation using currents generated by the HYCOM-FSU hydrodynamics model (Chassignet and Srinivasan, 2015) consistently resulted in both a subsea trajectory and concentration fields most similar to that predicted by the ADCP measurement-based current field below 200 m. The flow was generally toward the southwest, as the ADCPs indicated occurred (French McCay et al., 2018a). The HYCOM used by Paris et al. (2012), which was apparently the same HYCOM-NRL Real-time product we used, also showed similar movements of small droplets in the deep plume.
Because most of oil forming the surface slicks was from large droplets that surfaced near well, it was not very displaced by deep currents. Thus, use of the various hydrodynamic models for transport below 200 m did not greatly influence the surfacing locations to the point where it would be perceptible in the overall distribution of floating oil (i.e., as compared to synoptic remotesensing based maps of oil distributions). Thus, the modeling problem can be treated in two domains: waters below 200 m in the offshore and surface waters above 200 m both offshore and over the shelf to nearshore.

Implications for Modeling Environmental Exposures
This modeling effort is unique and ground-breaking in several ways. The objective is to quantify by modeling oil transport, fate, exposure concentrations, and (as part of follow-on studies) biological effects (French McCay et al., 2015a,b,c,d,e being the first syntheses of these efforts). Thus, the modeling predicted oil transport, fate and exposures continuously over the entire spill period, including in deep water and surface waters. Transport modeling evaluated nine potential hydrodynamic inputs (seven models, measurement data and an assumption of no currents). The complexities of the oil composition and weathering were tracked using pseudo-components, such that exposure concentrations could be characterized by concentrations of each pseudo-component in the mixtures. Most fate processes were quantified as well, as opposed to evaluating implications of certain assumed inputs, such as droplet size or biodegradation rates. However, some fate processes could not be quantified for lack of sufficient information, such as related to marine oil snow formation and its flux to the sea floor. The exposure modeling (French McCay et al., 2015b,c) involved distribution, behavioral analysis and simulation of movements of many biological groups and life stages, recording their exposure histories to oil components, the results of which are being analyzed. We have also made a concerted effort to validate the model simulations at each stage of the overall effort. Such a comprehensive modeling and validation exercise has not been attempted previously.
The importance of data assimilation is well-established for calibrating physical models with observational information, and the tested and literature-reported hydrodynamic models do include assimilation of such information as sea surface height and temperature. However, data assimilation techniques have not been applied in oil spill models. Rather, many of the literature-reported oil trajectory model exercises (such as some of those noted above) ran short-term trajectories from known oil positions, as opposed to long simulations where displacement errors would compound. MacFadyen et al. (2011) described their approach for modeling during the response period where trajectories were reinitialized from identified remote sensingbased observations, and different hydrodynamic models were used over time, in order to obtain the more accurate trajectories.
While such a nudging-based initialization approach was considered and tested, whereby oil positions were updated with remote sensing observations, as was done for the response modeling (MacFadyen et al., 2011), this led to discontinuities in the concentrations of oil on the surface and in the water column beneath it. Thus, the exposures to biota were not reflective of reality and under-estimated when oil was repositioned. Further, weathering history of each oil spillet was needed to evaluate the composition of the exposure concentrations. Remote sensing analyses did identify thick and thin oil but did not provide enough information to track which parcel of oil moved where, such that weathered versus fresh oil could be tracked and repositioned appropriately. Thus, a more sophisticated "nudging" approach would be needed to improve the oil trajectory over what would be predicted from the hydrodynamic modeled currents and meteorological models wind products.
However, the predicted surface oil distributions were reasonable as compared to the remote sensing and shoreline oiling data. This suggests that the data assimilation techniques employed by the hydrodynamic and meteorological models were sufficient to produce reliable oil trajectories. On-going improvements in hydrodynamic modeling and data assimilation techniques should continue to improve oil spill model predictions of oil movements in the surface layer.
On the whole, the predictions of subsurface oil movement below 200 m were not nearly as reliable as for the floating oil. A major limitation is the availability of measurement data to assimilate into the hydrodynamic models. ADCP, temperature and salinity measurements throughout the water column in deep water over the full domain of interest could allow data assimilation to tune the hydrodynamics to better reflect the field conditions. Data assimilative techniques have been included in all seven of the hydrodynamic models evaluated, but likely these will need more development to handle the deep-water three-dimensional flow fields. In addition, analysis techniques are needed for evaluating model performance in simulating deep-water flows, analogous to those used for surface water dynamics (e.g., Halliwell et al., 2014).
With respect to quantifying exposures and biological effects, the results will depend on the areas and volumes affected at exposure concentrations that potentially could cause adverse effects French McCay, 2009). If biota have similar densities in areas affected by the model, as in areas oiled in the field, then the modeling results will be reliable, even if the specific path of the floating oil or subsurface plume is not precisely located. For wildlife, the modeled swept area by surface oil determines numbers of animals affected. For water column biota, the effects are related to the volume exceeding concentrations of concern. Biological densities are more uniform (on average) in offshore pelagic waters than they would be near the coast. Thus, accuracy of the trajectory and exposure concentrations are more important nearshore than in open waters offshore. The modeled movements and amounts of oil floating over time were found to be in good agreement with estimates from interpretation of remote sensing data, indicating initial oil droplet distributions and oil transport and fate processes produced oil distribution results reliable for evaluating environmental exposures in the water column and from floating oil at water surface.
Both organisms and oil components are highly patchy and variable over time. Present models cannot deterministically predict these fine scale dynamics. Further, biological distribution data are not available to precisely quantify fine scale distributions of all species and life stages. However, statistically describing the oil exposures, toxicological effects, and biological distributions should lead to reliable results. This approach also allows quantification of the uncertainties related to potential adverse effects. Model analyses have demonstrated the importance of the currents, winds and initial DSD to exposure concentrations, and have provided best estimates and uncertainty bounds for assessments. Analyses of potential exposures and biological effects based on these validated oil fate simulations using HYCOM-FSU and ADCP data are being developed, using methods and data similar to those described in French McCay et al. (2015b,c,d), such that the impacts of the DWH oil spill can be better understood.

DATA AVAILABILITY STATEMENT
Publicly available datasets were analyzed in this study. This data can be found here: The remote sensing, Shoreline Cleanup and Assessment, sensor and chemistry datasets analyzed for this study can be found in the ERMA Repository:

AUTHOR CONTRIBUTIONS
DF, MS, and DM designed the study. DF, MS, DC, DM, MH, and JF analyzed the data. DF wrote the manuscript. All authors listed have made a substantial, direct and intellectual contribution to the work, and approved it for publication.