Comparing the Performance of a Remotely Operated Vehicle, a Drop Camera, and a Trawl in Capturing Deep-Sea Epifaunal Abundance and Diversity

Deep-sea ecosystems provide services such as food, minerals, and nutrient recycling, yet baseline data on their structure is often lacking. Our limited knowledge of vulnerable deep-sea ecosystems presents a challenge for effective monitoring and mitigation of increasing anthropogenic threats, including destructive fishing and climate change. Using data from two stations differing in total epifaunal abundance and taxonomic composition, we compared the use of imagery collected by two non-invasive tools [remotely operated vehicle (ROV) and drop camera] and data collected with a trawl system, commonly used to quantify epibenthic megafauna in the deep sea. Imagery and trawl data captured different epifaunal patterns, the former being more efficient for capturing epifauna, particularly Pennatulacean recruits. The image-based methods also caused less disturbance, had higher position accuracy, and allow for analyses of spatial structure and species associations; fine-scale distributions could not be elucidated with a trawl. Abundance was greater for some taxa, and diversity accumulated faster with increasing sample size for the drop camera than the ROV at one station. However, there are trade-offs between these tools, including continuous and discrete sampling for the ROV and drop camera, respectively, which can affect follow-up analyses. Our results can be used to inform monitoring frameworks on the use of appropriate sampling tools. We recommend further research into tool sampling biases and biometric relationships to help integrate datasets collected with different tools.

Epifaunal communities in the deep sea have been sampled in part with tools which are lowered to the seafloor and collect physical samples, such as trawls, corers and epibenthic sleds (Jamieson et al., 2013). A widely used trawl system includes a net towed along the seafloor, adapted from coastal commercial fishing techniques. Trawls are regularly used for commercial fishing (Hall-Spencer et al., 2002), as well as fish stock assessment (Clark, 1979). Often, trawl data are used to determine patterns in distribution and biomass of megafaunal communities (Kenchington et al., 2011(Kenchington et al., , 2014(Kenchington et al., , 2016aMoritz et al., 2013;Gullage et al., 2017;Murillo et al., 2020).
Relatively less invasive tools such as remotely operated vehicles (ROVs), autonomous underwater vehicles (AUVs), drop cameras, towed cameras, camera sleds, rovers and baited cameras, have been used to collect imagery data. ROVs generally hover above or make minimal contact with the seafloor and can maintain a relatively constant speed and height above the bottom while sampling. ROVs are non-destructive, allow for habitat associations and behavior analyses, are suitable in complex/high relief habitats and have high maneuverability (Sward et al., 2019). Recent direct comparisons suggested that ROVs captured a higher abundance of sea pens than trawls (Chimienti et al., 2018b(Chimienti et al., , 2019. Other imaging systems are also less destructive than research trawls. Unlike ROVs, AUVs are not tethered to a ship, and can run imagery transects close to the seafloor, possibly producing less noise and discontinuous lighting than ROVs (Morris et al., 2014). Drop cameras are towed behind a ship and hop along the seafloor capturing still images, only when stationary and in contact with the seafloor. Towed cameras are tethered and towed by a ship, often maintaining a target height above the seafloor without making contact (Drazen et al., 2019). Camera sleds and rovers generally make continuous contact with the seafloor, with sleds being towed by a ship and rovers being autonomous. Baited cameras are deployed as free-falling systems, which rest at a fixed location on the seafloor with an attached bait that attracts fauna (Logan et al., 2017;Drazen et al., 2019). These baited camera systems may capture imagery continuously or at discrete time intervals.
While there is a wealth of literature on individual descriptions and qualitative comparisons of performance for these various deep-sea sampling tools (e.g., Jamieson et al., 2013;Flannery and Przeslawski, 2015;Durden et al., 2016), direct empirical comparisons between the tools and their sampling biases are limited. A few studies have compared quantitatively the megaepifaunal data collected by different tools in a single study area (e.g., abundance, diversity, biomass, size). For example, several studies have compared fish abundance and composition, as well as abundance of select megafauna (e.g., crustaceans, echinoderms, and molluscs) between ROVs or drop cameras and trawls of different sizes (Adams et al., 1995;Nybakken et al., 1998;Kenchington et al., 2011;Ayma et al., 2016;Pacunski et al., 2016;Chimienti et al., 2018b). Other studies have compared abundance and diversity of both megafauna and fish among human-occupied submersibles, camera sleds (analog) and otter trawls (Uzmann et al., 1977) and among AUV, towed cameras, and trawls (Morris et al., 2014). A few studies have focused on comparing diversity, abundance, and size across imaging systems such as AUVs, towed camera systems, or baited cameras (Logan et al., 2017;Schoening et al., 2020).
In this study, we had the opportunity to directly compare the performance of three commonly used tools in quantifying megaepifauna, a ROV (ROPOS), a drop camera (Campod), and a research trawl (Campelen 1800 shrimp trawl) by sampling at the same two locations on the Northwest Atlantic Ocean, in the Laurentian Channel Marine Protected Area (MPA). This opportunity allowed us to: (1) compare the composition of the same benthic assemblages (abundance of the most common morphotaxa and diversity) as quantified using the ROV and the drop camera; and (2) examine differences in image quality and sampling bias, such as catchability, spatial extent, and position accuracy, between the ROV, drop camera, and trawl. We were able to compare tool performance in two locations which differed in megafaunal density, community composition, and environmental characteristics, using a replicated sampling design. Monitoring and research logistics related to operation and maintenance costs, frequency of use and technical specifics are outside of the scope of this study, as those factors change rapidly with evolving technology, are highly variable between tools in the same category, as well as dependent on the research objectives, finances and the specific ecosystem.
Detected differences in species abundance or diversity among tools could imply varying catchabilities. Although many studies have used some of these tools, such empirical (quantitative and qualitative) direct comparisons across all three tools used to characterize the same assemblage have not been made to our knowledge. These types of comparisons can provide insight into the selection of the most appropriate tool(s) for capturing a targeted species or different ecological attributes of interest, thus ensuring high data quality and supporting appropriate data interpretation. Our study can both guide the collection of relevant baseline data and enhance monitoring efforts of deepsea ecosystems. However, our study also underscores the need FIGURE 1 | Map of the Laurentian Channel MPA (boundary provided by DFO), in Atlantic Canada off the southwest coast of Newfoundland indicating the locations of the two sampling stations (LC2 and LC5). Shown are 8 400-m ROPOS transects, 3 400-m Campod transects, and trawl set at station LC5 (inset i) and LC2 (inset ii). Esri (2020) for World Ocean Base layer, using coordinate system GCS_WGS_1984. Bathymetry data layer (in meters) is from Lacharité et al. (2020). This MPA is located within a deep submarine valley in the Northwest Atlantic and was designated to protect corals (predominately sea pens), several fish and shark species, as well as leatherback turtles (DFO, 2019).
for more detailed evaluation of catchability, encompassing other tools, species, ecosystems, and ecological attributes of interest.

Study Site
Our sampling areas were in the Laurentian Channel MPA, located in a deep submarine valley off the southwest coast of Newfoundland, Canada (Figure 1), which is ∼ 11,580 km 2 (DFO, 2019) and ∼115-490 m deep (Lacharité et al., 2020). We sampled two stations, LC2 and LC5, to capture a range of taxonomic diversity and abundance. A map of biophysical seafloor features classified station LC2 as part of a benthoscape characterized by intermediate depth (200-400 m) with low relief (0.5-1 • ), very abundant pockmarks (>5 km −2 ), sparse ice scours (<1 km −2 ), and mixed sediment with some gravel (Lacharité et al., 2020). Station LC5 was classified as deep (>400 m) with low relief, sparse pockmarks (<1 km −2 ), abundant ice scours (>2 km −2 ), and sandy mud with gravel traces. The environmental conditions (bathymetry, pockmarks, ice scours, and slope) were similar within stations as supported by Lacharité et al. (2020) and video observations of the sampled areas. Thus, we assumed that our results could be attributed mostly to how the tools captured the morphotaxa, rather than spatial patterns in environmental variables.

Imagery
We used two different tools to collect and compare imagery from the two stations. In 2017, we performed 8 400-m parallel transects with the ROV Remotely Operated Platform for Ocean Sciences (ROPOS 1 ). Sampling was based on a systematic cluster design 1 https://www.ropos.com/ with alternating spatial lags, recommended for capturing spatial patterns in the absence of prior knowledge (Fortin et al., 1989). Transects were spaced at spatial lags of 10 m and groups of two were spaced at 200 m (Figure 2). This design allowed us to combine a large spatial extent with high spatial resolution. We used continuous video collected with a downward-facing Insite Pacific Zeus-Plus HD camera (1,920 × 1,080 pixels) to capture epibenthic fauna and quantify sampling area. The ROV includes 3 × 400 W HMI and 3 × 350 W LED primary light sources, as well as 8 × 150 W LED lights used to fill in shadows near the vehicle and for additional sampling, e.g., using the manipulator arms and attached equipment. Video was stored as a series of MPEG files for easier processing, resulting in 70 video segments at station LC2 and 68 at LC5. Metadata included a real-time comment log, voiceover comments on videos, closed-caption encoding (geo-referencing; using Digital Rapids StreamZHD recorders, Canadian Scientific Submersible Facility, 1995-2020), date, time, latitude, longitude, depth, heading, pitch, roll, forward velocity, starboard velocity, downward velocity, altitude above seafloor, as well as temperature and salinity from a CTD. Specimens, water samples and sediment samples were collected opportunistically.
In 2018, we collected additional imagery with the drop camera Campod [operated by the Department of Fisheries and Oceans (DFO)-Canada, Bedford Institute of Oceanography]. The sampling design was modified from the one used with ROPOS to 3 1-km parallel transects at ∼200 m spacing because the passive drifting of Campod makes maneuvering difficult. Still images (JPEG) were captured every 10 s, timed to manual hops of the camera along the seafloor, using a downward facing NIKON D810 camera (7,360 × 4,912 pixels). This drop camera system had two Quantum Qflash Model T5D-R light sources, which were operated at full power at 150 W. A total of 2,886 images at station LC2 and 2,202 FIGURE 3 | Examples of imagery collected by the ROV ROPOS (first row) and the drop camera Campod (second row), deemed suitable (left) or unsuitable (right) for analysis (e.g., poor illumination, blurry edges, sediment plumes). If less than 50% of an image was unsuitable and the scaling lasers were visible (middle), then those images were cropped before enumerating fauna and calculating the area analyzed.
Frontiers in Marine Science | www.frontiersin.org  images at LC5 were captured over the longer transects; however, we only processed images corresponding to a 400-m segment of each transect to make the transect arrays comparable between sampling tools (Figure 2). We used a real-time comment log and other metadata (date, time, latitude, longitude, depth and altitude) were provided by the Campod technical *Some data also collected on 2017-09-13, **13 images from CON47 were missing measurements of depth and altitude; means and SD are calculated for n = 18 images.
crew after post-processing of the Navnet, CTD, altimeter and USBL systems.

Trawl
Biomass of all caught coral species (sea pens, gorgonians, soft corals, cup corals) was estimated from DFO (Newfoundland and Labrador Region) trawl surveys performed using a Campelen 1800 Shrimp Cosmos Trawl in the 3Ps NAFO region (which included the Laurentian Channel MPA) in 2010 (McCallum and Walsh, 1996). We used data from two tows, 0.9 nautical miles in length (17-18 min at three knots), providing catch weights (kg) for five unique coral records (Duva florida, Funiculina quadrangularis, Halipteris finmarchica, Pennatula cf. aculeata, Sea pen sp.), as well as tow metadata (e.g., date, set, NAFO region, distance, duration, damage, depth, temperature, start/end latitude/longitude, gear type). The principle of stationarity, that the same ecological processes are assumed to be occurring throughout a given area, can be rendered invalid at increasing distances between points of interest (Dale and Fortin, 2014). To avoid non-stationarity, we limited comparisons to data within a 2-km buffer around each starting point of the ROPOS transects. This encompassed the entire ROPOS and Campod tracks, but only one full trawl track at each station was within the selected 2-km station radius.
Area swept by each trawl set was calculated as tow distance multiplied by wing spread adjusted for the ship used (area = distance × wing spread). Assuming a depth of ∼400 m at the Laurentian Channel, and using the median value for wing spread of 16.5 m on CCGS A. Needler (Walsh et al., 2009), we estimated the area per trawl as 27,502 m 2 . We estimated biomass in g m −2 by dividing biomass (kg tow −1 ) by the estimated area swept by the trawl, after weight conversion to grams [biomass (g m −2 ) × biomass (kg) × 1, 000area −1 ]. Similarly to biomass (kg), the estimated biomass per unit area (g m −2 ) assumes corals were evenly distributed across the trawled area.

Imagery Analysis
Using the Video and Track Replay feature in the software application Ocean Floor Observation Protocol (OFOP 3.3.8c, Huetten and Greinert, 2008;Scientific Abyss Mapping Services, 2009), position data from ROPOS was synced to video with timestamps, and images were extracted at a target distance interval of 1.5 m. We confirmed the start/end of each transect in ArcGIS (Version 10.5 Esri, 2016) and excluded off-transect or overlapping images. We aimed to analyze every 4th image. Images were included in the analysis if the total area was less than 6 m 2 as estimated by the scaling lasers spaced 10 cm apart, and if image clarity permitted an unobstructed view of the seafloor (see Figure 3 for example imagery). Obscured sections were cropped out to permit taxonomic identification (i.e., removing suspended particles, pelagic animals near the camera, sediment plumes, or sections of low light). Images were deemed unsuitable if they required >50% cropping. Overall, this protocol resulted in ∼ 6-m spacing between images; when images were deemed unsuitable the next sequentially suitable image at a distance of 1.5-6 m was analyzed instead. We filtered out 50% of the analyzed ROPOS images to reduce sampling size, making it comparable to Campod (which generated fewer images), which resulted in a final spacing of ∼12 m between images for ROPOS. Due to the passive drifting of Campod, we used a target time interval of 10 s rather than a target distance between images, which, assuming a speed of one knot (0.514 m s −1 ), corresponds to ∼ 5-m spacing. However, this was likely an overestimate of distance due to slower drifting of the camera than expected. To maintain a standard image analysis protocol (using ImageJ software; Abràmoff et al., 2004) consistent to that used for the ROPOS analysis, we analyzed every 4th image for a 400-m section of each transect (spacing ∼40 s, ∼20 m) using the same protocols as for ROPOS.
All megafauna >2 cm in the largest dimension were enumerated and identified to morphospecies using a reference guide based on World Register of Marine Species (WoRMS). If morphospecies were too numerous to count or colonial, e.g., holothuroidea and encrusting sponges, they were recorded as percent cover instead of counts. We used the point method to estimate percent cover (209 points for ROPOS and 204 for Campod, respectively, due to differences in image resolution), including only the number of points that fell onto the cropped area of the image.

Data Analysis
We aimed to avoid spatial overlap of sampling among tools and thus minimizing potential confounding effects. However, some overlap may have occurred between the trawl track and Campod (CON46) at LC5, and ROPOS (2A) at LC2 (Figure 1). Since these tools have different levels of position accuracy it was difficult to interpret distances between their tracks, in particular for the trawl, as latitudes and longitudes reflect the vessel position rather than having positional equipment directly mounted on the trawl itself.
For each transect of each tool, we evaluated image quality based on a number of criteria: total images captured, number of suitable images selected for analysis, total area covered by images and total area of images deemed unsuitable for analysis. For analysis, we selected the most abundant taxonomic groups (see Figure 4), determined as those recorded on at least 11 of 22 total transects in the study. The less abundant taxa were either aggregated to form groups of higher abundance (e.g., Actiniaria (O.) spp.) or excluded (i.e., too few counts) from the analyses.
To make the sampling design used by ROPOS comparable to that of Campod (which included 3 400-m transects) for statistical analyses, we assembled the eight individual ROPOS transects (A-H) into four groups of 3 400-m transects, ∼200 m apart (ROPOS_ACE, ROPOS_BDF, ROPOS_CEG, ROPOS_DFH). We compared the abundance of the most abundant taxa among sampling designs (5 levels: Campod, ROPOS_ACE, ROPOS_BDF, ROPOS_CEG, and ROPOS_DFH) using one-way type 2 ANOVAs (Underwood, 1997). We detected a lack of normality using Shapiro-Wilk tests and normal quantile plots, and heteroscedacity using Levene's tests and residual plots. Although we explored several data transformations (e.g., log e and square root versions), none improved heteroscedasticity and normality; therefore we used the untransformed data (abundance of individuals or colonies m −2 ) in the ANOVAs. Post hoc comparisons for significant pairwise differences in treatment means were performed with Tukey's HSD tests (Abdi and Williams, 2010). We used all 8 400-m ROPOS transects (ROPOS_all) for some analyses. Including all aggregated taxa, we calculated morphospecies accumulation curves using the random method for each sampling design at each station, with 999 permutations. We used non-metric multi-dimensional scaling (NMDS) to explore similarities in the composition of the assemblages among sampling designs within stations and between stations. Significant patterns were explored using permutational multivariate analysis of variance (PERMANOVA) on the Bray-Curtis dissimilarity matrix with 999 permutations using the "Adonis" function. All statistical analyses were done with R version 3.6.1; packages Tidyverse, Reshape, Vegan, Car, and Agricolae.

RESULTS
Overall, image quality was high for both tools and most transects, except for ROPOS at LC5. For ROPOS, more images per transect were unsuitable at LC5 than at LC2 (Table 1). A higher proportion of images collected by Campod than by ROPOS were unsuitable, and only half the images were selected for analysis. However, it was often possible to replace unsuitable images, except for ROPOS at LC5 where fewer images were analyzed. Less area needed to be cropped out of images by ROPOS at LC2 than by ROPOS at LC5 and by Campod at both sites (10-40% of the area was cropped out of images with unsuitable sections). Mean altitude (and standard deviation) above the sea floor was consistent across all transects, at ∼1-2 m and thus did not affect image quality ( Table 2).
There were some differences between stations in the detected species composition and some taxa were only found at one  LC2 than at LC5, for ROPOS (all eight transects; LC2 0.000 ± 0% and LC5 0.005 ± 0.003%) but varied for Campod (all three transects; LC2 0.273 ± 0.3% and LC5 0.060 ± 0.003%). The abundance of taxa varied between tools and stations. Abundance was much greater at LC2 than at LC5, but was dominated by a few species, particularly Pennatula sp. 2. (Figure 5). Campod captured significantly higher abundance of Actiniaria (O.) sp. 23 than all ROPOS designs and of Pennatula sp. 2 than two of four ROPOS designs, but only at station LC5 (Table 5).
Morphospecies accumulation curves were similar for Campod and ROPOS at LC2 (Figure 6) but the curve was steeper for Campod than all ROPOS designs at LC5. A plateau of the morphospecies accumulation curves was not reached at either station, except when all ROPOS transects were combined at LC5. Significant differences in species composition existed between the two stations ( Figure 7A and Table 6). Based on the NMDS plots and PERMANOVAs, the imagery tools produced similar assemblages at LC2 (overlapping 95% confidence interval for tools) but not at LC5 (Figures 7B,C and Table 6).

DISCUSSION
We examined differences in the estimation of abundance and diversity, as well as sampling performance and biases among the ROV, drop camera and trawl when applicable. Image quality was impacted by several factors including resolution (based on camera and file type), speed and mode of movement, elevation above seafloor, sufficiently uniform lighting and obstruction by plumes of resuspended sediment and animals in the water column. Extracted frame grabs from video of a moving camera (ROV) had reduced quality compared to stills (drop camera), likely related to motion blur (i.e., caused by a moving camera or target) and compression of the video file. It was usually possible to replace unsuitable images with neighboring ones, except for the ROPOS transects at LC5, where significant fish activity created sediment plumes that obstructed many images resulting in a slight sampling bias of fewer images ( Table 1). This fish behavior may have been a response to the motion, sound, or constant bright lighting of the ROV in an otherwise low light environment. However, the drop camera produced a yet higher proportion of unsuitable images, as the tool itself caused some sediment disturbance.
Taxon-specific abundance (particularly for Pennatula and Actiniaria) was higher and accumulation curves steeper when using the drop camera than the ROV, but only at one station. However, the curves did not reach an asymptote with either tool, suggesting more than 90 images were required to fully capture diversity. These taxon-specific differences suggest the tools had different catchabilities for different morphospecies, possibly because poor image quality compromised the ability to distinguish smaller sized individuals (e.g., Pennatula sp. 2 recruits) or taxa with similar coloration to the sediment ( Table 8).
Another study comparing imagery from towed cameras and an AUV, similarly concluded that higher resolution imagery both leads to detection of higher faunal densities and accounts for smaller fauna (Schoening et al., 2020). Minimizing the altitude of the camera, to obtain higher resolution imagery, should also result in higher taxonomic resolution (Schoening et al., 2020).  Table 1 for number of images per transect. Different scales on y-axis, *denotes taxon that was only found at one of these stations.
A towed camera produced imagery of lower resolution when sampling from an altitude of ∼3 m than ∼1 m, resulting in reduced taxonomic identification (Jones et al., 2009).
Sampling adjustments that enhance image quality are needed to optimize data analysis, such as constraining the altitude off the seafloor and speed of the camera movement. In addition, certain video imagery file types during data collection may improve resolution and allow reliable detection of recruits and smaller taxa. In general, differences in catchability between tools may make some tools better suited than others for capturing morphospecies with different magnitudes/patterns of abundance (i.e., very abundant vs. rare morphospecies), affecting tool performance. Catchability and sampling biases of all sampling tools need to be compared quantitatively for different species and ecosystems by sampling the same locations and ecological attributes, preferably at the same time. To our knowledge, ours is the first study that compares empirically catchability from a drop camera to that of a ROV and a trawl (with ∼16.5 m wing spread) in the peer-reviewed literature.
Trawls are used to assess fish stocks (Trenkel et al., 2004;McIntyre et al., 2015), and sometimes invertebrates, such as octopus, decapods, sea pens, sponges, holothurians, and some gorgonians (Junceella sp. and alcyonaria) (Adams et al., 1995;Wassenberg et al., 2002;Pitcher et al., 2007;Ayma et al., 2016; FIGURE 6 | Morphospecies accumulation curves at (A) station LC2 and (B) station LC5 in the Laurentian Channel MPA, based on abundance per photo across each of the six sampling designs; Campod, ROPOS_ACE, ROPOS_BDF, ROPOS_CEG, ROPOS_DFH (3 400 m transects), and ROPOS_all (8 400 m transects). Used random method with 999 permutations; shaded confidence interval are one SD. Pacunski et al., 2016;Chimienti et al., 2018bChimienti et al., , 2019Zhulay et al., 2019;Dinn et al., 2020). Otter trawls have a flexible mouth that is better suited for capturing mobile fauna but is less effective for epibenthic fauna (Jamieson et al., 2013). In this study, we showed different patterns in relative abundance for sea pens (Order Pennatulacea) sampled by ROV from those based on biomass data from the trawls. Trawls may have lower capture efficiency for some invertebrates than the drop camera and ROV. Kenchington et al. (2011) suggested ∼5.2% sea pen catch efficiency for the Campelen Trawl compared to Campod. Similarly, sea pen density of P. rubra was higher based on ROV data than trawl data (Chimienti et al., 2018b).
Trawls are likely more appropriate for assessing some mobile fauna, yet in this study drop camera and ROV appeared to capture sessile fauna and recruits more effectively. Past studies have also found that trawls tend to undersample abundance and diversity compared to imagery returning higher abundance estimates for many but not all species (e.g., Uzmann et al., 1977;Nybakken et al., 1998;Morris et al., 2014). The higher abundance recorded from trawls for some species, such as squid, herring, mackerel, and butterfish, were likely the result of a photonegative response to the lighting on the submersible or camera sled (Uzmann et al., 1977). Logan et al. (2017) recorded overall higher fish abundances and diversity with a baited camera than a towed camera, yet this varied with habitat and functional group, where towed camera recorded higher abundances of species with cryptic or territorial behavior.
Comparisons of data obtained by imagery tools and trawls are challenging, as the tools appear to have different catchability limitations and capture different epifaunal patterns. In our study, the high catch weights combined with low numerical abundance captured by the trawl may have been the result FIGURE 7 | Non-metric multi-dimensional scaling (NMDS) plot of the assemblages in the Laurentian Channel MPA (mean abundance m -2 per transect for n = 23 aggregated taxa) using Bray-Curtis dissimilarity with 95% CI for station and tool groups, at (A) station LC2 and LC5; (B) station LC2 for each sampling design and (C) station LC5.
of larger sea pen species at one of the stations (LC5), and/or smaller individuals being missed because of net size. Trawls have reduced catchability for small species such as Kophobelemnon spp. (Kenchington et al., 2011). Kenchington et al. (2011) also reported varying mean weights for the sea pen species in the Laurentian Channel and our imagery suggested the composition of sea pens may vary by station. Additionally, some sea pens have a withdrawal response (Langton et al., 1990;Ambroso et al., 2013;Chimienti et al., 2018b), which may result in an underestimate of abundance and biomass. *Denotes significant p-value using α = 0.05. Using 23 taxon in total, some that were found only at one station. Based on mean abundance data by transect.

Qualitative Comparison of Tools
Overall, imagery tools appeared to perform better than trawls for most ecological attributes with fewer sampling biases and causing less disturbance, within a smaller footprint ( Table 8). All three tools can be used to identify morphospecies, commonly used for image analyses, which can be verified subsequently with physical samples collected by a ROV or trawl but not a drop camera. However, taxonomic identification using imagery is constrained and efforts on global standardization are underway (Howell et al., 2019). Sizing taxa can allow for examination of population dynamics, such as recruitment events (Bak and Meesters, 1998;Chimienti et al., 2018b), although this is not currently common practice for most of the trawl samples in our region. Imagery may be used to estimate size only of non-erect taxa lying on the same plane as the lasers, but in our study, the scaling lasers in Sizing possible for objects >2 cm on same plane as scaling lasers; or in absence of appropriate scale, relative sizing (adult vs. juveniles) is possible. It is also easier to see objects <2 cm in some imagery. Caution: erect fauna requires a more appropriate scale.
Specimens can be measured (minimum size depends on largest mesh size and catch efficiency). Note: not common practice.
Bias/quality control 4. Catchability and sampling bias Mobile animals may be attracted/repelled by continuous presence and lighting of tool; repeat counts possible if individuals reenter the transect at multiple points (i.e., follow the camera).
Disturbance of sediment could cause aversion/retraction maybe preempting capture; repeat counts possible if individuals reenter the transect at multiple points (i.e., follow the camera).
Low efficiency for sea pens 2 (possible retraction may preempt capture), and varying catchability for various fish species 3 . Catchability/spread of the trawl may be affected by obstruction, improper rigging, net damage, depth, amount of warp, stability of the vessel, currents, and bottom type 1 . 5. Image quality Image quality depends on camera resolution, file compression/type chosen, speed of movement, altitude off the seafloor, sufficient lighting, and degraded quality of imagery during extraction of frame grabs. Not recommended 1 : quality control adjustments during trawl could affect area swept; mostly done before or after tow (i.e., redo tow). 8. Position accuracy Advanced position accuracy (i.e., using three different systems USBL, gyrocompass, DVL) 4 . Real-time positioning adjustments possible (i.e., 0.2% of position depth, ∼±1 m when depth is 500 m).
Good position accuracy (i.e., 10 s of meters, using USBL but post-processing calculations rely on ship GPS). Passively drifting system, real-time positioning adjustments not possible.
Limited positioning (i.e., system is ship based with accuracy of 10 s of meters). May use calculations to estimate trawl position relative to ship or assume same position. Protocol does not allow for real-time adjustments. Other 9. Disturbance of the seafloor Minimal disturbance, slight resuspension of sediment localized to width of ROV (i.e., ∼2 m 2 ).
Some disturbance of sediment during bottom contact; hopping of camera on seafloor.
High disturbance of the seafloor; sustained bottom contact.
Limited [i.e., CTD, SCANMAR/SEATRAWL data on trawl geometry and performance, log sheets (vessel, vessel position, set number, depth, as well as the start/end/speed of tow), and sometimes Roxann is used to collect data on substrate. Specimens usually greater than mesh size.]. 11. Processing time (excludes quality control) High; depending on image resolution, complexity, and observer experience (i.e.,

∼7.5 min per image).
Low to medium (i.e., minutes to hours depending on catch). Processing includes removing specimens from net, on board sorting, identification, weighing. Note: further processing onshore, not included. both drop camera and ROV imagery were deemed unsuitable to size sea pens.
Tool-specific operational effects likely impacted overall catchability. Lighting, noise, and physical disturbance of sediment may have led to the attraction or aversion of some morphospecies. Fish behavioral reactions to ROVs and trawls have been recorded in previous studies (Adams et al., 1995;Trenkel et al., 2004;McIntyre et al., 2015;Ayma et al., 2016). Other factors including obstruction, improper rigging, net damage, depth, amount of warp, stability of the vessel, currents and bottom type are known to affect trawl sampling (Walsh et al., 2009). The ROV had the highest real time quality control, which was otherwise limited for the drop camera and trawl systems either due to the passive tow sampling nature or operation protocols ( Table 8). Positioning accuracy was highest for the ROV at ∼±1 m, and estimated as ∼±10 s of meters for the drop camera and trawl, although the trawl position accuracy has additional limitations as positional equipment is mounted on the vessel rather than on the trawl. Furthermore, wire-out for trawls is often three times the depth to the seafloor and the trawl can be 1,000 s of m below or behind the vessel (Jamieson et al., 2013).
The ROV sampled at a rate of approximately 0.75 km h −1 , slightly slower than the drop camera (0.8 km h −1 ); the trawl had the most efficient sampling time at ∼5.6 km h −1 ( Table 8).
Processing time was estimated to be higher for imagery, at ∼6.5 min per image for ROPOS and ∼7.5 min per image for the drop camera likely due to the higher imagery resolution. Trawl processing time on ship is highly variable, ranging from minutes to hours depending on catch size.
Ultimately, the data collected using different tools may be used in different types of analyses. For example, the more flexible datasets with detailed metadata, such as georeferenced faunal records collected continuously along video transects, allow for analyses of spatial structure and species associations (Table 8). Further, there are trade-offs between data resolution (complete transect with ROV and snapshots with drop camera) and image quality, as well as sampling/processing time and data quality, which could affect analyses. Overall, real time control of sampling, less disturbance, and higher position accuracy than trawls are desirable features of ROVs.

Future Research and Recommendations
More baseline data are needed to understand the structure and function of deep sea communities and develop strategies for monitoring and conservation (Danovaro et al., 2017a;Aguzzi et al., 2019). However, for this data to be meaningful, appropriate and quantitative tools should be used. It is evident that different tools have different efficiencies in capturing different species, often rendering results incomparable.
Further research is needed into the utility of available sampling tools for different types of analyses. ROVs collect data that may be used to define taxa-specific habitat relationships at more spatially discrete scales, as well as community structure and biogeographic affinities (Zhulay et al., 2019). Key research foci should include size relationships, other biometric relationships, the integration of datasets between tools, ground truthing, and catchability studies. The development of biometric relationships (e.g., inferring biomass from imagery) from trawl catches can help with integration of data from the different tools, allowing the use of historical datasets from trawls as we move toward less destructive monitoring (Chimienti et al., 2019). Research that directly and empirically compares tools should be prioritized, as more than one tool is required to ground truth data and understand catchability. For example, Pacunski et al. (2016) suggested using both ROV and trawl to assess fish stocks and developing statistical methods to combine data from the two tools. In addition, low detectability of animals due to either their visibility (i.e., cryptic) or observer perception is often ignored in most studies (Katsanevakis et al., 2012). Research into methods for sizing erect taxa is also needed, potentially through 3D photo mosaicking (Kwasnitschka et al., 2013;Bennecke et al., 2016;Gerdes et al., 2019). Lastly, imagery appears to have greater catchability for sessile fauna compared to trawl, yet is a more timeintensive to process. Thus, research into automation of image processing, will also be a great benefit to future deep-sea research (e.g., Lacharité et al., 2015).

Conclusion
Overall, imagery tools appeared to better capture epibenthic fauna than a trawl and provided more informative datasets that can allow for various follow-up analyses, such as on spatial structure and species associations. We found evidence that drop cameras may be better than ROVs at capturing both abundance [Actiniaria (O.) sp. 23 and Pennatula sp. 2] and diversity (morphospecies accumulation curves at LC5) of some taxonomic groups, possibly due to its higher imagery resolution and catchability for some species. However, more research is needed to understand the catchability of all these tools, and allow for better interpretation and integration of datasets, to ensure effective sampling in deep-sea environments. Catchability studies are essential to address whether we are effectively and quantitively capturing our target species or ecological attributes, to ensure high data quality and accurate representativity.

DATA AVAILABILITY STATEMENT
The datasets related to ROV and drop camera presented in this article are not readily available because they are part of ongoing thesis work by SD. The data analyzed in this study related to trawl was obtained from Fisheries and Oceans Canada (DFO). Requests to access the trawl data should be directed to Fisheries and Oceans Canada (DFO) at info@dfo-mpo.gc.ca. Requests to access the imagery datasets should be directed to sarah.de.mendonca@dal.ca.

AUTHOR CONTRIBUTIONS
SD conceptualized the study. SD and AM designed the ROV and drop camera sampling. SD led ROV and drop camera imagery collection, with assistance from those in acknowledgments. SD and those in acknowledgments conducted the analysis of imagery and data. Both authors contributed to the writing of the manuscript.

FUNDING
Funding was provided to NSERC grant  to AM. SD was funded by scholarships from Natural Sciences and Engineering Research Council of Canada, Nova Scotia Graduate Scholarship program, and Faculty of Graduate Studies, Dalhousie University. This research is sponsored by the NSERC Canadian Healthy Oceans Network and its Partners: Department of Fisheries and Oceans Canada and INREST (representing the Port of Sept-Îles and City of Sept-Îles). This research is in part of an ongoing graduate thesis.