Towards remote surveillance of marine pests: A comparison between remote operated vehicles and diver surveys

Early detection of marine invasive species is key for mitigating and managing their impacts to marine ecosystems and industries. Human divers are considered the gold standard tool for detecting marine invasive species, especially when dive teams are familiar with the local biodiversity. However, diver operations can be expensive and dangerous, and are not always practical. Remote operated vehicles (ROVs) can potentially overcome these limitations, but it is unclear how sensitive they are compared to trained divers for detecting pests. We assessed the sensitivity and efficiency of ROVs and divers for detecting marine non-indigenous species (NIS), including the potential for automated detection algorithms to reduce post-processing costs of ROV methods. We show that ROVs can detect comparable assemblages of invasive species as divers, but with lower detection rates (0.2 NIS min-1) than divers (0.5 NIS min-1) and covered less seafloor than divers per unit time. While small invertebrates (e.g., skeleton shrimp Caprella mutica) were more easily detected by divers, the invasive goby Acentrogobius pflaumii was only detected by the ROV. We show that implementation of computer vision algorithms can provide accurate identification of larger biofouling organisms and reduce overall survey costs, yet the relative costs of ROV surveys remain almost twice that of diver surveys. We expect that as ROV technologies improve and investment in autonomous and semi-autonomous underwater vehicles increases, much of the current inefficiencies of ROVs will be mitigated, yet practitioners should be aware of limitations in taxonomic resolution and the strengths of specialist diver teams.


Introduction
Control or eradication of invasive species are often dependent upon detecting populations when they are small and restricted in distribution (Myers et al., 2000;Bax et al., 2001;Inglis et al., 2006). To enable early detection, surveillance programmes for marine pests often focus on the highest risk pathways, such as ports, harbours, and marinas (Wotton and Hewitt, 2004;Lehtiniemi et al., 2015;Tamburini et al., 2021;Hatami et al., 2022). One of the most efficient and effective methods for visual detection and conformation of marine Non-Indigenous Species (NIS) is the use of divers (Peters et al., 2019), yet various health, safety, and physiological limitations can greatly limit where and when divers can be safely deployed. Harbour environments can be some of the busiest, most polluted, and most dangerous environments for deploying human divers (Barsky, 2006). Alternative methods are required to underpin post-border surveillance in increasingly busy and highly regulated marine spaces. Remote operated and autonomous underwater vehicles (AUVs) (Zereik et al., 2018) could complement or replace diver operations, yet their biases and limitations for marine pest detection are not understood.
Underwater remote operated vehicles (ROVs) have been available for marine surveys for over three decades (Capocci et al., 2017). The utility of ROVs is widely recognised in a range of scenarios, including marine science, exploration, and construction (Zereik et al., 2018). Adaptation of unmanned aerial vehicle (UAV) technology to ROVs has led to a large increase in the range of affordable models that are small, mobile, and relatively easy to operate, thereby increasing their availability to consumers and researchers (Buscher et al., 2020). ROVs are now routinely used for biodiversity surveys (Lam et al., 2006;Pacunski et al., 2008;Andaloro et al., 2013;Boavida et al., 2016), including the use of incidental imagery captured during non-scientific operations (Macreadie et al., 2018;McLean et al., 2019). The collection and retention of imagery by remote vehicles provides additional opportunity to be scrutinised by multiple trained taxonomists or automatically processed by trained algorithms (Woodall et al., 2018).
There are currently no widely accepted standards or protocols for applying ROVs to post-border surveillance for unwanted marine pest species, yet there are many examples of their use (Sammarco et al., 2010;Wells, 2011;Arthur et al., 2015;Peters et al., 2019). In fact, the use of "free-flying" ROVs (Davidson et al., 2006a;Davidson et al., 2006b;Floerl and Coutts, 2011) and hull crawling robots (Caccia et al., 2010;Eich et al., 2014) for vessel inspections is widely accepted. However, there were limitations to their effectiveness, particularly their ability to identify and obtain specimens of suspect organisms. Significant advances have been made since these studies, and removal capabilities are now available on many ROV platforms (Mazzeo et al., 2022). Moreover, AUVs are increasingly capable of the close-range imaging surveys necessary for pest detection (Bonin-Font et al., 2016;Gutnik et al., 2022). An assessment of the utility of ROVs for replacing or complementing biosecurity surveys is therefore needed, especially as the benefits of autonomous vehicles are increasingly realised.
Despite the potential of ROVs, current evidence suggests surveys undertaken with them are not as efficient as surveys using SCUBA (Self-Contained Underwater Breathing Apparatus) divers, particularly compared to diver collections (Peters et al., 2019). Peters et al. (2019) detected larger numbers of NIS on vessel hulls at reduced costs when combined samples were collected by divers compared to visual identification by ROV (Peters et al., 2019). Identifying small organisms (e.g.,<10 mm) from video feeds will be dependent on proximity of the camera to surfaces of interest, turbidity of the water, and camera resolution. However, even the most advanced systems are unable to fulfil the requirements to classify many groups of organisms (Horton et al., 2021). While current remote operated imaging systems are unlikely to match specimen-based taxonomy, ROVs provide a useful tool for identifying a large range of macro-organisms (Beisiegel et al., 2017), fish communities (Raoult et al., 2020), and can complement current methods (e.g., divers) in dangerous scenarios (e.g., high current, dangerous marine animals, in busy ports, at depths beyond 20 m).
Critical assessments of new technologies are necessary to ensure their sensitivity as a diagnostic tool (MacAulay et al., 2022). There is, however, a need to overcome surveillance bottlenecks associated with physiological limitations of human divers, workforce limitations, and declining taxonomic expertise (Cook and Coutts, 2017). Combined, these limitations highlight the potential value of remote operated camera systems and trained Artificial Intelligence (AI) detectors as a mechanism to augment human efforts, both in the field and in the laboratory (Mohanty et al., 2016;Jothiswaran et al., 2020). Despite the benefits of ROVs (and increasingly AUVs) for surveillance of marine pests, for this potential to be realised these tools must meet the three pillars for gold standard diagnostics: high sensitivity; low cost; and speed (MacAulay et al., 2022). The application of real-time computer vision algorithms at broad taxonomic levels could greatly improve the speed of ROV surveillance by reducing the need for postprocessing (Wäldchen and Mäder, 2018), but whether these tools can meet the three pillars of gold standard diagnostics remains relatively unknown.
To ensure ROVs are applied appropriately to post-border marine surveillance, they must have similar detection sensitivity as divers and achieve comparable coverage for similar costs. Here, we assess the relative sensitivity and detection rates of visual surveys by divers and an ROV in post-border surveillance of macroorganisms in Aotearoa New Zealand. We note that this differs from camera surveys by divers, which like ROV surveys, require post-processing. We also assess the relative coverage and the relative costs of each method across gradients of turbidity, including potential cost savings associated with automated computer vision algorithms for processing ROV video.

Methods
To test the sensitivity and efficiency of ROVs in a marine surveillance context we deployed an ROV alongside SCUBA divers at three locations in Aotearoa New Zealand subject to targeted surveillance for NIS: Kaipara Harbour (North Island/Te Ika-a-Maui), Nelson Harbour (South Island/Te Waiponamu), and Lyttelton Harbour/Whakaraupō(South Island/Te Waiponamu) ( Figure 1). The surveillance in Kaipara Harbour, Nelson Harbour, and Lyttelton Harbour/Whakaraupōare based on a biannual survey programme (National Marine High Risk Site Surveillance (NMHRSS) programme). The established number of NIS at each location, as determined by a mix of dedicated baseline surveys (i.e., near full census of marine biodiversity), biannual surveillance (e.g., NMHRSS), sporadic surveillance, and expert verified citizen science records (Seaward et al., 2015) are reported for each site (Figure 1). Due to the wide range of methods, in some cases deployed biannually for > 20 years, comparing the number of NIS detected in this study to total NIS detected at each location is not necessarily appropriate.
The existence of a strong turbidity/visibility gradient in Kaipara Harbour was used to assess the implications of water visibility on survey efficiency ( Figures 1A-C). We tested three broad hypotheses: 1) increasing turbidity will greatly influence survey cost per unit area for both ROV and diving methods; 2) the sensitivity (i.e., the ability to detect NIS accurately), detection rate (i.e., the number of NIS detected per unit time), and the taxonomic composition of species observations will vary between methods; and 3) survey costs per unit area will be greater for ROV methods but development of automated detection algorithms will improve relative cost differences.

Survey methods
The ROV used for this study (Boxfish ™ ROV, Boxfish Research Ltd) is relatively small (length 60 cm, height 30 cm, and width 40 cm), and is equipped with high-powered LED lights, a Sony ™ RX100 (V) camera capable of recording in 4K (30 fps) and four parallel scaling lasers. The ROV is relatively lightweight (<30 kg) and was deployed from small vessels by a team of 2-3 people depending on operating conditions. For example, in conditions that allowed the vessel to be anchored or berthed at pontoons/wharfs, a two-person team was sufficient for safe operations, but during unsecured operations a three-person team was required to safely operate the vessel, the ROV, and manage the ROV's tether.
The SCUBA diving personnel used for this study have experience in marine biosecurity surveillance across ports and harbours throughout New Zealand (Woods et al., 2018). These specialist scientific diving teams were regularly trained on the identification of high-risk marine pests, were familiar with many of the NIS present in New Zealand and maintain a high level of knowledge of native species present throughout New Zealand. Divers used in this study had a minimum of five years' experience. Furthermore, more experienced divers (> 10 years' experience) were always paired with less experienced divers. Although the same divers were not used for every dive in every region (on account of dive profile management and regional diver availability), the experienced diver was always denoted "Diver 1" while the less experienced diver was denoted "Diver 2". Like ROV surveys, dive teams used artificial lighting (e.g., high powered LED torches) to illuminate their surroundings and aid in the identification of organisms.
Divers had a primary objective of detecting nine target marine pest species, five that have yet to be detected in New Zealand (the Northern Pacific seastar Asterias amurensis, the European shore crab Carcinus maenas, the green seaweed Caulerpa taxifolia, the Chinese mitten crab Eriocheir sinensis, and and the Asian clam Potamocorbula amurensis), and four established pest species (the Asian date mussel Arcuatula senhousia, the droplet tunicate Eudistoma elongatum, the Mediterranean fanworm Sabella spallanzanii, and the clubbed tunicate Styela clava). However, as a secondary objective, divers are tasked with detecting non-target NIS know to be in New Zealand waters (e.g., the Asian paddle crab Charybdis (Charybdis) japonica, the colonial ascidian Didemnum vexillum, and the kelp Undaria pinnatifida) and any suspect organisms thought to be new to New Zealand (Woods et al., 2018). A selection of these species, and other NIS commonly found in New Zealand are shown in Figure 2. When comparing the species detected by each method, we separately analysed the detection profiles for total NIS observed (i.e., primary and secondary objectives) and the target subset of organisms (i.e., the primary objective of nine target species and the three nontarget species).
A key aspect of the survey protocol that these programmes have implemented for 20 years was the focus on detecting new incursions of NIS, not enumerating densities or abundance of established NIS (Woods et al., 2018). This strategy was implemented to ensure that human observers did not become overwhelmed counting abundant NIS but maintain mental capacity to observe and detect primary or secondary target species. Therefore, we assessed the efficacy and efficiency of divers and ROVs at detecting the presence or absence of NIS and do not report densities or abundances. However, we briefly report on whether NIS were identified by single specimens or abundant populations.

Visibility dependent survey coverage
Kaipara Harbour ( Figure 1) presented a strong turbidity gradient, from the upper arms of the harbour (turbid) to the harbour entrance (clear). We used this gradient to compare the swath of benthos sampled by divers and the ROV at high visibility (> 3 m secchi), moderate visibility (1.5 m secchi) and low visibility (0.8 m secchi) ( Figures 1A-C). Two sites were investigated at each turbidity range. At each site all NIS were noted, and the time and distance covered recorded. The width of seafloor within clear view of the ROV and SCUBA divers was estimated using parallel lasers to calculate the average frame width (ROV), and half the secchi disk measurement taken at the surface (divers). While it is not appropriate to assume that the benthos could be accurately sampled by divers at the full length of the secchi measurement, we considered that half the secchi depth was a conservative swath width of human divers. Wharf piles were surveyed at locations of high and low turbidity (no wharf piles were found in regions of moderate turbidity), and the times taken to cover these features were also recorded. All NIS detected were enumerated, and any suspected NIS or unknown native species were sampled by SCUBA divers. Samples were preserved according to the taxon to which they belonged (as identified by trained parataxonomists) and sent to specialist taxonomists for formal identification.
The average point-to-point distance covered per unit time, by the ROV and divers, was estimated from the difference between the start and end Global Positioning System (GPS) coordinates and the time it took to cover the distance. The GPS coordinates were collected with a Lowrance ™ HDS-16 chart-plotter with horizontal accuracy < 5 m.
Sensitivity, efficiency, and taxonomic composition of each method ROV and SCUBA surveys were completed at all three locations ( Figure 1). The 14 sites (Kaipara, four sites; Nelson, six sites; and Whakaraupo/Lyttelton, four sites) surveyed included a range of natural and artificial habitats including rocky riprap/reef, soft sediments, man-made pontoons, and wharf piles. Sites were predetermined and were assessed by ROV first to avoid missing specimens collected by divers. SCUBA surveys were completed no more than two weeks after ROV surveys. Sampling was completed during daylight hours at each location. Sampling in Kaipara Harbour was completed during the week 20-24 May 2019; between 18-26 June 2019 in Nelson; and between 1-15 July 2019 in Whakaraupo/Lyttelton Harbour.
At each site, a defined area was sampled in line with the NMHRSS programme (Seaward et al., 2015), either 50 m benthic transect, 50 m pontoon structures, or ten wharf piles were sampled, and the time taken for each method recorded. Divers searched the prescribed area (e.g., linear distance travelled or prescribed number of wharf piles) in "buddy" pairs, swimming the same route which included observations of both overlapping and discrete habitats. Furthermore, all NIS observed by divers or ROV were recorded. The sensitivity of each method (i.e., the proportion of NIS, sampled by each diver and the ROV, compared to the total NIS sampled by all methods) and the detection rate (i.e., number of NIS sampled per minute) were calculated for each site.
Sensitivity (Equation 1) was calculated as the total number of NIS (number of species, not number of individuals) found at that site by each method (M s ) compared to the total number of NIS observed by all methods (i.e., both divers and ROV) at that site (S).
Equation 1: Detection rates (Equation 2) were calculated as the total number of NIS detected by each method (M s ) divided by the time taken to complete the survey (T min ). Detection rate was expressed as NIS detected per minute.

Equation 2:
Detection rate= M S T min Observations of NIS were recorded in situ by divers. However, ROV videos were processed following completion of field campaigns by trained parataxonomists with no prior knowledge of the observations made by the divers. In cases where NIS were suspected, but were unable to be confirmed via specimens, confirmation by diver collected specimens was used to confirm presence. We note that without performing a full biodiversity census at each site we were unable to assess the inability of both methods to detect NIS present within a site.
Species assemblages at each location and for each method (including separate detections from each diver) were used to identify differences in taxonomic profiles detected by each method. The influence of substrata, method, and region on NIS composition was examined. We analysed the total richness of NIS detected by both methods as well as the subset of primary target species to ensure that additional capability to review ROV video did not bias the detection of additional species compared to divers.

Relative survey costs and AI augmented post-processing
We assessed the relative efficiencies and efficacy of diver and ROV methods, including the costs of manual footage review compared to artificial intelligence (AI) augmented processing of ROV video. This was done by summing the time required to cover 1 hectare (ha) of seafloor and multiplying this by the number of people required for operations. Four people were required to complete diving operations, while three people were required for ROV operations. ROV surveys required post-processing of video to analyse the presence of NIS and this time was added to the total time costs of ROV surveys. Video processing time was added to the survey cost at the same ratio as the footage collected (i.e., one hour of benthic video = one hour post-processing). We note that equipment (e.g., ROV, dive gear, vessels) and consumables (e.g., fuel) was not factored into calculations of costs. We focus on the personnel costs of completing surveys after investing in the appropriate equipment.
To demonstrate the applicability of AI detectors we trained and tested a computer vision algorithm against expert marine biosecurity specialists. Here, we examined the potential for computer vision algorithms to recognise and discriminate a key unwanted, yet widely present invasive species of tubeworm, Sabella spallanzanii, including testing against a similar native tubeworm Pseudobranchiomma grandis (both Family Sabellidae; Figure 3). S. spallanzanii has been shown to be damaging to ecosystem services and marine industries (Soliman and Inglis, 2018;Atalah et al., 2019;Douglas et al., 2020;Tait et al., 2020). Halting regional spread of species such as S. spallanzanii is a key goal of regional and central government agencies across New Zealand and Australia (Cunningham et al., 2019;McDonald et al., 2020).
We explored the potential for automatic NIS detection in survey videos (collected with the Boxfish ™ ROV), using AI to identify a NIS already present in New Zealand. We trained a YOLOv3 AI model to detect the non-indigenous tubeworm, S. spallanzanii and the indigenous tubeworm P. grandis. This classifier was chosen because of its suitability for object detection, and in particular its utility for real-time detection.
YOLOv3 (You Only Look Once version 3), is a real-time object detection algorithm that identifies specific targeted objects in videos (as a post-process). During training, YOLOv3 iteratively learns the features required to accurately identify its target and, after each iteration, discards any information that does not improve the accuracy. To train the YOLOv3 model we used a small dataset of approximately 100 individual images of non-indigenous tubeworm S. spallanzanii. For testing, we used independent ROV video that had not been used for training.
From our selected ROV survey video, we fed a set of 100 frames (set A) into the trained detector. When the detector predicted the presence of the target (S. spallanzanii) anywhere within the frame, it drew a coloured box around (annotated) each instance of the detected target and saved the set of 100 annotated frames (set B) for later analysis. Set A was given to one of two experts to manually count and record the number of individual S. spallanzanii that were visible in each frame. Set B was given to the second expert to count and record any AIannotated detections in each frame. We then carried out a comparison between the instances of S. spallanzanii manually counted in set A and those automatically detected in set B. While the trained detector algorithm included both indigenous and non-indigenous tubeworms, only the invasive S. spallanzanii was present in the test video.

Data analysis
Variation in sensitivity and detection rates between methods (including between divers) were analysed using one-way ANOVA, including post-hoc Tukey tests. Diagnostic plots were used to check for outliers and homoscedasticity before Tukey tests were performed. Statistical tests were done in R Studio (RStudio Team, 2020).
Composition of NIS was analysed using principal coordinate analysis (PCA) and permutational ANOVA (PERMANOVA) using the R 'vegan' package (Oksanen, 2007). We chose multivariate analysis to retain species specific information and identify subtle biases in the detection profiles of each method. To test for and visualise these responses we overlayed the contribution of individual species using redundancy analyses to the total species profile of each replicate (i.e., a single site by one method). The Jaccard index was used for calculating dissimilarity indices for the presence-absence data. The combined effects of method (Diver 1, Diver 2, or ROV) survey substrata (rocky reef/riprap, soft sediment, or pontoons), and location (Kaipara Harbour, Nelson Harbour, and Lyttelton Harbour/Whakaraupo) on the composition of NIS detections were analysed using PCA and PERMANOVA. PCA plots included vectors representing the contribution of NIS to detection profiles and frames surrounding treatment groups.
Agreement between computer vision detection algorithms and expert observations were analysed with linear regressions. To test the deviation of the regression slope from one (i.e., perfect agreement) the difference between computer vision detections and experts' detections were compared to expert detections. Significant deviation of the regression slope from zero was used to assess the over-or underdetection of S. spallanzanii by the computer detector.

Visibility dependent survey coverage
Under decreasing water clarity, the area of seafloor or surface area of structures surveyed per unit time by divers or ROV decreased dramatically (Table 1). This was determined by the area of seafloor/structure visible under a gradient of turbidity, where decreasing water clarity affected the proximity of the ROV to the benthos/structure and therefore the area of benthos clearly visible per linear metre of benthos/structure covered or the number of piles covered per unit time. Since it was necessary for SCUBA divers to dive as buddy pairs for safety reasons, the area of benthos/ structure sampled was doubled, but the area surveyed by a single diver and the ROV was comparable.

Sensitivity, detection rate and taxonomic composition of each method
Combined, visual surveys with ROVs and divers detected 35% of all known NIS present in Kaipara Harbour, 14% of all NIS known from Nelson Harbour, and 15% of all known NIS from Whakaraupo/Lyttelton Harbour. ROV alone detected 28% of all known NIS present in Kaipara Harbour, 8% of all NIS known from Nelson Harbour, and 10% of all known NIS from Whakaraupo/ Lyttelton Harbour. The sensitivity of detection was similar for each method, with no statistical difference in the proportion of the total NIS detected by each method ( Figure 4A). However, detection rates (e.g., the number of unique NIS sampled per unit time) of NIS were significantly greater for divers compared to the ROV ( Figure 4B; Table 2). Tukey post-hoc tests revealed no significant differences in detection rates between Diver 1 and Diver 2, or between Diver 2 and ROV, but Diver 1 had higher detection rates than ROV ( Table 2).
The composition of NIS assemblages as detected by each method and at each site were presented in two-dimensional space as Principal Coordinate Analyses (PCA). To visualise the overlap in detection profiles we present duplicate plots highlighting the differences between each method ( Figures 5A, B), various substrates ( Figures 5C, D), and regions ( Figures 5E, F). Principal coordinate plots showed high overlap in the detection profiles between all methods for the full suite of NIS detected ( Figure 5A) and the subset of targeted high-risk species (Figure 5B), although ROV detections captured only a subset of the overall NIS profile observed by divers ( Figure 5A). Unlike survey methods, NIS profiles were distinct across substrate types ( Figures 5C, D), and across regions ( Figures 5E, F). PERMANOVA analysis showed that the species profiles detected by each method were not significantly different, but the species profiles differed significantly between substrata (Table 3).
Species poorly detected by the ROV were relatively small organisms, including the skeleton shrimp Caprella mutica and the hydroid, Ectopleura crocea (Table S1). Furthermore, the ROV did not identify any of the Asian paddle crab (Charybdis (Charybdis) japonica) in Kaipara harbour which were observed by divers at the same location in sampling 24 hours apart (Table S1). However, the invasive goby Acentrogobius pflaumii was only detected by the ROV and not by divers at the same location one week later (Table S1). It is worth noting that while mobile organisms such as the crab Charybdis (japonica) japonica may have simply moved over 24 hours, the tendency of Acentrogobius pflaumii to hide in burrows means it may simply go un-noticed if disturbed. Area or number of piles also separated by water visibility at the time of inspection, low visibility (0.8 m secchi), moderate visibility (1.5 m secchi), and high visibility (3 m secchi).

Relative survey costs and AI augmented post-processing
The relative costs of diving and ROV assessments in terms of the number of people hours required to cover one hectare (ha) of seafloor showed that ROV methods were 1.8 times more expensive than divers and 2.4 times more expensive than divers if manual imagery review was required (Table 4). Both ROV and diver searches in low visibility environments (0.8 m secchi) were almost four times more expensive than searches in clear water (> 3 m secchi). The level of visibility had little impact on the relative expenses of ROV and diver-based methods (Table 4).
Computer vision algorithms had no misidentification of nonindigenous tube worms as indigenous and vice versa. The independent test ROV survey video had high numbers of the non-indigenous tubeworms at various distances from the camera, providing a relatively challenging task for automated algorithms and experienced observers. Computer vision algorithms showed high agreement with expert-based video observations for the enumeration of the non-indigenous S. spallanzanii ( Figure 6). There was, however, a trend of false negatives by the automated detector. Linear regression of the differential between detection by experts and the detector and the total number of specimens observed by experts showed a significant negative trend (t = -9.7, p< 0.001), indicating that increasing numbers of specimens lead to increasing ratios of false negatives by the detector. Higher agreement at lower densities shows that automated detectors provide fewer instances of false positives when presented with low numbers of Sabella spallanzanii. Overall, there were only three instances where computer vision detections exceeded expert detections by a single specimen.

Discussion
Comparisons between scientific SCUBA divers and ROV revealed that divers were more efficient than ROVs in detecting NIS in certain surveillance situations, although there was no A B

FIGURE 4
Comparison of SCUBA divers and ROV for NIS detection. Graph (A) shows the proportion of NIS detected at each site (detection efficacy) and graph (B) shows the detection rate per minute for each method (detection efficiency). Results of Tukey test comparisons shown by letters (a, b), with separate letters indicating statistically significant differences.  (Peters et al., 2019). However, we show that video based ROV surveys compare well to visual diver surveys for the detection of NIS across a range of natural and artificial habitats. Direct comparisons between diver-operated video and remoteoperated video for benthic species richness metrics have revealed strong agreement (Biovida et al., 2015). Additionally, assessment of the comparability of ROVs and human snorkelers for identifying fish communities showed that ROVs detected greater abundance and diversity of fish (Raoult et al., 2020). Here we report the detection of an invasive fish Acentrogobius pflaumii by a small ROV, while divers were unable to detect this fish due to its ability to conceal itself in burrows. It is likely that at this low visibility site, A. pflaumii are sensitive to approaching divers and the noise of air exhalation and retreat into burrows before divers can detect them, whereas the quieter ROV can approach close enough without disturbing them.
Video-, or image-based analysis will always fall short of the taxonomic resolution outcomes from physical samples (Peters et al., 2019), however, collections of specimens are not always possible, and in many circumstances good quality imagery can provide highlevel taxonomic information (Marshall and Evenhuis, 2015). Furthermore, machine learning algorithms are increasingly viable options for detecting organisms (Gaston and O'Neill, 2004;Wäldchen and Mäder, 2018) and we show that implementation of automated detection algorithms can substantially reduce costs associated with manual footage review in a marine biosecurity surveillance context. Care must be taken in the application of imagery alone for identification of NIS (Krell and Marshall, 2017), but the ubiquity of imagery-based techniques in marine systems necessitates that we optimise these products for a range of applications. Here we show that the use of automated detection algorithms in post-processing of video imagery can reduce the total costs of ROV surveys, but overall remote operated surveys were unable to match the efficiencies of dive teams. While it is no surprise that water visibility dramatically increased the cost of surveillance per unit area, the relative costs to diving and ROV surveys were the same.
Despite current inefficiencies of ROVs, further advancements in these technologies have the potential to approach the efficiencies of divers. Small ROVs are increasingly equipped with high-resolution cameras (e.g., 4k video) and lightweight manipulators (for sample collection). These platforms are also able to apply acoustic imaging systems (Kim and Yu, 2016), stereocamera systems (Negahdaripour and Firoozfam, 2006), and sampling systems (Mazzeo et al., 2022) that can provide further confirmation of NIS or improve operations under challenging conditions. Furthermore, with the application of underwater positioning systems, and real-time (or near real-time) detection algorithms, development of increasingly autonomous surveillance or analysis can pave the way for automated detection and geolocation of NIS at efficiencies far greater than any current method (Williams et al., 2016). This will also reduce capacity limitations and enable a greater variety of agencies and personnel to perform surveillance activities.
While we stress the power of experienced and trained biosecurity SCUBA divers for marine post-border surveillance, we acknowledge the physical, physiological and workforce limitations of these methods and provide support for the application of ROVs under specific scenarios too dangerous or challenging for divers. Such conditions include regions where dangerous marine animals are present, conditions of high tidal flushing, depths beyond c. 20 m depth, environments with especially high or erratic vessel traffic, and conditions where pollutants have the potential to affect human divers. We expect that remote operated and autonomous systems will soon become the standard for marine surveillance and establishing their limitations against gold standard methods will help identify current limitations and efficiency bottlenecks. Improvements in the swath-width of remote imaging systems  provides a tangible path forward for immediate gains in efficiency of these systems.

Conclusion
As autonomous technologies increasingly become available for close range visual imaging, we must ensure that these technologies can be integrated seamlessly into existing monitoring or surveillance programmes. Part of this integration undoubtedly involves transition from human annotation to computer-vision recognition algorithms which will enable data-processing/analysis to keep pace with the exponential increase in data collection (Beyan and Browman, 2020). To ensure continuity of existing programmes, biases and limitations of existing or emerging technologies must be identified and quantified. We identify some limitations of camerabased ROV imaging compared to human divers and we present critical parameters required for autonomous or remote operated systems in turbid coastal environments.
Currently ROVs are unable to achieve the same efficiency as dive teams, but such systems may represent a stopgap while closerange visual imaging AUVs gain improved capabilities for navigating complex coastal environments (Gutnik et al., 2022). AUVs will eventually eclipse both methods in cost effectiveness and efficiency and greatly improve surveillance and monitoring of benthic marine ecosystems. We show that camera-based robotic surveys and integration of computer-vision algorithms can complement human-based surveillance programmes and could integrate additional emerging surveillance technologies (e.g., eDNA; Bowers et al., 2021) to provide platforms with both imaging and sampling capabilities (Yamahara et al., 2019).

Data availability statement
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

Author contributions
LT and GI conceived the research. LT, LR, KS, LO, CW, and HL contributed to field and lab data collection, preparation, and analysis. JB contributed to the development of computer vision algorithms and analysis. LT, JB, HL, CW, KS, and GI contributed to manuscript preparation and review. All authors contributed to the article and approved the submitted version.

Funding
This research was supported by the New Zealand Government's Strategic Science Investment Fund (SSIF; COBS2102, COBS2202, CEBS2302) to the National Institute of Water and Atmospheric Research (NIWA) and the Ministry for Primary Industries Marine High Risk Site Surveillance programme (SOW18048). Additionally, field logistics in Kaipara Harbour were supported by Auckland Council and Northland Regional Council.

Publisher's note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Supplementary material
The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fmars.2023.1102506/ full#supplementary-material SUPPLEMENTARY TABLE 1 Full list of non-indigenous species detected by Diver 1, Diver 2, and ROV across three harbours. Species detected at least once by each method are indicated by a tick, whereas species not detected by that method at any sites are indicated by a cross.