Ferries and Environmental DNA: Underway Sampling From Commercial Vessels Provides New Opportunities for Systematic Genetic Surveys of Marine Biodiversity

Marine environmental DNA (eDNA) is an important tool for biodiversity research and monitoring but challenges remain in scaling surveys over large spatial areas, and increasing the frequency of sampling in remote locations at reasonable cost. Here we demonstrate the feasibility of sampling from commercial vessels (Mediterranean ferries) while underway, as a strategy to facilitate replicable, systematic marine eDNA surveys in locations that would normally be challenging and expensive for researchers to access. Sixteen eDNA samples were collected from four fixed sampling stations, and in response to four cetacean sightings, across three cruises undertaken along the 300 km ferry route between Livorno (Tuscany) and Golfo Aranci (Sardinia) in the Ligurian/Tyrrhenian Seas, June-July 2018. Using 12SrDNA and 16SrDNA metabarcoding markers, we recovered diverse marine vertebrate Molecular Operational Taxonomic Units (MOTUs) from teleost fish, elasmobranchs, and cetaceans. We detected sample heterogeneity consistent with previously known variation in species occurrences, including putative species spawning peaks associated with specific sea surface temperature ranges, and increased night time abundance of bathypelagic species known to undertake diel migrations through the water column. We suggest commercial vessel based marine eDNA sampling using the global shipping network has potential to facilitate broad-scale biodiversity monitoring in the world’s oceans.


INTRODUCTION
Environmental DNA (eDNA) is an important tool to support biodiversity research and monitoring but challenges remain on how to scale eDNA surveys for assessments over large spatial areas, and to make it feasible to increase the frequency of sampling in remote locations at reasonable cost (Pawlowski et al., 2018). Such upscaling is important for generating high resolution biodiversity surveys needed for conservation planning, or impact assessments of human activities (Wetzel et al., 2018;Bani et al., 2020). Sampling design is often subject to logistical and financial constraints, which may mean data collection is opportunistic or has limited spatial and temporal resolution, introducing uncertainty into the interpretation of species presence/absence records (Menegotto and Rangel, 2018). However, understanding the environmental drivers of species distributions and interactions, requires systematic sampling over large spatial scales (Carstensen, 2014;Hale et al., 2018). eDNA has been suggested as a tool to improve the spatiotemporal resolution of biodiversity surveys and can offer the advantage of detecting communities of species from a single sample through the use of universal primer sets targeting taxa of interest (e.g., MiFish, Miya et al., 2015) and high throughput sequencing (HTS) (Stat et al., 2017). To date, studies in the marine environment have typically focused on detecting community differences at small spatial scales in coastal environments or comparing between regions using point-based samples (Port et al., 2016;Bakker et al., 2017;O'Donell et al., 2017;Yamamoto et al., 2017). Developing strategies to facilitate sampling in offshore pelagic environments could enhance the contribution of eDNA to large scale marine biodiversity surveys. This is particularly important for marine megafauna, where large distributions and dispersal capabilities of species mean that such taxa may have partial, low-resolution coverage from conventional techniques, while being a high priority for conservation planning due to their vulnerability and exposure to anthropogenic pressures (Hooker et al., 2011). In the case of marine mammals, eDNA studies have typically focused on single species assays (Foote et al., 2012;Baker et al., 2018;Hunter et al., 2018;Székely et al., 2021), or they have been found as serendipitous detections in metabarcoding surveys when using fish-specific primers . Developing metabarcoding approaches to reliably detect marine mammal eDNA in assays targeting marine vertebrates communities would increase efficiency and the scaling up of ecosystem level surveys and monitoring (Foote et al., 2012;Baker et al., 2018;Székely et al., 2021).
This study presents the first report, including novel sampling protocols, on using commercial marine ferry routes as a platform to systematically collect eDNA samples. Ferry routes allow sampling over the large spatial scales required for marine conservation planning and offer a cost-effective way for researchers to access remote offshore areas on a regular basis (Arcangeli et al., 2013;Matear et al., 2019). Since ferries typically follow set routes on routine schedules, they can be used for repeatable transect sampling over time (Arcangeli et al., 2013). Ferry-based sampling also allows the collection of samples at any time of the day (vessel schedule allowing), opening the opportunity to assess diel variation in marine community composition in environments that may not be accessible to smaller research vessels at night due to operating constraints. As a test case, we selected a route in the central Mediterranean basin between Livorno (Tuscany, Italy, Southern Ligurian Sea) and Golfo Aranci (Sardinia, Italy, Northern Tyrrhenian Sea) which crosses through the Pelagos Sanctuary for Mediterranean marine mammals (Notarbartolo-di-Sciara et al., 2008). We profile samples using two novel "universal" marine vertebrate barcode markers employed here for the first time with marine environmental samples (Valsecchi et al., 2020), with the aim of investigating the potential of eDNA collected from commercial vessels to detect sample composition variation related to small scale spatial and temporal processes, such as diel variation and environmental factors such as bathymetry and sea surface temperature (Berry et al., 2019;Djurhuus et al., 2020). While we focus here on ferries as a test case, in principle the sampling approach can be deployed from any commercial vessel, opening up the possibility of sampling for marine eDNA across the global shipping network.

Sample Collection
Water samples were collected from the Mega Express III, a 212 m long vessel of the Corsica/Sardinia Ferries fleet, while underway along the route from Livorno (N 43 • 33 3 , E 10 • 18 0 ) to Golfo Aranci (N 40 • 59 24 , E 9 • 37 12 ) and on the return leg (hereafter LiGA route, see Figure 1). Samples were collected at four fixed sampling stations (FSS), selected to be evenly spaced along the 300 km route, and to cover sites of ecological interest with different environmental characteristics (bathymetric profile, distance from shore etc.). FSS 1 and 2 were in shallower coastal waters up to approximately 80 m deep, with FSS 2 being close to the port of Golfo Aranci and major sea-enclosure based fish farms. FSS 3 and 4 were in deep offshore waters adjacent to the continental shelf (approximately 550-850 m) in the Pelagos marine mammal sanctuary (Figure 1, Box 1, and Supplementary  Table 1). Additional samples were taken opportunistically when cetaceans were sighted by a marine mammal observer team conducting visual surveys (see below). The latter, by necessity, were collected only during daylight hours, while fixed stations were sampled whenever the ferry crossed their location. Sampling was carried out during June and July 2018, on three return cruises, 15 days apart (18th-19th June, 2nd-3rd July, 16th-17th July 2018; Supplementary Table 1). A sample naming convention was adopted as "LiGAx.y" -where "x" is the cruise number and "y" -sampling station: 1, 2, 3, 4 for fixed stations; S1, S2 etc., for sequential opportunistic sighting samples within cruises.
A total of 16 samples were collected: 12 from the FSSs and four during marine mammal sighting events (Figure 1 and Supplementary Table 1). Half (n = 8) were collected during daylight and the remainder at night. Three fixed sampling stations (1, 3, and 4) were sampled with different light regimes over the three ferry trips according to the vessel schedule (FSS1: 2 day, 1 night; FSS3 and FSS4: 1 day, 2 night). All three FSS2 samples were collected at night.
Supporting data for environmental variables, such as sea surface temperature (SST), chlorophyll concentration (CPH), salinity and bathymetry, covering sampling locations and dates (Supplementary Table 1) were obtained from the EU Copernicus Marine Service (2020a,b,c). Distance from shore (DFS) of sampling points was calculated in ArcGIS (ESRI, 2011).
To confirm that sample sites were environmentally differentiated, a principal components analysis was conducted using DFS (km), depth (m), SST ( • C), CC (mg/m 3 ), and salinity (PSU) in R (R Core Team, 2013) for each sample location. Each variable contributed to differentiation of sample FIGURE 1 | Bathymetric map of the study area showing the ferry route between Livorno and Golfo Aranci, and the sampling sites for this study (colored squares are fixed stations -LIGA1, 2, 3, 4-and circles correspond to opportunistic sampling associated with cetacean sightings). Inset map shows location of sampled route relative to the whole Mediterranean Sea. See Supplementary Table 1 for full details of sampling sites. environmental profiles, with the first two principle components accounting for 85.5% of the variation between profiles, and suggested that FSS 1 and 2 samples, and the samples from the remaining sites respectively, group according to their environmental characteristics (see Supplementary Table 2

Cetacean Observations by the FLT Mediterranean Monitoring Network
A marine mammal observer team from Istituto Superiore per la Protezione e la Ricerca Ambientale (ISPRA), part of the Fixed Line Transect (FLT) Mediterranean monitoring network 1 , traveled on each cruise. During daylight hours the ISPRA team recorded sightings of all cetacean species, and communicated encounters to the eDNA team in the vessel's engine room, who immediately initiated water sample collection. The monitoring was conducted by professional observers positioned on both sides of the vessel bridge using a standard protocol based on distance sampling (ISPRA, 2015). A dedicated GPS was used to automatically record the survey track, marking the beginning/end points and the locations of sightings. Visual monitoring was only carried out when the wind strength was less than 3 on the Beaufort scale.

Water Sampling Procedure
Water samples were collected from the ferry engine room, via a derivation pipe intercepting marine cooling water upstream of the engine. The water intake was located at 4.5 m below the sea surface. Before each sample was collected, the derivation pipe was opened allowing seawater to flow for 3-5 min to rinse the pipe with "local" water, thus removing residual water from the previous sample. A total of 13 liters of seawater was collected from each sampling site and was decanted directly from the derivation pipe into sterile foil laminated plastic "Bag-in-the-Box" (BiB) containers (supplied by KRCA 2 ; see Supplementary Figure 2). The BiB containers were subsequently used for sample storage and transportation.
The time to fill each BiB varied according to sea conditions and experience. On some occasions, the water coming out of the tap contained air which increased the time to fill the bag. BOX 1 | Standard Operational Procedure (SOP) for Commercial Vessel Transect eDNA Sampling.
• (1) VOLUME OF SAMPLED WATER. It would be good practice to filter large volumes of marine waters (up to membrane saturation), in order to retain as much eDNA as possible. Such a volume is however variable, depending on filter characteristics and on water density (e.g., day-time samples saturate the filters quickly, being rich in phytoplankton). According to our experience, from his study and in the analysis carried out in controlled environment (Valsecchi et al. 2020), we suggest the processing of 4-5 litres of marine water per filter.
• (2) FILTER POROSITY. We did not find any significant difference between the three tested NC (nitrocellulose) filter types with porosity 0.22, 0.45, 0.8 µm. However, we suggest to exclude the 0.22 µm pore-size membrane, as filtration is very slow, and saturation is reached after 2-3 litres, without providing a better quantity/quality eDNA. Between the two remaining filter types, we recommend the use of 0.45 µm pore-size membranes, in order to retain the smallest biological particles, consistentl with findings by Li et al. (2018).
• (3) NUMBER OF REPLICATES sample replicates are necessary for both a) increase the total amount of eDNA retrieved from each single sampling station (useful for future analyses) and b) to reduce the false positive and negative rate inbuilt in the metabarcoding technique (Ficetola et al., 2015). Thus, a minimum of three replicas per station is advisable (meaning a total of 12-13 litres collected from each sampling station).
• (4) SAMPLE CONTAINER. The Bag in the Box Sampling System (BiBSS) presents many advantages for the collection/preservation/storage of marine water samples for eDNA surveys (see Box 2).
• (5) SAMPLING STATION DESIGN. The selection of the geographic positions to locate fixed sampling stations (FSS) invariable over cruises, should aim to: (1) sample spots of biological interest based on previous observational/literature data; (2) prioritize points on bathymetric maps indicating habitat changes (e.g., edge of continental shelf); (3) select roughly equidistant sampling sites (about 35-45 nautical miles apart) along the designated shipping lanes, to cover the whole route and (contingent on vessel schedule) allow the FSSs to be sampled during both day and night time. For the same reason, in order to sample adjacent points at different time of the day, it is recommendable to number the stations following the sampling chronological order, meaning that on the map they will not appear in a consecutive order. For example, if six fix sampling stations are selected, and three will be sampled on the outward journey and three on the return journey, the order along the route, on the map, will be: PortA-FSS6-FSS 1-FSS 5-FSS 2-FSS 4-FSS 3-PortB, with the three underscored sampling stations surveyed in the return journey.
• (6) TIME BEFORE FILTRATION. Preferably the water contained in the BiB bags should be filtered immediately after collection to maximise eDNA retention, and to simplify sample transportation, by avoiding transfer of bulky water samples. However, if this is not possible, sample storage times of 1-2 weeks between collection and filtering is well tolerated, provided that the BiBs are kept at 4 degrees and away from exposure to the sun during transport. It is important to note that water samples should never be frozen to avoid breaking of cellular components that would result in the release of extracellular DNA which is more easily lost in filtration.

Filtration
Most samples were filtered immediately after collection, unless multiple samplings for marine mammal sightings occurred in close succession, resulting in a slowdown of the filtration processing. To minimize this problem, we operated two portable vacuum pumps (Fisherbrand TM FB70155, Fisher Scientific) in parallel. In some instances, the water samples were filtered after returning to the laboratory in Milan. The time elapsed from sample collection to filtration ranged between 0 up to a maximum of 29 days, with a mean of 5 days. Nitrocellulose filters with 0.8, 0.45, and 0.22 µm pore sizes were used on aliquots of the same 13-litre sample to identify the best filtering protocol to use in future ferry eDNA sampling campaigns. For each membrane type, the volume of seawater passed through the filter was adjusted according to the onset of the filter's pore saturation. Thus, 13 liter samples were subdivided into aliquots of 5, 5, and 3 liters filtered with 0.8, 0.45, and 0.22 µm pore size nitrocellulose filters respectively, using a BioSart R 100 filtration system (Sartorius). Immediately after filtration, the membranes were folded on themselves (filteredparticle sides facing each other), wrapped in aluminum foil, labeled, and stored at −20 • C until DNA extraction.
Twelve of the sixteen samples were filtered in three aliquots with all three different membrane pore sizes. Four samples (LiGA1.4, LiGA3.1, LiGA3.4, and LiGA3.S1), were filtered only with the 0.8 µm and the 0.45 µm membranes. Filtration duration varied among samples and filter type. The processing of five liters of marine water through the 0.8 µm filtering membranes took from 1 h 11 to 8 h 28 (average 4 h 52 ); the same volume of water processed through the 0.45 µm filters took from 1 h 21 to 10 h 54 (average 6 h 39 ), while the filtration of three liters of marine water through the 0.22 µm took from 3 h 40 to 14 h 14 (average 8 h 34 ).

DNA Extraction, Amplification, and Sequencing
DNA was extracted from filters between 0 and 31 days after filtration, with a mean of 14 days. Extractions were done using a DNeasy PowerSoil Kit R (Qiagen), following the manufacturer's protocol.
The target taxa for this study were all marine vertebrates, including cetaceans, so the DNA samples were amplified using MarVer1 (12S) and MarVer3 (16S) metabarcoding primers (Valsecchi et al., 2020). These primers were explicitly designed to increase efficiency for amplification of cetacean DNA whilst retaining the ability to amplify amplicons from teleosts, elasmobranchs, chelonians, and birds (Valsecchi et al., 2020). For each locus, seven independent PCR negative controls were included, which used sterile ultrapure water (Sigma Ltd.) instead of an eDNA sample. The library for locus MarVer1 was sequenced in a 150 bp paired-end lane, and locus MarVer3 in a 250 bp paired-end lane, using an Illumina MiSeq sequencer at the University of Leeds Genomics Facility, St James's Hospital, as described in Valsecchi et al. (2020). The raw short read sequence data for this study is deposited the NCBI BioProject repository, accession number (PRJNA751540).

Initial Quality Filtering and Annotation of Amplicon Sequences
The bioinformatics workflow was as described by Valsecchi et al. (2020). First, paired reads were screened for the presence of the expected primer and index sequence combinations to exclude off-target amplicons. Reads were then combined to generate insert sequences, and potential PCR duplicates and chimeric sequences were removed. The insert data was then aggregated to create a counts matrix containing the occurrence of each unique sequence in each sample. The taxonomic origin of each insert was determined by Blasting their sequence against a local instance of the GenBank NT database (Nucleotide 3 ). The level of homology of the insert to the hit sequence was noted, as was the species name of the hit sequence. The taxonomic hierarchy for each unique insert was generated by searching a local instance of the ITIS database (ITIS 4 ) with the annotated GenBank species name. The count matrix and taxonomic hierarchy for all annotated unique sequences were then aggregated into values for equivalent Molecular Operational Taxonomic Units (MOTUs), by combining all inserts with a set homology (≥98%) to the GenBank hit at a specified taxonomic level (i.e., "Order, " "Family, " "Genus, " or "Species"). The workflow description is available at http://www.dna-leeds.co.uk/eDNA/, while code and executable files used for the analysis are hosted at https://github.com/ msjimc/eDNA).

Relation Between Read Counts and Filter Type/Processing and Sampling Regimes
Pearson's product moment correlation was used to evaluate read count recovery in relation to time elapsed from sampling to end of filtration, and time elapsed from filtration to extraction. Differences in read counts obtained between the three membrane pore-sizes, and for daylight vs. night time sampling were assessed using Kruskal-Wallis tests. Subsequently, data from libraries with different filter sizes for the same sample were merged for further analyses.

Evaluation of Potential Contamination and Ambiguous MOTU Assignments
Following an approach similar to that suggested by McKnight et al. (2019), MOTUs attributable to potential contamination were identified by comparing the ratio of mean read counts for each MOTU in marine samples, vs. the mean read counts of the same MOTU in the seven PCR negative controls. MOTUs with a ratio of less than 20:1 (Sample:Control) were excluded, except for 10 cases where the ratio was greater than 5:1, and the MOTU was not related to any species previously handled in the analyzing laboratory, and/or the ratio for the alternate marker was greater than 20. Excluded MOTUs were primarily amplicons attributable to taxa present in environmental samples from Genoa aquarium which were analyzed in the same sequencing runs (Valsecchi et al., 2020). In addition, MOTUs from humans, cow, pig, dog, chicken, turkey, and other taxa which were related to control DNAs from Valsecchi et al. (2020) were removed. For MarVer1 and MarVer3, 8.6 and 3.6% of reads respectively were excluded under these criteria (see Table 1 and Supplementary Tables 4, 5).
Amplicon sequences for MOTUs annotated as species not recorded for the Mediterranean [based on the Fishbase catalog 5 ] were evaluated individually against related sequences in Genbank according to the scheme shown in Supplementary Figure 3. Such MOTUs were then either reclassified as derived from a Mediterranean taxon, or highlighted as a potential newly described record or introduced non-indigenous species (NIS; see Supplementary Table 3).

Analyses of Community Composition and Sample Diversity
Summaries and visualizations of read counts for different taxonomic levels were generated using the R package "Phyloseq" (McMurdie and Holmes, 2013). Habitat type and trophic levels for fish species were downloaded from www.fishbase.se. Associations between read counts of specific MOTUs, or other summary statistics such as measures of Alpha diversity for samples, and environmental variables were assessed through Pearson's product-moment correlation.

Read Counts in Relation to Sample Categories
After quality filtering, a total of 606,544 and 1,021,207 reads for MarVer1 and MarVer3 respectively, were assigned to taxa with Genbank references ( Table 1). No significant correlation was found between the number of reads recovered from each sample and the time span from sampling to filtration for MarVer1 (t = −0.0622, df = 42, p = 0.951, cor = −0.0096), but a weak negative correlation was detected for MarVer3 (t = −2.3787, df = 40, p = 0.0222, cor = −0.3520). Most samples (95.5%) were filtered within 8 days from collection and, within that time window, sample read counts were evenly distributed (Supplementary Figure 4A). A negative correlation was observed for read counts and time from filtration to extraction for MarVer1 (t = −2.8145, df = 41, p = 0.0075, cor = −0.4024), and a positive association for MarVer3 (t = 2.5137, df = 39, p = 0.0162, cor = 0.3734). However, read counts were relatively evenly distributed overall, and the significant correlations may be driven by outlying values (Supplementary Figure 4B). No difference in mean read counts among filter pore sizes was detected for either locus (Supplementary Figure 4C; MarVer1, Kruskal-Wallis

Summary of Taxonomic Assignments
More than 90% of reads were attributed to bony fish species, with the second most represented group being cetaceans, followed by elasmobranchs and birds (  Table 3). These represent cases either where there is no reference sequence available for the corresponding species, or where variation among haplotypes is less than the 2% divergence threshold used to amalgamate MOTUs. These included several tuna MOTUs (particularly Thunnus albacares which was detected at high abundance by both markers), whereas only two tuna species (Thunnus alalunga, Thunnus thynnus) are considered Mediterranean residents. A MOTU assigned to the non-resident Trachurus japonicus, was also detected at high abundance by both markers. The Atlantic (Trachurus trachurus) and Mediterranean (Trachurus mediterraneus) horse mackerels are normally found in the Mediterranean, but for the latter there is no reference sequence deposited, and the Atlantic species Trachurus trachurus differs only by 1 and 4 bp from Trachurus japonicus sequence for MarVer1 and MarVer3 respectively. Similar cases occurred for cetaceans-amplicons annotated as the non-resident Tursiops aduncus were detected by MarVer3, which were taken as originating from T. truncatus (from which it differs by 1 variable site), and reads for Stenella frontalis and S. longirostris were recovered by MarVer3 (2 variable sites) and MarVer1 (1 variable site) respectively, but we assume are derived from S. coeruleoalba. For other MOTUs, where a non-resident could be unambiguously associated with a single resident congeneric, we consider them as originating from that resident species, otherwise, we consider them unresolved at the genus level. Ultimately four and three MOTUs remained unrelated to known Mediterranean taxa for MarVer1 and MarVer3 respectively. However, these instances were all low abundance amplicons accounting for only 110 (0.0181%) and 38 (0.0037%) reads for MarVer1 and MarVer3 respectively. None of these MOTUs scored more than 10 reads in a single sample. For MarVer1 the species were: Cololabis saira (6 reads), Dentex tumifrons (56 reads), Engraulis japonicus (47 reads) and Gasterochisma melampus (one read); and for MarVer3: Cyclothone atraria (31 reads), Dentex canariensis (6 reads) and Lutjanus fulvus (1 read). Given the low abundance of these MOTUs, further interpretation would be speculative as to whether they represent artifacts, possible detections of new alien species or novel resident taxa. However, if future studies corroborate the incidence of these taxa, then our data should be reviewed as early detection of the species.

Fish Community Composition
The two markers showed high consistency in the overall relative abundances of detected teleost taxa, with eight genera in common among the 10 most abundant for each marker ( Table 2). Sample profiles were dominated by anchovy (Engraulis spp.) and sardine (Sardina spp.) MOTU reads, accounting for 32.8% and 33.8%, and 32.2% and 24.1% of teleost reads for MarVer1 and MarVer3 respectively. Overall the 10 most abundant genera accounted for 98.8% and 94.5% of reads for MarVer1 and MarVer3. Intermediate trophic level predatory fish (trophic level 3) were the most commonly detected species type (greater than 75% total reads; Figures 2-4), but top-level predators (trophic level 4 in Figure 3) including Garfish (Belone spp.; detected by MarVer3 only), Tuna (Thunnus spp.), and Bullet/Frigate tuna (Auxis spp.) were also among the top 10 most abundant taxa. Amongst the less abundant MOTUs, detections of eDNA from other large predatory fish included swordfish (Xiphias gladius), and ocean sunfish (Mola mola), which were found by both markers in multiple samples and from different cruises. With respect to habitat type (as defined in www.fishbase.se), MOTUs from pelagic-neritic species were most abundant (greater than 60% reads), followed by pelagic-oceanic, reef associated and bathypelagic (Figure 3).
Species compositions showed considerable spatiotemporal variation across samples. For example, Belone MOTUs were detected at high abundance in fixed station samples LiGA1.2-1.4 by MarVer3, but were rare at the same locations for cruises 2 and 3, sampled 15 and 30 days later respectively (Figures 2, 4). Mediterranean rainbow wrasse (Coris julis; Family Labridae) was detected at high abundance by both markers at fixed station 4 for cruise 2 and 3 (samples 2.4 and 3.4), but had lower abundance in other samples. However, the relative patterns of abundance for the most common taxa were consistent between the two loci (Figures 2, 4).
The relative abundances of anchovy (Engraulis spp.) and sardine/sardinella (Sardina/Sardinella spp.) MOTUs showed a reciprocal occurrence pattern across different samples for both markers, such that high abundances of Engraulis MOTUs corresponded with low abundances of sardine/sardinella and vice versa (Figures 2, 4). Sardina showed low abundance (less than 10%) in samples for Cruise 1 (mid-June), but predominated in eight of 12 samples from Cruises 2 and 3 (July). These patterns showed a significant association with sea surface temperatures across a range of 22.7-26 • C, with Engraulis showing a negative relationship, and Sardina a positive correlation (Table 3 and Supplementary Figure 6).
Both markers also detected eDNA from several lantern fish species (Myctophidae family) including Ceratoscopelus maderensis, Lampanyctus crocodilus, Hygophum hygomii, Hygophum benoiti, and Myctophum punctatum. These are bathypelagic species which undertake a vertical diel migration into the epipelagic zone at night. Read abundances for these and other bathypelagic species showed increases in several of the nocturnal sample collections, notably in cruise 2, which coincided with a near full moon (88% full, Figure 2 and

Detection of Cetofauna
Traces of cetacean eDNA were detected for eight of the 16 sample sites with MarVer1 and from all sites (fixed and sightings) with MarVer3 ( Figure 5 and Supplementary  Tables 4, 5). The eDNA signals were weak, with 97% and 87% of species-sample combinations for MarVer1 and MarVer3 respectively yielding fewer than 100 reads. Both primer sets detected amplicons attributable to bottlenose dolphins (Tursiops truncatus), striped dolphins (Stenella coeruleoalba), and the fin whale (Balaenoptera physalus). MarVer1 also detected sperm whale (Physeter macrocephalus) eDNA in the nocturnal sample LiGA2.4 (03/07/2018, h03:09). All four species had at least one detection with a minimum of 50 read counts, for one or both markers. Reads for striped dolphin were the most abundant for both markers, with a maximum per sample abundance of 120 for MarVer1 and 1,098 for MarVer3. Cetacean eDNA was detected in all four visual sighting samples for MarVer3 and at LiGA2.S1 and LiGA2.S2 for MarVer1. In these two samples, both markers recovered more than 100 reads and matched the species detected by visual observations. Mean read counts were higher for visual sighting samples than fixed stations for both loci, and for

Intra and Inter-Sample Diversity
Alpha diversity measures for MOTUs (Shannon and Inverse Simpson; Table 4) were highest for both markers in samples from Cruise 2, and fixed station 2 (proximate to the port of Golfo Aranci and fish farms) had the highest diversity on each Cruise. Mean diversities for night time samples were higher than those collected in daylight. Mean diversity measures varied significantly across cruises for the Inverse Simpson measure (MarVer1: Kruskal-Wallis χ 2 = 7.3419, df = 2, p = 0.02545; MarVer3: Kruskal-Wallis χ 2 = 7.989, df = 2, p = 0.01842), but differences in means for other comparisons were not significant, nor were correlations between sample diversities and read count, sea surface temperature, chlorophyll concentration, and collection track length.
A hierarchical cluster analysis of Bray-Curtis distances among samples (Supplementary Figure 7) identified all the sighting samples clustering together for MarVer3, and other paired clusters of night-time, or within-cruise samples were observed. However, the study was not designed to directly evaluate environmental or ecological variables as drivers of community composition, so such patterns should be treated cautiously. Nonetheless, this provides some evidence for heterogeneity and structure in species composition among samples.

DISCUSSION
In this study, we evaluated the feasibility of using large commercial vessels, such as ocean-going ferries, as a platform  for eDNA sampling to support marine biodiversity studies. Metabarcoding results from mitochondrial markers targeting vertebrates recovered diverse community profiles matching known Mediterranean biodiversity, including teleost fish, elasmobranchs, birds and cetaceans. We also detected inter-sample variation consistent with previously identified spatiotemporal ecological patterns. This suggests that eDNA sampling from commercial vessels is capable of yielding high quality data relevant for biodiversity surveys and spatial ecological research.

Advantages of Sampling From Commercial Vessels
Scaling marine eDNA studies to support surveys over large spatial scales, or increasing the temporal frequency of sampling efforts is limited by logistical and financial constraints arising from the high costs and access to survey vessels for use offshore (Bani et al., 2020;Sigsgaard et al., 2020). Using commercial vessels as a platform potentially provides the following advantages: (1) Access to remote offshore locations: commercial vessels make regular regional or transoceanic voyages, traveling 100 s or 1,000 s of kilometers offshore. Open sea areas are often of high biological relevance but are difficult to access with the smaller vessels commonly available to researchers, or which require large dedicated oceanography/fisheries research vessels with high operating costs (Abdulla et al., 2009); (2) Repeatability: commercial vessels, and ferries in particular, follow specific routes with high traffic frequency, thus allowing repeated sequential sampling along the same tracks. In principle this allows for transect based study designs, and facilitates temporal comparisons ranging from days to seasons and years for specific locations; (3) Easy diurnal sampling: commercial vessel scheduling allows for routine operation at night, which may not be feasible for smaller port-based research vessels that have operating restrictions during hours of darkness; (4) Synchronicity: different routes can be surveyed concurrently, allowing co-ordinated simultaneous sampling over ocean basins; (5) Linear sampling: sampling from vessels allows collection of eDNA over tracks of 3-4 km (vs. single point sampling), which may be advantageous for some applications, by increasing the amount and diversity of eDNA recovered; (6) "Emission (N), night sample; S1, S2, cetacean sighting samples.
free" surveys: sampling takes place as an addition to existing journeys, rather than as specifically commissioned research voyages, therefore no extra emissions are attributable to the sampling procedure; (7) Cost effectiveness: commercial vessel platforms can drastically cut the expense of eDNA sampling, since they remove the need to operate dedicated research vessels, and operators may be willing to host researchers and equipment at no or nominal cost; (8) Increased public awareness around conservation issues: participation in eDNA surveys offers opportunities for scientific outreach activities on marine conservation with ferry (vessel) companies and, in turn, the wider public. The potential for survey vessel disturbance to influence species detectability of some marine megafauna via eDNA could be a point of concern, but evaluating the importance of this effect is beyond the scope of the current study. eDNA signatures may persist in sea water for variable time, and can be influenced by water movements, meaning that the eDNA collected could be deposited prior to the vessel passage, and therefore the samples' species eDNA composition may not be directly influenced by disturbance (Collins et al., 2018). In future, integrating data from visual megafauna surveys such as ACCOBAMS 6 , with larger scale eDNA surveys could help explore this question in more depth. 6 https://accobams.org/asi-data-presentation/

Methodological Considerations
For convenient acquisition of eDNA samples from large vessels, we advocate using engine cooling-water taken from the external environment. Most large vessels will have water intakes which can be intercepted by researchers. This can provide samples "on demand, " while the vessel is underway, without the need for deployment of additional external equipment. We suggest a Standard Operational Procedure (SOP; see Box 1) employing aluminized plastic storage bags [Bag-in-Box Sampling System (BiBSS); Box 2], for convenient water sampling and storage. Allowing the intake water to flow for 5-8 min before the actual collection ensures that collections will reflect the eDNA profile at the intended point of sampling. Samples are derived from water collected along the track for the duration of the collection window, which may extend for several kilometers at typical vessel cruising speeds. The collection window length is determined by the vessel speed, the flow rate of water drawn from the cooling system, and the speed at which this fills the BiBSS. Adjusting the flow rate may allow for samples to be collected over shorter or longer track lengths, depending on study requirements. In the current study, the water intake was situated 4.5 m below the surface. Vessels with larger drafts will have correspondingly deeper intakes, but the intake position limits sampling to surface waters, and therefore is less flexible than sampling at different depths during a dedicated research cruise. The importance of this constraint will depend on the study research focus, and behavior of taxa of interest. However, in the current study we detected eDNA from species from multiple depth zones. The extent to which this represents mixing of water across depth strata, and the mechanisms which influence the distribution of eDNA remain to be explored.

Marker Choice
The taxonomic focus of this study was on marine vertebrates, which, as intermediate and top level predators, are important indicators of the status of marine ecosystems (Hazen et al., 2019). We employed two universal marine vertebrate primer sets, recently developed by our team (Valsecchi et al., 2020). This work reports their first use with environmental samples collected at sea, and corroborates their previous successful validation with marine aquarium samples (Valsecchi et al., 2020). Both MarVer1 [12S rDNA, targeting a similar region to the MiFish primers (Miya et al., 2015)] and MarVer3 (16S rDNA) simultaneously recovered fish and cetacean amplicons, but the taxonomic coverage of the two loci was not fully overlapping. MarVer3 detected almost twice as many teleost species as MarVer1, but not all species found with MarVer1 were also detected by MarVer3 (Tables 1, 2  and Figures 2, 4). However, there was good concordance between the two markers for the most abundant MOTUs.
Both primers successfully detected cetacean species (Figure 5), but only striped dolphins were consistently detected concurrently. MarVer1 identified sperm whale presence in sample LiGA2.4, while fin whales were identified by both loci, but in non-overlapping samples. This suggests that while both markers have the capability to amplify all four cetacean species, as with other studies, stochasticity still influences taxon BOX 2 | Advantages of the Bag-in-Box Sampling System (BiBSS) in marine eDNA surveys.
• (1) CONVENIENCE. BiBs are easy to be carried when empty, as opposed to bulky and expensive sterile solid bottles (12 vs. IBiB); • (2) SAMPLE HOMOGENEITY. BiBs promote sample homogeneity: 12-litres sample stored in 12 different bottles would inevitably imply sampling-error biases due, for instance, to the time elapsed between the filling of the first and the last bottles. Such sampling error could be even more conspicuous in linear sampling, as carried out from a moving platform; BiB containers are made of several layers of metallised film, allowing the preservation of sample in light-free environment until filtration. This drastically reduces both eDNA degradation due to UV light (Strickler et al., 2015) and the growth of phytoplankton communities after sampling and before filtration; • (4) SPILL-FREE DISPENSER. The BiB tap spring dispensers allow pouring into the filtering cylinder (100 ml capacity in our case) small volumes of water sample at the time, without spilling; • (5) OIL-FREE SAMPLES. BiBs prevent oil traces -more and more frequently found in the seas worldwideto contaminate the filtering membrane. Petroleum compounds are lighter than seawater and therefore float at the surface. In contrast to sampling bottles, where the superficial layer of water is the first to be poured out and filtered, BiBs allow to filter the lower layers of the water sample first, as the dispenser is positioned at the bottom of the bag (while eventual oil traces would migrate to the top of the bag); • (6) RECYCLABLE. After use, BiB sacks can be disposed as recyclable plastic waste. detection, reinforcing the importance of replication with a combination of primer sets (Shelton et al., 2016;Sawaya et al., 2019;Djurhuus et al., 2020).
The success of MOTU annotation in all metabarcoding studies is dependent on the accuracy and coverage of reference databases for the chosen loci (Hestetun et al., 2020). In this study, most gaps in taxonomic overlap between the two loci are likely to be due to lack of reference sequences. The previous in silico assessment of MarVer1 and MarVer3 (Valsecchi et al., 2020), indicated that the present coverage of teleost fish for the MarVer1 region is less extensive than for MarVer3, corresponding with the smaller number of MOTUs assigned for MarVer1. However, our annotation pipeline would assign a MOTU to Genus or Family level, if no match within the homology threshold was found at the species level, providing relevant reference sequences were available. Garfish (Belone belone) was found in abundance by MarVer3 in the first cruise (e.g., representing more than 50% of total reads in sample LiGA1.3; Figure 4), but was absent among MarVer1 MOTUs. There are no deposited sequences for the genus Belone for the MarVer1 region, but references for three other genera in the same family (Belonidae) are available. However, these were all Pacific or IndoPacific species (Ablennes hians, Tylosurus acus melanotus, Strongylura anastomella) and none of them matched with any of the amplicons produced in our study, suggesting that if Belone belone amplicons were actually produced using MarVer1, they would have probably remained unannotated due to the lack of reference sequences for the nominal and related taxa. The rapid uptake of eDNA surveys means that reference database coverage can be expected to improve quickly, but some coverage variation dependent on the popularity of different barcode targets will persist (Porter and Hajibabaei, 2018). Taxonomic biases also exist, with many poorly studied groups continuing to be underrepresented in reference databases (Porter and Hajibabaei, 2018;Wangensteen et al., 2018). Improving the latter point will require support from traditional taxonomic approaches (e.g., morphological descriptions) to link sequence information with species descriptions (Cognato et al., 2020).
Polymorphism levels within barcode target regions can also influence the accuracy of annotation (Alberdi et al., 2018). Where variation among species falls below the threshold for MOTU aggregation (98% homology for this study, equating to 4-5 variable sites), automated annotation may not be able to resolve some taxa. In this study, both MarVer1 and MarVer3 had MOTUs assigned to species not considered to be resident in the Mediterranean (e.g., tuna species other than Atlantic and Albacore) due to this issue, which required additional manual curation to resolve.
Elasmobranchs may be underrepresented in our data. Amplicons from rays accounted for 0.04 and 0.033% of total reads for MarVer1 and MarVer3 respectively, and there were no detections of reads from shark species. These results are consistent with other recent studies which suggest cartilaginous fish species are hard to detect as their eDNA is rare enough to be overwhelmed by teleost eDNA when using universal primers (Stoeckle et al., 2020). The role of eDNA concentration in elasmobranch detection has also been highlighted in aquarium tests where both MarVer1 and MarVer3 detected the three nektonic cartilaginous fish species swimming in the shark tank, while failing to detect the three benthic elasmobranch species (the southern and round stingrays and the longcomb sawfish) (Valsecchi et al., 2020). Elasmobranch detection could be enhanced by the use of specific primers for cartilaginous species (Kim et al., 2021).

Spatiotemporal Ecological Signatures in eDNA Profiles
We used four fixed sampling stations (each sampled three times), selected for different environmental profiles, and four ad hoc samples collected in response to cetacean sightings.
The main objective was to test if sampling from ferries could detect sample composition heterogeneity potentially related to environmental variation. Quantifying how environmental parameters influence community compositions in different niches more generally, would require a larger spatiotemporal sample set, and while our sites could be differentiated by their environmental profiles, patterns of site clustering relative to measures of beta diversity were inconsistent. Similarly, no significant associations between sample alpha diversity and environmental variables were detected, but this may reflect lack of statistical power. However, with the current sample sizes, we detected other patterns of inter-sample variation consistent with spatiotemporal ecological patterns known from conventional survey approaches.
A well-recognized behavior of bathypelagic species is the diel migrations made as they rise from depth to feed in surface waters at night, particularly in association with lunar cycles (Shima and Swearer, 2019). We detected increased abundance of reads from bathypelagic taxa, notably lantern fish (family Myctophidae) in night samples from Cruise 2, which was close to the full moon (Supplementary Table 2). This behavior would also predict increased species diversity in nocturnal samples as bathypelagic species enter the surface layer, mixing with resident taxa. We observed increased alpha diversity for night time samples with both loci, although the difference was not statistically significant (possibly due to low power arising from the limited sample size). Night time samples also had significantly higher read abundance for both markers, potentially reflecting increased biomass and eDNA load in water collected at night. This suggests that circadian vertical movements of organisms and/or water masses can bring molecular signals from lower layers to surface waters, and that in some cases sampling of surface waters alone may allow eDNA to characterize community composition representative of different habitats (Suter et al., 2020;West et al., 2020). Thus, it is advisable to collect both day and night-time samples for offshore surveys. This also highlights the capability of eDNA to detect real time changes in biodiversity profiles, reflecting aspects of species behavior, like diel cycles, which change on short time scales.
The majority of reads for both loci were derived from teleost fish species and overall species composition were consistent with taxa expected for the Mediterranean. Noticeable heterogeneity was observed among sites, and temporal samples at the same location, despite the study covering a relatively small geographic area (approximately 300 km transect length), and close sampling time intervals (3 collections spaced 15 days apart in June-July). Consistent with a previous study in the Bay of Biscay (Fraija-Fernández et al., 2020), reads from anchovy (Engraulis sp.) and sardine (Sardina sp.) taxa were the most abundant MOTUs. These species also showed a striking temporal pattern in relative abundance with the former being predominant in the first cruise (18th-19th June 2018) and the latter dominating in the two July cruises (2nd-3rd and 16th-17th July 2018). Mediterranean anchovy populations are known to preferentially spawn with SST below 25 • C matching the significant negative correlation observed for Engraulis read count abundance and SST in this study (Palomera et al., 2007). Reads from the Mediterranean rainbow wrasse (Coris julis), also showed a strong spatiotemporal pattern, being detected at high abundance in night-time samples from station 4 on cruises 2 and 3 (Figure 4). Coris julis is typically described as a benthic species so a high abundance in pelagic samples is potentially paradoxical; however, the species' eggs are pelagic and peak spawning occurs in June coincident with our survey period (Quignard and Pras, 1986;Alonso-Fernández et al., 2014). These observations suggest that some of the strongest fish signals we detected could be related to spawning events rather than to schooling fish (Sigsgaard et al., 2017;Sevellec et al., 2020). It is unclear whether this could also be the case also for the Belone belone peak detected in the June cruise, as it is epipelagic and its spawning time is also compatible with the period of our sampling campaign (Tsikliras et al., 2010).
Lastly, FSS 2 returned the highest sample alpha diversity on each cruise (although the differences were not significant). Intriguingly this site is close to a major sea-enclosure aquaculture development, farming sea bass (Dicentrarchus labrax) and seabream (Sparus aurata). It has previously been suggested that nutrient input into the water arising from the farms could influence ecosystem structure, increasing local fish diversity and attracting top level predators such as bottlenose dolphins (Días López et al., 2008;Días López, 2012)-a suggestion consistent with our observation of elevated eDNA alpha diversity at the site. However, we did not see increased abundance of amplicons attributable the farmed species or cetaceans relative to other sites.

Cetofauna Detection
To date, marine mammal sequences have been detected only occasionally within eDNA samples screened for fish , or with vertebrate specific primers (Port et al., 2016) or a combination of both (Sigsgaard et al., 2020). Other eDNA studies targeting cetaceans using species-specific primers suggest that cetacean eDNA is hard to detect, even in close spatial or temporal proximity to the source animal. For example, the molecular detection of harbor porpoise (Phocoena phocoena) eDNA was observed to diminish at distances greater than 10 m from a netted sea pen hosting four harbor porpoises, and in the open sea, harbor porpoises were detected in only one of the 24 samples, collected in eight sites known to be attended via a static acoustic survey (Foote et al., 2012). Killer whale (Orcinus orca) eDNA was detected in water samples collected from the wake of killer whale pods up to 2 h after the encounter (Baker et al., 2018). However, more than half of the sampled encounters produced no or only weak detections, indicating that detection probabilities were low for samples not collected in close temporal association to the animals (Baker et al., 2018). Similarly, bowhead whale (Balaena mysticetus) eDNA was found to diminish substantially in water samples collected as little as 10 min after the whale presence (Székely et al., 2021). The reported challenges in detecting cetacean eDNA suggest that the results from our pilot study using novel primers optimized to for cetaceans are encouraging (Valsecchi et al., 2020), given that a) they allow for amplification of all vertebrate taxa, meaning low-prevalence eDNA could be overwhelmed by high abundance targets (Rojahn et al., 2021); and b) metabarcoding is less sensitive to detecting rare signals than species specific qPCR assays (Harper et al., 2018;Qu and Stewart, 2019).
Our survey detected four common cetacean species present in the Mediterranean at multiple sites across all three cruises, associated with both visual sighting events and fixed stations. While the limited sample sizes for sampling stations and cruises do not permit formal statistical comparison, trends in the patterns of read abundances were consistent with some reported aspects of cetacean ecology. Firstly, mean read abundances were greater in sighting samples for the two dolphin species, suggesting a potential influence on detection rates of spatiotemporal proximity to animals, fitting those reported in earlier studies (e.g., Székely et al., 2021). Across sites, mean cetacean read abundances were greater for fixed stations 3 and 4 compared to stations 1 and 2. Stations 3 and 4 have bathymetry greater than 200 m and lie close to the continental shelf edge, as did the sighting locations. Across cruises, Cruise 2 had the highest cetacean read counts. Cruise 2 coincided with a near full moon and returned the highest MOTU alpha diversity for both markers. Finally, the highest read abundances for fin and sperm whale were from night-time samples from fixed stations on Cruise 2, and mean abundances for the dolphin species were higher at night when sighting samples were excluded.
Cetacean distributions are closely aligned with prey distributions. Presence in deeper waters close to continental shelves, are a frequently observed pattern, as nutrient rich upwellings can drive concentration of prey species (Shaff and Baird, 2021). Similarly, diurnal variation in activity and use of different depth zones, sometimes interacting with the lunar phase, has also been reported (Shaff and Baird, 2021). Increased use of surface waters (<20 m) by fin whales (Balaenoptera physalus) in the Southern California Bight at night has been observed (Keen et al., 2019). It is noteworthy that lantern fish form a significant component of Mediterranean striped dolphin diet (Dede et al., 2016), with the Madeira lantern fish (Ceratoscopelus maderensis) being the second most abundant prey recorded (after Diaphus spp., also belonging to Myctophidae family). Ceratoscopelus maderensis, Hygophum hygomii and Hygophum benoiti lantern fish reads were extremely high on the night of July 2nd 2018, coincident with when the largest numbers of cetacean reads were recorded. These potential links between eDNA and environmental covariates are tantalizing and deserve deeper exploration in future work. These results also suggest that in some cases, taking into account target species ecology, such as increased night-time activity, or responses to lunar phase, may help cetacean eDNA researchers enhance detection rates.

Conclusion and Future Prospects
This study demonstrates the feasibility of sampling from commercial vessels (Mediterranean ferries) while underway as a strategy to support replicable, systematic marine eDNA surveys in locations that would normally be challenging and expensive for researchers to access. The ISPRA FLT cetacean monitoring network currently covers 12 fixed routes around the Mediterranean Sea during all seasons. Therefore, the methodological approach could be extended to other seasons and areas of the Mediterranean basin. eDNA samples are in principle straight forward to collect, meaning that non-specialist professionals, or amateur researchers have potential to contribute to such studies via "citizen science" initiatives, opening the possibility to expand biodiversity surveys across large sea areas and to create the first international marine Mediterranean eDNA bank. This could establish an archive of material for future studies, supporting current work on biodiversity monitoring, and research targeted at specific species of commercial and conservation interest [e.g., Mediterranean monk seal, Monachus monachus (Valsecchi et al., 2021)], and invasive species monitoring. This novel approach for the study of the Mediterranean seascape, will bridge existing biodiversity gaps (e.g., low coverage of pelagic marine communities) and contribute to wildlife management and marine conservation planning. For example, most newly described endemic, cryptogenic or alien species detected in the Mediterranean have been from coastal waters (Chartosia et al., 2018;Katsanevakis et al., 2020). This notion could imply that the number of invasive species in the Mediterranean is currently underestimated and future research needs to focus on offshore waters.
The approach presented in this pilot study could also be extended to any large vessel and to any sea area. While our study focused on Mediterranean ferries, other commercial vessel types such as container ships, tankers, and passenger liners could all offer similar sampling opportunities on any shipping route globally. The high volume and global distribution of commercial vessel traffic mean that such platforms could make an important contribution to the future of ocean monitoring though eDNA, especially with automation of sampling and analysis workflows.

DATA AVAILABILITY STATEMENT
The data presented in the study are deposited in the NCBI BioProject repository, accession number: PRJNA751540.

ETHICS STATEMENT
Ethical review and approval was not required for this study since it relies solely on noninvasively collected marine water samples.

AUTHOR CONTRIBUTIONS
EV conceived the project idea, designed sampling strategy, and developed the Standard Operational Procedure. AA coordinates the "Fixed Line Transect Network" (using ferries as platform of observation for monitoring mega and macro marine fauna and main threats) and made possible intermediation with the ferry company. RL conducted sample collection, eDNA extraction, and molecular biology experiments. EB assisted in data analysis and environmental parameters' retrieving. IC was responsible for the NGS facility where samples were run and dealt with the biostatistics pipeline to analyze the HTS data. PG provided logistic support and contributed in funding acquisition. SG led the lab where eDNA samples were processed and NGS libraries were prepared, and performed Phyloseq analysis of read abundance and sample composition. EV and SG wrote the original draft. EB and AA contributed to manuscript review and editing. All authors have read and agreed to the published version of the manuscript.

FUNDING
This work was partially supported by a crowd funding campaign "MeD for Med" (2020-CONT-0312 -BICOCCA UNIVERSITA' DEL CROWDFUNDING -MED FOR MED), and other nongrant funds held by SG and EV. RL's participation, including travel to the University of Leeds to carry out laboratory work, was made possible by the ERASMUS C program (2017-1-IT02-KA103-035644). EB was supported by a University of Leeds funded Ph.D. scholarship.

ACKNOWLEDGMENTS
We would like to thank Ummey Hany from the Leeds Institute of Molecular Medicine sequencing facility for support on preparation of NGS libraries. We would also like to thank Corsica and Sardinia Ferries, in the person of Cristina Pizzutti, for welcoming our proposal of attempting eDNA sampling from their fleet, with a special thanks to the whole crew of the MegaExpress 3 for its availability and support.