Exploring the Use of Environmental DNA (eDNA) to Detect Animal Taxa in the Mesopelagic Zone

Animal biodiversity in the ocean’s vast mesopelagic zone is relatively poorly studied due to technological and logistical challenges. Environmental DNA (eDNA) analyses show great promise for efficiently characterizing biodiversity and could provide new insight into the presence of mesopelagic species, including those that are missed by traditional net sampling. Here, we explore the utility of eDNA for identifying animal taxa. We describe the results from an August 2018 cruise in Slope Water off the northeast United States. Samples for eDNA analysis were collected using Niskin bottles during five CTD casts. Sampling depths along each cast were selected based on the presence of biomass as indicated by the shipboard Simrad EK60 echosounder. Metabarcoding of the 18S V9 gene region was used to assess taxonomic diversity. eDNA metabarcoding results were compared with those from net-collected (MOCNESS) plankton samples. We found that the MOCNESS sampling recovered more animal taxa, but the number of taxa detected per liter of water sampled was significantly higher in the eDNA samples. eDNA was especially useful for detecting delicate gelatinous animals which are undersampled by nets. We also detected eDNA changes in community composition with depth, but not with sample collection time (day vs. night). We provide recommendations for applying eDNA-based methods in the mesopelagic including the need for studies enabling interpretation of eDNA signals and improvement of barcode reference databases.


INTRODUCTION
The ocean's mesopelagic zone is poorly explored, in large part due to the vastness of the habitat and the technological and logistical challenges in accessing it. Recently, it has been discovered that mesopelagic biomass is significantly greater than previously thought St. John et al., 2016); however, there is significant uncertainty about the composition of mesopelagic biomass. Deep pelagic waters, including the mesopelagic, likely contain numerous undescribed species (Robison, 2004(Robison, , 2009). Traditional sampling nets may miss important taxa such as delicate gelatinous species that fall apart when collected, or fish that avoid capture altogether. A recent genetic study suggested that species diversity is significantly underestimated (Sommer et al., 2017), and genetic analysis has shown that many nominal species may consist of multiple cryptic species, sometimes even belonging to different families (Lindsay et al., 2017).
Mesopelagic species are under immediate risk for exploitation as it is a potential source of fish meal and nutraceuticals (St. John et al., 2016;Hidalgo and Browman, 2019), but our lack of knowledge about their distribution, life history, and ecology impedes sustainable management (Webb et al., 2010;St. John et al., 2016;Glover et al., 2018;Hidalgo and Browman, 2019). The consequences of overharvesting are potentially severe and global in nature. Mesopelagic fish and invertebrates play a key role in the biological carbon pump, which transfers carbon from surface production to the deep sea where it is sequestered (Giering et al., 2014). Anthropogenic changes to mesopelagic biodiversity and biomass could alter this process. Thus, it is urgent to improve the understanding of the composition, distribution, and abundance of mesopelagic fauna before irreparable anthropogenically induced changes occur (St. John et al., 2016;Glover et al., 2018;Hidalgo and Browman, 2019).
New tools are emerging that will facilitate scientific study of this region. One of these is the analysis of environmental DNA (eDNA), which is DNA that is present in the environment that, for metazoans, is no longer associated with the organisms from which it originated (reviewed in Ruppert et al., 2019). In eDNA analyses, water is collected and filtered, the eDNA is extracted from the filter, and one or more DNA markers are analyzed, typically through qPCR, or metabarcoding. There are many potential benefits of eDNA animal biodiversity assessments relative to traditional assessments, including cost effectiveness (Sigsgaard et al., 2015;Thomsen and Willerslev, 2015;Evans et al., 2017) and improved detection of species that are rare or are difficult to sample by traditional means (Dejean et al., 2012;Piaggio et al., 2014;Kelly et al., 2017;Closek et al., 2019;Stat et al., 2019).
The use of eDNA analyses is still in its infancy and there is no standard protocol that is universally applicable to any given system Ruppert et al., 2019). eDNA sampling at meaningful spatial and temporal scales in mesopelagic waters, as opposed to coastal and freshwater systems where most eDNA work has been done, may be especially challenging due to the enormity of the environment and the difficulty in accessing it. In marine coastal regions, samples are taken using Niskin bottles, often mounted on a CTD rosette, and sample volumes are typically around one liter (Andruszkiewicz et al., 2017b;Djurhuus et al., 2017). For the mesopelagic, an adaptive sampling approach that targets sampling locations based on remotely sensed information from shipboard acoustics could be beneficial for targeting concentrations of mesopelagic organisms such as in deep scattering layers (Kinzer, 1969;Orlowski, 1990;Hazen and Johnston, 2010;D'Elia et al., 2016). Multi-frequency and broadband acoustic approaches would detect the greatest variety of organisms over a range of depths (Lavery et al., 2007;Davison et al., 2015;Bassett et al., 2020), as acoustic scattering by organisms is frequency-dependent and frequencies vary in the depth that they penetrate.
The goal of this work was to explore the collection and utility of eDNA for surveying and characterizing metazoan biodiversity in the mesopelagic zone, in order to better understand the composition of this region's biomass. We used an adaptive sampling approach based on acoustic backscatter (18,38,120,and 200 kHz frequencies) to target layers of organisms in daytime and nighttime depth-stratified sampling of eDNA (via Niskin bottles) and zooplankton (via MOCNESS nets). We conducted DNA metabarcoding on all samples using the 18S V9 barcode gene (Amaral-Zettler et al., 2009) to enable detection of a broad range of taxa above the species-level (i.e., genus or family), shedding light on the efficiency of both approaches for detecting animal taxa.

Environmental DNA Sample Collection and Shipboard Processing
Seawater samples were collected during a cruise on the NOAA Ship Henry B. Bigelow between the 2,000 and 3,000 m isobaths off the shelf break south of the island of Martha's Vineyard in Massachusetts, United States (Figure 1). Samples were collected during casts of a Seabird 911 plus CTD mounted on a rosette frame with eight 5 l Niskin bottles. Niskin bottles were triggered to sample based on the layers identified in real time with the shipboard Simrad EK60 echosounder whose split-beam transducers (11 • beam width for the 18 kHz and 7 • beam width for the 38, 120, and 200 kHz) were mounted on a retractable keel on the vessel.
Three daytime and two nighttime CTD casts were completed (avoiding crepuscular periods when animals may be vertically migrating), yielding a total of 40 eDNA samples ( Table 1) and consisting of 8 samples per cast. Niskin bottles were not cleaned between casts as the bottles were deployed open and thus exposed to the entire water column during the downcast, until they were triggered to close at the target depth on the upcast. In the first four casts, one sample was taken at each of 8 depths. On the final cast (Cast 10), two replicate samples were taken at each of 4 depths. Diel vertical migration was observed as the vertical shift in the position of the layers on the echosounder, and our water sampling for each cast coincided with these layers (Figure 2 and Supplementary Figures S1-S4). In all of the CTD casts, the water column was stratified with clines in density, temperature, and salinity between approximately 25 and 50 m below the surface (Figure 3 and Supplementary Figures S5, S6). Dissolved oxygen peaked in this region, before declining at the depth around 50-70 m. Oxygen minimums were found around approximately 250 m below surface in all casts (Supplementary Figures S5, S6).
Water from the Niskin bottles was filtered in a temperaturecontrolled cold room (maintained at 13-15 • C) on the ship and began immediately upon retrieval of the CTD-rosette. The work area in the cold room was wiped down daily with bleach, followed by several rinsing wipes with Milli-Q water. Nitrile gloves were worn throughout the filtering protocol and were changed frequently. Three samples were processed at a time. Bottles were removed from the rosette, brought into the processing room, and secured on custom-built stands. The remaining bottles were wrapped in ice packs until they could be processed.
Three peristaltic pumps were used to pump water from the Niskin bottles through sterile encapsulated, single-use 0.2 µm PES Sterivex filters. The tubing was first cleaned by pumping with a minimum of 200 ml of Milli-Q water and then flushed with a minimum of 200 ml of sample water. The Sterivex filters were attached after cleaning. The remaining Niskin bottle water was filtered through the Sterivex and the volume of the flowthrough was collected and measured (Table 1). Once filtration was complete, the Sterivex filter was removed, the ends of the filter were sealed with parafilm, the filter was placed in a sterile 50 ml Falcon tube, and the tube was stored at −80 • C. A control sample was filtered along with each cast. The control samples consisted of approximately 4.8 l of Milli-Q water dispensed into pre-sterilized, single-use Whirlpak sample bags and pumped through Sterivex filters.

Zooplankton Sample Collection and Shipboard Processing
A 1 m 2 MOCNESS (Multiple opening and closing net and environmental sensing system; Wiebe et al., 1985) zooplankton tow was taken for comparison with the eDNA data from the CTD casts. The MOCNESS was equipped with nine nets (333 µm mesh), SeaBird temperature and conductivity probes, a pressure sensor, and a flowmeter. Eight of the MOCNESS nets sampled at discrete depth intervals ranging from the base of the mesopelagic zone (1,000 m) to the surface, and an additional net sampled the integrated water column (0-1,000 m). The MOCNESS tow was taken in the vicinity of the CTD casts (Figure 1). Temperature and salinity data were consistent with those from the CTD casts, indicating that MOCNESS and CTD sampled the same water mass (Figure 3). The nine nets sampled the following depth intervals: 0-1,000, 1,000-800, 800-600, 600-400, 400-200, 200-100, 100-50, 50-24, and 24-0 m, and tow sample volumes ranged from approximately 149 to 5,208 m 3 ( Table 2). Upon retrieval, each net was washed and the cod end buckets were removed. The zooplankton from each net were split into four equal parts using a Folsom plankton splitter (McEwen et al., 1954). One of these splits was preserved in 95% ethanol for metabarcoding analysis. The ethanol preservative was replaced with fresh 95% ethanol approximately 24 h after collection.

Environmental DNA Extraction and Sequencing
Upon return to the laboratory, genomic DNA and controls were extracted using DNeasy Power Water Sterivex Kits (Qiagen). DNA extractions were performed in a dedicated, preamplification DNA work area. The work area was cleaned before use with 10% bleach followed by several rinses with Milli-Q water. Next Generation library preparation and sequencing were conducted at the Center for Genome Innovation at the University of Connecticut (Storrs, CT). Each extract was normalized to 5 ng/µl and amplified in duplicate using 2× KAPA HiFi HotStart ReadyMix (Kapa Biosciences, Wilmington, MA) following the 16S Metagenomic Sequencing Library Preparation guide (Illumina, San Diego, CA). The PCR was set up in a PCR hood and the surface area was first cleaned with 70% bleach and additionally decontaminated using UV light, and only sterile pipette tips with embedded aerosol filters were used for pipetting. 2.5 µl of sample was added to each PCR. Per sample and While the Niskin bottles were nominally 5 l bottles, the actual volume of seawater contained was greater than 5 l. In a small number of cases, less than 5 l was filtered due to apparent filter clogging.
filtration control two PCR replicates were run and pooled after the first PCR. The primers used were the 1380F and 1510R V9 primers (Amaral-Zettler et al., 2009) with Illumina adapters. The primer sequences with their adapters (in bold) were: 1380F 5 -TCGTCGGCAGCGTCAGATGTGTATAAGAGACAGCCCTGC CHTTTGTACACAC-3 and 1510R 5 -GTCTCGTGGGCTCGG AGATGTGTATAAGAGACAGCCTTCYGCAGGTTCACCTAC-3 . The PCR conditions for the first PCR were: 95 • C for 3 min; 25 cycles of: 95 • C for 30 s, 55 • C for 30 s, 72 • C for 30 s; 72 • C for 5 min. Amplification success was assessed on the pooled PCRs using an Agilent 4200 TapeStation electrophoresis system using the High Sensitivity DNA D1000 assay (Agilent Technologies, Santa Clara, CA) and the products were purified using Agencourt AMPure XP Beads (Beckman Coulter, Brea, CA). Index barcode sequences were incorporated into the purified amplicons by a second PCR using NexteraXT Index Kit v2 Sets C and D (Illumina, San Diego, CA) and 2× KAPA HiFi HotStart ReadyMix (Kapa Biosciences, Wilmington, MA). The second PCR protocol was: 95 • C for 3 min; 8 cycles of: 95 • C for 30 s, 55 • C for 30 s, 72 • C for 30 s; and 1 cycle of 72 • C for 5 min. The indexed PCR products were purified using AMPure XP Beads (Agencourt) and assessed for quality and adapter removal using an Agilent 4200 TapeStation electrophoresis system (High Sensitivity DNA D1000 assay). Libraries were quantified using a Qubit 3.0 fluorometer (Life Technologies, Carlsbad, CA), normalized, pooled, and denatured according to Illumina MiSeq sample preparation for sequencing. 20-30% PhiX (Illumina, San Diego, CA) was added to our amplicons before running on the Illumina MiSeq using the 500 cycle v2 reagent kit. Our targeted sequencing depth was 200,000 reads per sample. The negative control samples were also sequenced. For pooling the negative controls (which did not contain detectable amounts of DNA), we added the maximum volume that was used for the samples.

Zooplankton DNA Extraction and Sequencing
Zooplankton samples were sorted into small (333-1,000 µm) and large (>1,000 µm) size fractions before extraction in order to maximize the detection of low-biomass organisms and the wet weight of each was recorded (Supplementary Table S1). The samples were poured through stacked 150 and 1,000 µm sieves and the samples were rinsed with Milli-Q water, which also washed away any remaining ethanol. When present, exceptionally large animals or fragments were picked out of the 1,000 µm sieve and preserved for later individual (Sanger) sequencing. Sieve contents were rinsed into pre-weighed, sterile 50 ml Falcon tubes using as little water as possible and concentrated by spinning 1 min at 1,000 rpm. Residual water overlying the zooplankton was pipetted away. Zooplankton samples were homogenized using a Benchmark D1000 homogenizer with a 10 mm sawtooth probe until the Falcon tube contents were visibly smooth and well-mixed. The homogenized tubes were centrifuged for 2 min at 5,000g and excess water overlying the pellets was decanted off. All tubes were weighed. The homogenizer was cleaned between samples by running it for 20 s in a 10% bleach solution followed by 3 additional runs in separate tubes of Milli Q water.
Two hundred mg of zooplankton biomass was transferred from each tube to a sterile 15 ml Falcon tube and DNA was extracted using DNeasy Blood and Tissue DNA extraction kits, using a slight modification of the manufacturer's protocol to accommodate the relatively large tissue biomass in the samples. Specifically, 1,800 µl of Buffer ATL and 200 µl of proteinase K were added to each tube, and tubes were incubated for 3 h at 56 • C and vortexed periodically. After incubation, 200 µl of lysate was transferred to a new, sterile 1.5 ml tube. At that point, the Qiagen manufacturer's protocol was followed exactly, with a final elution into 200 µl Buffer AE. Aliquots of the extracted DNA were sent to the Center for Genomic Innovation at the University of Connecticut for library preparation and sequencing, following the same protocol as for the eDNA samples.
Genomic DNA of the large individuals that were set aside was extracted using the DNeasy Blood and Tissue Kit (Qiagen) following the manufacturer's protocol. The 18S V9 gene was amplified using the V9 primers (Amaral-Zettler et al., 2009) under the following conditions: 95 • C for 3 min; 35 cycles of: 95 • C for 30 s, 55 • C for 30 s, 72 • C for 30 s; and 1 cycle of 72 • C for 5 min. PCR products were amplified on a 1.2% agarose gel and visualized with GelRed. No additional attempts were made to amplify samples that failed the first time, in order to be consistent with the metabarcoding protocol. Amplicons were purified with QiaQuick PCR Purification kits (Qiagen) according to the manufacturer's protocol, quantified using a Nanodrop, and sequenced in both directions at Eurofins Genomics 1 .

Bioinformatics
The same bioinformatics workflow was applied to both the eDNA and the MOCNESS zooplankton samples, although each group was processed separately as they were sequenced on separate runs and thus differed in their sequencing error profiles. Demultiplexed paired-end sequence reads were processed using Quantitative Insights Into Microbial Ecology 2 (QIIME2) version 2019.1 (Sigsgaard et al., 2015;Thomsen and Willerslev, 2015;Bolyen et al., 2018). Forward primer sequences were trimmed, and forward and reverse reads were truncated based on quality plots of the data after base pair 120. Reverse primers were removed (if present) using the Cutadapt (Martin, 2011) Qiime2 plugin (v. 2020.1). DADA2 (Callahan et al., 2016) implemented through QIIME2 performed sequence quality control including error correction and chimera removal, and paired the reads to generate unique (i.e., 100% similarity) amplicon sequence variants (ASVs). The control samples contained very few reads (e.g., 0.3% of the metazoanonly dataset). For each ASV in the dataset that was present in both the samples and controls, the maximum number of reads found in any control was subtracted from every sample. ASVs that had a frequency of less than 10 (summed across all 40 samples) were deleted from the dataset.
Taxonomy was assigned using a naïve Bayesian classifier  that was trained on the Silva v. 132 99% database (Quast et al., 2013) using 18S sequences only for the 18S V9 gene region. The resulting dataset was filtered to generate a metazoan-only dataset that provided a high-level taxonomic classification for a broad range of animal species (Level 5 on the Silva database classification). We further resolved the most abundant categories based on Level 6 of the Silva database. Taxa comparisons between eDNA and MOCNESS were based on Level 7 (the deepest level) of the Silva database.
We also searched all ASVs against the GenBank nr database. Blast search parameters were set to exclude uncultured, unfiltered, and unclassified entries. We reported top hits when the percent identity between our ASV and the Genbank entry was 97% or greater. This threshold has been used in other studies (Pearman et al., 2014;Casas et al., 2017) and is effective in distinguishing genera and families in copepods (Wu et al., 2015). Note that even a perfect (100%) match does not necessarily indicate a species-level identification (Wu et al., 2015;Blanco-Bercial, 2020) and that, while better populated than the Silva database, representative V9 sequences for many mesopelagic animal species are not available on Genbank.
Rarefaction curves were generated on the eDNA and MOCNESS all-data and metazoan-only datasets to evaluate whether the obtained sequencing depths were sufficient to recover all taxa. Using datasets where sequencing depth in all samples was truncated to the lowest observed sequencing depth, we generated Jaccard (presence-absence) and Bray-Curtis indices in QIIME2, and used these to create non-metric multidimensional scaling (nMDS) plots with vegan 2.3_5 in the statistical package R (Oksanen et al., 2016). Data points were visualized relative to time of sampling (day or night) and sampling depth categories (0-100, 100-200, 200-500, 500-800 m). Functional regressions of the data points against each nMDS dimension were performed in Matlab to assess the significance of observed patterns (Ricker, 1973).

Environmental DNA Sequence Data Summary
A total of 13,561,867 sequences were obtained from 40 samples and 5 negative controls, with an average of 301,375 ± 22,404 (SE) reads per sample. The number of reads in the negative controls ranged from 550 to 6,899. A sixth negative control (from the PCRs) had no reads. In the DADA2 analysis, a total of 12,181,342 sequences (approximately 90%) remained after the filtering, denoising, merging, and chimera removal steps, with an average of 270,696 ± 20,331 (SE) sequences per sample (Supplementary Table S2). These sequences were classified into 10,543 unique amplicon sequence variants (ASVs). After removing rare ASVs (total frequency across samples < 10), 8,417 ASVs comprising 12,170,626 sequences remained. Of these, 351 ASVs comprising 1,421,650 sequences were metazoans. There were 65 ASVs that were present in both the negative controls and the samples (Supplementary Table S3). For each of these, we took the maximum number of reads found in any of the controls and subtracted that number from the corresponding ASV read count from every sample. Four additional ASVs that were discovered to be either misclassified by Silva as metazoans (based on our Blast results, below) or were unlikely mesopelagic inhabitants were also removed from our analyses.
The percentage of ASVs that were classified as metazoan taxa varied considerably between samples from different depths. The proportion of metazoan sequences in the results generally declined with depth beginning at 30 m (approximately coinciding with the pycnocline; Supplementary Figure S6), and in all cases except one, metazoans comprised 21% or less of the total number of obtained reads in samples collected below 200 m (Figure 4). Non-metazoan taxa (Supplementary Table S4) were not analyzed further.
A broad range of metazoan taxa were recovered in both daytime and nighttime casts (Figure 5). In Cast 10, where duplicate samples were taken at each depth, we observed considerable differences in the presence and abundance of taxa in the reads we obtained ( Figure 5C). Overall, crustaceans (Copepoda, Eumalacostraca, Myodocopa) and medusozoans (Scyphozoa, Hydroidolina, Trachylina) were particularly wellrepresented. Crustacean ASVs comprised 53.7 ± 28.9% of the reads in mesopelagic samples (200-800 m) and 73.0 ± 24.2% of the reads in epipelagic samples (0-200 m). Conversely, medusozoan ASVs were generally a more prominent component of the reads (35.2 ± 24.8%) in mesopelagic depths than in epipelagic depths (16.6 ± 25.4%). The Silva database Level 6 analysis revealed that the major components of crustacean reads were classified as calanoid and cyclopoid copepods (Figure 6), and the major components of the obtained medusozoan reads were siphonopohores (especially at mesopelagic depths) and anthoathecate medusae (Figure 7). We also found that 150     Table S5). Consistent with the Silva classification, taxa with Genbank matches included many calanoid and cyclopoid copepods, siphonophores, and anthoathecate medusae, as well as other taxa such as trachymedusae, scyphomedusae, ctenophores, and other less commonly recovered taxa.
Rarefaction plots of all ASVs and the filtered metazoanonly dataset showed an initial increase in the number of ASVs with sequence subsampling depth and then leveled off, indicating that the sequencing depth was sufficient to capture ASV diversity in our samples (Supplementary Figure S7). Using the metazoan datasets truncated at the lowest sequence depth, nMDS plots revealed clustering was related to depth but not time of sampling (Figure 8). The structuring was more evident with the Jaccard indices (stress = 0.128) than with the Bray-Curtis indices (stress = 0.221). However when the data were log-transformed, the stress index for the Bray-Curtis analysis improved (stress = 0.154). Note also, unlike Jaccard similarity which is based on presence/absence of taxa, Bray-Curtis indices take into account read abundances, which in metabarcode data, are influenced by PCR biases  and eDNA dynamics (Allan et al., 2020). A regression of the Jaccard data points against the nMDS axes revealed a significant relationship with dimension 1 (R 2 = 0.728, p < 0.001) but less so with dimension 2 (R 2 = 0.108, p = 0.0.38). Samples corresponding to the shallowest depth bin (0-100 m) appeared to be driving this relationship (Figure 8), so to assess the relationship with depth for the deeper samples, we repeated the regression analysis without the 0-100 m samples. We found significant correlations with both dimension 1 (R 2 = 0.34, p < 0.001) and dimension 2 (R 2 = 0.755, p = 0.001). There was no significant relationship with sampling time (day vs. night) for either dimension (dimension 1: R 2 = 0.025, p = 0.333; dimension 2: R 2 = 0.003, p = 0.744). The results were similar for the Bray-Curtis analysis based on the log-transformed data.

MOCNESS Zooplankton Sequence Data Summary and Comparison With eDNA
Altogether, the 9 MOCNESS nets sampled a combined volume of 12,193.7 m 3 (=12,193,700 l). From this, a total of 4,129,604 sequences were obtained from the large and small size fraction samples combined, with an average of 229,422 ± 8,025 (SE) reads per sample. A total of 3,629,395 sequences (approximately 87%) remained after the filtering, denoising, merging, and chimera removal steps, with an average of 201,633 ± 7,352 (SE) sequences per sample (Supplementary Table S6). These sequences were classified into 866 unique amplicon sequence variants (ASVs). After sequences with a frequency less than 10 were removed, 622 ASVs (3,628,313 sequences) were retained, 470 of which were metazoans that, like the eDNA results, comprised a diverse array of animal taxa (Figure 9 and Supplementary Table S7). However, the number of ASV's obtained from the zooplankton dataset (455) was approximately two-thirds greater than that obtained from the eDNA dataset (275). Overall, medusozoan taxa were less abundant in the MOCNESS tow than in the eDNA samples (3.11 ± 3.35% in the large and 7.86 ± 10.34% in the small size fractions), but crustaceans were generally highly abundant (91.10 ± 4.06% in the large and 88.32 ± 11.93% in the small size fractions). Also in contrast to the eDNA results, the major components of the crustacea were calanoid copepods (but few cyclopoid copepods) and eumalacostracans (Figure 10). The medusozoan groups were comprised of primarily siphonophores, but other groups including narcomedusae, trachymedusae, and others were also detected (Figure 11). The Blast results included many 97% or greater matches with numerous calanoid copepods, and eumalacostracans including euphausiids, amphipods, and decapods. Other matches included a greater variety of non-crustacean taxa as compared to the eDNA results, and these included siphonophores and other medusazoan taxa as well as chaetognaths, polychaetes, fishes, salps, and others (Supplementary Table S7). As with the eDNA dataset, rarefaction curves showed that sequencing depth was sufficient to capture the ASV diversity in both the full data set and the metazoan-only dataset (Supplementary Figure S8).
Additionally, a total of 10 large individuals and organism fragments were removed from the MOCNESS samples before homogenizing. Six of these produced PCR bands, and 5 were sequenced successfully (Supplementary Table S8). Three of  these were fish, one was a scyphozoan, and one was likely a eumalacostracan. All were identical to MOCNESS ASVs. The specimens that failed to be sequenced included 3 fish, one shrimp, and one unidentified gelatinous animal.
The MOCNESS sampling recovered a greater number of both metazoan ASVs and metazoan taxa than did sampling with the Niskin bottles (eDNA samples). Overall, we found 455 metazoan ASVs in the MOCNESS samples (metabarcoding ASVs from small and large fractions combined plus the one unique largeindividual sequence), and 275 in the eDNA samples (all casts and samples combined). We compared the number of highlevel animal taxa (as classified to Level 7 in the Silva database) and found that the MOCNESS sampling (all nets and fractions combined, including large individuals) detected 23 more taxa than the eDNA sampling. Thirty-nine taxa were detected in both approaches, 17 were found in eDNA only, and 40 were found in the MOCNESS only (Supplementary Table S9). Despite the greater number of metazoan taxa sampled by the MOCNESS, sampling efficiency, defined here as number of metazoan ASVs detected per liter of water sampled, was far greater for eDNA (275 ASVs/206 l = 1.335 ASVs/l; volume filtered from Table 1) than for the MOCNESS (455 ASVs/12,193,700 l = 0.00003731 ASVs/l).

Biodiversity Detection
Our results suggest that eDNA analysis is an informative and highly efficient approach for invertebrate biodiversity detection in the mesopelagic that complements net sampling. Our sequence results were based on the 18S V9 marker, which detects a comprehensive and phylogenetically diverse range of animals (Wu et al., 2015;Bucklin et al., 2019;Blanco-Bercial, 2020). The 18S V9 marker is well-suited for broadly focused eDNA surveys due to its relatively short length, as eDNA is likely degraded and thus comprised of relatively small fragments (Jo et al., 2017).
The animal component of our eDNA results declined with depth, and this result is consistent with Stefanoudis et al. (2019), who, in a survey extending to 800 m depth based on specimen identifications from net tows, showed that zooplankton abundance decreased substantially below 200 m. In our study, the observed decline was approximately coincident with the pycnocline (which was located in the 0-200 m depth interval in Stefanoudis et al. (2019)). While the pycnocline can be a barrier for animal dispersal (Suzuki et al., 2018), Closek et al. (2019) found no significant difference in fish taxa identified from eDNA above and below the pycnocline off of the central California coast. It will be interesting for future studies to examine whether or not the pycnocline presents a barrier to eDNA dispersal and how that impacts interpretation of eDNA signals.
The eDNA sequences originated primarily from calanoid and cyclopoid copepods, siphonophores, and other medusozoans. In contrast, the crustacean component of the MOCNESS sequences comprised primarily calanoid copepods (and relatively few cyclopoids) and eumalacostracans (e.g., euphausiids and amphipods). The medusozoan component of the MOCNESS sequences also included siphonophores (but with fewer ASVs), and the relative proportions of other medusuzoans were different than in the eDNA dataset.

Can Environmental DNA Can Add Substantially to Plankton Metabarcoding Analyses?
While the number and types of metazoan taxa detected in the genetic material collected by MOCNESS sampling were greater than those detected by the eDNA sampling, we detected 17 taxa via eDNA that were not detected by the MOCNESS. The volume of water sampled by Niskin bottles for eDNA analysis, however, was a tiny fraction of that sampled by the MOCNESS. Thus, in terms of the number of metazoan ASVs and taxa detected per volume of water sampled, eDNA from Niskin bottles outperformed MOCNESS sampling, although each approach detected taxa that the other did not. Given the close proximity of the CTD and MOCNESS sampling locations and times, and the temperature and salinity data that showed that the sampling locations belonged to the same water mass, we assume that the pool of biodiversity was similar for each sampling event and that the observed differences are due to sampling biases associated with each method. Our findings that many of the same taxa were recovered in both sampling approaches and that other taxa that were recovered in only one or the other approach, are also consistent with other studies that compare eDNA with other survey methods (Thomsen et al., 2016;Kelly et al., 2017;Sigsgaard et al., 2017;Closek et al., 2019;Liu et al., 2019).
The relative frequencies of medusozoan ASVs in our eDNA reads compared to our MOCNESS reads could indicate a relatively high eDNA shedding rate or a relatively high biomassor both. While there are few studies on eDNA shedding rates, the available data show that they are variable between taxa, between individuals within a species, and between different stages of an organism's life cycle (Sansom and Sassoubre, 2017). Most studies on eDNA shedding focus on coastal and freshwater fish (Takahara et al., 2012;Klymus et al., 2015;Sassoubre et al., 2016;Nevers et al., 2018). However, Minamoto et al. (2017) found that medusae shed eDNA at a significantly greater rate than fish (by 1-3 orders of magnitude). Allan et al. (2020) also found that medusae may shed eDNA at higher rates than other animal types. Animals with hard external surfaces (e.g., bivalves and crustaceans) may have lower shedding rates (Sansom and Sassoubre, 2017;Allan et al., 2020). Interestingly our eDNA data lacked eumalacostracan ASVs, which were common in the MOCNESS data. Eumalacostracans include euphausiids, which are a well-known component of mesopelagic biomass and so their paucity in the eDNA is notable. Additional research on shedding rates in mesopelagic organisms is critical for interpreting eDNA signals.
The biomass of medusozoans and other gelatinous animals in the mesopelagic is likely substantially underestimated due to difficulties in sampling them (Madin and Harbison, 1978;Larson et al., 1991;Robison, 2004Robison, , 2009). These animals are destroyed by traditional plankton net sampling, and many new forms were discovered only when human-occupied underwater vehicles (HOVs) and remotely operated vehicles (ROVs) came into use (Madin and Harbison, 1978;Larson et al., 1991;Robison, 2004Robison, , 2009). Our eDNA and MOCNESS results showing a greater presence of gelatinous animals in eDNA are consistent with the hypothesis that gelatinous animal biomass is under-sampled using net sampling and also demonstrate that analysis of eDNA provides another, complementary approach to detecting these fragile animals.

Biomass and eDNA Sampling Strategy
Our sampling strategy focused on acoustic scattering layers identified by the 18 and 38 kHz (deep scattering layers) and the 120 and 200 kHz (near-surface scattering layers) Simrad EK60 echosounders. These frequencies are typically used to detect biomass in fisheries surveys (Jech and Sullivan, 2014;Proud et al., 2019). We chose this approach as these layers likely had relatively high concentrations of organisms, although the types of taxa detected by acoustic backscatter is frequency-dependent (Lavery et al., 2007). The lower frequencies that we used (18 and 38 kHz) were more informative for our sampling as they penetrate to deeper depths. These frequencies also tend to detect gas-bearing organisms such as siphonophores and fish with swim bladders (Lavery et al., 2007) although deep scattering layers may contain a wide range of animal taxa (Stefanoudis et al., 2019). While the relationship between eDNA concentration and abundance or biomass is often not direct (Andruszkiewicz et al., 2017b), biomass indicated by acoustic backscatter has been correlated with eDNA signals. Yamamoto et al. (2016) found that in general, fish eDNA signals were correlated with biomass as determined from a 120 kHz echosounder in Maizuru Bay, a shallow semienclosed water body in the Sea of Japan. They noted, though, that confounding factors, such as the presence of exogenous eDNA sources, could disrupt this correlation.
The fact that we sampled in the deep-scattering layers where acoustic backscatter from ship-borne echosounders presumably indicates the presence of fish, yet did not detect fish eDNA, suggests more effort is needed for understanding how eDNA relates to biomass in the mesopelagic region. While 18S V9 is known to pick up a broad range of animals including fish (Blanco-Bercial, 2020), other taxa are potentially preferentially amplified (Sawaya et al., 2019). Fish were also only a very small component of the MOCNESS DNA results; however the failure to sequence three of our fish specimens from the MOCNESS sampling also suggests that some species may be missed with this marker. Analysis of eDNA samples with a marker such as 12S that targets vertebrates may be needed to uncover fish biodiversity (Andruszkiewicz et al., 2017b;Kelly et al., 2017). Additionally, the frequency-dependent scattering by organisms can bias our interpretation of the types of organisms present using acoustic echograms. This is especially true using ship-borne systems where the frequency content of the acoustic data decreases with range from the transducer-i.e., "higher" frequencies attenuate with depth so only "lower" frequency data are available for interpretation. Notably, our mesopelagic sampling was based on the two lower frequencies (18 and 38 kHz) and while we did not detect fish, we did observe a significant eDNA signal from siphonophores. Furthermore, our Blast search found that several siphonophore ASVs matched families in the suborder Physonectae, which contain gas-filled pneumatophores (Dunn et al., 2005), suggesting that they could be the source of those acoustic signals. eDNA studies that focus solely on 12S may miss these potentially important taxa.
In the near-surface scattering layers, the combination of all frequencies (18, 38, 120, and 200 kHz) provided additional information on a greater variety of animals and presumably a less biased interpretation (Davison et al., 2015;Bassett et al., 2020). Vertically migrating mesopelagic animals, including fish, crustacea, and gelatinous animals, comprise these near-surface scattering layers at night, and surface data collected at night may provide useful information on these mesopelagic species. Focusing on surface sampling during the night could thus be an efficient strategy; however, deep water sampling is necessary to identify non-migrating mesopelagic species.
Because our sampling depths were chosen based on real time echosounder data, sampling depths differed between casts and were not consistent with the MOCNESS intervals. To look for patterns associated with depth, we therefore combined our data from all casts into four 100-300 m depth bins and found that composition of our eDNA reads differed significantly among these depth bins. Thus, eDNA appears to be a suitable tool for examining biodiversity patterns at that vertical scale. Finer scale resolution of potential vertical patterns, such as those associated with acoustically detected layers, was limited by the number of Niskin bottles (and thus samples) available to us (8) given our goal to explore variation in the water column, but could be a direction for future research. Other studies have found that eDNA can resolve community differences between marine habitats, despite the biological and physical interconnectedness of those habitats (Jeunen et al., 2019;Laroche et al., 2020;West et al., 2020).

Interpreting Mesopelagic eDNA Signals
Many biological and physical factors could have influenced our observed eDNA signals, including eDNA shedding rates, decay rates, and transport (Sansom and Sassoubre, 2017). As described above, eDNA shedding estimates are derived primarily from fish, but there is some evidence that medusae may shed eDNA at a substantially higher rate Allan et al., 2020). This could potentially result in a disproportionately greater observed read abundance for medusozoan taxa. Shedding rates are also influenced by temperature (Lacoursière-Roussel et al., 2016), and constraining shedding rates of vertically migrating mesopelagic animals that experience a range of temperatures on a daily basis may be especially complex. eDNA decay estimates, which are also based primarily on fish studies, suggest that eDNA can be detected about 1-7 days after shedding (Thomsen et al., 2012;Sansom and Sassoubre, 2017). However, like shedding, the decay rate depends on a variety of environmental factors such as temperature. Cowart et al. (2018) reported a slower decay rate (with a half-life of 37.2 h) potentially allowing detection after 25 days from fish in subzero Antarctic waters. As physical and biogeochemical conditions in the mesopelagic zone are dramatically different from the surface layer of the ocean, eDNA decay rate derived for surface conditions might not be applicable to the mesopelagic zone. Interestingly, one seemingly relevant factor that varies with depth is sunlight, which does not appear to impact eDNA decay (Andruszkiewicz et al., 2017a).
Our eDNA samples may include signals from animals both in our sampling location as well as eDNA transported to that location from elsewhere. Water movement can transport eDNA downstream of its source. Thus, eDNA signals from flowing environments may actually indicate the presence of species in an upstream location (Wacker et al., 2019). Additionally, water mixing can cause the signal of eDNA released over a small spatial scale to quickly become diluted and undetectable (Pilliod et al., 2014;Stoeckle et al., 2017). Most studies on eDNA dispersal to date are from river and stream environments, where the flow is primarily unidirectional. Based on modeling in a coastal system (Monterey Bay, California), Andruszkiewicz et al. (2019) found that eDNA can be transported several 10 s of km horizontally by regional currents over the course of a few days. Open ocean environments are much more complex and subject to the influences of different flows, including mesoscale, sub-mesoscale, tidal, upwelling, and downwelling currents. In the Slope Water off the Northeast U.S. coast where this study occurred, flow can be impacted by the Gulf Stream, warm-core rings, subduction and upwelling along the ring periphery, slope current, tides, and internal tides (Joyce, 1984;Flagg et al., 2006;Kelly et al., 2016;Zhang and Partida, 2018). These flows could redistribute eDNA in the three-dimensional space.
Despite the potential for eDNA dispersal by ocean currents, some studies in coastal systems suggest that the eDNA signal can remain localized. In their survey of Maizuru Bay, Japan, Minamoto et al. (2017) found spatial and temporal correlation between eDNA concentrations of jellyfish from surface and subsurface samples and simultaneously performed visual surfacewater observations. Kelly et al. (2018) found that tidal flow had minimal effect on characterizing local communities in tidally dynamic nearshore habitats. Similarly, Jeunen et al. (2019) found unique localized eDNA signatures from four coastal distinct, yet interconnected habitats that were within a few km of each other. Weak spatial dispersal of eDNA signal could result from either long residence time of the water in the eDNA source region or rapid decay of the eDNA materials. The relative importance of physical transport vs. biological degradation in eDNA signal dispersal is a question that calls for further study.
In the mesopelagic ecosystem, there is an additional complicating process not found in other aquatic habitats that may influence our interpretation of our eDNA signals. Diel vertical migration (DVM), or the movement of many mesopelagic animals to surface waters at night to feed and back down to deeper depths during the day, has been referred to as the largest migration on the planet (Hays, 2003;Sutton, 2013). Animals can travel vertical distances of hundreds of meters in each direction on a daily basis. However, we did not find a significant pattern associated with the time of sampling (day vs. night) in any depth bin as might be expected from vertically migrating animals. The lack of day/night signal suggests that the eDNA shedding rate might be very low or the time scale of diel vertical migration might be too short relative to eDNA transport and decay processes, both of which may limit our ability to detect vertical migration (Allan et al., 2020). Furthermore, the animals themselves may expedite the downward transport of eDNA originating from surface organisms through their feeding activities. For example, a migrating animal may feed at the surface at night, and release traces of eDNA from its surface prey through defecation at depth. Thus, while it is tempting to try to infer DVM-related (and other) changes in taxa with depth using eDNA, it is first necessary to understand the spatiotemporal scales of the relevant biological and physical processes in the surface through mesopelagic depths.
Lastly, we note that on our final cast where duplicate samples were taken at four depths, we observed considerable variation in read abundance between those duplicates. Other studies also show variation between replicates (Andruszkiewicz et al., 2017b). We do not currently understand the scale of eDNA distributions, which may be different for different species, and emphasize that the volume of our eDNA samples (5 l) is minimal relative to the scale of the habitat. More work should be done to optimize the scope and scale of eDNA sampling and replication to improve the accuracy of eDNA biodiversity estimates. Additionally, technological advances that allow for multiple, large-volume sample acquisition (Govindarajan et al., 2015;McQuillan and Robidart, 2017) may be especially useful for mesopelagic eDNA studies.

Taxonomy, Reference Libraries, and Marker Resolution
In our results, most of our ASVs did not have exact matches on GenBank. Possible explanations for lack of exact matches are bioinformatics error and lack of available reference sequences. Metabarcoding studies typically uncover vastly more putative taxa than traditional morphological studies (Lindeque et al., 2013;Kelly et al., 2017;Sommer et al., 2017). While to some extent, this greater diversity may be real, it is also possible that the observed diversity is artifactual (Caron and Hu, 2019;Santoferrara, 2019). We feel this is unlikely in our case, as we employed an approach that assesses and corrects likely sequencing errors to produce "amplicon sequencing variants" (ASVs), which is thought to be more accurate than traditional clustering or "operational taxonomic unit" (OTU) approaches (Callahan et al., 2017;Macheriotou et al., 2019), especially for characterizing community biodiversity (Pauvert et al., 2019). We furthermore conservatively removed rare ASVs from our data analysis.
Identifying taxa in metabarcoding studies relies on accurate, well-populated reference libraries (Cristescu, 2014;Bucklin et al., 2016;Elbrecht et al., 2016;Porter and Hajibabaei, 2018). DNA sequences are matched against a reference database for taxon assignment; however, existing databases, including GenBank, are notoriously incomplete or contain errors (Leray and Knowlton, 2016;Lindsay et al., 2017;Santoferrara, 2019). Given the relative inaccessibility of deep oceanic waters, reference libraries may be particularly depauperate for animals in the mesopelagic. Our Genbank results showed matches for a variety of crustaceans and cnidarians; however we also noted an absence of V9 reference sequences on Genbank for many species of mesopelagic fishes (Govindarajan, pers. obs.). For some taxa, there is a mismatch between traditional barcode efforts that focus on the COI and 16S genes (Cristescu, 2014;Lindsay et al., 2015;Elbrecht et al., 2016;Leray and Knowlton, 2016), and metabarcoding analyses which often instead utilize 18S markers to capture a broad range of higher level taxa (de Vargas et al., 2015;Kelly et al., 2017;Djurhuus et al., 2018;Bucklin et al., 2019;Sawaya et al., 2019;Blanco-Bercial, 2020). Additionally, nominal species on Genbank may be misidentified, either due to mistakes in identification due to lack of taxonomic expertise or to the existence of cryptic species (Lindsay et al., 2015(Lindsay et al., , 2017Abad et al., 2017). Development of reference barcode libraries, grounded in taxonomic expertise (Wheeler, 2018;Pinheiro et al., 2019), for mesopelagic animals are essential for future applications of eDNA analyses in the mesopelagic.
Our goal, however, was primarily to identify higher level taxa rather than species, and V9 is better suited toward this task due to its relatively conserved nature compared to markers such as COI and 16S. Thus, an exact match to a reference V9 sequence might not necessarily indicate a species identification, because multiple closely related species could potentially share that sequence (Abad et al., 2017;Blanco-Bercial, 2020). On the other hand, because evolutionary rates are variable between lineages, a given percent similarity (e.g., 97%) might indicate different degrees of taxonomic classification depending on the lineage. In our Blast searches, we selected a 97% identity as our threshold for reporting Genbank results based on calibration of this marker in copepods (Wu et al., 2015). Similar calibration analyses should be conducted for other animal groups and markers as reference barcode libraries are expanded in order to better understand the taxonomic makeup of the sampled biodiversity.

CONCLUSION
More information on mesopelagic biodiversity is urgently needed as interest in harvesting these resources increases. eDNA analysis has tremendous potential to contribute to our understanding of this environment and fill important knowledge gaps, such as those related to the detection of animals that are poorly sampled by other methods. eDNA analysis is also highly efficient in terms of the number of taxa recovered per unit effort and can complement other traditional sampling methods. However, eDNA analysis is a new approach that to date has been primarily used in freshwater and coastal marine environments. Based on our results, we suggest the following future research directions for facilitating the application of eDNA analyses to the mesopelagic environment: (1) Experiments to determine eDNA shedding, decay, transport, and dispersal rates under mesopelagic physical and biological conditions to improve our interpretation of eDNA signals; (2) Assess relationships between eDNA and biomass; and (3) Populate genetic reference databases with sequences from mesopelagic species that have been morphologically identified by taxonomic experts for improved taxonomic assignments.

DATA AVAILABILITY STATEMENT
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found below: Dryad repository (doi: 10.5061/dryad.4xgxd255j).

AUTHOR CONTRIBUTIONS
AG conceived the study and conducted the DNA sequence analyses and drafted the manuscript. AG, AL, JMJ, WZ, JL, and PW planned, prepared, and conducted the shipboard sampling. RF conducted the lab work and assisted with the acoustics plots. JMJ conducted the acoustics analyses. WZ and PW conducted the CTD analyses. All authors contributed to data interpretation, structuring of the manuscript, reviewed and edited the manuscript, and approved the final version for submission.