An Acoustic Survey of Beaked Whales and Kogia spp. in the Mariana Archipelago Using Drifting Recorders

The distribution, abundance, and habitat of cryptic cetacean species such as beaked whales and dwarf/pygmy sperm whales (Kogia spp.) are challenging to study due to their long dive times and/or very limited surface behavior. Even less is known in minimally studied and remote regions, including the Mariana Archipelago and parts of the broader western Pacific. In 2018, we deployed a network of eight Drifting Acoustic Spar Buoy Recorders (DASBRs) on the west side of the Mariana Archipelago with the goal of examining the distribution and habitat of beaked whales and Kogia spp. in this region using passive acoustic monitoring. Concurrently, conductivity-temperature-depth (CTD) data were collected within the drift area and combined with satellite oceanographic data to build Ensemble Random Forest Models to identify specific oceanographic features that determine the distribution of these species. DASBRs deployed at locations ranging from 13°N to 18°N generally drifted from east to west between the Mariana Archipelago and the West Mariana Ridge. Spectral and temporal characteristics of echolocation signals were used to identify the presence of beaked whales and Kogia spp. species. This dataset contained frequency modulated (FM) pulses characteristic of Longman’s (Indopacetus pacificus), Cuvier’s (Ziphius cavirostris), and Blainville’s (Mesoplodon densirostris) beaked whales, as well as the unidentified beaked whale FM pulse known as the “BWC,” along with narrow-band high frequency clicks from Kogia spp. The detection rate was substantially higher for all species on the five tracks in the region north of 15.5°N than for those drifts occurring farther south. Species distribution models suggest that differences in the oceanographic characteristics between the northern and southern regions may impact foraging opportunities, possibly explaining the specific ecological niche for these species within this water mass. This is the first study of the distribution of cryptic cetacean species within the wider Mariana Archipelago region. We demonstrate that autonomous drifting acoustic recorders, combined with environmental sampling and remote satellite data are a powerful tool for studying the habitat dependent distribution of cryptic cetacean species.


INTRODUCTION
Beaked whales (Family Ziphiidae) and Kogia spp. (dwarf and pygmy sperm whales) are cryptic marine mammal clades that challenge researchers' ability to understand their ecology. Species in these groups are deep-diving, have long dive cycles with minimal surface time, and extremely low body profiles at the surface . All of these factors make them difficult to locate, identify, and track during visual-sighting based surveys, particularly during poor weather conditions (Barlow and Gisiner, 2006;. However, beaked whales produce vocalizations that are identifiable to the species level, and Kogia spp. vocalizations are readily identifiable to the genus level, establishing passive acoustic monitoring as an effective means of surveying for the presence of these species (Caldwell and Caldwell, 1987;Baumann-Pickering et al., 2013, 2014.
The vocal repertoire of beaked whales consists primarily of echolocation signals, though whistles (Dawson et al., 1998) and burst pulse sounds Rankin and Barlow, 2007) have also been described. Their echolocation signals include frequency-modulated (FM) pulses as well as clicks (Johnson et al., 2004(Johnson et al., , 2008Baumann-Pickering et al., 2013, 2014Keating et al., 2016). While full call repertoires have not been established for all species of beaked whales, species-specific FM pulses have been described for most. Beaked whales produce these FM pulses during deep foraging dives Zimmer et al., 2005), which typically last 60 min and can reach depths of 800-2,000 m ( Barlow et al., 2013Barlow et al., , 2020. The unique and species-specific features of beaked whale FM pulses and their reliable presence during foraging dives make them well suited to study using passive acoustic methods. Kogia spp. are small-bodied (2-2.7 m) deep diving cetaceans that travel in small groups (1-6 animals) (Willis and Baird, 1998;Baird, 2005). Their foraging dives are roughly 25 min long and are interspersed with short surfacings (Plön, 2004;West et al., 2009); at the surface they produce neither visible blow nor engage in visible surface behaviors (Willis and Baird, 1998;Baird, 2005). These factors make them extremely difficult to study, and almost all information on their distribution has been gained from records of stranded individuals (summaries in Willis and Baird, 1998;Taylor et al., 2012). However, Kogia spp. produce narrowband high-frequency (NBHF) echolocation clicks (Caldwell and Caldwell, 1987; with peak frequencies between 110 and 140 kHz. To date, researchers have not been able to discriminate the clicks of dwarf versus pygmy sperm whales; rather the NBHF clicks are identified only to the genus level (Griffiths et al., 2020). NBHF clicks are produced by several other cetacean species [e.g., Dall's porpoise (Phocoenoides dalli) & harbour porpoise (Phocoena phocoena)], but where these species are not known to be present, detection of NBHF clicks can be used to effectively identify Kogia spp. presence.
Visual sighting surveys often incorporate passive acoustic monitoring in the form of towed hydrophone arrays, however the array is generally at shallow depths (10-20 m), reducing its ability to detect species that dive and vocalize at deep depths. Networks of stationary recorders deployed to the seafloor have proven valuable in understanding beaked whales and Kogia spp. vocal behavior and distribution (Baumann-Pickering et al., 2013, 2014Hildebrand et al., 2019). However, broadscale acoustic networks spanning large geographical ranges are rare (Henderson et al., 2016), limiting their utility in describing beaked whales and Kogia spp. distribution across a range of habitats. The recent development of Drifting Acoustic Spar Buoy Recorders (DASBRs) provides a particularly excellent alternative (Griffiths and Barlow, 2015) for studying these cryptic cetacean species.
Drifting Acoustic Spar Buoy Recorders allow acoustic data collection at appropriate depths without added noise or other impacts from a survey vessel. They drift slowly, providing the opportunity to capture a full beaked whale or Kogia spp. dive cycle; and drift freely with ocean currents providing sampling opportunities across diverse habitats without potential bias that can arise from a priori decisions about monitoring locations (Griffiths and Barlow, 2016;Keating et al., 2018;Fregosi et al., 2020;Malinka et al., 2020). In addition, the slow drift rate and reduced noise levels allow for increased opportunities to detect the short wavelength and highly directional NBHF clicks produced by Kogia spp. Griffiths et al., 2020;Malinka et al., 2021).
At least five species of beaked whales and both species of Kogia are known to occur in the broader tropical western Pacific region. Cuvier's (Ziphius cavirostris) and Blainville's (Mesoplodon densirostris) beaked whales have been identified during visualsighting based surveys in the Mariana Archipelago (Hill et al., 2020) and detected at long-term passive acoustic monitoring sites in the southern portion of the archipelago (Baumann-Pickering et al., 2014;Simonis et al., 2020). Longman's beaked whales (Indopacetus pacificus) are recognized from strandings in the Philippines (Acebes et al., 2005) and southern Japan (Yatabe et al., 2010); ginkgo-toothed beaked whales (Mesoplodon ginkgodens) from rare sightings and strandings in Japan (Nishiwaki and Kamiya, 1958); and Deraniyagala's beaked whale (Mesoplodon hotaula) from a stranding in the Philippines (Lacsamana et al., 2015). Acoustic monitoring in the Mariana Archipelago has also revealed the presence of an unidentified beaked whale FM pulse known as the "BWC" (McDonald et al., 2009), which is possibly produced by the ginkgo-toothed beaked whale (Baumann-Pickering et al., 2014). Stationary passive acoustic monitoring efforts near Saipan and Tinian, ongoing since 2010, have revealed the year-round occurrence of Cuvier's, Blainville's, and BWC beaked whales in these regions, however the geographic distribution and environmental drivers of each species distribution are difficult to assess using data from fixed sites (Baumann-Pickering et al., 2014).
Detailed studies examining the linkage between cetacean occurrence and oceanographic features (e.g., MacLeod et al., 2003;Baumann-Pickering et al., 2014) are rare as such efforts require a rich dataset to reveal meaningful associations. Unique oceanographic features, with a strong zonal salinity front near 15 • N, bisect the Mariana Archipelago (Kimura et al., 2001). This archipelago lies in the path of the westward flowing North Equatorial Current (NEC) that carries two water masses: the North Pacific Tropical Water to the north and the equatorial water to the south. The North Pacific Tropical Water is advected from the subtropical gyre and characterized by high temperatures and high salinity, while the equatorial water is formed under the equatorial convergence zone and is cooler with low salinity (Suga et al., 2000;Suntsov and Domokos, 2013). Where these two waters meet, a salinity front is formed. This environment provides a unique opportunity to study the distribution of cetacean species in two distinct water masses over a relatively small geographic area.
We used acoustic data from DASBRs deployed within the Mariana Archipelago to examine the occurrence of beaked whales and Kogia spp. and compared those detections to both the conductivity-temperature-depth (CTD) and satellite-derived measures of local oceanographic conditions in order to better understand the ecological and oceanographic processes that drive beaked whales and Kogia spp. distribution within this poorly studied region.

Data Collection
The Pacific Islands Fisheries Science Center (PIFSC) Cetacean Research Program conducted Mariana Archipelago Cetacean Survey (MACS) from July 9 to August 1, 2018, aboard the NOAA R/V Oscar Elton Sette. The survey included ship-based visual search and towed hydrophone array passive acoustic monitoring for cetaceans, as well as the deployment of DASBRs and collection of oceanographic data using CTD casts. The study area extended from the Mariana Archipelago to the West Mariana Ridge, roughly 280 km to the west, and from 100 km south of Guam at 13 • N to the northern end of Pagan around 18 • N (Figure 1).
Drifting Acoustic Spar Buoy Recorders were deployed at random locations, generally near the island chain between the islands of Guam and Guguan and were left to drift while the shipbased visual and passive acoustic surveys continued. DASBRs were retrieved when the ship was in their general vicinity or when they were nearing the West Mariana Ridge. The survey area was divided into a 50 × 50 km grid, and a CTD cast to 1,000 m depth was conducted using a Sea-Bird Scientific SBE9 (#91660) sensor package within most grid cells across the study area (Figure 1 and Table 1).
The PIFSC DASBR is based on the design of Griffiths and Barlow (2015), but modified to withstand the rough seas and higher water temperatures common to the tropical central and western Pacific (Figure 2). The DASBR includes a spar-buoy surface float with trailing line and trailer buoy for recovery, 150 m of 1/2-inch (1.3 cm) line, and an anchor at the bottom. The spar-buoy contains an Iridium satellite tracker (NAL 9602-AB) that provides GPS location at programmed time intervals (typically every 2 h). A multi-channel (up to 4) SoundTrap ST4300-HF (Ocean Instruments, Auckland, New Zealand) is attached to the section of line 15 m above the anchor, with two hydrophones (HTI-92 & HTI-96; High Tech, Inc., Long Beach, MS, United States) spaced 10 m apart, one above and one below the ST4300-HF (Figure 2).
The SoundTrap ST4300-HF sampled at 576 kHz on two channels, with a custom low pass filter at 150 kHz to improve detection of NBHF Kogia spp. clicks (standard units filter at 90 kHz). Acoustic recordings were collected using a duty cycle with a 2-min sampling period starting at 5-min intervals. A Time-Depth-Recorder (TDR) (LAT1400-500m; Lotek, St. John's, Canada) was attached to the vertical line below the lower hydrophone to collect accurate hydrophone depth measurements (with 0.5 m resolution).

Detection and Classification of Beaked Whale and Kogia spp. Signals
We identified frequency-modulated beaked whale echolocation signals and Kogia spp. NBHF clicks within the acoustic data using the click detector module (IIR Butterworth 2 kHz high pass filter) within PAMGuard software v. 2.00.14c (Gillespie et al., 2009) and coded them with custom specifications based on peak frequency (Keating and Barlow, 2013). Secondarily, we manually extracted and grouped FM pulses for acoustic encounter analysis based on bearing angles and click trains. FM pulse bearing angles from PAMGuard were relative to the vertical hydrophone array (0 • = surface, 90 • = depth of array, 180 • = directly below array). Lastly, we manually classified detections of beaked whale FM pulses to the species level based on their spectral characteristics (e.g., peak frequency, −10 dB bandwidth, duration, inter-pulseinterval, encounter level spectrums) (Baumann-Pickering et al., 2013, 2014Keating et al., 2016), and detections of NBHF clicks were identified as Kogia spp. based on similarity of click characters to those described by  and Marten (2000). We did not attempt to calculate the detection range for any species. However, Barlow et al. (2021) performed multiple analysis of detection range for Cuvier's beaked whale FM pulses received on DASBRs and found a likely maximum detection range of ∼5 km.
To avoid oversampling the presence of each species, we aggregated adjacent files containing echolocation signals produced by the same species into "acoustic encounters." Beaked whales generally do not produce echolocation FM pulses near the surface or during recovery dives (between foraging dives) ( Barlow et al., 2013). Recovery dives typically last no less than 15 min, and we use these periods of silence to define the end of acoustic encounters, which for beaked whales included adjacent 2-min data periods with gaps in detections of less than 15 min. Beaked whale FM pulse detections occurring more than 15 min since the last detection are considered a new encounter, with no attempt to determine if the new encounter was from the same group of beaked whales detected previously. In this way, a beaked whale acoustic encounter represents an individual foraging dive by a group of beaked whales. Periods of Kogia spp. NBHF clicking in adjacent 2-min files were also aggregated into acoustic encounters. Kogia spp. have been observed echolocating near the surface Malinka et al., 2021) and their vocal behavior during foraging dives is not understood; therefore, we used a break in detections of one or more 2-min detection files to indicate a new acoustic encounter. Due to the duty cycled data collection, it is not possible to examine the specific start and stop  time of each acoustic encounter as echolocation signals may have begun or ended during a period with no recording, resulting in acoustic encounter duration lasting 0-6 additional minutes. Acoustic encounters of beaked whales and Kogia spp. were further aggregated for comparison to environmental covariates ( Table 2). For this analysis, we associated each acoustic encounter of beaked whales and Kogia spp. with the GPS location closest in time (either forward or backward) to the start of the encounter. This provided a binomial classification for each GPS point from the DASBR of either present or absent for each species, hereafter referred to as "GPS associated encounters." In some instances, a single GPS point could contain the presence of more than one beaked whale species and/or Kogia spp. producing echolocation signals.

Environmental Covariates
Fourteen CTD casts were conducted in the study region, but they did not provide adequate spatial range coverage of the temperature and salinity data for use in identifying spatial patterns of beaked whales and Kogia spp. GPS associated encounters. To obtain finer-resolution oceanographic data, we visually compared and calculated a Pearson's correlation of the CTD data with output from an ocean circulation model, the HYbrid Coordinate Ocean Model (HYCOM; hycom.org), from the same location and day as the CTD casts. HYCOM is a global eddy resolving ocean circulation model with a 1/12 • resolution and 32 depth layers and provides daily snapshots of temperature and salinity, as well as zonal and meridional current velocities. The HYCOM data adequately captured the subsurface oceanographic features (Figure 3) seen in the CTD data; we therefore used HYCOM data for measures of temperature and salinity in the environmental model. We identified 22 environmental covariates that potentially influence the distribution of beaked whales and Kogia spp. in the study area ( Table 2). They are a combination of bathymetric and dynamic covariates from in situ measurements, modeled, and satellite data. The HYCOM data were extracted for the location and date of each GPS associated encounter from the surface down to 1,000 m. To quantify covariates associated with the thermocline we used the variable representative isotherm method described in Fiedler (2010) to calculate thermocline temperature (TT) from HYCOM temperature data, as TT = T(MLD) -0.25[T(MLD) -T(400 m)], where the temperature at the base of the mixed layer is T(MLD) = SST−0.8 • C, and T(400) is the temperature at 400 m. The "thermocline depth" is defined here as the depth at which the temperature equals the TT. We define the "thermocline strength" as the slope of the thermocline layer [T(MLD) to T(400 m)] (Fiedler, 2010). Additionally, the seafloor slope was calculated from bathymetry data using the "slope" function in the "raster" package (Hijmans and Van Etten, 2012) in the statistical software R 3.6.3 (R Core Team, 2020).

Ensemble Random Forest Analysis
To determine which environmental covariates were predictive of whale distribution, we used a formulation of the Random Forest (RF) machine learning approach, Ensemble Random Forest (ERF), following Siders et al. (2020). Because beaked whales and Kogia spp. have different foraging patterns and habitat utilization one model might not be able to successfully predict the distribution of all species. Therefore, we built three different ERF models for GPS associated encounters: one for all detected species (beaked whales and Kogia spp.), one for Blainville's beaked whales, and one for Cuvier's beaked whales. There were not enough GPS associated encounters of Longman's beaked whales, BWC, or Kogia spp., to attempt to model their distributions within a single-species model. Given the clear oceanographic shift that is generally apparent near 15 • N within the Mariana Archipelago region and the uneven distribution of the encounters within the study area, in addition to building the three ERFs for the full study region, we built ERFs for each species grouping that included GPS associated encounters and environmental covariates from north of 15.5 • N only. Modeling the northern area separately allowed us to resolve finer scale environmental variability in the ERFs that might otherwise be overwhelmed by the more significant oceanographic differences between the northern and southern portions of the study region. Thus, we generated a total of six ERF models using the same construction methods for all areas. First, we subset the data using stratified random sampling with replacement into a training and a validation dataset, holding 20% FIGURE 3 | Salinity (A,C) and Temperature (B,D) depth profiles from south to north in the study area derived from data collected on conductivity-temperature-depth (CTD) casts (A,B) during the survey and output from an ocean circulation model, HYCOM (C,D). The light gray dotted lines show the location of the CTD cast, and the data have been interpolated using LOESS with a 20% smoother. Note that the longitudes for the CTD casts are not the same but collected within a 3-degree spatial area.
for the model validation dataset and using 80% for training the model. GPS associated encounters were sampled separately to ensure there were adequate encounters in the validation dataset. This was especially important for the less commonly encountered species that might otherwise have been underrepresented in the validation dataset. Second, because the GPS associated encounter data showed strong class imbalance with absences outnumbering presences, we down-sampled the majority class in the validation dataset to achieve a 1:1 ratio between presence and absence (following Stock et al., 2020). The Random Forest models were built using the R package "randomForest" (Liaw and Wiener, 2002). For each RF model within an ensemble, 1,000 decision trees were grown and five covariates drawn randomly at each split from the pool of 23 covariates ( Table 2). The number of covariates to pull at each split was optimized using a single RF for each of the six species-area grouping models. The ensemble was generated by running 200 RFs, using a new subsample of the training dataset for each RF. To evaluate model performance, we predicted each RF on the full dataset then used the average to form the ERF predictions (Siders et al., 2020), using the Area Under the Curve (AUC). It was calculated from the receiver operating characteristic curve (ROC) and the true skill statistic (TSS; Allouche et al., 2006) as evaluation criteria. ROC was calculated using the "ROCR" package (Sing et al., 2005), and TSS was calculated using the "BIOMOD" package (Thuiller et al., 2012) in R. AUC values range between 0 and 1 with 0.5 being equivalent to predicting presence or absence by chance; an AUC greater than 0.7 (a 70% chance of predicting a true presence) is generally considered a useful model (Phillips et al., 2006). The importance of the environmental covariates to the species distributions was ranked based on the mean decrease accuracy (mean of 200 RFs per model), a measure of the extent to which removing each covariate reduces the accuracy of the model. To monitor model overfitting, we included a covariate generated from random numbers and disregarded all environmental covariates with covariate importance lower than the random covariate (Siders et al., 2020). Additionally, to visualize the effects of any environmental covariates that had a mean decrease accuracy greater than 5% of the total number of whale encounters omitted when that covariate is not part of the particular model (e.g., >6.75 for all whale full area model, >2.9 for Blainville's full area model, >2.45 for Cuvier's full area model), we generated Accumulated Local Effects (ALE) plots for the ERF using the "ALEplot" package in R (Apley, 2018).

RESULTS
Eight DASBRs were deployed during the MACS effort and recorded data over a 19-day period, with an average deployment duration of 10 days per DASBR drift (Table 1 and Figure 1). All DASBR drifts traveled from east to west covering a total of 1,449 km of trackline. Cumulatively, the acoustic recorders collected 1,818 h of data.

Detection and Classification of Beaked Whales and Kogia spp. Signals
Out of 21,731 2-min recording files, 3% (561) contained acoustic detections of four species of beaked whales (Blainville's, Cuvier's, Longmans, BWC) as well as Kogia spp. (Table 3). There were two FM pulse detections with insufficient signal quality to differentiate between Blainville's or Cuvier's beaked whales.
Most acoustic encounters spanned more than one 2-min recording period, resulting in 152 total acoustic encounters with beaked whales and Kogia spp. over the full recording duration (Table 3), with 76% of the encounters classified as Blainville's or Cuvier's beaked whales. The median encounter duration for Longman's beaked whales (90 min) was substantially longer than the other species (6-15 min). The vast majority (89%) of acoustic encounters occurred on the five DASBR tracks between Saipan and Pagan (15.5 • N to 18 • N latitude). There were no encounters with Longman's beaked whales south of 15.5 • N (Saipan) and only seven encounters of (4) Cuvier's and (3) Blainville's beaked whales in this region. Although relatively few, encounters with BWC and Kogia spp. were more evenly distributed throughout the monitored area.
The DASBR drift tracks contained 980 GPS points representing the drift track at 2-h intervals, with 139 (14.2%) GPS associated encounters with beaked whales and Kogia spp. (Table 3). Twelve GPS points included more than one species.

Environmental Data
Fourteen CTD casts were conducted across the survey area (Figure 1). Evaluation of the temperature and salinity data from the CTDs revealed differences in the physical characteristics of the water. The stations sampled to the north of 15.5 • N have a strong salinity maximum at ∼100-200 m and a deep salinity minimum near 600 m, consistent with North Pacific Tropical Waters (Suga et al., 2000;Suntsov and Domokos, 2013). A surface salinity front can be seen near 15 • N ( Figure 3A). Waters to the south of this front have a surface salinity minimum and a deep salinity maximum weaker than that found to the north. Temperature and salinity values from the CTD and modeled HYCOM data from the same location on the same date were highly correlated (r = 0.998 and r = 0.730, respectively) (Figure 3).

Environmental Modeling
The six ERF analyses performed with varying success; the highest TSS and AUC (0.68 and 0.93) values for the ERF model included only Cuvier's beaked whale (Table 4). Counts consist of detections within 2-min individual recording files, acoustic encounters where adjacent 2-min files were binned together to represent dive-cycles, and GPS associated encounters where 2-min files were assigned presence/absence to the closest GPS point. Duration was only calculated for acoustic encounters to gain insight for vocal behavior given as median with 10th and 90th percentiles in parentheses.

All Species Model
The model using all species encounters for the full study region had lower skill than did the model restricted to the northern part of the region (TSS = 0.46 and 0.50, respectively). However, the AUC was higher for the full model (0.86) compared to the north only model (0.84) ( Table 4). The most important environmental covariates differed between the two models as well; the full area model found thermocline covariates more important while the northern only model was more sensitive to current velocities ( Figure 4A and Table 4). Two environmental covariates had a mean decrease accuracy greater than 5% of the encounters in the full area model: temperature at 400 m and thermocline temperature. The probability of encounters increased with increasing temperature and peaked around 11 and 24 • C for temperature at 400 m and at the thermocline, respectively (Figure 5). For the northern area model, bottom depth was the only covariate that had a mean decrease accuracy greater than 5% of encounters ( Figure 4B and Table 4) and with the highest probability of encounters in shallower (<3,000 m) waters ( Figure 5).

Blainville's Beaked Whale Model
The two ERF models focusing on only Blainville's beaked whale encounters had similar AUC values (0.86 and 0.83 for full area and northern models, respectively; Table 4) to the all species model, but the all area model performed with higher skill (TSS = 0.52) than both all species ERFs. The environmental covariates with the highest mean decreased accuracy varied between the full and northern region models and the variance between random forest models was greatest for this set of ERFs (Figures 4C,D). In the Blainville's full model, similar to the ERF for all species, covariates describing the thermocline were important, while in the northern model covariates describing salinity and current movement had higher importance ( Figure 4C,D and Table 4). Two environmental covariates had a mean decrease accuracy greater than 5% of the encounters: temperature at 400 m and thermocline temperature, and the probability of encounters increased with increasing temperature for both ( Figure 5).

Cuvier's Beaked Whale Model
The Cuvier's beaked whale encounter ERFs outperformed the four all species and Blainville's ERFs, with the northern model having a higher AUC (0.93 and 0.90 for full area and northern models respectively; Table 4) but lower skill (TSS = 0.65 and 0.68 for the full area and northern models, respectively). Environmental covariates with the highest importance were consistent for both geographic model extents with bottom depth, temperature at 400 m, chlorophyll a, thermocline temperature, mixed layer depth (full area model), and salinity at 400 m (northern model) ranking highest (Figures 4E,F and Table 4).
Eleven environmental covariates had a mean decrease accuracy greater than 5% in the full area model ( Figure 5); detections increased over shallower bathymetry with increasing chlorophyll, mixed layer depth, thermocline and halocline strengths, and TDR temperatures. In the north only model, ten environmental covariates had a mean decrease accuracy of greater than 5% (Figure 5), and patterns were similar to the full area model with detections increasing over shallow bathymetry, higher chlorophyll levels, and water movements at depth to the north and east. Less saline and lower thermocline temperature waters had higher probability of detections. For both the full area and north-only models the probability of detecting Cuvier's whales was highest when temperatures at 400 m were 10-12 degrees.

DISCUSSION
Cetacean distribution within the Mariana Archipelago and throughout much of the western Pacific is not well understood. Recent long-term passive acoustic monitoring efforts together with a variety of small-scale visual surveys (reviewed by Hill et al., 2020), have provided substantial new insight into cetacean occurrence and distribution in the archipelago, though inferences into cryptic beaked whale and Kogia species remain challenged by the rarity of visual sightings. Although longterm acoustic monitoring has provided information on the relative occurrence and seasonality of several beaked whale species at fixed monitoring locations, inferences about overall species distribution, abundance, and habitat are limited by their geographic scope. The rich DASBR dataset used in this study enabled a more detailed examination of the geographic distribution of beaked whales and Kogia spp. within the Mariana Archipelago than has been possible with other datasets. Using established detection and classification methods, we identified the presence of Kogia spp. and four species of beaked whales in the Mariana Archipelago and then paired the acoustic detection data with in situ and satellite derived oceanographic datasets to examine habitat characteristics. Models of the full study area resulted in a consistent set of oceanographic covariates that link Mean decrease accuracy is a measure of how much removing each covariate reduces the accuracy of the model. Covariates to the right of the dashed gray bar represent those where more than 5% of the total number of whale encounters are omitted when that covariate is excluded for that particular model. overall species distribution with the two prominent water masses that bisect the region. In contrast, each model describing only the northern region, which had the highest number of beaked whales and Kogia spp. encounters, had a largely distinctive set of important covariates, likely reflecting the unique habitat preferences of the individual species within this higher density region. Together these data and analyses provide new insights into the habitat dependent distribution of deep-diving cetaceans in this little-studied region.

Detection and Classification of Beaked Whales and Kogia spp. Signals
The bulk of what is known about beaked whale and Kogia spp. occurrence in the Mariana Archipelago comes from longterm fixed-site passive acoustic monitoring in the southern portion of the archipelago. Blainville's, Cuvier's, and BWC beaked whales have all been recorded year-round at fixed long-term monitoring sites near Saipan and Tinian (Baumann-Pickering et al., 2014;Simonis et al., 2020). Hill et al. (2020) compared the overall relative occurrence (based on proportion of recording days with each species present) of these species at the same monitoring locations as well as at a site near Pagan. They reported that Blainville's were the most commonly detected species at all three locations, with varying proportions of Cuvier's and BWC depending on the monitoring location. These studies did not specifically examine the occurrence of Longman's beaked whales and did not report echolocation signals other than those identified as one of these species. Although such long-term monitoring provides a detailed examination of the temporal variability in the occurrence of beaked whale species through time, inference about geographic distribution is limited to the monitoring location only. Our data similarly reveal the highest encounter rates for Blainville's beaked whale across the monitoring region (40% of beaked whale encounters), though with only slightly lower encounter rates for Cuvier's beaked whales (36%) when sampling across the broader region (Table 3). Our encounter rate for BWC (10%) is similar to the relative presence of this species reported by Hill et al. (2020). Together with a single towed array acoustic detection of Longman's beaked whales during the same survey that supported our DASBR deployments (discussed in Hill et al., 2020), our data show Longman's beaked whale to be present in this region despite its absence in previous visual or acoustic studies in the Mariana Archipelago. There were considerably fewer encounters of Longman's beaked whales than any other beaked whale species in this study. However, Longman's encounters were unique in that they were the only species detected strictly in the waters north of 15.5 • N, and their acoustic encounter duration was substantially longer than other species, representing 30% of the total duration of beaked whale detections (Table 3). Our data support the widespread distribution of all four of these beaked whale species throughout the archipelago, including previously unsampled waters as far west as the West Mariana Ridge.
Historical stranding records indicate that Deraniyagala's beaked whales may also be present in the broader western Pacific and Philippine Sea region (Lacsamana et al., 2015). The sounds of this species have been described previously (Baumann-Pickering et al., 2010), and our detection and classification approach would have allowed for their identification if they were present and vocalizing near our recorders. The absence of Deraniyagala's beaked whales within our dataset and from fixed site monitoring within the archipelago suggests that this species is rare in this region.
There are few visual sightings of Kogia species during visual surveys in the Mariana Archipelago, all of which were pygmy sperm whales (Kogia sima) in nearshore waters off Guam and Saipan (Hill et al., 2020). In contrast, Kogia spp. were detected on five DASBR drifts in deep, open waters between the Mariana Archipelago to the West Mariana Ridge, suggesting a pelagic distribution similar to that observed in other regions. Although their occurrence in deep waters near the Marianas is not surprising given what is known of global distribution patterns for the species, our DASBR detections are the first known data indicating that Kogia spp. are widely distributed throughout the Mariana region.
The echolocation FM pulses produced by Blainville's and Cuvier's beaked whales correspond to time spent foraging at depth Zimmer et al., 2005;Barlow et al., 2013). Additionally, the synchronous diving behavior of these beaked whales (Aguilar de Soto et al., 2020), followed by time spent near the surface between foraging bouts , represents a natural break in our acoustic encounters, where individual encounters likely represent a single foraging dive by the group of whales. Although DASBR acoustic data are duty cycled precluding precise measurement of encounter duration, we can compare encounter durations among species. Blainville's, Cuvier's, and BWC beaked whales averaged encounter times less than 16 min in sharp contrast to the relatively long (91 min) average encounter duration for Longman's beaked whales (Table 3). Visual sighting data for Blainville's and Cuvier's beaked whales indicate that both of these species are not particularly social, dive in sync, and generally occur in small groups of only 2-3 animals (Baird, 2005(Baird, , 2019Aguilar de Soto et al., 2020;Hill et al., 2020;Alcázar-Treviño et al., 2021). Visual group size estimates for Longman's beaked whales during surveys in the Hawaiian Islands varied from 15 to 60 animals (Bradford et al., , 2021 and have been observed breaking into smaller subgroups and diving in a staggered rotation (Yano et al., 2018). This continual, staggered diving rotation could explain the extremely long acoustic encounter durations for Longman's beaked whales.

Environmental Modeling
The broad geographic sampling achieved with our DASBR tracks revealed a clear geographic demarcation in the detection of beaked whales and Kogia spp. showing substantially more encounters of all species on those drift tracks to the north of Saipan compared to those in the south. This north-south division aligns with a distinct shift in oceanography; all three full study area ERF models are described by temperature and mixed layer covariates that partially define the gradient between the North Pacific Tropical Waters in the north and equatorial waters to the south. The break between water masses is generally defined by a salinity front (Suga et al., 2000;Suntsov and Domokos, 2013), and although only models for Cuvier's contained salinity covariates with a mean decrease accuracy greater than 5% of whale encounters (Table 4), all full study area models did include salinity metrics as significant predictors of whale occurrence (Figure 4).
Perhaps more interesting was the finer-scale assessment of habitat covariates within the northern water mass, where beaked whale encounters were substantially higher, and some of the influence of the oceanographic shift across the front were removed. Within this region, the most significant environmental covariates within each ERF model varied widely. Model predictive power was fairly high (TSS = 0.65-0.68, AUC = 0.90-0.93) for both the full study area and northern region Cuvier's models, and these models shared 4 of 5 of the top predictors, substituting only mixed layer depth for salinity at 400 m in the northern model. The similarity of the two models is not surprising because all but four of the 50 GPS associated encounters of Cuvier's beaked whales were recorded north of 15.5 • N. Only the bottom depth covariate was shared between the all species model and the Cuvier's model, suggesting our models may have captured a unique set of environmental predictors that collectively describe the specific ecological niche for these species within this water mass.
Differences in the number of whale encounters north and south of the salinity front could have been impacted by the availability of micronekton within each region. Abecassis et al. (2015) found that a high density of midwater micronekton was an important covariate for Blainville's beaked whale distribution off the island of Hawai'i, likely because it represents the prey for beaked whale food sources. Benoit-Bird et al. (2020) found that additional oceanographic features highly correlated with beaked whale habitat were associated to deep-sea squid habitat, highlighting the importance of these focused food sources. When Suntsov and Domokos (2013) sampled micronekton and macro zooplankton on three meridional transects between 10 • N and 17 • N in the Mariana Archipelago, they found relatively low concentrations within the oligotrophic NEC suggesting the prey availability for beaked whale and Kogia spp. may be more limited in this region.
Similar to previous studies of beaked whale habitat Hazen et al., 2011;Abecassis et al., 2015), we found bathymetry (bottom depth) to be an important predictor of beaked whale distribution. In particular, within the northern models, bottom depth was the most important covariate for the all species and Cuvier's model with increased encounters taking place over relatively shallow bathymetry (1,800-2,300 m). Stomach contents of Cuvier's beaked whales consist primarily of cephalopod species along with non-food source materials (asphalt, coal-like material) suggesting occasional foraging at the sea floor, perhaps by pinning prey items to the ground or foraging for prey within the substrate (Santos et al., 2001(Santos et al., , 2007Adams et al., 2015). Blainville's beaked whales and other Mesoplodon prey species differ from Cuvier's beaked whales, with stomach contents containing more fish species and some cephalopods (Santos et al., 2007) showing with no indication of foraging at the seafloor. In addition to bathymetry, chlorophyll a and mixed layer/thermocline covariates were also important. The ERF models suggest that these whale species prefer shallow/seamount areas that have higher chlorophyll concentrations and a shallower mixed layer depth and cooler thermocline temperature. This preference for shallow/seamount areas was also found in the tagged study of Cuvier's beaked whales off southern California waters . Ocean currents interacting with the physical structure of a seamount can generate current enhancement, turbulence, and upwelling to bring nutrient rich waters up into the euphotic zone causing a seamount induced chlorophyll enhancement (Lavelle and Mohn, 2010). Seamounts in the Mariana Archipelago region predominantly show chlorophyll enhancement with persistent increased primary productivity, which is transferred to higher trophic levels (Leitner et al., 2020), potentially providing richer foraging grounds for whales compared with surrounding waters. Additionally, seamounts can trap and concentrate micronekton that undergo diel vertical migration, further contributing to increased concentrations of potential forage (Cascão et al., 2019).
Overall, we gained significant new insight into the relative occurrence of beaked whale and Kogia spp. in a historically data deficient area, as well as providing a more nuanced view of their distribution and habitat associations. We presented a method of improving data collection of cryptic cetacean species in remote environments and of applying environmental covariates to explain their distribution, along with collecting viable data for efforts to estimate abundance where new techniques are currently being developed (Barlow et al. submitted). Our methods are applicable to a wide range of cetacean species, and could greatly enhance the spatial coverage of future marine mammal surveys.

DATA AVAILABILITY STATEMENT
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

AUTHOR CONTRIBUTIONS
EO generated the research funding. AA, EO, EN, and JM designed and carried out the field data collection. JM performed acoustic analysis. JW and ZS analyzed environmental data and carried our modeling efforts. All authors contributed to the manuscript and agreed to be accountable for the content of the work.

FUNDING
DASBRs and staff support for data analyses was provided by the NMFS Pacific Islands Fisheries Science Center. Field teams were supported by U.S. Navy Commander, Pacific Fleet and PIFSC.