Advancing the Interpretation of Shallow Water Marine Soundscapes

Soundscapes offer rich descriptions of composite acoustic environments. Characterizing marine soundscapes simply through sound levels results in incomplete descriptions, limits the understanding of unique features, and impedes meaningful comparisons. Sources that contribute to sound level metrics shift in time and space with changes in biological patterns, physical forces, and human activity. The presence of a constant or chronic source is often interwoven with episodic sounds. Further, the presence and intensity of sources can influence other sources, calling for a more integrated approach to characterizing soundscapes. Here, we illustrate this complexity using data from a national-scale effort, the Sanctuary Soundscape Monitoring Project (SanctSound), an initiative designed to support collection of biological, environmental, and human use data to compliment the interpretation of sound level measurements. Using nine examples from this diverse dataset we demonstrate the benefit of integrating source identification and site features to interpret sound levels across a diversity of shallow water marine soundscapes (<150 m). Sound levels from sites in high human use areas reflect the anthropogenic influences on the soundscape, especially when measuring broad frequency bands, whereas sites with relatively low human activity and high sound levels reveal biological features of the soundscape. At sites with large tidal changes, sound levels correlated with the magnitude of tidal flow, and sound levels during high tidal flow periods were similar to sound levels at sites near commercial shipping lanes. At sites in relatively close proximity (<30 km), sound levels diverge because of the presence of different proximate sound contributors and propagation features of the site. A review of emerging methodologies for integrated soundscape analysis, including acoustic scenes, provides a framework for interpreting soundscapes across a variety of conditions. With a global growth in monitoring efforts collecting standardized measurements over widely distributed arrays, more integrated methods are needed to advance the utility of soundscapes in marine resource management.

Soundscapes offer rich descriptions of composite acoustic environments. Characterizing marine soundscapes simply through sound levels results in incomplete descriptions, limits the understanding of unique features, and impedes meaningful comparisons. Sources that contribute to sound level metrics shift in time and space with changes in biological patterns, physical forces, and human activity. The presence of a constant or chronic source is often interwoven with episodic sounds. Further, the presence and intensity of sources can influence other sources, calling for a more integrated approach to characterizing soundscapes. Here, we illustrate this complexity using data from a national-scale effort, the Sanctuary Soundscape Monitoring Project (SanctSound), an initiative designed to support collection of biological, environmental, and human use data to compliment the interpretation of sound level measurements.
Using nine examples from this diverse dataset we demonstrate the benefit of integrating source identification and site features to interpret sound levels across a diversity of shallow water marine soundscapes (<150 m). Sound levels from sites in high human use areas reflect the anthropogenic influences on the soundscape, especially when measuring broad frequency bands, whereas sites with relatively low human activity and high sound levels reveal biological features of the soundscape. At sites with large tidal INTRODUCTION Shallow water marine environments present complex and dynamic blends of sounds and sonic relationships . Acoustic monitoring offers a unique and multi-dimensional view into an ecosystem that can be readily recorded and archived. Characterizing the collection of all sounds both near and far present at a given location provides a comprehensive view of all the acoustic information available to listeners. This collection of sound is often referred to as a soundscape (ISO-18405, 2017) and represents an interconnected landscape of information networks (Barber et al., 2010). Individual perceptions of soundscapes (ISO-12913, 2014) create unique acoustic habitats embedded within these soundscapes . Soundscapes include vital communication signals, as well as the sensory condition against which animals must detect and decipher acoustic signals from conspecifics, predators, and prey (Popper and Hawkins, 2019) and important cues on the conditions of an environment. These cues, referred to as soundscape orientation (Slabbekoorn and Bouton, 2008), can direct movement and help animals identify suitable habitat. The rich information contained within shallow water soundscapes affords many opportunities for ecological studies and conservation applications, yet real analytical challenges remain.
Soundscape analyses pursue diverse objectives, reflecting the diversity of information available (Figure 1). There is a growing need for soundscape measurements to aid in biodiversity assessments at regional and global scales. This approach parses biological soundscape features while accounting for variation related to abiotic and anthropogenic contributions (Mooney et al., 2020). Noise impact assessments focus on parsing noise sources and intensities, examining biological responses to these noise sources Kunc and Schmidt, 2019;Duarte et al., 2021), and characterizing noise-free conditions (Buxton et al., 2017). Noise impact assessments also provide measures of acoustic habitat quality (Merchant et al., 2015). Acoustic scene assessments aim to understand all sources that are present, identify dominant sources in the sound field, and quantify modes of spatiotemporal variation with the goal of differentiating soundscapes. Acoustic scenes are comprised of identifiable sources against a background summation of unidentifiable sources; the identifiable sources are often the targets of classification schemes (Bregman, 1990;Barchiesi et al., 2015). The residual acoustic scene includes the fluctuating sound levels that cannot be attributed to specific sources which sets the perceptual and detector performance limits for source identification (ANSI/ASA S3/SC1.100, 2014). Both components can have important implications for how animals respond to sound in the environment (Ellison et al., 2018). Our investigation of shallow water marine soundscapes focuses on acoustic scene assessments including current approaches, challenges, and future opportunities.
The sounds present in a soundscape fall into three general categories of sound sources, including sounds generated by biological sources, human activity, abiotic conditions, hereafter referred to as biotic, anthropogenic, and abiotic (Figure 1). While sounds in these categories can have similar acoustic properties and characteristics, separating them is essential for interpreting soundscapes. These sources, when analyzed together, provide insight into the acoustic scene. Disentangling the occurrence and characteristics of all sounds present in a soundscape and how they vary across space and time presents significant analytical challenges that remain an intriguing line of research. The goal of these methods is to provide a comprehensive assessment of ecosystems by simultaneously monitoring biological and physical conditions alongside human influences.
The commonly measured characteristics soundscapes vary in terms of amplitude or sound level, frequency content, temporal patterns, spatial extent, and source occurrence (Figure 1). These characteristics relate to both the sound sources as well as the sound propagation conditions and features associated with the location of the listening station. Further, sound sources are interrelated and the presence of one sound may influence the production or detection of another. For example, anthropogenic sources can change the frequency content and temporal occurrence of biological sounds (e.g., Nemeth et al., 2013). Geospatial features, such as bathymetry and bottom type as well as variation in water column stratification determines propagation of sounds. Further, these features along with receiver characteristics (i.e., instrument settings, animal hearing) limit the distances at which sources of interest are detected at a given location. Proximity to biological (e.g., spawning grounds, migration corridors) or human activity (e.g., commercial ports, FIGURE 1 | Soundscapes can be interpreted from different perspectives: biodiversity assessment, noise assessment, or acoustic scene assessment. Each interpretation focuses on specific source types (shown by arrows, dashed arrows indicate possible source focus). There are many acoustic characteristics of a soundscape and this list represents some of the more commonly measured and discussed. The features of a listening location are generally site or habitat specific and these listed serve as examples for shallow water marine soundscapes, some of which can be measured using passive acoustics (indicated by + ).
fishing grounds, resource extraction sites) will influence the characteristics of the sound sources present in a soundscape. Temporal occurrence of sounds can be short transient signals that occur over seconds or minutes (e.g., animal calls, nearby boat passage), possibly in regular intervals. Alternatively, these sounds can have continuous presence in the soundscape over hours or days resulting in chronic contributions to the soundscape (e.g., rain or wind noise, biological choruses, distant shipping). This distinction between transient and chronic sounds is contingent upon perceptual rendering or analytical procedures pertaining to effects under investigation. Seasonal, latitudinal, and celestial factors will have an effect on the presence of some sounds (Staaterman et al., 2014;Širović et al., 2015;Haver et al., 2017Haver et al., , 2019. Understanding, measuring, and integrating these features aids the interpretation of common soundscape metrics such as sound levels. Soundscapes in shallow water environments tend to contain many sound sources, resulting in complex conditions and in many cases, relatively high sound levels. Further, the intricacies of sound propagation in shallow waters will influence relationships between sounds received at listening locations and actual acoustic activity in the surrounding environment (Jensen et al., 2011). Therefore, broadband sound levels in a soundscape may be similar at two sites, yet very different in terms of the composition of sounds present. Incorporating source identification and site features is therefore necessary when comparing soundscapes across broad spatial scales. Using examples from a national-scale effort to characterize and compare widely distributed soundscapes in US National Marine Sanctuaries (NMS), we: (1) illustrate challenges faced when comparing sound levels in isolation, (2) outline the benefit of integrating contextual knowledge, and (3) describe next steps to advance the utility of soundscape analysis for natural resource management.

CURRENT APPROACHES TO CHARACTERIZATION OF MARINE SOUNDSCAPES
Methods and associated metrics used to characterize underwater soundscapes in terms of sound levels are diverse (Erbe et al., 2016;Miksis-Olds et al., 2018). Often the first step in understanding a soundscape is visualizing the sound field by transforming the waveform data into spectral content. The resulting graphics, known as spectrograms, reveal patterns in both frequency content of a soundscape as well as temporal patterns. While the resolution of these graphics is user defined, these visualization techniques are a means of data averaging and result in long-term spectral averages that can be used to visualize unique characteristics of a site or presence of acoustic events (Dias et al., 2021). These visual representations of sound are valuable for soundscape exploration, qualitative descriptions, and manual extraction of sound events; additional analytical steps are typically necessary to derive quantitative and comparable metrics for sound levels.
The most common way to characterize and compare soundscapes is by measuring the variation in pressure within a specified frequency band and time interval-sound pressure levels (ANSI/ASA S1.1, 2013), hereafter referred to as sound levels. Previous studies have referred to these measurements as ambient noise levels (Wenz, 1962;Hildebrand, 2009) and typically used in a signal detection framework (ANSI/ASA S1.1, 2013;ISO-18405, 2017). We retain the more specific measurement term, sound levels, when talking about measurements of all the sounds present within a soundscape to avoid confusion with multiple definitions of ambient (ANSI/ASA S3/SC1.100, 2014; ISO-1996ISO- -1, 2016ISO-18405, 2017). There are multiple processing decisions when converting an audio recording into sound levels , and while the details are beyond the scope of this manuscript, calibrated and standardized methods of sound level measurements are necessary to ensure compatibility of metrics between soundscapes (e.g., Martin et al., 2021). Sound levels are often summarized using percentiles to compare both the average level of sound and the variation within a specified time-period. Sound level metrics have been quantified at various sites in nearly every ocean basin (e.g., Wenz, 1962;Miksis-Olds et al., 2013;Dziak et al., 2016;Širović et al., 2016;Haver et al., 2017Haver et al., , 2018Heenehan et al., 2019). In complex and dynamic environments, especially shallow water, it is challenging to understand the sources and conditions driving sound levels. Using only sound levels or derivations thereof is informative for comparing overall sound energy in an environment, but in many cases limits the scope of interpretations and comparisons.
Various methods have been applied to sound level measurements to extract more meaningful comparisons of sound levels. Mennitt et al. (2014) utilized copious spatial measurements of sound levels to fit a soundscape model to landscape features; pre-industrial conditions were predicted by minimizing the anthropogenic contributions in this model. In a comparison of contemporary ambient noise levels to historical measurements, all discrete sound sources were removed before calculating sound level statistics to match methods used in historical data (e.g., McDonald et al., 2006). Other studies identified the contributing sound sources and compared sound levels with only known sound sources present .
Another approach is using acoustic indices, or mathematical summaries of variation and patterns in sound levels. There are over 70 published acoustic indices used for a variety of research questions (Buxton et al., 2018). These approaches have had mixed results, especially in marine environments (Buxton et al., 2018;Dimoff et al., 2021;Nguyen Hong et al., 2021). Further, variation in sound levels measured by these indices can be driven by different features (biological, anthropogenic, or abiotic) and in some cases the indices do not distinguish between these features. Lastly, detection and classification of individual sources, provides insight into one or multiple sources of interest, yet only represents certain aspects of the soundscape. Integrating identification of multiple sources with features of a listening location (Figure 1; e.g. weather patterns, distance to ports) holds promise for advancing soundscape comparisons and interpretation across broad spatial scales.

SOUNDSCAPE CHARACTERIZATION USING SOUND LEVELS WITH CONTEXTUAL KNOWLEDGE
Using a national-scale effort to characterize and compare soundscapes in widely distributed US National Marine Sanctuaries (NMS), coupled with a wealth of additional higher resolution observations available within these protected areas, we illustrate the benefits of contextual knowledge from identification of sources present and site information for soundscape interpretation.

Sanctuary Soundscape Monitoring Project (SanctSound)
The U.S. National Oceanic and Atmospheric Administration (NOAA) and the U.S. Navy engaged in a multi-year effort (2018-2022) to monitor underwater sound within the U.S. National Marine Sanctuary System. The agencies worked with numerous scientific partners to study sound within seven national marine sanctuaries and one marine national monument. The study included sites off the east coast of the U.S. As the first coordinated monitoring effort of its kind for the U.S. National Marine Sanctuary System, SanctSound was designed to provide standardized acoustic data collection to document how much sound is present within these protected areas by specific sources as well as potential impacts of noise to the areas' marine taxa and habitats. To understand the features of a given listening station that influence measured sound levels, the SanctSound effort encompassed results from multiple acoustic detection algorithms (biological and anthropogenic), data from a wide range of non-acoustic variables (e.g., gliders, ship traffic data, weather stations), and sound propagation models to quantify variation in specific sound source detection ranges.
Acoustic data were collected using SoundTraps, which are compact, self-contained underwater sound recorders developed by Ocean Instruments, Inc 1 . Instruments were set to record continuously at a sampling rate of 48 kHz or 96 kHz ( Table 1). Power spectral density (PSD) levels per hour were calculated as the median of mean-square pressure amplitude (µPa 2 ) with a resolution of 1 Hz/1 second from 20 Hz to 24,000 Hz over no less than 1800 seconds in each hour and converted to decibels (dB re 1 µPa 2 /Hz). Octave band sound pressure levels (OLs) were calculated by integration of PSD levels with a 1 Hz/1 second resolution over each octave band with nominal center frequencies ranging from 125 to 20,000 Hz (IEC, 2014). Resulting sound pressure levels in octave bands below 125 Hz were excluded due to uncertainty in propagation conditions and instrument sensitivity. Broadband sound pressure levels (BBLs) All instruments were the standard model with a working frequency range of 20 Hz to 60 kHz.
were calculated just as OLs but integrated over the full frequency range (88 to 22,387 Hz; IEC bands 20-43). The resulting OLs and BBLs with the 1 second resolution were then used to calculate hourly OLs and BBLs as a median over no less than 1,800 1-s values for that hour. The OLs and BBLs per hour were converted to decibels (dB re 1 µPa). The spectral and temporal resolutions presented in this paper were selected to be able to make broad comparisons across multiple sites and were adequate at capturing sound level features of interest. Data used in this analysis, as well as commonly used 1/3 rd octave band levels are available via the project data portal 2 and code for calculating sound pressure levels from audio recordings are available on GitHub 3 . Here, we used 31 days in spring 2019 from eight of the 30 recording locations and 60 days from one recording location to explore soundscape characterizations through sound level measurements across diverse regions (Figure 2 and Table 1). The 60-day recording period at a site in the Channel Islands NMS better represented a temporal feature of interest in sound levels from fish sounds that changed from individual calls to full chorus. For source identification, we used results from detection of the acoustic presence of vessels (Solsona-Berga et al., 2020) as well as detection results for different species of interest (e.g., fish, marine mammals) depending on the site (Mellinger and Clark, 1997;Urazghildiiev and Van Parijs, 2016). For contextual information, we integrated sound level measurements with the presence of ships equipped with Automatic Identification System (AIS) transponders within a 10 km buffer around a site, hourly average wind speed (m/s) calculated from downloaded data from the nearest NOAA buoy 4 , and tidal flow measured as change in height above sea level from the previous hour 5 .

Comparing Sound Levels
Sound levels, summarized as broadband (Figure 3A), octave bands ( Figure 3B), and spectrum levels ( Figure 3C), showed variation across sites and provided initial insight on the unique features of each soundscape. There were clear differences in terms of variation in absolute sound levels and frequency content (Figure 3). Highest BB sound levels were observed in Gray's Reef NMS, lowest levels were recorded at a site in Monterey Bay NMS, and a site in Olympic NMS had the highest variation ( Figure 3A). Three sites (FK02, CI01, GR01) had elevated levels with low variation in levels in the higher octave bands (>2 kHz, Figure 3B). Two sites (OC02, SB02) had higher levels in the lower frequencies (<250 Hz, Figure 3B). The site with the lowest BB levels (MB01) had sound levels in the highest frequencies (>8 kHz) that were below the instrument noise floor ( Figure 3C).
Combining these sound level measurements with an understanding of the environmental and human-use context and source identification revealed what sources were driving the observed differences. In some cases, higher levels related to biological activity of interest. In other cases, the presence of instrument related strumming in low frequencies or electronic noise in high frequencies ( Figure 3C, noise floor line), interfered with sound levels measurements. Using examples from these nine sites, in the next sections, we explored specific ambiguities in interpreting sound level metrics in isolation and the importance of understanding sources and site features to enhance interpretation. Results are summarized in Table 2 with synthesis statement for each example.

Spectral Features Expose Presence of Multiple Sources
Representing a soundscape over a broad range of frequencies summarized into one sound level metric (broadband) in a defined time-period provides a concise way to compare levels across multiple sites using a single value ( Figure 3A). The metric is useful at distinguishing general conditions in a soundscape, especially when comparing across many sites. These broadband sound levels, however, are dominated by low frequency sources and do not account for energy in the higher frequency bands, unless a weighting function is applied, such as A-weighting (ANSI/ASA S1.42, 2020). Further, when sources in all frequency bands are combined into one metric, the contributions of different sources are obscured. At a site in the Hawaiian Islands Humpback Whale National Marine Sanctuary (HI01), broadband levels showed a decrease toward the end of the recording period, suggesting a change in the soundscape (Figure 4top panel). When the sound levels for different octave bands were displayed, there was a decrease in the 500-Hz octave band, and an increase in the 16,000-Hz octave band (Figure 4bottom panel). The decrease in sound levels in the 500-Hz octave band corresponded to a reduction in humpback whale song at the end of the breeding season, and the increase in the 16,000-Hz band corresponded to a change in snapping shrimp activity. Analyzing sound levels in octave level frequency bands informed by the presence of known biological sounds can reveal important temporal patterns for different sound sources at a particular site (Tables 2,3.2). These narrow bands, in some cases, will not represent the same sounds across sites, resulting in a clear tradeoff when trying to make broad spatial comparisons of sound levels.

Temporal Patterns Reveal Multiple Sources
In addition to spectral variation, temporal patterns in sound levels can be informative for identifying unique features in a soundscape. Temporal patterns occur as repeated features in sound levels that can relate to time of day or occurrence within a given time-period. At a site in Florida Keys National Marine Sanctuary (FK02), sound pressure levels in the 500-Hz octave band showed clear peaks in acoustic energy with periodic increases of 15-20 dB (Figure 5). Further examination of temporal patterns also showed a second peak in energy, ∼10 dB increase occurring during the same period but slightly out of sync with the larger peaks ( Figure 5C). The peaks in acoustic energy were from two distinct fish choruses: the first was from an acoustically unidentified species and the second putatively from midshipman fish (Porichthys sp). These spectral features were prominent with a temporal averaging of one hour. Considerations of temporal resolution are crucial in determining the activity level of many marine sound sources ( Table 2,

3.3).
Temporal resolution of several seconds or minutes may represent individual call or sound activity, hourly averages as presented here may be suitable for identifying the presence of multiple fish choruses. Other useful temporal averages are divisions into day-and nighttime with dusk and dawn; daily; full, waning, FIGURE 3 | Comparison of sound levels across sites. Hourly band sound pressure levels for the 31-day periods at nine sites are shown as (A) broadband sound level (BB) comparison as box plots with line at median value, box includes 25th-75th percentiles, (B) octave band (OL) percentile sound levels at each site (5, 25, 50, 75, 95th), and (C) narrow band (1 Hz) percentile spectrum levels at MB01 (5,25,50,75,95th). The blue line shows the noise floor of the instrument, measured in air under quiet conditions with the hydrophone connected and adjusted for the system transfer function. The accuracy of this method is frequency dependent and may result in mechanical noise transmitted through the room and floor at frequencies below 1 kHz. While the line was generated from a single instrument, we are using it to represent all instruments used in this study. Spectrum levels were truncated at 88 Hz for broadband and octave band sound level computation and hence excluded from the comparison due to uncertainty in propagation conditions and instrument sensitivity. new, waxing moon; monthly or annual, depending on the soundscape features of interest. In most cases, the temporal resolution is informed by prior knowledge of sources present, either through visual confirmation in long-term spectral averages or non-acoustic confirmation of species presence in the area.

Biological Chorus Mimics Wind Noise
In the absence of anthropogenic activity, wind at the surface of the ocean has a predictable effect on underwater sound levelssimplified, sound levels increase from surface agitation as wind speed increases (Wenz, 1962). In shallow water environments, this relationship is complicated by the propagation conditions (Ingenito and Wolf, 1989;Jensen et al., 2011), distant vessel noise, and the presence of biological choruses. At a site in the Channel Islands National Marine Sanctuary (CI01), when sound levels were combined with measurements of both wind speed (from a station 12 nautical miles away) and presence of a plainfin midshipman (Porichthys notatus) chorus, it was evident that not just wind speed was driving higher sound levels. When the fish chorus, measured in the 125-Hz octave band, was absent wind speed dominated sound pressure level measurements (Figure 6). However, when the fish chorus increased in intensity, highest sound pressure levels occurred when fish were calling, and the sound levels were no longer influenced by wind speed (Figure 6). switch in the dominant continuous feature of a soundscape (Table 2,3.4). Pairing acoustic sensors with marine observation platforms provides a measure of abiotic conditions which make it possible to better quantify abiotic contributions to sound levels. When establishing long-term monitoring stations, colocating acoustic sensors with other environmental monitoring efforts provide continued opportunities for contextualizing sound level measurements, and in many cases, monitoring in marine protected areas, like the SanctSound project, will afford these benefits.

Ship Noise Masks Influence of Biological Activity
In coastal regions with access to major commercial ports, noise from passing ships is typically the dominant feature of a soundscape, resulting in elevated sound levels in low frequencies (<1 kHz). At these locations, sound levels (both narrow and broadband) provide an estimate of noise from vessel traffic (Hatch et al., 2008;Haver et al., 2018Haver et al., , 2019McKenna et al., 2012); however, the ability to measure patterns in biological sounds using sound levels is limited. In Stellwagen Bank National Marine Sanctuary, for example, vessels are a continuous source of noise with commercial ships using the port of Boston, ships docking at nearby Liquid Natural Gas terminals, and a variety of private and commercial vessels transiting the region (Hatch et al., 2008). At a listening location in this region (SB01), co-occurring with low-frequency vessel noise were a variety of biological sources (e.g., baleen whales and fish). Even when multiple species were present, specifically cod and sei, fin, and humpback whales, sound levels in low frequencies did not significantly increase (Figure 7-left). In contrast, both the number of vessels present (Figure 7-center) and wind speed (Figure 7-right) showed a positive relationship with sound level. To quantify biological activity using sound levels at sites with high levels of shipping noise, additional approaches are necessary to also quantify these less dominant sources present in the soundscape ( Table 2,3.5). One approach, employed in SanctSound, is running multiple automated detection algorithms to determine the presence of biological sources [e.g., Low Frequency Detection and Classification System to detect blue, fin, humpback, sei, and North Atlantic right whales (Baumgartner and Mussoline, 2011) and automatic grunt detector and recognizer for Atlantic cod (Urazghildiiev and Van Parijs, 2016)]. If presence of calls is known, sound levels can then be compared between periods of presence and periods when no calls are present to understand how sound levels differ (Haver et al., 2019). Other approaches minimize the background sounds before extracting complex biological calls (Helble et al., 2012(Helble et al., , 2013.

Artificial Sound Resembles Sound Levels Associated With Anthropogenic Activity
Artificial sound can be introduced from flow around a sensor or strumming of cables in periods of high-water movement, specifically during tidal changes. Flow noise over the hydrophone and cable strumming are not natural components experienced by animals in this environment but an artefact of the presence of the recording instrumentation. When this occurs, sound levels in lower frequencies (<1-kHz) will be artificially inflated and in some cases will resemble levels at sites with high shipping traffic.
A comparison of two sites in the Olympic Coast National Marine Sanctuary (OC01, OC02) illustrates this pattern when sound levels are combined with metrics on shipping traffic and tidal flow (Figure 8). Maximum sound levels are similar at both sites in octave bands below 1 kHz; however, the relationship to sound levels with ship operational hours (from AIS data) and tide differs between the sites (Table 3). At the site closer to shore (OC01) and considerably shallower water (14 m vs. 94 m), tidal changes correlated with octave band sound levels up to the nominal 1-kHz octave band and shipping activity correlated less significantly in the 2 and 4-kHz octave band (Table 3). At the deeper site (OC02), closer to the shipping lanes, sound levels correlated with shipping activity up to the nominal 1-kHz octave band but to tidal fluctuation only in the lowest 31.5-Hz octave band. Although some animals may experience and respond to sound generated by tidal flow around them, accurately quantifying this sound is challenging because the contribution is dependent on how water flows around the listener. Documenting when tidal noise is present on a sensor and removing these periods from sound level measurements provides more comparable metrics of sounds present in the soundscape, regardless of the listener. To accurately quantify sound levels at sites with high tidal influence (van Geel et al., 2020), periods with minimal tidal flow can be extracted and summarized to represent the soundscape (Table 2,3.6). While this reduces the amount of data available and may exclude bioacoustic patterns related to tides (Johnston et al., 2005;Staaterman et al., 2014), the resulting sound levels will reflect the actual sound present in the environment.    (Baumgartner and Mussoline, 2011) and Atlantic cod grunts using automatic grunt detector and recognizer for Atlantic cod (Urazghildiiev and Van Parijs, 2016). The same 31 days are shown in each panel.

High Sound Levels Do Not Always Correlate With Human Activity
Biological activity can result in high sound levels-levels similar to sites with high levels of human activity. At sites with minimal human activity, sound levels represent the biological community, especially when measured under similar abiotic conditions. At a site in Gray's Reef National Marine Sanctuary (GR01), broadband sound levels are higher than those found at a site near the busy port of Boston (SB01, Figure 3). At the site in Gray's Reef, biological sound sources dominate both the low frequency (two species of fish) and high frequency bands (snapping shrimp) (Figure 9). When making comparisons of sound levels across a variety of sites, first separating sites based on amount of nearby human activity can provide important context for interpreting sound levels (Table 2,3.7). In these relatively shallow water sites, measuring sound levels during periods of little to no human activity can also provide comparisons of natural variability in sound levels across sites.

Spatial Proximity Not Indicative of Similarity in Sound Levels
Soundscapes vary over small spatial scales due to variation in occurrence and proximity of human use patterns as well as sound propagation conditions and biological activity. These variations are typically reflected in sound level differences, especially when physical dynamics (wind and tidal change) are similar in nearby sites. Within Monterey Bay National Marine Sanctuary vessel activity is more concentrated in certain areas and sound levels at two sites less than 30 km apart reflect the difference in vessel activity (Figure 10). Sound pressure levels in the 500-Hz octave band are 5-10 dB higher at the site with higher small vessel activity, although these differences are reduced when wind is high (e.g., Figure 10, March 6). To capture the variation in sound levels within a region, spatial sampling needs to account for human use patterns and sound propagation conditions and not simply distance between sites (Table 2,3.8). Features that can be used to inform spatial sampling include distance to designated shipping lanes, fishing grounds, bathymetric features, and ocean stratification influencing sound propagation, and management area type.  Higher correlation coefficient rho indicates stronger correlation between variables and p showing the statistical significance (***p < 0.001; **p < 0.01; *p < 0.05; NS not significant).

INTEGRATED APPROACHES TO UNDERSTANDING SOUNDSCAPES
Distinguishing soundscape features of interest using spectral and temporal characteristics of sound levels with source identification and site features enhances comparisons and avoids ambiguous or in some cases erroneous interpretations ( Table 2). Because soundscapes offer a unique window into an ecosystem with a view of biotic, abiotic, and anthropogenic features, there is growing interest in distinguishing these features across broad spatial and temporal scales. Building on the examples provided for characterizing diverse soundscapes, we offer a framework for how to approach these efforts at the scale of the complete SanctSound data set by identifying and summarizing existing approaches in the literature. The overall analytical goal is to separate out coarse patterns in the main soundscape components to aid in the interpretation of sound level products and identify transitions or shifts in dominant soundscape features over space and time. The intended applications for protected area management include establishing baseline conditions, detecting change in soundscapes from environmental (Gottesman et al., 2021) or societal changes (Derryberry et al., 2020), supporting comparison of conditions inside vs. outside protected areas (Gottesman et al., 2020), providing complementary information for other resource condition monitoring, as well as creating engaging content for a host of outreach and educational purposes.

Estimating Residual Soundscape Conditions
Characterization of residual sound levels yield information about the most common and persistent acoustic conditions, integrated across the largest spatial scales permitted by sound attenuation. Residual sound in the soundscape context refers to the sound remaining at a given position when all identifiable sounds under consideration are eliminated from sound level calculations (ANSI/ASA S3/SC1.100, 2014; ISO-1996ISO- -1, 2016. Residual sound levels include all the innumerable, indistinguishable sound producers present in a soundscape. Residual sound constrains the detection and perception of identifiable sources as well as offer cues about the environment. These considerations assert that characterization of residual (also referred to as ambient sound) is crucial for ocean soundscape management . Residual sound levels are often reported as ambient noise in marine environments, and in many cases rely upon visual and aural review by analysts to identify time periods in a recording uninfluenced by transient sounds (e.g., McDonald et al., 2006). In other studies, ambient noise levels summarize existing sound levels that include all sources present (e.g., Chapman and Price, 2011;Haxel et al., 2013;Kaplan et al., 2015;Haver et al., 2018) and percentile statistics are utilized to express the gradient from chronic to rare. As demonstrated in the previous sections (Table 2), differing conditions can result in similar sound level statistics. Percentile summaries do not resolve this problem. Automated parsing of residual sound level patterns from transient sound events -such that both can be analyzed with minimal contamination from the other -will be an important advance for soundscape analyses. When successfully isolated, the residual components will be amenable to low-rank decompositions and the transient components will present more distinct signatures for feature extraction techniques.
One approach to parsing residual sound and transient sounds in soundscapes are source separation methods (Figure 11). Source separation methods isolate specific transient signals for the purposes of quantifying the contributions of these signals to an acoustic recording with demonstrated application in soundscape biodiversity assessments (Lin and Tsao, 2020). Each resulting waveform is intended to represent a single source for feature extraction or classification. These same methods can also be applied to remove transient signals from the residual waveform. "Denoising" procedures represent an alternative approach to clarifying distinctions among sounds (e.g., Helble et al., 2012;Abeßer, 2020).

Analytical Approaches to Soundscape Analyses
An analog to automated source separation methods is the ability of human analysts to subjectively differentiate acoustic scenes and classify them by listening to the sound components, e.g., this is 'a shallow water reef environment' , or 'a busy shipping lane' , or in the case of trained acoustic analysts by inspecting a long-term spectral average. The acoustic scene is a higher-level summary of both components and differs from the detection of singular acoustic events, such as a distinct fish call of a certain species or a container ship passage in the scenes above. However, the sum of singular acoustic events as well as the residual sound inform the classification of the scene.
For most automated classification scenarios, for acoustic events or acoustic scenes, the acoustic waveforms will need to FIGURE 10 | Divergence in sound pressure level at near-by sites. Top panel shows comparison of sound pressure levels (500-Hz octave band) at nearby sites in Monterey Bay National Marine Sanctuary (MB01 and MB02). The sites have different levels of vessel presence (middle panel), measured from all AIS-transmitting vessels within 10 km of each acoustic sensor; the majority of vessels transmitting AIS are in the small category (<20 m). Bottom panel shows regional hourly wind speed for context. be reduced in their complexity for a feature extraction step to be applied (Figure 11; e.g., Barchiesi et al., 2015). For this transformation, commonly used methods are Fourier transforms, cepstral and wavelet analyses with subsequent modulations of these methods (e.g., Abeßer, 2020). Redundancy in highdimensional features may need to be addressed as this may convey unintended weighting or other biases to subsequent machine learning or statistical models.
Unsupervised feature learning techniques have been increasing in past years, particularly where acoustic data volumes are expansive (Serizel et al., 2018). In many cases, features are being extracted from time-frequency matrices that reflect the underlying data structure and generalize in a high-level representation. Principal component analysis (PCA), independent component analysis (ICA), singular value decomposition (SVD) or non-negative matrix factorization are examples of these data transformations for unsupervised feature learning that have been applied in marine acoustic event detection ( Figure 11B; Sattar et al., 2016;Lin et al., 2017a;Lin and Tsao, 2020;Butler et al., 2021).
The benefit of these unsupervised methodologies is that both transient acoustic events and residual sound of marine soundscapes can be separated and characterized in theory without prior knowledge of underlying sources. However, in practice, some of the caveats described in previous examples may still apply. Specifically, decisions on feature reductions are often not justified, for example time and frequency binning applied to reduce data complexity are crucial (e.g., Figure 5). A study showing how these decisions impact feature learning and detectability of certain sources would be relevant. The performance of classifiers can be limited by the quality of these data representations (Serizel et al., 2018). Some acoustic scenes have highly predictable patterns where periodicity focused algorithms are highly successful (e.g., Lin et al., 2017a), such as crepuscular fish chorusing or nighttime delphinid foraging. It is yet unclear how stable these feature learning algorithms are over the course of, for example, multi-year recordings when patterns seasonally disappear (e.g., Figure 4), switch from one dominant source to another with similar time and frequency components (e.g., Figure 6), or when a multitude of variable sources overlap each other (e.g., Figure 7). Assumptions and prior information can massively increase the power of statistical models and machine learning, but they shape outcomes and increase risks of confirmation bias.

Considerations for Large Spatial and Temporal Scales Comparisons
While numerous large-scale acoustic monitoring efforts exist, methods for comparing characteristics of entire soundscapes are not as well developed (Mooney et al., 2020). When building analytical approaches to separate sounds and contextualize soundscapes (Figure 11), some of the analytical concerns are shared with other large-scale monitoring efforts. Templates or methods tuned to one data set may not generalize adequately to others. Recalibration may be necessary to tolerate relevant variation in identified sound sources or to account for new variation in soundscape conditions. For example, blue whale call detectors were created to compare signals across time and ocean basins (e.g., Mellinger and Clark, 1997;Širović et al., 2015). When these methods were applied to new data, they had to be recalibrated to changes in peak frequency in blue whale calls (Širovic, 2016).
To distinguish marine soundscapes as acoustic scenes, the methods used need to encompass the variation at relevant scales, using data that embody the variability that can be expected (low and high wind speeds, shallow and deep waters, different latitudes, and various kinds of bottom substrates). Subsequent usage of these automated methods should plan episodic expert evaluations of performance, so deteriorations in the performance are noticed and addressed (Kowarski and Moors-Murphy, 2020;Roch et al., 2021). The SanctSound project is in a unique position to develop and test methods of acoustic scene assessment and classification. It has generated a dataset representing a range of variability in soundscapes with over three years of data at 30 sites in a diversity of shallow water habitats. It benefits from high resolution ancillary datasets of non-acoustic data collection (e.g., glider biological surveys, human activity monitoring), many of which offer high levels of detail and resolution.

Soundscape Syntheses
Using soundscapes as an indicator of the quality or condition of the environment, that is comparable across sites, requires integration of the various metrics and approaches discussed in previous sections. We contend that soundscape metrics must colligate multiple dimensions of acoustic environments and identify the focus of the soundscape interpretation (Figure 1), given the coupled nature of the physical conditions, biological activity, and human presence. When the focus is on analyzing noise from human activity, natural sources of sound must be distinguished and segregated. For example, both terrestrial and marine soundscape assessments have applied noise exceedance metrics (Buxton et al., 2017;Merchant N. et al., 2018;Borgir, 2021) to quantify the influence of noise on sound levels or, in other words, how much noise caused elevated sound levels above natural ambient. In other cases, the interest is in tracking biodiversity, especially in changing environments. Given that many marine species produce sounds, either intentionally or unintentionally, bioacoustics offers a promising method to understand biological conditions (Mooney et al., 2020). Yet real challenges remain for devising algorithms and metrics that reliably extract equivalent information and present accessible ecological interpretations. Ocean soundscapes also offer a view into the rapidly changing physical environment from climate change-from measuring retreating ice coverage (Haver et al., 2017) to changes in wind patterns (Shajahan et al., 2020). As metrics and approaches improve for estimating residual sound levels, source separation, and predictive models, integrated suites of methods will emerge to amplify the value of soundscape analysis for exploring scenarios and supporting resource management.

CONCLUSION
Soundscapes capture the collection of sounds present at a given location which reveal multiple dimensions of an ecosystem (biological, human, and physical) and present invaluable opportunities for understanding complex and changing ecosystems. Yet, parsing and analyzing the components of a soundscape, while necessary for interpretation and comparability, is still an emerging field of research. Interpreting sound level measurements with knowledge of sound source properties and non-acoustic data provided motivational examples of comparing soundscapes across US National Marine Sanctuaries. For example, analyzing sound levels in specific frequency bands informed by known biological sounds exposed important temporal patterns for different sound sources at a particular site. Identifying when sound levels deviated from expected relationships with surface wind speeds indicated when other sources are present or when a switch in the dominant continuous feature of a soundscape occurs. Separating sites based on levels of nearby human activity provided important context for interpreting sound levels. Applying and evaluating automated methods to characterize soundscapes as acoustic scenes with the separation of residual sound from transient sounds offers an advancement in using soundscape analyses in a resource management context for protected area management and defining and tracking marine soundscapes as essential ocean variables (e.g., Miksis-Olds et al., 2018). These approaches aim to separate the main soundscape components first to then aid in the interpretation of sound level products and identify shifts in dominant soundscape features over space and time. Advancing these methodologies will fulfill the promise of passive acoustics to provide a rich source of autonomous information for monitoring environmental health and realizing sustainable societies.

AUTHOR CONTRIBUTIONS
MM, SB-P, SP, KF, and LH contributed to conception and design of the study. JJ, TM, ACMK, LP, AS, SB-P, AK, BS, EZ, ML, JB, EK, JS, and TR facilitated acoustic data collection, evaluated data quality, and performed initial data analysis. CW, TM, ACMK, SB-P, JJ, and ML organized the data and databases. SP, LH, CW, LP, BS, and JG provided overall project management and guidance. MM, SB-P, ACMK, and WO performed the soundscape analysis. JA performed analysis of non-acoustic data. MM wrote the first draft of the manuscript. SB-P, ACMK, and WO wrote sections of the manuscript. All authors contributed to manuscript revision, read, and approved the submitted version.