Sounding the Call for a Global Library of Underwater Biological Sounds

Aquatic environments encompass the world’s most extensive habitats, rich with sounds produced by a diversity of animals. Passive acoustic monitoring (PAM) is an increasingly accessible remote sensing technology that uses hydrophones to listen to the underwater world and represents an unprecedented, non-invasive method to monitor underwater environments. This information can assist in the delineation of biologically important areas via detection of sound-producing species or characterization of ecosystem type and condition, inferred from the acoustic properties of the local soundscape. At a time when worldwide biodiversity is in significant decline and underwater soundscapes are being altered as a result of anthropogenic impacts, there is a need to document, quantify, and understand biotic sound sources–potentially before they disappear. A significant step toward these goals is the development of a web-based, open-access platform that provides: (1) a reference library of known and unknown biological sound sources (by integrating and expanding existing libraries around the world); (2) a data repository portal for annotated and unannotated audio recordings of single sources and of soundscapes; (3) a training platform for artificial intelligence algorithms for signal detection and classification; and (4) a citizen science-based application for public users. Although individually, these resources are often met on regional and taxa-specific scales, many are not sustained and, collectively, an enduring global database with an integrated platform has not been realized. We discuss the benefits such a program can provide, previous calls for global data-sharing and reference libraries, and the challenges that need to be overcome to bring together bio- and ecoacousticians, bioinformaticians, propagation experts, web engineers, and signal processing specialists (e.g., artificial intelligence) with the necessary support and funding to build a sustainable and scalable platform that could address the needs of all contributors and stakeholders into the future.

need to be overcome to bring together bio-and ecoacousticians, bioinformaticians, propagation experts, web engineers, and signal processing specialists (e.g., artificial intelligence) with the necessary support and funding to build a sustainable and scalable platform that could address the needs of all contributors and stakeholders into the future.
Keywords: soundscape, bioacoustics database, artificial intelligence, biodiversity, passive acoustic monitoring, ecological informatics BACKGROUND Aquatic (i.e., marine, brackish, and freshwater) environments encompass the world's most extensive habitats, rich with sounds produced by a diverse range of animals. Advances in data acquisition, storage and processing that enable increased recording durations at reduced costs, and easier logistics of sensor deployment and retrieval, have made passive acoustic monitoring (PAM) a more accessible and feasible tool than ever before (Lindseth and Lobel, 2018;Chapuis et al., 2021;Wall et al., 2021). Combined with an increasing appreciation of the ecological importance of acoustic cues to almost all aquatic fauna, these advances have expanded the field of underwater bioacoustic and ecoacoustic research to increasing numbers of researchers and organizations (Lindseth and Lobel, 2018). The result has been an almost exponential increase in the volume of aquatic PAM data being collected around the world, in conjunction with increases in soundscape research (Lindseth and Lobel, 2018;Mooney et al., 2020;Duarte et al., 2021). Researchers now routinely collect substantially more PAM data, on an increasing number of taxa, and in more locations than ever before, from freshwater to marine, from shallow waters to the deep, and from tropical to polar regions . Higher sampling frequencies and longer deployment durations mean that datasets may now easily exceed terabytes in size and years in duration, potentially containing millions of sounds and hundreds of different types (Waddell et al., 2021;Wall et al., 2021). This makes manual classification of underwater sounds by experts-the traditional method of verifying call presence and source identificationincreasingly difficult (Mooney et al., 2020;Waddell et al., 2021).
With an increase in the use of PAM, there is increasing awareness of the impacts the acoustic environment (i.e., frequency-dependent propagation loss) can have on characteristics of recorded sound. For example, the same humpback whale song may produce different received spectra in two spatially separated recordings, depending on propagation conditions, and appear as two different types of calls. Sound production mechanisms (e.g., directionality of the source signal) can also influence recorded sound characteristics, such as the azimuth-dependent received spectra of some odontocete calls (Lammers and Au, 2003). Together with the impact of signalto-noise ratio (SNR) on the clarity of a sound sample, these factors all affect the reproducibility of signals, which needs to be considered when assessing the sound samples provided to, and by, a sound library.
Underpinning much of this work is the ability to identify or characterize sound sources either to assess them individually or understand their contribution to the overall soundscape (Mooney et al., 2020;McKenna et al., 2021). We are beginning to understand how these biological sounds, together with anthropogenic and geophysical sounds that make up the local soundscape (Schafer, 1969(Schafer, , 1977Southworth, 1969;Krause, 2008;Hildebrand, 2009), can collectively provide information on physical habitats, biodiversity, and aquatic ecosystem health (Mooney et al., 2020). PAM is proving to be one of the most effective ways to monitor visually elusive but vocal species in aquatic environments, which can potentially aid in more effective conservation management, such as spatio-temporal zoning measures found in marine park areas or fishery closures (Coquereau et al., 2017;Nikolich et al., 2021). At a time when global biodiversity is in significant decline (Sala and Knowlton, 2006;Worm et al., 2006;Marques, 2020) and increasingly impacted by climate change (e.g., Poloczanska et al., 2013;Sydeman et al., 2015), there is a need to document and understand as many sound sources in the ocean as possible, potentially before they disappear.
There are 126 marine mammal species, approximately 35,000 known species of fish, and nearly 250,000 documented species of marine invertebrates in the world (Froese and Pauly, 2021;World Register of Marine Species, 2021), and the number of known soniferous (actively sound-producing) species underwater is consistently increasing (see Figure 1 for spectrograms of example sounds). There are even a handful of reports of sound produced by birds underwater (e.g., Thiebault et al., 2019). It is thought that all aquatic mammal species exhibit soniferous behavior underwater and reports have so far confirmed this trait for almost all of them (e.g., Mellinger and Clark, 2006;Richardson et al., 2013). Calls of many marine mammal species are often distinctive and can even show significant variability among individuals (e.g., Janik and Sayigh, 2013;McCordic et al., 2016;Bailey et al., 2021). Additionally, as comparatively large and charismatic species, mammals can often be verified as the source of a sound with nearby surface sightings or from studies of animals in human care (e.g., Rogers et al., 1996).
Overall, validated sounds have been attributed to a much lower proportion of species from the speciose groups of aquatic invertebrates and fishes, than for marine mammals. Whereas almost all marine mammals are confirmed to produce sounds underwater, this behavior has been validated for fewer than 100 species of aquatic invertebrates (e.g., Popper et al., 2001;Coquereau et al., 2016) and approximately 1,000 fish species (Kaatz, 2002;Parmentier et al., 2017;Bolgan et al., 2020a;Looby et al., 2021;Rice et al., 2022); however, the former includes members of Alpheidae, the "snapping shrimp" family with over 500 species and the latter represents over two-thirds of fish families, implying many more species are soniferous (Parmentier et al., 2021). Fishes and invertebrates are typically more difficult to validate in the field than mammals (e.g., Sprague and Luczkovich, 2001;Riera et al., 2017), though visual confirmation is on occasion achieved (e.g., Lobel, 1992Lobel, , 1996Lobel, , 1998Lobel, , 2001Allen and Demer, 2003;Lobel et al., 2010;Parsons et al., 2013a) or inferred by weight-of-evidence from the species present at the time of recording and their behavior (e.g., Tricas and Boyle, 2014;Pyć et al., 2021), or by localization (e.g., Parsons et al., 2009;Mouy et al., 2018). Sound travels much farther than light underwater and efficiently through turbid waters that often prohibit visual source validation more than a few meters from an observer or camera, or even ranges of centimeters in turbid environments (Harvey et al., 2004;Jones et al., 2019). This is particularly problematic when the source in question is "small and cryptic, " found within an assemblage of several species' , at great depth, or within complex habitat. Moreover, many fish and invertebrate species are predominantly nocturnal, rendering simultaneous visual and audio observations arduous or impossible (e.g., Spence, 2017). Thus, while some sources have been confirmed, the majority of fish and invertebrate sounds and choruses remain anonymous, uncharacterized and largely unreported, as they do not comprise sounds of a project's target species. Recordings taken under controlled conditions (e.g., within tanks or aquaria) can provide confirmation of species' sound production (e.g., Sprague and Luczkovich, 2001) as well as other information on sound-producing behavior that could be challenging to collect in the field (e.g., Montie et al., 2017;Riera et al., 2018); however, assessment of the acoustic characteristics and behavioral context of these sounds requires additional consideration. The material, dimensions and background noise within a constrained environment, for example, affect the received signal (e.g., Akamatsu et al., 2002). Additionally, soniferous behavior may be affected by captivity, such as the acclimation time, surroundings and number of other individuals within the environment, among other factors (e.g., Holt and Johnston, 2014). The nature and extent of the effects of captivity has on recorded sounds and overall acoustic behavior may vary between species and potentially even individuals (e.g., Bolgan et al., 2020b,c).
Although substantial work has been conducted on freshwater species, predominantly on fishes and initially in aquaria (Gerald, 1971;Desjonquères et al., 2020;Grabowski et al., 2020;Linke et al., 2020;Roca et al., 2020;Higgs and Beach, 2021), the majority of efforts to record aquatic biological sounds have historically focused on the marine environment (Greenhalgh et al., 2020). Freshwater recordings present a variety of complexities that are less common in the marine environment, such as terrestrial, aerial and watersurface sounds from birds, insects, and road or air traffic (Erbe et al., 2018;Linke et al., 2020;Leon-Lopez et al., 2021).
In addition to the difficulties in identifying soniferous species, there is also potential variability in sound types and characteristics of sound production for a given species (e.g., McIver et al., 2014;Parsons and McCauley, 2017;Bolgan et al., 2020c). There are very few species where the entire suite of calls has been captured and even at a single location, full repertoires are rarely confirmed or reported. Further, numerous taxa are cosmopolitan, either as wide-roaming individuals, such as the great whales, or as broadly distributed species, such as many fishes. Some of these global (and regional) travelers exhibit dialects, or completely different signal structures among regions, several of which evolve over time (e.g., Parmentier et al., 2005;Garland et al., 2011; Figure 1).
Alongside active sound production for the purported purpose of communication, many aquatic species produce "passive sounds" as a by-product of other life-functions, such as eating, swimming, and crawling (e.g., Fish, 1948;Moulton, 1958Moulton, , 1960Moulton, , 1963Moulton, , 1964Uno and Konagaya, 1960;Mallekh et al., 2003;Radford et al., 2008;Rountree et al., 2018;Ajemian et al., 2021;Tricas and Boyle, 2021; Figure 1). These passive sounds may be less acoustically complex or distinct than active sounds; however, they still provide important contributions to the soundscape and have demonstrated ecological signal potential in select circumstances (Banner, 1972;Connor et al., 2000;Tricas and Boyle, 2014;Rountree et al., 2018). Thus, while collating global records of known sound production may be feasible to accomplish (e.g., for fishes; Looby et al., 2021), because of the variation in sound within and among species and individuals, the effort required to collect and maintain representative sounds for every species is a continuous and laborious process. Further, even when unidentifiable biological sounds are described in detail, there remains no global system with which to attempt to FIGURE 1 | Example spectrograms produced (1,024 point-long Hanning window, 0.9 overlap, frequency display 50-20,000 Hz, relative received levels) from: two simultaneous recordings of a humpback whale (Megaptera novaeangliae) song in (A) 20 m and (B) 40 m depth waters off Okinawa, Japan, (recording locations separated by ≈500 m, note the lack of high-frequency energy in B); (C) a complex call and (D) a single grunt sound from gulf toadfish (Opsanus beta); (E) two sounds from a 20-30 cm-long sooty grunter (Hephaestus fuliginosus); (F) one sound from a 7 cm-long spangled grunter (Leiopotherapon unicolor); (G) 4 s of sounds made a crawling kina urchin (Evechinus chloroticus) and (H) 4 s of sounds produced by a New Zealand paddle crab (Ovalipes catharus). Power spectral density axes in each spectrogram are relative and span 50 dB re 1 µPa 2 /Hz. Spectrograms are for comparative purposes and, as such, recording conditions and methods are not provided. All recordings sampled at 44.1 ksps except that producing panel (E), which was sampled at 48 ksps.
Frontiers in Ecology and Evolution | www.frontiersin.org characterize or identify them (Anderson et al., 2008;. Although some of these sources are obvious, pervasive, and readily observed in long-term recordings, many more are rare and of lower amplitude, often going undetected unless the observer is specifically searching for them (Mooney et al., 2020). Many studies are conducted with single-or limited-species objectives, although these recordings are often filled with a great diversity of sounds. Collectively there are now multi-millions of recording hours around the world that could potentially be assessed for a plethora of both known and, to date, unidentified biological sounds. Only recently have studies begun to address the groupings of such sounds, in the field of acoustic community ecology (Desiderà et al., 2019;Bolgan et al., 2020a;Di Iorio et al., 2021).

SOUND LIBRARIES
The provision of audio samples is an important activity as it is often difficult for a researcher to confirm that a sound they have recorded is the same as one that has been previously identified, based on a description in a journal or website. This is particularly true if the two were recorded under different environmental conditions. A library provides firsthand examples for comparison, preferably with a spectrogram that has clear annotations describing the specific time and frequency range of the target signals, along with sufficient metadata to facilitate comparison between user and library samples, to maximize the use of the library. The audio-visual combination provides the user with a good understanding of the call type (under the recorded conditions). This combination can be particularly important for high-biodiversity systems such as coral reefs, where even a short recording can pick up multiple animal sounds.
Several independent libraries of biological sounds, many of which either contain aquatic examples, or have an underwater focus, have been established around the world (see Table 1 for selected examples). Existing libraries often focus on species of interest that are targeted by the host institute's researchers and are often recorded from a particular phylum or more restricted taxon, with a smaller selection of opportunistically recorded species. A few libraries describe many known sound sources from a region as the basis for an article describing reported species distribution in the region, including standardized characteristics of each sound type for the species, with a link to a website where the sounds can be downloaded (e.g., Erbe et al., 2017). Other libraries are national and may be incrementally expanded by contributions of a handful of researchers with associated papers outlining sounds as they are observed ( Table 1). The FishSounds website project, for example, began with a systematized, global review of fish species examined for sound production (with or without documented sonifery) in the peer-reviewed and gray literature, which is now being expanded to include representative recordings of fish sounds contributed by researcher donations of known and unknown fish species (Looby et al., 2021). This is a significant step; however, this library currently only accepts recordings of fish sounds that can be associated with some form of published reference.
In general, existing libraries are "silos"-lacking the cohesiveness that a taxa-independent global library or network could provide. Moreover, PAM is not a traditional method of categorizing or preserving information on diversity. Thus, keeping such libraries up to date has not been a focus and, in recent years, many libraries have lagged in their updates. Sustainability and accessibility of a sound collection is critical, particularly when it is tied to a single researcher, rather than a host institution.
Finally, few libraries identify what is missing from their catalogs. While this is a more complex task for fish and invertebrates, examples like Cornell University's Macaulay Library have a list of target species for which they have fewer than 10 recordings. As our list of confirmed sources and known soniferous species increases, so does the ease with which the unconfirmed sources can be identified via a "weight of evidence" approach.
Here, we provide justification for the creation of a global bioacoustics platform that integrates and expands on existing libraries by describing five critical characteristics of such a program and what its extensions can bring to acoustic research and monitoring. The benefits of a global sound library include: (1) a full inventory of known underwater sound sources; (2) a baseline of unidentified biological sounds; (3) the foundation for a training platform for detection and classification algorithms (at both a source and soundscape level); (4) standardized metadata for understanding how, when, and where the recordings were made; and (5) an open-access (including for citizen science/public users) database to make aquatic biological sounds more accessible to the general public and allow them to upload sounds and add to the dataset (see Figure 2, for a conceptual diagram of such a potential integrated library). In addition to these benefits, the global sharing of such an expansive database-from potentially numerous contributors-holds the potential for multiple broadscale collaborations on regional and international trends of PAM detections. Similar efforts have been achieved in related fields, like acoustic telemetry (e.g., Hussey et al., 2015;Sequeira et al., 2019;Lédée et al., 2021;Matley et al., 2021), and visual censusing of marine fauna (e.g., Langlois et al., 2020), fostered by open forums and working groups to develop such research. We also discuss some of the technical challenges in developing this platform, historical hurdles that may have prevented previous attempts at such global data sharing environments, and a potential way forward for building this resource.
The discussion presented in this paper originated within the "Working Group on Acoustic Measurement of Ocean Biodiversity Hotspots" of the International Quiet Ocean Experiment (Boyd et al., 2011), an international program of research, observation and modeling formed to better characterize and understand ocean sound fields and the effects of sound on marine life. This collaboration was then expanded to include authors that are involved in the development, presentation and maintenance of existing underwater bioacoustics repositories;  (Fish and Mowbray, 1970) 155 sound samples of 153 fish species from the Western North Atlantic (Fish and Mowbray, 1970) The SOUND i.e., an overall partnership that represents various stakeholder groups-including bio-and ecoacousticians, and research specialists of a variety of taxa and ecosystems. All of these authors are aware of the benefits a global library of underwater biological sounds offers to the scientific, environmental management and public sectors. This paper, however, is not meant to dictate the exact form such a global underwater biological sounds library should take, but is meant to renew and revitalize discussion on the topic, present some of the many considerations such an effort would require, and describe a possible path forward for us or others to undertake as opportunities and interests arise. A more detailed discussion of such a program, involving a wider network of contributors, is planned through upcoming stakeholder engagement and scoping workshops.

Applications of an Inventory of "Known" Sounds
Creating a reference library of aquatic sounds from known origins will broaden our reference list for confirming the sources of sounds that appear in recordings and help expand our knowledge of aquatic acoustic diversity, as well as our understanding of taxonomic biodiversity and ecology. Bringing known sounds together in a unified depository or single platform with links to multiple existing databases facilitates easy comparison among species, locations, species repertoires, and recording methodologies. Studies often focus on a single or a limited number of species and, therefore, so does a project's data analysis. Although the advent of multi-species automated detection platforms has brought significant advances in the terrestrial environment (Potamitis, 2014;Sueur and Farina, 2015;Farina et al., 2018;Kahl et al., 2021), such analyses are currently lagging behind in the underwater environment. The potential to process each dataset for sources beyond the focal species is rarely undertaken due to the funding and effort required as well as a lack of individual knowledge about all the different sounds and sources that exist. Easing this burden requires a collective effort to detect, identify, characterize, and collate sources. A global reference library of underwater biological sounds would increase the ability for more researchers in more locations to broaden the number of species assessed within their datasets and to identify sounds they personally do not recognize. Such access would ultimately lead to a description and catalog of acoustic biodiversity around the globe, and an increased understanding of acoustic ecology. A global database could serve broader questions, like determining universal trends in underwater sound production, while individual, specialized repositories could continue to inform and detail other topics, such as documenting the presence of soniferous species in a particular region.

Spatiotemporal Species Mapping
The expansion of PAM data collection has increased our understanding of spatiotemporal patterns of individual species' presence and acoustic behavior. As a result, reported distributions of these species are being expanded. Even some of the great whales are being found in places they were not expected (Allen et al., 2021), and occasionally a new species (Rosel et al., 2021) or a new sound Cerchio et al., 2020) is discovered. This fact could prove vital for soniferous fauna, as our ever-changing climate ensures that many species are modifying their distributions and broadening or reducing their ranges (e.g., Scheinin et al., 2011;Ramirez et al., 2017;Bonebrake et al., 2018). Biologically important areas can be mapped; spawning grounds, essential fish habitat, and migration pathways can be delineated (e.g., Luczkovich et al., 1999;Rountree et al., 2006;Mann et al., 2009;Morano et al., 2012a;Schärer et al., 2014;Bertucci et al., 2015;Lammers and Munger, 2016;Karaconstantis et al., 2020); the timing of reproductive activities can be associated with the environment (e.g., Mann and Grothues, 2008;Parsons, 2010;McWilliam et al., 2017;Zarada et al., 2019); and displacement from preferred habitats due to anthropogenic activities and noise, such as that from shipping lanes and exploration surveys, can be mapped (e.g., Tyack, 2008;Castellote et al., 2012;Rako et al., 2013). These and other questions can be queried on broader scales if we have a global catalog of sounds.

Comparisons of Signal Structure
Comparison of sounds from a single species across broad areas and times provides the ability to understand signal diversity and evolution, and to gain insights into species ecology (e.g., Tellechea et al., 2010). Fin whale (Balaenoptera physalus) calls, for example, differ among populations (Delarue et al., 2009) between the Northern and Southern hemispheres (Gedamke and Robinson, 2010;Širović et al., 2013Aulich et al., 2019), as well as over seasons (Morano et al., 2012b). Pilot whales (Globicephala melas), on the other hand, produce similar call types across the hemispheres even though populations' home ranges do not (or no longer) cross the equator, raising interesting questions about their acoustic ecology and evolution (Courts et al., 2020). Fishes may also develop "dialects, " such as the different acoustic characteristics of agonistic sounds produced by the skunk anemonefish (Amphiprion akallopisos) in Madagascar compared with those in Indonesia (Parmentier et al., 2005). Cultural evolution of humpback whale (Megaptera novaeangliae) song has been observed across ocean basins, providing greater understanding of population interactions across the Pacific Ocean (Garland et al., 2011) and around the coast of Australia (Allen et al., 2018). Call structure, and source spectra of blue (Balaenoptera musculus) and pygmy blue whales (Balaenoptera musculus brevicauda) around the world evolve through time, with peak frequencies changing each year (e.g., McDonald et al., 2009;Gavrilov and McCauley, 2012), such that detection algorithms developed in 1 year may not be successful some years later; thus, keeping libraries up to date aids classification efforts. Sound production between similar species within a taxonomic family can also be compared, such as those of mulloway (Argyrosomus japonicus) and black jewfish (Protonibea diacanthus) in Australia (Parsons et al., 2012(Parsons et al., , 2013b(Parsons et al., , 2016 with those of French meagre (A. regius) in Europe (Lagardère and Mariani, 2006;Bolgan et al., 2020b), or those of various species of toadfishes in the Pacific, Indian, and Atlantic oceans (Thorson and Fine, 2002;Rice and Bass, 2009;Mosharo and Lobel, 2012;Alves et al., 2016;Staaterman et al., 2018;Pyć et al., 2021), to better understand the variation within families.

Acoustic Communities
There is increasing evidence that the study of acoustic communities, based on acoustic characteristics of the sounds emitted by animal communities, provides ecologically relevant information (Francis et al., 2009;Farina and James, 2016;Desiderà et al., 2019;Mooney et al., 2020). Soundscapes provide unique opportunities to investigate the biodiversity and community of soniferous species, frequency and temporal niche partitioning, and organism-environment relationships (Ruppe et al., 2015;Di Iorio et al., 2021;McKenna et al., 2021). However, this field is in its infancy and requires a catalog of identified sounds to develop reliable and time-efficient classification techniques that will be necessary for describing the acoustic communities and relating them to the underlying animal assemblages (Mooney et al., 2020).

Environmental Noise
Anthropogenic noise, such as that from vessels, exploration, construction, and aerial vehicles (Reine et al., 2014;Newhall et al., 2016;Pangerc et al., 2016;Erbe et al., 2018;Chion et al., 2019;McCauley et al., 2021;Parsons et al., 2021), is a growing pollutant in the underwater environment and high ambient noise levels, such as those found in areas of intense human activity, inhibit signal detection (Hildebrand, 2009;Duarte et al., 2021). If the observer knows a target species' signal characteristics, these sounds may be more easily detected, but without prior knowledge of either presence or structure of sounds, listening through the noise can be difficult. This has been highlighted by the recent COVID "anthropause" experienced at various aquatic locations around the world (e.g., Bates et al., 2021;De Clippele and Risch, 2021;Dunn et al., 2021;Gabriele et al., 2021;Ryan et al., 2021), where removal of the anthropogenic component of some soundscapes has provided an opportunity to observe sounds (and therefore presence) of marine fauna that might otherwise be lost in the noise (e.g., Pine et al., 2021). However, it is not just anthropogenic noise that limits acoustic detection of marine fauna. The ocean is naturally noisy and geophysical noise (such as from wind and ice) exceeds anthropogenic noise in many regions and seasons (e.g., Farcas et al., 2020;Erbe et al., 2021;Sertlek, 2021). The number and intensity of storms and extreme weather events are expected to increase with climate change (e.g., Cheal et al., 2017), inevitably contributing further noise to the underwater environment (e.g., Zhao et al., 2014;Ashokan et al., 2015;Zhang et al., 2018). A reference library of sounds, as well as detection algorithms, would significantly ease the detection of sounds in low SNR environments.

Assisting Unknown Source Identification
A sound catalog can provide a reference for comparison with unknown sounds to assist in their source identification, potentially through an online tool, within the library. The associated metadata that accompanies recordings (discussed below) may also contribute to a weight-of-evidence approach in identifying sound sources. Sainburg et al. (2020), demonstrated the use of unsupervised learning to assemble acoustic signals displaying similar spectral-temporal modulation features into groups. Such exploratory data analysis tools will assist in identification of sound sources; however, their design and functionality may be dependent on the amount of data (general and source-specific acoustic data and validation data), the distribution of the source signal and potential sources, and the characteristics of the signal, among other factors. In selected species, where sound production has not yet been confirmed, it may be possible to use signals reported from a closely related species to assist in detecting sounds or choruses from the targeted species. Calls produced by terapontid fish species, for example, are often similar, but not all species within the family have been reported to produce sound (Parmentier et al., 2016;Looby et al., 2021). Although not definitive, sound production by related species can provide evidence toward soniferous behavior, though caution is warranted as species that appear morphologically similar can be acoustically different, such as Ophidiformes (Mann et al., 1997;Parmentier et al., 2006Parmentier et al., , 2010.

Basis for Machine Learning Development
A library of reference sounds requires only a handful of examples for each individual sound type. In contrast, a dataset for training artificial intelligence (AI) requires a far larger number of signals, ideally several thousands of examples (e.g., Madhusudhana et al., 2020, used > 11,000 replicates of the same call type to build robust detectors). The library itself can be of benefit as it provides the basis from which the AI datasets can be developed either through providing numerous examples for directly training models, or for facilitating event mining from previously collected data streams (Xie et al., 2008;Zhang et al., 2013). For example, Miller et al. (2021) initiated an open-access library of annotated recordings to train and evaluate automated detectors of Antarctic blue whale and fin whale (B. physalus) calls. The library was designed to include recordings from a broad range of instruments, locations, environmental conditions, and years, to ensure that robust detectors can be developed and tested across a suite of recording conditions. However, given the scope of this library initiated through the Southern Ocean Research Partnership Fin and Blue Whale Acoustics Group (Van Opzeeland et al., 2014), it is unlikely to be extended to other regions and taxa.
As the number of samples of a given sound type reaches critical mass, and recordings of them are also sufficiently rich in signals recorded under different conditions (e.g., SNR, acoustic environment, recording methods), that sound type can be flagged as available for the development of detection algorithms. Indeed, presenting the information on current sample numbers within each sound type could promote contributors to target them, increasing the likelihood of collectively bringing the dataset up to the required level. If it can be achieved, a library that includes an entire species' sound repertoire will assist in validating detection algorithms and provide the ability to expand these algorithms to datasets where the call type was not the original target for analysis, and conduct this on a global, rather than local scale. For species that produce sounds that change with time, historical data and continual updating of the library could assist in predicting future evolution (Gavrilov and McCauley, 2012).

Database of Unknown Sounds
A database of unidentified sounds is, in some ways, as important as one for known sources; as the field progresses, new unidentified sounds will be collected, and more unidentified sounds can be matched to species. These sounds and the times and locations of their recording can form a basis for future identification and ease mapping of the species' distribution once the source has been confirmed. Given the increasing rate of data collection, it is better to start building a map of these sounds as soon as possible. The library can also provide evidence to help test hypotheses of source species for unknown sounds if there are sufficient recording locations that can be compared to distribution maps of potential source species (e.g., those produced from catch data or visual census).
Although the analysis of acoustic communities benefits from a baseline of cataloged sounds, most sound sources that contribute to the soundscape remain uncertain and most libraries only archive signals with known species' identity. In addition, we know more about the sounds of endangered or commercially important species than those of commonly encountered species (Luczkovich et al., 2008;Popper and Hawkins, 2019). This knowledge gap has impeded effective use of underwater soundscapes in monitoring marine biodiversity, but much information on acoustic ecology can still be gleaned from categorized sound types of unknown origin (Le Bot et al., 2015;Rountree et al., 2019;Bertucci et al., 2020;Bolgan et al., 2020a;Di Iorio et al., 2021). A library to archive unknown sounds and their recording times and locations will be crucial for guiding future studies of marine bioacoustics and biodiversity. This is especially important in areas that are rarely investigated or where source identification is particularly problematic, such as the twilight and midnight zones, where a description of unknown sounds can give us insights on biodiversity in the deep ocean (e.g., Mann and Jarvis, 2004;Rountree et al., 2012;Lin et al., 2019).
There have been considerable advances in the fields of facial and voice recognition that have advanced public use of phonebased apps to identify music, plants, and the calls of frogs and birds (Kahl et al., 2021). However, this success has been largely due to the enormity of the respective databases from which AI algorithms can be trained, a goal that has only recently become possible in the aquatic environment and only for selected call types. The sheer enormity of data now collected in many underwater acoustic studies, together with the myriad of signals often present and the extreme amount of time required to search these records in more "standard" methods, means there is a clear opportunity for AI to improve efficiency and extract more information from these datasets. Increasingly, neural networks and other AI methods are being used to detect marine mammals in historical recordings such as of humpback whales across the Hawaiian archipelago (Allen et al., 2021), and multiple cetacean species along the west coasts of Canada and Australia (Mellinger and Clark, 2006). Detections of pulse trains, typically based on previously identified or grouped inter-pulse timing (the time between pulses of sound) are being successfully applied to count the number of echolocating individuals and fish sounds within datasets (Bahoura and Simard, 2010;Le Bot et al., 2015;Ibrahim et al., 2018).
Many machine learning techniques have been developed under the framework of AI (e.g., Shamir et al., 2014). Application of these techniques has begun to coalesce and recent studies now extract multiple features to detect the different types of signals, such as Malfante et al. (2018), who tested 84 extracted features to detect and characterize four general classes of fish sounds: (1) impulsive (pulses with signal separation > 1 s); (2) trains of > 15 pulses audible as either a single tone or a series of knocks in quick succession; (3) wide-band signals of 10-30 s in duration; and (4) short signals with harmonic structure. Each of these sound types required the detection of different features and individual machine learning. The recent development of deep learning has significantly reduced the work required in feature extraction (Shiu et al., 2020;Kahl et al., 2021;Waddell et al., 2021) and even made end-to-end learning feasible, in which AI models automatically learn features from raw audio to perform signal detection and classification. Successful examples include avian calls (Bravo Sanchez et al., 2021), odontocete clicks (Luo et al., 2019;Roch et al., 2021), and frog calls (Xie et al., 2020). Hand-selected features, such as peak frequency and frequency bandwidth measured manually or automatically, are no longer a necessity.
Developing a global database that can assist in modifying or developing algorithms in the underwater environment holds significant potential for detecting, classifying, and quantifying spatiotemporal distribution and abundance of aquatic fauna. Such a global database of known and unknown sound sources can benefit both supervised and unsupervised machine learning. Supervised machine learning is effective when training data of detection/classification targets are available. However, most underwater biodiversity assessments are unable to make sure all sound sources are already covered in the training database. Unsupervised machine learning may help discover the structure of sound categories from a substantial number of unlabeled recordings and reduce the effort required in manually annotating signal types and characteristics (Frasier et al., 2017;Phillips et al., 2018;Lin et al., 2021;Ozanich et al., 2021). Deep-learning models that have learned the structure of labeled and unlabeled recordings archived in this global database will be adaptable to other applications, via transfer learning (Yosinski et al., 2014). Therefore, this global library will benefit the detection and quantification of signal types, which may be added to a suite of acoustic metrics, to be used collectively as scene classifiers, to routinely characterize the soundscape.
Any AI model must learn the difference between target signals and background noise. Voice recognition techniques often begin with clean recordings and synthesize training data by adding noise to the known reference (Lu et al., 2013;Xu et al., 2015). These are then used to train models to extract speech from recordings that contain real noise and target signals in long-term recordings. Such data augmentation techniques have proven effective in improving the performance of AI models and have been widely applied in speech and music enhancement models (see review in Lin and Tsao, 2020). Recordings of biological sounds with high SNR will therefore be crucial to the development of a marine fauna AI database. The ultimate goal of an AI algorithm is that it can accurately classify sounds through any or all recordings of any duration and noise level.
An audio database for training AI models requires large numbers of recordings for one species or sound type of unknown origin. The goal of building such a database is to train a model that can effectively recognize species-specific or sound type acoustic features. This requires signals that have been recorded and processed to a certain set of criteria. Although it is difficult to assess how many recordings will be needed, in general the greater the number of sound samples and the higher the sound quality, the more reliable and precise the automatic classification becomes, as the algorithm learns and improves its performance with increasing data availability (Zhong et al., 2020). For species with more complex vocal repertoires, greater amounts of training data further improve classification. Thus, the prerequisite to apply these techniques is a robust and representative training dataset, which is what the library we propose here could provide.

Citizen Science
AI has facilitated the development of many highly popular image-based animal, plant and music recognition applications (apps). Possibly the best known is iNaturalist, 1 though other more taxonomically focused applications are emerging (e.g., Merlin BirdID, 2 WikiAves). 3 The iNaturalist app started as a crowd-sourced community, where people uploaded animal or plant photos to be identified by other users, and has become a place where images are identified by artificial intelligence. In the biological sounds space, FrogID 4 and BirdNet 5 have shown the possibility of using machine learning with signal processing to allow researchers and citizen scientists alike to identify frogs and birds by recording calls with a phone (Rowley et al., 2019;Kahl et al., 2021).
Much like BirdNet and FrogID, a library of underwater biological sounds and any automated detection algorithms would be useful not only for the scientific, industry and marine management communities, but also for users with a general interest. Acoustic technology has reached the stage where a hydrophone can be connected to a mobile phone so people can listen to fishes and whales in the rivers and seas around them. Therefore, sound libraries are becoming invaluable to citizen scientists and the general public, with signal-processing automated detection algorithms supporting the decision networks behind apps like FrogID and BirdNet for someone to record a sound and identify the source. FrogID has over 50,000 recordings uploaded for the > 240 species of frogs in Australia, and the Cornell Lab of Ornithology's BirdNet app has been downloaded over 1,000,000 times and has records of 3,000 species of bird calls across 40 countries (Kahl et al., 2021). As evidence of this type of application moving into the underwater world, the River Listening app, 6 which began in Australia (Barclay et al., 2018), encourages the general public to record sounds in rivers and coastal waters to listen to the sounds of fishes. Further, Chapuis et al. (2021) showed the utility of waterproof recreational recording systems (such as GoPros) to collect information on underwater soundscapes and in particular, recording and cataloging biological sounds, while Lamont et al. (2022) highlighted how low-cost alternative hydrophones and recording systems (such as the Hydromoth) are becoming increasingly available to scientists and the general public. These types of systems can provide valuable PAM data; however, the calibration, variability in sensitivity and directionality, and low signal-to-noise ratios mean additional considerations must be made, to be able to use the data within the library, in particular, for sound analysis purposes.
Increased sampling efforts from citizen scientists could be invaluable for the detection of vocal fauna in coastal and inland waters. For example, FishBase 7 uses community input to provide information on each species (Froese and Pauly, 2021), while Redmap (Range Extension Database & Mapping project) 8 is more explicit, inviting the general public to spot, log, and map marine species that are uncommon in Australia, or along particular parts of the coast to monitor changes in species distribution. Future online libraries, such as the Open Portal to Underwater Soundscapes (OPUS) 9 would be expected to facilitate public contributions, similar to WhaleFM, 10 a citizen science project that focused on categorization of call types produced by two cetacean species.

Metadata and Functionality
Creating a library with established metadata and information criteria will help standardize the format in which signals are reported, optimize use of the library, and ease future classifications of sounds (Frazao et al., 2019). It is important to provide guidelines on all the pertinent information that could be provided by a person collecting the original recording and that should be included when it is presented, for example, as a static spectrogram on a library website (Parsons, 2010;Warren et al., 2018;Frazao et al., 2019;Looby et al., 2021;Miller et al., 2021). Such metadata standardization can also build confidence in the library's utility and attract support from national bodies for its application, such as the ADEON noise reporting standards (e.g., Ainslie et al., 2017). Each criterion may not be required for entry of a sample into the library, but the level of information supplied determines the level of potential use of a sample within the database. There are three criteria that determine how useful a recording could be to the database and how it could fit with known information about the species and its soniferous behavior around the world. These relate to the information available about the recording and the source species: • Metadata pertaining to the specific recording (e.g., recording equipment and pre-amplifier used, calibration, model, and sensitivity; recording settings such as gain, sampling rate, number of bits and duty cycle; recording methodology such as deployment configuration; environmental conditions such as depth and bottom characteristics; location and timing), and how it is presented (e.g., un-/calibrated waveform, spectrum with specified window lengths, resolution, FFT/DFT size, and overlap, or the code and settings associated with a plugin automatically generating visualizations of the recording).
Recordings taken under controlled conditions (e.g., within tanks or aquaria) have additional acoustic and behavioral considerations and therefore require additional metadata, such as the tank material and dimensions, acclimation time and number of other individuals present. • Information about the source species in general, such as recording-specific information (e.g., behavioral context if concurrent visual observations were made); more general information may be supplied by the contributor or updated by the host, through continued review of literature (e.g., known distribution, auditory ability, known sound types and their characteristics, and sound production mechanism and typical behavioral contexts associated with sound production, if known). This information provides context to place calls into a species' known behaviors. • Associated information about the location relative to the broader region (e.g., description of community species composition, habitat, and local soundscape information). This information provides a broader picture of how the local environment may have affected the animal producing the sound.
Defining these requirements, their level of detail, and the final platform design requires the collective expertise of biologists (who have experience related to the potential numbers and types of sounds a species may exhibit), acousticians (who appreciate the impact that propagation losses, sampling methods, and processing techniques may have on the characteristics of the audio clips), signal processing experts (who develop and apply detection, classification, and recognition algorithms, and who can detail the needs of turning example signals into a database for automated detection), and data scientists/database developers (who can develop a scalable and searchable database that can be effectively used and accessed by a broader user community).
In such a way, data are optimized at a quality that is useful for future applications, such as AI development or global-scale meta-analyses and reviews of sound production. Users would benefit from not only the sounds themselves, but the associated metadata about the sounds (Teixeira et al., 2019;Kahl et al., 2021;Lin et al., 2021) and, if there are multiple recordings, a classification of the sound type in which it fits (Sainburg et al., 2020). This information can assist in categorization of an unknown sound and provide context around the recording from an environmental, methodological, or behavioral perspective. However, the level of data made available in the library for each species and each recording depends on the information provided by the contributor, and researchers from fields with different objectives, backgrounds, and experiences, who typically report information in different ways. The most common example in bioacoustics is the classification of signals by phonetic description (onomatopoeia), such as the "thwop, " "muah, " and "boop" of humpback whale social sounds (Recalde-Salas et al., 2020), and various onomatopoeic descriptions of species-specific and unidentified fish sounds (e.g., Tavolga, 1971;Thorson and Fine, 2002;Staaterman et al., 2018;Waddell et al., 2021). Such calls may also be reported in a physics-focused context that include categories of frequency-or amplitude-modulated or continuouswave signals .
Finally, the library and its potential as an AI database would benefit from a scalable design that allows frequent expansion and a web platform that can be continually updated. The database would include a user-friendly interface to investigate data and upload sounds, with automated quality control. Presentation of the data is also a consideration, not only with respect to the variety of signals, but also individual recording locations and their temporal distributions. Data portals, such as the interactive maps of the North West Atlas in Australia, 11 allow viewers to choose any study site in a map and view a synopsis of species composition and a video snapshot of the site, along with environmental data. This could also be achieved from an acoustic perspective. For example, the data portal of the Integrated Marine Observing System 12 allows viewers to peruse long-term spectrograms of recording sites for a userdefined period. An interactive map that can incorporate all these options becomes a user platform for the acoustic data. OPUS is a recent initiative driven by the International Quiet Ocean Experiment that is currently under development and includes some of these functions. This program was created to share underwater soundscapes through audio and synchronized spectral visualizations at staggered temporal resolution. It allows viewers to select locations from a map and explore local soundscapes while also logging events of interest, thereby inviting the public to participate in creating overall logs with acoustic events that can support further processing of the data.

HISTORIC HURDLES AND CHALLENGES
This is not the first time a global approach to data sharing has been suggested in underwater acoustics research. In recent years, multiple international research and blue economy-focused workshops have repeatedly identified the need for global sharing of data, technology, and best practices, to grow techniques and ensure that economic, environmental, and social benefits, developed through the application of knowledge, are realized to the benefit of all (e.g., World Wildlife Fund [WWF], 2017; European Commission, 2018). Most recently, the emergence of COVID-19 has provided a perfect example of the need for international transparency and collaboration to maximize research opportunities and rapidly respond to urgent needs, and it has highlighted how achievable this approach is with modern technology (Apuzzo and Kirkpatrick, 2020). In an acoustics forum, among other workshops, the special sessions of the Acoustical Society of America [ASA] (2018) identified that there are "an increasing number of applications of machine learning methods in ocean acoustics, particularly when working with large data sets" and discussions focused on data access, code-sharing, and reproducible research. 11 https://northwestatlas.org/nwa/map/gallery 12 https://portal.aodn.org.au/search An integrated sound sharing platform begins with three main areas of development to focus and pool efforts: largescale archives of annotated and unannotated audio data, the open-access reference library of identified and unidentified sound sources, and data mining processes, including AI algorithms. Acoustic repositories and data portals, such as OPUS and IMOS, are becoming increasingly common. Importantly, initiatives such as the allocation of 15 petabytes for a passive acoustic data portal by the National Center for Environmental Information (NCEI) of the National Oceanic and Atmospheric Administration (NOAA; United States of America) illustrate the growing appreciation and realization of this need at the national scale .
Reference libraries have existed on various scales for many years and advances in technology are quickly increasing their ability to expand and integrate user contributions. The Detection Classification, Localization and Density Estimation (DCLDE) and the Detection and Classification of Acoustic Scenes and Events (DCASE) workshop series (2003-2022 and 2013-2022, respectively) have focused on data mining and analytical approaches. These groups have been the main producers of public datasets to advance machine learning applications of biological sounds in the ocean (Frazao et al., 2019) and the workshops regularly provide training sets to test detection algorithms under different conditions (e.g., various frequency-dependent SNR and propagation losses). 13 Their outputs have shown what is achievable from data-sharing of comparatively "small" platforms (previously up to 10 TB), which complement the sharing of open-source code that individuals are increasingly providing with the publication of analytical works (e.g., Bergler et al., 2019;Bermant et al., 2019;Madhusudhana et al., 2020;Lin et al., 2021). Together these activities highlight the potential for applications of data-sharing of acoustic information to be applied to larger repositories that are now more achievable with cloud-based options, such as AI for Social Good, 14 or government supported platforms, such as the NCEI.
Although the development of a global integrated and openaccess underwater sound reference library, repository and sharing platform has been suggested previously, despite these discussions and the increasing appearance and support for individual components of such a program, it has not been fully realized on an international level. The main barrier to creating an international database of aquatic bioacoustics may be as simple as sourcing adequate funding to achieve such a sizable task, due to a lack of awareness of the value and importance of the product among organizations with the financial resources to support its creation and continuation.
To make a global underwater sound library a success, broader engagement, buy-in, and support of the scientific community will be needed, as well as providing incentives for individuals to contribute their sounds and algorithms to the library (e.g., Bradbury et al., 1999;Gaunt et al., 2005). There are several nontrivial hurdles to establishing this buy-in. Firstly, researchers often need to be convinced about the value of open and accessible science that may counterbalance more individualistic benefits associated with their intellectual property and therefore encourage contribution of recorded sounds to a repository; this parallels the ongoing broader scientific cultural change toward promoting data sharing and accessibility. A repository that provides a way to have example sounds as citable data (such as through providing a DOI number) further motivates individuals to contribute by ensuring they receive appropriate future credit for their original recordings; however, this is matched with the consideration that in some cases, contribution may require signing over copyright and access rights for that acknowledgment. Secondly, a repository needs to reduce burdens for individuals to contribute sounds and provide a system that can easily ingest audio and relevant metadata (Bradbury et al., 1999). A third challenge is raising the awareness that many individual archives are not as permanent as individuals think; analog media often degrades over time (Gaunt et al., 2005) and hard drives are not immune from failure, so depositing sounds in a sustainable repository is an urgent need, particularly for older recordings. One example of such archiving is the recovery and digitization of the fish sound recordings taken by Fish and Mowbray (1970), as described in Rountree et al. (2002). Launching a new library is particularly taxing as it requires building the interest of potential contributors to maximize their donations, while having limited outputs to offer initially. This could be alleviated by integrating efforts from existing libraries and archives, rather than initiating an entirely new database, which will also increase the library's appeal to potential funding sources.
There may also be more nebulous factors that have limited the provision of appropriate funding, including the likely duration of the program (i.e., including long-term planning and on-going resources to maintain the platform) and facilitating the repeated meeting of numerous global partners needed to identify and agree on its structure and criteria. Securing the longevity of the program is vital to the usefulness of the platform as libraries that are not scalable, well-maintained, and continually updated can quickly become redundant or outdated. The world's increasing awareness around the environmental costs of data storage and processing mean that consideration of carbon neutrality will also be a key factor in the design and longevity of the program.
Passive acoustic research is now, it appears, rapidly approaching a nexus point. The changing environment and decreasing biodiversity are compelling the documentation of baseline acoustic observations. Technical advances associated with data collection and an increasing number of researchers and institutes collecting PAM data are providing the ability to create bioacoustic databases. Concurrently, awareness of the importance of acoustic cues to aquatic fauna, the impacts of noise on them and the potential for acoustic communities to provide an indication of ecosystem health has reached a stage where PAM is becoming appreciated as a mainstream data source across more species and ecosystems than ever. Finally, public interest and access to user applications means citizen scientists can drive widespread knowledge sharing. Now is the time to facilitate that progress by gathering the acoustic, ecological, and bioinformatic community together to realize an aquatic-sounds sharing platform.

FUTURE STEPS
The development of an international platform for sharing acoustic data is non-trivial and requires identifying and describing a number of inter-dependent factors including: (1) sources and protocols for securing and maintaining significant funding at national and international levels; (2) global interdisciplinary collaboration and stakeholder consultation to develop and agree on criteria for data supply and reporting and system configuration that produce the most useful, yet user-friendly, environment; (3) an appropriate scalable platform on which the facility can be hosted; (4) an open forum to facilitate open access and common development of AI algorithms; (5) continual system management and quality assurance; (6) establishment and agreement on the use of data and metadata standards; and (7) on-going promotion and engagement to ensure maximum use, such as open working groups to foster international collaborations focused on global spatiotemporal trends in detected aquatic fauna. These are multidisciplinary tasks requiring input from bioand eco-acousticians, bioinformatics experts, AI engineers, web engineers, and stakeholders. To begin our journey along this shared pathway, we recommend a multi-disciplinary workshop to detail all the requirements for developing an appropriate library/database to fulfill the needs of all that may wish to access it and to detail the resources needed to support the work. Such an effort is critical and timely as we enter the UN Decade of Ocean Science for Sustainable Development.

AUTHOR CONTRIBUTIONS
MP, THL, TAM, CE, FJ, SoL, SiL, ML, AL, SLN, IVO, CR, ANR, LS, JS, EU, and LDI contributed to the conceptualization, writing, preparation, review, and editing of this manuscript. All authors have read and agreed to the published version of the manuscript.
FUNDING Support for the initial author group to meet, discuss, and build consensus on the issues within this manuscript was provided by the Scientific Committee on Oceanic Research, Monmouth University Urban Coast Institute, and Rockefeller Program for the Human Environment. The U.S. National Science Foundation supported the publication of this article through Grant OCE-1840868 to the Scientific Committee on Oceanic Research.

ACKNOWLEDGMENTS
The IQOE Science Committee created the Working Group (WG) on Acoustic Measurement of Ocean Biodiversity Hotspots to address the examination of marine acoustic diversity. We thank them for their initiative and support. We would also like to thank the reviewers for their time and constructive comments in consideration of this manuscript.