How Loud Can you go? Physical and Physiological Constraints to Producing High Sound Pressures in Animal Vocalizations

Sound is vital for communication and navigation across the animal kingdom and sound communication is unrivaled in accuracy and information richness over long distances both in air and water. The source level (SL) of the sound is a key factor in determining the range at which animals can communicate and the range at which echolocators can operate their biosonar. Here we compile, standardize and compare measurements of the loudest animals both in air and water. In air we find a remarkable similarity in the highest SLs produced across the different taxa. Within all taxa we find species that produce sound above 100 dBpeak re 20 μPa at 1 m, and a few bird and mammal species have SLs as high as 125 dBpeak re 20 μPa at 1 m. We next used pulsating sphere and piston models to estimate the maximum sound pressures generated in the radiated sound field. These data suggest that the loudest species within all taxa converge upon maximum pressures of 140–150 dBpeak re 20 μPa in air. In water, the toothed whales produce by far the loudest SLs up to 240 dBpeak re 1 μPa at 1 m. We discuss possible physical limitations to the production, radiation and propagation of high sound pressures. Furthermore, we discuss physiological limitations to the wide variety of sound generating mechanisms that have evolved in air and water of which many are still not well-understood or even unknown. We propose that in air, non-linear sound propagation forms a limit to producing louder sounds. While non-linear sound propagation may play a role in water as well, both sperm whale and pistol shrimp reach another physical limit of sound production, the cavitation limit in water. Taken together, our data suggests that both in air and water, animals evolved that produce sound so loud that they are pushing against physical rather than physiological limits of sound production, radiation and propagation.


INTRODUCTION
Sound is the medium through which animals, including humans, can communicate complicated and unambiguous signals: from laughter when we are happy, to terrified screaming when we fear for our lives. From a baby babbling whilst practicing speech, to Feynman presenting his famous "Lectures on physics." Humans, especially, are capable of combining vocal utterances into languages able to convey our most complicated concepts (Fitch, 2005(Fitch, , 2012. Sound production is critical to the social communication and survival for many arthropods and the majority of vertebrates. Almost 10,000 bird species, 7,000 frog species, 6,000 mammal species, and an unknown number of fish and arthropod species, have evolved the ability to produce sounds, many with highly specialized organs (Bradbury and Vehrencamp, 2011), driven by complex motor patterns, and executed by exceptional muscles (Elemans et al., 2008(Elemans et al., , 2011Mead et al., 2017). Sound plays a pivoting role in many behaviors, including courtship and territorial display signals in insects, fish, frogs, birds and mammals, and orientation and prey capture in echolocating animals. No other communication modality combines the accuracy, speed, and richness of communication over long distances as does sound, both in air and in water (Bradbury and Vehrencamp, 2011).
One critical acoustic parameter for communication is sound pressure amplitude or source level (SL) of the animal vocalizations. SL affects the range of vocal communication in a network or the range of object detection and interpretation in echolocation, because with increasing SL animals can detect sound signals in ambient noise at longer ranges. Even though many animals may not benefit from producing loud sounds, some avian and mammalian species produce particularly high SLs. The term loud here refers to high sound pressures, which is different from, and should not be confused with loudness, a term reserved in psychoacoustics for the perceived level of a sound (Troscianko, 1982). Interestingly, in air, the highest reported SL values do not seem to exceed 120 dB peak re 20 µPa at 1 m (Surlykke and Kalko, 2008;Podos and Cohn-Haft, 2019), which suggests that there are certain limitations to produce high sound pressures. However, direct numerical comparison of published SL amplitudes is complicated by the different standards and methods used to compute them. We therefore currently lack a direct comparison of the highest SLs, which is critical for investigating potential limitations to producing loud sounds.
Here we compiled SLs of the loudest animals known both in air and in water and converted all reported values into standardized measures that are directly comparable. Furthermore, we use acoustic models to estimate the highest acoustic pressures generated in the entire acoustic field. We discuss what physical and physiological mechanisms could constrain the production, radiation and propagation of high sound pressures and if such boundaries are met by animals.

How to Compare Source Levels?
The SL of a sound source is defined as the sound pressure at a reference distance along its acoustic axis (Figure 1). Traditionally, the methodology of reporting SL values differs significantly between animal groups in bioacoustics research. However, comparing SLs can be done easily when considering five issues: First, the SI unit for pressure is the Pascal, but this physical property is often reported on the decibel (dB) scale, which first scales the data to a reference value and then applies a log transform. Because the reference value is typically 20 µPa in air and 1 µPa in water, the same absolute pressure in Pascal is represented by a numerical value 26 dB higher in water than in air when represented on the dB scale. To avoid confusion, we consistently report sound pressures both in Pascal and on the relevant dB scale (Also compare the two central pressure scales in Figure 1C).
Second, because it is not possible to measure the pressure at the location of the source, the SL is defined at some distance from the source. The reference distance varies between scientific fields but is one meter by convention in most biological and engineering applications. Many animals do not provide a convenient way to place a microphone or hydrophone at this reference position. In such cases, if the distance to the animal is known, the SL of the animal is estimated by accounting for the transmission-loss of the pressure magnitude over the distance traveled (Urick, 1983;Madsen and Wahlberg, 2007;Wahlberg and Larsen, 2017). Often simple spherical spreading loss models are used to estimate transmission loss, but these can be imprecise especially at longer distances to the source, when acoustical properties of the environment play an important role .
Third, because sound sources are directional at high frequencies relative to the size of the sound source, it is important to record the sound on-axis or to reconstruct the radiation directionality pattern and report the on-axis SL ( Figure 1A). Sound pressure is highest along the acoustic axis and attenuates continuous with increasing off-axis angles. For highly directional sounds produced by bats and toothed whales the direction of the acoustic axis and position of the animal can be determined by using microphone or hydrophone arrays (Madsen and Wahlberg, 2007;Jakobsen and Surlykke, 2010).
Fourth, there are several ways to quantify the amplitude of a time-varying pressure wave. Amplitude measurements are traditionally either taken peak-to-peak (ptp), zero-to-peak (peak), or root-mean-square (rms) and it is important to note the differences when comparing studies using different amplitude measures (Figure 1B). For a sine wave, the peak-topeak value is 6 dB higher than the peak and 9 dB higher than the rms value. For most real-world signals these relationships are different. Especially the rms amplitude will differ and the difference between peak-to-peak and rms can be greater than 9 dB depending on the time window used for computing the rms. Sound level meters are also used for bioacoustics measurements and common measures are given as either L peak or L eq. L peak equals the peak amplitude measurement with no time averaging applied and is used widely in bioacoustics and human audiology research. L eq is the equivalent continuous level and the same as the rms measure.
Fifth, the frequency response and sensitivity of the recording chain needs to be specified. For example, most sound level meters have different filters that can be selected, e.g., A, C, and Z weighing, where A and C relate to human loudness perception at different intensity levels, and Z has a constant reference pressure FIGURE 1 | Source levels of the loudest animals in air and water. (A) Source level is defined as the on-axis radiated sound pressure at 1 m distance from the source. (B) Three commonly used measures of pressure amplitude; SL peak is the highest absolute magnitude of the signal. SL ptp is the difference between highest and lowest amplitude. SL rms is here shown as the rms amplitude over the duration set by using a 95% energy threshold criterion [see Madsen and Wahlberg (2007) for detail]. (C) SLs of the loudest reported animals in air and water (For data points and references see Table 1). The two vertical bars of pressure in the middle are on the same absolute pressure scale to allow direct comparison of the different dB scales in air and water. of 20 µPa across frequencies (i.e., unweighted) (International Standard IEC61672-1, 2002). Thus, A and C weighing can be used to make conclusions about human perception. Because hearing sensitivity varies significantly across species this type of weighing should be avoided in bioacoustics research and will especially affect low-and high-frequency sounds. Lastly, sound level meters come in two classes, 1 and 2 that have difference tolerance limits for precision. Both perform almost equal between 20 Hz and 10 kHz, but class 2 has lower precision tolerance outside this frequency range. Therefore class 1 sound level meters are recommended for measurements at frequencies below 20 Hz and frequencies above 10 kHz.

Which Animals Produce the Highest Source Levels?
To identify the loudest species, i.e., the species that produce the highest SLs, within and between all clades of vocal animals in air and water, we compiled SLs of animal vocalization per taxon (see section "Materials and Methods, " Figure 1C and Table 1).
To prevent overrepresentation of species with lower SLs, we included only the four loudest species within each taxon. We included bats and toothed whale as separate groups because echolocation likely imposes a different evolutionary demand on the sound production system than does communication. The variable measuring conditions of acoustic fields in laboratory and field, makes comparing dB values with precision below 1 dB not very meaningful. In combination with the different methodologies used to measure peaks or average maxima, we should consider the maximal values reported here indicative within 2-3 dB of what the animals produce. Our efforts in trying to compile these data emphasized to us how infrequent SLs are reported in bioacoustics papers. Given the importance of SL for the biology of species, we thus would like to urge people to measure and report SL in their work.
Water (peak dB re 1 µPa at 1 m) 20 µPa at 1 m (i.e., 20 Pa peak at 1 m) (Poole et al., 1988;Surlykke and Kalko, 2008;Hulgard et al., 2016). The loudest reported amphibian species call at 110 dB peak re. 20 µPa at 1 m (i.e., 6.3 Pa peak at 1 m) (Gerhardt, 1975;Passmore, 1981). The loudest reported reptile species are the alligators at around 105 dB peak re. 20 µPa at 1 m (i.e., 3.6 Pa peak at 1 m) (Todd, 2007;Wang et al., 2007). The loudest reported insects are several species of cicadas at 102 dB peak re. 20 µPa at 1 m (i.e., 2.5 Pa rms at 1 m) (Villet, 1987;Sanborn and Phillips, 1995). These SLs represent the highest values at species level. For the bat, bird, insect and toothed whale species included here, the SL values reported represent their reported loudest vocalizations. However, for the other species we do not know if the reported SLs encompass the maximal capabilities in the species-specific vocal repertoire, and we cannot exclude they can emit higher SLs. Also within species, SL variability can be expected. Humans deserve special attention because it is the only species where we have some information on the loudest individuals within a species. The human shouted voice is about 105 dB rms re. 20 µPa at 1 m (Lagier et al., 2017). However, The Guinness Book of World Records lists the loudest voice from a schoolteacher saying "Silence" at 122 dB re. 20 µPa at 1 m and the loudest nonspeech scream to be 129 dB re. 20 µPa at 1 m, which would rank humans up with the loudest mammal and birds. However, we have not been able to confirm the recording methodology of these records with Guinness, including what amplitude measure was used, and therefore do not include them here. Taken together, in air, the loudest animals all emit surprisingly similar maximum SLs around 120 dB peak re. 20 µPa at 1 m, which equals 20 Pa peak at 1 m.
In water, maximum SLs are much higher than in air. Toothed whales are by far the loudest group of animals in water; the sperm whale (Physeter macrocephalus), emits echolocation clicks with SLs up to 239 dB peak re. 1 µPa at 1 m (i.e., 900,000 Pa peak at 1 m) (Mohl et al., 2003). In comparison, the loudest baleen whale is the fin whale (Balaenoptera physalus) at 203 dB peak re. 1 µPa at 1 m (i.e., 14,000 Pa peak at 1 m) (Wang et al., 2016). The loudest teleost fish, the black drum (Pogonias cromis) (Locascio and Mann, 2011), is almost three orders of magnitude of pressure below the sperm whale at 183 dB peak re. 1 µPa at 1 m (i.e., 1,400 Pa peak at 1 m), as is the pistol shrimp (Synalpheus parneomeris) at 183 dB peak re. 1 µPa at 1 m (Au and Banks, 1998). Please note that the dB values in water are 26 dB higher than in air due to the difference reference pressure of 1 µPa alone (see central, black labeled pressure scale in Figure 1C). In water, we thus do not observe that different animal clades converge upon a maximum SL.

Loudest Animals Are Independent of Size and Frequency in Air, but Not in Water
How much would a sound source need to move to achieve a SL of 125 dB peak re. 20 µPa in air or 240 dB peak re. 1 µPa in water? To approximate this, we considered the output of two simple sound sources: (1) a pulsating sphere and (2) a piston of equal diameter (see section "Materials and Methods, " Figure 2). These models show that the velocity needed to achieve a certain fixed SL decreases with the radiated frequency and physical size in air and water (Figures 2A,B). We also considered the product of the wavenumber (k = 2πf) and size (a), the ka product. This dimensionless parameter represents the acoustic size of an emitter i.e., the size relative to the wavelength it is emitting since ka = 2πa/λ. At a fixed SL, the velocity also decreases with ka for both air and water ( Figure 2C). While the piston model shows a power relationship (linear on the double logarithmic axes), for the pulsating sphere the velocity required becomes constant at higher frequency, size and ka. This is because the source becomes large compared to the wavelength and tends to locally radiate a plane wave, for which the ratio of sound pressure to particle velocity is the characteristic impedance of the propagation medium, ρc [see also Equation (1) in section "Materials and Methods"]. By fixing other parameters, such as rms volume velocity of the source (see section "Materials and Methods"), the SL increases with frequency, size and ka product (Figures 2D-F). Again, for the pulsating sphere, the SL does not increase with frequency, size and ka for a fixed velocity over a certain frequency for the reason mentioned above.
These simple models illustrate three acoustic considerations important for generating sound. First, to produce higher frequencies at the same SL, the source needs to move less. Second, reversely, with the same source velocity, a higher SL can be achieved at higher frequencies or larger size. Third, due to the impedance difference between air and water, the same source motion results in water in a three orders of magnitude higher sound pressure than in air. It is thus much easier to generate a high pressure in water.
The ka product determines how much of the power used to produce the sound is converted into acoustic power that radiates from the source, i.e., the efficiency of the source. For a pulsating sphere the maximum efficiency is at ka ≥ 2. Below ka = 2 efficiency drops by 100 for every order of magnitude of ka (Michelsen, 1992;Larsen and Wahlberg, 2017). While there is no increase in source efficiency at ka > 2, most sound sources will exhibit a substantial increase in SL because the sound source becomes increasingly directional with increasing ka, i.e., pressure is highest along the acoustic axis and progressively decreases at greater off-axis angles. Thus, a directional source radiating the same acoustic power as an omni-directional source will emit a higher SL on the acoustic axis. However, a pulsating sphere does not become directional at high ka.
Because these simple acoustical models predict a clear dependency on frequency, size and ka product, we compiled SL of the loudest animals as a function of their peak frequency body mass, acoustic radius and ka product (Figure 3 and Supplementary Table 1, see section "Materials and Methods"). We consider that applying descriptive statistics is not meaningful given the sparse nature of the data, but a few patterns do emerge. Although within a clade body size may be a good predictor of SL (Villet, 1987), for the loudest aerial species we observe no increase of highest SLs with radiated sound peak frequency over four orders of magnitude ( Figure 3A), no increase with body mass across nearly five orders of magnitude ( Figure 3B) and no increase with increasing ka over two orders of magnitude ( Figure 3D). All loud insects, frogs, reptiles, birds and terrestrial mammals have ka between 0.1 and 1, which makes them omnidirectional sound emitters. The bats have ka > 2, which makes them efficient and more directional sound emitters. Thus, in contrast to simple linear acoustic models that show increase of SL with increasing frequency, radius and ka product, the maximal SL of around 120 dB re. 20 µPa at 1 m in air seems independent of weight, radius, frequency and ka product (Figure 3 and Supplementary Table 1). FIGURE 2 | Pulsating sphere and piston models predict that source level depends on frequency and size. (A) Isolines of a 240 dB re. 1 µPa and 125 dB re. 20 µPa source show that producing sound requires less movement with higher frequency, (B) size and (C) ka product. (D) SL increases with frequency, (E) size and (F) ka product for both sphere and piston models. The lines shown here are at a volume velocity that makes the source of 10 mm diameter produce 240 dB re. 1 µPa and 125 dB re. 20 µPa at 1,000 Hz, in water and air, respectively (see section "Materials and Methods").
For aquatic animals, the sparse observations fit the simple acoustic models that highest SL increases with frequency ( Figure 3C), body size ( Figure 3D) and ka product ( Figure 3E). However, due to the sparseness of the data, we should be cautious interpreting this data. For loud crustaceans, fish and baleen whales, the ka product is between 0.01 and 0.2, which makes them omnidirectional, but not such efficient sound emitters. For tooth whales the ka product is larger than 10, which makes them efficient and highly directional sound emitters. As a consequence, while toothed whale SLs are substantially higher than the baleen whales, the high directionality means that the difference in radiated acoustic power, i.e., the combined sound radiation in all directions, is much smaller. This is because when emitting sound directionally, sound pressure is concentrated in the frontal direction and much lower pressures are radiated off-axis whereas for omni-directional sources, sound pressure radiation is roughly equal in all directions.

Physical Upper Limits to Sound Pressure Generation and Radiation
The SL of bat echolocation calls has been suggested to be close to the physical limit of maximal pressure generation in air (Madsen and Surlykke, 2014). Are animals indeed so loud they are hitting certain physical limits to sound production?
In air, pressure fluctuates around atmospheric pressure of about 100 kPa and the negative crest is limited at 0 Pa. Sound waves that are symmetric around atmospheric pressure can therefore reach an amplitude of maximally 200 kPa peak-topeak (194 dB peak re. 20 µPa). However, there is no theoretical In aquatic animals, we observe a trend that the highest SLs increase with frequency, (D) in aquatic animals, we observe a trend that the highest SLs increase with frequency, (E), body mass and (F) ka, but the sparsity of the data prevents statistical interpretability.
physical upper limit to pressure, and extreme explosions can indeed surpass the 100 kPa positive crest. The supposed loudest explosion in recent human history was the 1883 Krakatoa volcano eruption with an estimated SL of about 270 dB peak re. 20 µPa at 1 m (Winchester, 2003). Besides many issues with approximating this particular SL, it is clear that in air, making sounds by exploding is not a viable option for animals, and vocalizations do not reach such enormous pressures.
In water, the minimal sound pressure is limited by the formation of vapor-filled cavities, i.e., cavitation, at 0 Pa. Because the ambient water pressure depends on depth in the water column, the difference between ambient pressure and cavitation also depends on diving depth. Thus, a sound wave at the water surface and symmetrical around atmospheric pressure can therefore also reach an amplitude of maximally 200 kPa peak-topeak (220 dB peak re. 1 µPa). Again, there is no theoretical upper limit to pressure, but because the cavitation boundary poses a design constraint in human-made sonar systems (Woollett, 1962) it is reasonable to assume that this also is the case for biological systems. A sperm whale click of 239 dB peak re. 1 µPa would thus actually surpass the minimal crest limit when produced at shallower depths than 80 m.
The above physical limitations apply to acoustic pressure magnitudes irrespective of where they occur in the sound field of a source. However, what are the maximal sound pressures animals produce in the entire sound field that they radiate? Whereas SL is defined at the reference distance of 1 m, the highest pressures mostly occur much closer to most animals. To estimate the maximal acoustic pressures the loudest animals generate, we approximate them as two types of sound sources; a pulsating sphere and a piston in an infinite baffle (Figures 4A,B). In the far field sound pressure decreases with 6 dB per doubling of distance due to the spreading of the acoustic power over a larger area (Jacobsen and Juhl, 2013). A pulsating sphere only has a far field and the highest pressure produced is obtained at the surface of the sphere (Figure 4A, see section "Materials and Methods"). However, pistons and more complex sound sources also have a near-field where the pressure strongly depends on local conditions. For a piston in an infinite baffle the transition from near to far field boundary can be conservatively FIGURE 4 | The estimated highest occurring sound pressures in air and water. (A) For animals that are omni-directional sound radiators we used the monopole model to estimate the highest occurring pressure (red horizontal arrow). Because a monopole does not have a near field, we assumed the radius of the monopole to be the body wall (see section "Materials and Methods"). In far field conditions, the sound pressure decreases with 6 dB per doubling of distance. (B) For highly directional sound radiators (bats and cetaceans), we used the piston model to estimate the highest occurring pressure. We use the conservative estimate that the highest occurring pressure (red horizontal arrow) occurs at the border of the interference near field and far field (see section "Materials and Methods"). (C-E) Estimated highest produced sound pressures increase with frequency but plateau at about 150 dB ref. 20 µPa by animals vocalizing in air. (F-H) Estimated highest produced sound pressures seem to increase with frequency and size for animals vocalizing in water. approximated by: D piston = k × a 2 , where k is the wavenumber (k = 2π/λ) and a the radius of the emitter (Figure 4B; Foote, 2014). In the interference near field of a piston, pressure can be up to 12 dB higher than are the near/far field border we use for our approximation and strong dips occur that are highly sensitive to local conditions and ka-values ( Figure 4B). Given the near field conditions are very specific for each animal, we consider it safer to use the more conservative maximum pressure at the boundary between the geometric nearfield and the far field.
Using these two models, we estimated the maximum pressures the loudest animals generate (see section "Materials and Methods"). In air, below 2 kHz the estimated maximum sound pressure increases with frequency ( Figure 4C). However, at 2 kHz, the pressure seems to reach a plateau at 150 dB peak re. 20 µPa with the exception of the Bellbird that reaches 160 dB peak re. 20 µPa. This maximum pressure plateau is also maintained for animals under 10 kg but decreases with body mass over 10 kg ( Figure 4D) and radius over 5 cm (Supplementary Table 1). When estimating the maximum pressure produced, the frogs and cicada's move up and interestingly, all loudest mammals, birds, cicada's and frogs converge upon 140-150 dB peak re. 20 µPa.
In water there is a trend that maximum pressure increases with frequency with no indication of a plateau as seen in air ( Figure 4F). However, body mass, and ka product do not show clear relationships with the maximal pressure (Figures 4E,F). Both the pistol shrimp and the toothed whales produce estimated maximal pressures as high as 230 dB re. 1 µPa and reach cavitation limit pressures at depths less than 30 m.
Taken together, we observe that animals vocalizing in water roughly follow the source relations predicted by sphere or piston models. The loudest animals in water come close or reach a physical limit (cavitation) when producing loud sounds at shallow depths. The loudest animals vocalizing in air are efficient sound producers, but do not get close to the maximal amplitude for a symmetrical wave. Our data thus suggests that they are limited to amplitudes of 140-150 dB peak re. 20 µPa.

Physical Upper Limits to Sound Propagation
The next physical limitation of sound production is the phenomenon that at high acoustic pressures sound propagation becomes non-linear and efficacy decreases. The non-linearities occur since the speed of sound is temperature dependent and pressure fluctuations are accompanied by temperature fluctuations. As a result, the positive pressure crest travels faster than the negative pressure crest. This effect accumulates over distance and eventually (depending on loss mechanisms) shockwaves may form, even from a waveform that is initially a sinusoid (Pierce, 1981). This distance from the source at which the shock wave is formed is called the shock formation distance. The relevant propagation (e.g., communication or prey detection) distance is thus a key factor to include when estimating shock formation distance. The creation of shockwaves is frequency and level dependent and the radiated waveshape at the source also plays a major role. The sound producing process itself might lead to a waveform that is close to that of a shockwave, thereby reducing the shock formation distance. Because of these propagation non-linearities, very loud sounds attenuate much more rapidly with distance than dictated by simple spherical spreading loss and atmospheric attenuation. The introduction of propagation non-linearity can (depending on level, frequency, and range) even give rise to a saturation effect for sound propagation in air and water, because increased SL beyond this level is not associated with an equivalent increase in signal range (Pierce, 1981).
However, the effects of spherical spreading and absorption counteract the formation and propagation of shockwaves. Since absorption in both air and water increase with frequency, the higher harmonics caused by the transition into a shockwave are attenuated more than the fundamental frequency leading to a sinusoidal waveform at large distances (the so-called old-age region) (Pierce, 1981). The strength of this counteracting effect depends on amplitude, frequencies and propagation distances. This effect along with the saturation effect is in particular relevant for animals communicating over long distances.
Shock wave formation can thus be considered a realistic but "soft" limit to sound production in air and water, because it is frequency, level, waveshape and distance dependent. Due to the complicated non-linear acoustics involved, analytical models of the attenuation of shock waves are limited to approximate cases such as plane wave propagation of an initially sinusoidal waveform. As a rule of thumb and at moderate distances, sound pressure can reach 150 dB ref. 20 µPa in air and 240-250 dB ref. 1 µPa in water before physical non-linearity and additional losses significantly reduce amplitude . Thus at least in air, the loudest birds, mammals, frogs and insects create sound pressure levels that approach the level at which non-linear propagation losses become significant and further increase would be inefficient as a mean to increase communicative distance. Thus, radiation non-linearities may provide a realistic physical limitation to making louder sounds. The resulting skewed sound waveforms are at least consistent with the bellbird calls and mammalian screams.
Definitively answering the question if propagation nonlinearities are physically limiting sound production requires non-linear modeling and precise measurements. The acoustic nearfield and spherical spreading have to be taken into account and can only be solved numerically. Measurements of shock waves and thereby high-order harmonics from animals producing high-frequency vocalizations should be definitive, but also impose high demands to the equipment in terms of sampling frequency and transducer response. The conditions are so different for each species that the question must be solved on a case-by-case basis, which is beyond the scope of this paper.

Physiological Limitations to the Production of Loud Sounds
All extant vocalizing species have undergone millions of years of evolution and sound production is only one of a multitude of trade-offs individuals face in their survival. Many factors could thus play an important role in explaining why most species do not produce loud vocalizations. First of all, making high acoustic pressures is also conspicuous and thus not necessarily an advantage. Another major factor is the energetics and efficiency of vocal production in relation to the ecology and behavior of a species. In frogs, birds, and bats it has been shown that high SLs come with a substantial increase in energy expenditure (Currie et al., 2020). Obviously, the duty cycle of calling plays a major factor in this; some frog species call at high duty cycle for several hours, but other species may only produce a few vocalizations per day. However, if power plays a major role, we would hypothesize that large animals would be louder as they could afford more energy, but our data does not support this. Additionally, loud sounds can become too loud and may temporarily deafen the receiver (Finneran, 2015). These are just a few reasons why an animal may not invest in making high sound pressures. However, can we identify more principal constraints in the physiology that pose a limitation to producing high sound pressures?
To answer this question, we need to look at the different mechanisms animals use to generate sounds. Sound production mechanisms differ widely and pose phylogenetic and evolutionary constraints. In some case they are not wellunderstood or even unknown. Most air-breathing tetrapods produce vocalizations by converting respiratory flow to modulated flow by self-sustained oscillation of laryngeal vocal folds or syringeal analogous structures. The resulting air pressure disturbances constitute the acoustic excitation of the system (Titze, 2000). This framework is called the myo-elastic aerodynamic theory of sound production or MEAD. The theory of sound production using MEAD is best studied in humans, but also found applicable to non-human mammals (Herbst et al., 2012) and birds (Elemans et al., 2015;Jiang et al., 2020). Amphibians and the few vocal reptiles probably also use MEAD (Rand and Dudley, 1993;Reber et al., 2015).
We identified at least four MEAD features that potentially pose limits to producing high SLs. A first limit is the efficiency by which aerodynamic energy is converted into acoustic energy. This efficiency is referred to as the glottal efficiency in laryngeal sound producers including humans (van den Berg, 1956;Bouhuys et al., 1968;Schutte, 1980) or vocal/mechanical efficiency (ME) (Titze et al., 2010;Zhang et al., 2019) and is defined as the ratio of radiated acoustic power over driven aerodynamic power of the subglottal/subsyringeal air. Acoustic power is typically determined by combining the measured sound pressure, impedance and an approximation of the area over which the energy is radiated. Aerodynamic power is calculated as the product of measured mean tracheal/bronchial airflow and pressure. When measured in vivo, ME captures both (i) the transformation of aerodynamical power into acoustic flow within the vocal tract, (ii) transmission efficiency through the airways, and (iii) the transformation of sound from the surface (mouth/beak/air sacs) to the environment (Titze and Palaparthi, 2018). ME varies greatly with bronchial pressure (Herbst, 2014), frequency (Zhang et al., 2019), vocal fold position, geometry and pathologies and also in between species (e.g., Brackenbury, 1979;Titze et al., 2010;Herbst, 2014;Maxwell et al., 2021) and values are reported between 10 −4 to 2% (e.g., a factor of −60 to −20 dB).
Many animals have evolved anatomical or behavioral adaptations that aid in radiating the sound energy from their vocal organs to the radiated sound field. Indeed, the ME of excised vocal organs is typically lower because there is no upper vocal tract (Titze, 2006). Anatomical adaptations to increase sound radiation efficiency, such as air sacs in frogs (Rand and Dudley, 1993), birds (Riede et al., 2004), and mammals (Riede et al., 2008), or enlarged larynges in howler monkeys (Dunn et al., 2015) and hammerhead bats (Schneider et al., 1967). Additionally, behavioral adaptations can be found such as posture modifications to increase mouth/beak opening when emitting high SLs, as seen in the bell bird and, howler monkeys. Models suggest that for mammals and birds, adjustments of head size, mouth opening, and beam direction can make the power transformation efficiency from vocal tract to radiated sound as high as 100% in the 1-50 kHz range (Titze and Palaparthi, 2018). Some animals even change their environment by constructing horns or baffles that aid in radiating the sound (Mhatre et al., 2017).
A second limitation is the amount of aerodynamic energy an animal can produce. in vivo and excised larynx and syrinx work has shown that SL increases with mean bronchial pressure (Schutte, 1980;Zhang et al., 2019). The increasing pressure leads to higher VF displacement, sharper flow starts and stops and therefore a higher SL. The maximal expiratory pressure is limited by the maximal effort of respiratory muscles and in humans ranges from 5 to 7 kPa during crying in infants and up to 10-15 kPa in adults during shouting (Wilson et al., 1984;Dimitriou et al., 2000;Lagier et al., 2017). Without vocalizing, higher expiratory pressures over 20 kPa can be achieved by both normal and brass instrument playing adults (Fiz et al., 1993).
However, before the maximal respiratory pressure or flow is achieved, a third limit is typically reached. As bronchial pressure and flow increases, at specific values the dynamics of VF vibration behavior bifurcates from regular to chaotic regimes. This point is called the phonation instability pressure or flow (Jiang and Titze, 1993;Hoffman et al., 2012). As pressures exceed the phonation instability pressure (PIP) the SL does not increase further in the few species studied (Jiang and Titze, 1993;Zhang et al., 2007;Hoffman et al., 2012), probably because the vocal efficiency decreases. Although using pressure above the PIP is unfavorable from an energetics point of view, irregular or chaotic vocal fold regimes are common in mammalian vocalizations (Wilden et al., 1998;Fitch et al., 2002) and their signaling function in communication thus likely outweighs the loss of energy efficiency.
Fourth, with increasing amplitude the collision force of vocal folds, or impact stress, increases. Although short peak impacts may not be a limiting factor per se, accumulative vocal fold damage due to a large amount of high impacts, aka the vibration doses, may be limiting. Through intense voice use, damage can accumulate over time and tissue stress is suggested as the tradeoff for peak performance (Titze and Hunter, 2015). Impact stress is also the main traumatizing mechanism in human voice production, and the main cause of vocal fold nodules (Horacek et al., 2009). In humans, many impact related VF pathologies are known, but to our knowledge there is no reports on VF pathologies in animals.
Taken together, for animals using MEAD to produce vocalizations, at least the above four physiological constraints could pose limits to SL. However, we suggest that these constraints are not hard limits, but should be more seen as tradeoffs in energy expenditure or vocal fold damage. Furthermore, our current dataset does not allow investigation of allometric scaling with anatomical and physiological parameters (e.g., Charlton and Reby, 2016), because we did not systematically sample across a range of SLs and taxa that use MEAD. Instead we specially mined the literature for the highest SLs. It would be interesting to see if within phylogenetically related taxa of animals using MEAD allometric relationships can be found, as between SL and size within the cicada's (Villet, 1987).
The loudest insects, the cicadas, use a fundamentally different mechanism to produce sound. Cicada's buckle ribs on their tympanum that results in clicks, which provides a resonant source that drives the abdominal resonator, from which sound is radiated via the tympana (Young and Bennet-Clark, 1995). The limit to produce clicks is unknown, but most likely related to mechanical failure of the tympanic ribs.
Animals producing loud sounds in water do so by at least three mechanisms. The unique mechanism by which pistol shrimp produce sound using their large snapper claw is well-understood. Muscle co-contraction builds up tension that is released by contraction of another muscle. The rapid closure of the claw pushes a plunger into a socket, and creates an outward water jet at such velocity that a cavitation bubble forms. It is the implosion of this cavitation bubble that creates the loud snapping sound (Versluis et al., 2000).
Bony fishes have evolved perhaps the largest diversity of sound generating organs among vertebrates (Fine and Parmentier, 2015;Ladich and Winkler, 2017). For the few species studied, the most common mechanisms are muscle driven vibration of a gas-filled bladder, and stridulation mechanisms of pectoral girdle or fin (Ladich and Winkler, 2017). The loudest teleost fish reported here most likely produce sound by swim bladder vibration (Locascio and Mann, 2011). Because all vertebrate muscles trade-off muscle power and speed, the fastest muscles can move at rates of 270 Hz (Mead et al., 2017). These extreme contraction rates still produce low frequencies for sound. Given the size of the fish, these result in ka < 1, which makes them poor pressure radiators. However, many fish are mostly sensitive to particle motion, not pressure, and thus pressure may not be the most relevant cue for communication (Radford et al., 2012).
In cetaceans sound production has received much attention, however, we have no convincing direct evidence of how the sounds are produced. Cetaceans have shared ancestry with the artiodactyla and sound production is thought to be driven by air flow. In mysticetes, the hypothesis that sound is produced by laryngeal tissue vibration is based on anatomy (Damien et al., 2019) and we still lack direct experimental observation to test outstanding hypotheses. Their relative low ka values make them suboptimal sound radiators, but the low-frequency emission may be favorable because of low absorption and thus allow long-range communication. The odontocetes produce the highest sound pressures of all animals (Mohl et al., 2003). Several lines of evidence suggest that sound production occurs at the phonic lips in the upper nasal passages, either by a muscle-driven catchrelease mechanism or an air-flow driven MEAD system. The sound radiates from the melon is highly directional. In the sperm whale, the produced sound is collimated inside the enormous nasal complex, resulting in the most directional sound source known where most energy is concentrated in a beam of only a few degree (Mohl et al., 2003). However, given the fact that odontocetes are producing the highest sound pressures of any animal on the planet especially warrants further investigation to understand how they manage to produce 1 MPa sounds.

CONCLUSION
Across the animal kingdom we find that the loudest animals span several orders of magnitude of size and frequency and can be found in all phylogenetic groups and habitats. To investigate what potential mechanism could limit the generation of loud sounds, we compiled SL data for animals vocalizing in air and water. In air we see that SLs are limited to 125 dB peak re. 20 µPa at 1 m after correcting for scaling conventions. The maximum actual pressure generated are 140-150 dB peak re. 20 µPa, typically much closer to the source than one meter. Several physiological processes could be limiting but given the many tradeoffs the different animals face during evolutionary history it is hard to point to a single constraint that explains the maximally observed values. Two physical constraints are of a magnitude to pose serious limitations. First the acoustical size (ka) constraints the efficiency of sound radiation. The loudest animals in air all seem to be good radiators, maybe except for the elephant, with ka close to or above 1. Second, non-linear propagation makes it inefficient, but not impossible, to make louder sounds. Thus, in air, physical limitations and particularly non-linear propagation could play a major role in how loud animals can maximally get.
In water, pistol shrimp and odontocetes produce extreme acoustic pressure close to the zero pressure (cavitation) limit. The loudest fish reach a physiological limit that muscle-powered swim bladder motion is limited to generating frequencies of 300 Hz. The mechanisms of sound production in both baleen and tooth whales are not well understood. How these animals achieve these incredible SLs is not well known.
Being loud is one of many strategies of the surprising tapestry of animal vocalizations. The loudest animals produce sound pressures where several physical processes become highly non-linear. To solve which process poses a limitation to producing higher SLs requires the development and detailed testing of numerical models on a case-by-case basis. Although for the majority of animals, being loud has not been an evolution strategy, we see that both in air and in water, species have evolved that are pushing against the physical limits of sound production.

Source Level Comparison and Compilation
We determined SLs by making the following two conversions to the literature data if relevant: First, we use sound pressure level (peak) as the proxy for sound amplitude (Figure 1A). For the particular purpose of this study, peak is a better measure than RMS because it represents the maximum pressure the animals are producing while RMS averages the pressure over the duration of the sound. We did this conversion using the relationship between peak, peak-to-peak and RMS for a simple sinusoid, i.e., by adding 3 dB to RMS values or subtracting 6 dB from peak-to-peak values. For RMS values this underestimates the peak value for non-sinusoid signals, which makes our SL peak values conservative estimates. Second, we calculate SL to the standard reference distance of 1 m using spherical spreading attenuation. While atmospheric attenuation becomes substantial in air at frequencies >20 kHz, it is negligible over the short distances we encounter here and very likely less than the overall uncertainty involved in the reported measures. All our values are based on the highest reported values in each study.

Pulsating Sphere and Piston Model
To relate sound pressure measurements at one position to another we must adopt a model of the sound source and the propagation medium. For the medium we assume lossless free space and discuss air/water-attenuation at ranges where these effects are relevant. For the sound source we employ two models: the pulsating sphere and the piston in a baffle, which despite being simple approximations are quite often used in bioacoustics.
For a pulsating sphere the relation between pressure amplitude and surface velocity is (Jacobsen and Juhl, 2013): where ρ is the density of medium, c is the speed of sound in medium, wavenumber k = 2π f/c, a is the radius of the sphere, U is the velocity of the sphere surface and r is the distance to center of sphere. If the velocity is given as an RMS value, the resulting sound pressure is an RMS as well and so forth for peak or peak-to-peak values. The quantity (4πa 2 U) is the volume velocity of the sphere, which is often used to characterize source strength in acoustics.
For a piston in a baffle, we limit the discussion to the on-axis pressure, the amplitude of which can be calculated by, (Jacobsen and Juhl, 2013) p where x is the distance to the center of the piston. For high frequencies and close distances strong interference can occur (Figure 4B), whereas an approximate expression can be found for long distances (compared to both radius and wavelength): Note that the volume velocity of the piston, (πa 2 U), is one-fourth of that of the sphere. For a given radius and volume velocity, the frequency response of the sphere is increasing by 6 dB/octave at low frequencies before reaching a limit at ka = 1 (3 dB corner frequency). For the piston in a baffle, there is no such limit in the far-field, but evidently the near-field extends further with increasing frequency.

Estimation of Maximal Acoustic Pressure
For sources that can be considered equivalent to oscillating pistons, we used the theoretical boundary between the interference near field and far-field as the distance to the source where the highest sound pressure occurs. According to Foote (2014), this can be approximated conservatively as: Where a is the radius of the piston and λ is the wavelength of the sound. We use this approximation for the toothed whales and bats who's highly directional sound emission patterns have been shown earlier to fit well with piston model predictions (see e.g., Mohl et al., 2003;Jakobsen and Surlykke, 2010). For bats we used the piston-fit to the measured directionality of E. fuscus as reported in Hulgard et al. (2016). We assumed that E. bottae emits similar directionality to E. fuscus and computed a using emitted frequency as reported by Holderied et al. (2005). For The two Noctillio, we assume higher directionality based on the much higher emission frequency relative to body size, we therefore adjust the size by the difference in estimated maximum gape size as reported by Thiagavel et al. (2017). For Toothed whales, the end of the near field of T. truncatus is ca 0.5 m Finneran et al. (2016). Given that P. crassidens emits the same directionality as T. truncatus and assuming that O. orca does so as well, we estimated a from the known nearfield of T. truncatus and the emitted frequencies of each species. Directionality is higher for P. macrocephalus and we accounted for this by multiplying the assumed a at equal directionality to T. truncatus by the difference in directivity index (2 dB = 1.25) [see Jensen et al. (2018) for directivity measures]. For sources that can be considered monopoles, the limitation is essentially the size of the animal as there is no interference nearfield. We approximate animals that emit sound with no apparent directionality as monopoles, i.e., a ka product < 1 (see Figure 4), which included all animals other than bats and toothed whales. Acoustic size estimates are not commonly given in the literature, so we used approximations based on available morphological measures. For frogs we estimated the size of the vocal sac as half the length of the animal (snout-vent length) and assume that the vocal sac is equal to the size of the monopole. For the cicada we estimated the width of the body from the commonly given hemelytra length using the known relationship between hemelytra length and body width reported for Cyclochila australasiae (Young, 1990). For the pistol shrimp, we used the size of the cavitation bubble reported by Versluis et al. (2000). For the fish, we computed the radius of a cylinder based on reported lengths and weights assuming the same density as water. For all other animals we used the halfwidth of the skull as the monopole radius. All values are given in Table 1

DATA AVAILABILITY STATEMENT
The original contributions presented in the study are included in the article/Supplementary Material, further inquiries can be directed to the corresponding author/s.

AUTHOR CONTRIBUTIONS
LJ, JC-D, PJ, and CE: conceptualization, formal analysis, and writing -review editing. LJ, PJ, and CE: methodology. LJ and CE: writing -original draft. All authors contributed to the article and approved the submitted version.

FUNDING
This study was supported by the Villum Foundation (00025380) and the Danish Research Council (DFF 8021-00155) to LJ and the Novo Nordisk Foundation (NNF17OC0028928) to CE.