A Unified Theory of Psychophysical Laws in Auditory Intensity Perception

Psychophysical laws quantitatively relate perceptual magnitude to stimulus intensity. While most people have accepted Stevens’s power function as the psychophysical law, few believe in Fechner’s original idea using just-noticeable-differences (jnd) as a constant perceptual unit to educe psychophysical laws. Here I present a unified theory in hearing, starting with a general form of Zwislocki’s loudness function (1965) to derive a general form of Brentano’s law. I will arrive at a general form of the loudness-jnd relationship that unifies previous loudness-jnd theories. Specifically, the “slope,” “proportional-jnd,” and “equal-loudness, equal-jnd” theories, are three additive terms in the new unified theory. I will also show that the unified theory is consistent with empirical data in both acoustic and electric hearing. Without any free parameters, the unified theory uses loudness balance functions to successfully predict the jnd function in a wide range of hearing situations. The situations include loudness recruitment and its jnd functions in sensorineural hearing loss and simultaneous masking, loudness enhancement and the midlevel hump in forward and backward masking, abnormal loudness and jnd functions in cochlear implant subjects. Predictions of these loudness-jnd functions were thought to be questionable at best in simultaneous masking or not possible at all in forward masking. The unified theory and its successful applications suggest that although the specific form of Fechner’s law needs to be revised, his original idea is valid in the wide range of hearing situations discussed here.


INTRODUCTION
Psychophysical laws attempt to relate the amplitude of a physical stimulus to its perceived magnitude, such as loudness as a function of sound pressure or brightness as a function of luminance. The classic approach to uncovering psychophysical laws was advanced by Fechner (1966) in the mid 18th century (original work published in 1860). Fechner assumed that the justnoticeable-difference (jnd), expressed as the Weber fraction ( I/I), where I is a standard sound intensity and I is the intensity change required for the jnd, produced an equal increment in loudness sensation ( L). Integrating this equation, namely L = I/I, he produced what is known as Fechner's law: loudness is a logarithmic function of sound intensity (L = log I).
Not only was Fechner's logarithmic law replaced by Stevens's power law or L = I θ , where θ is a constant (Stevens, 1961), his general approach was also questioned due to failure to integrate the jnd functions of two different sounds to predict their respective loudness functions (Newman, 1933;Miller, 1947). Thus, it was not too surprising that the Fechnerian approach in relating the stimulus jnd to the subjective magnitude was abandoned by some researchers. What was surprising is the grounds on which the Fechnerian approach was abandoned. For example, Stevens (1961) argued that the direct magnitude estimation technique obsolesced intensity discrimination as a measure of the stimulus-sensation relationship. He viewed the discrimination measure as "an engineer talking. . . the scatter of some dial settings." In a completely opposing view, Viemeister and Bacon (1988) stated that loudness estimation data were a measure with "probably strong involvement of non-sensory factors, (and) we did not attempt to relate these data to those for intensity discrimination." There have been other researchers who continued to advance the Fechnerian approach in searching for a unified theory relating intensity discrimination to the loudness function. Fechner's original assumption was sometimes referred to as the "slope" theory, because it predicted that the steeper the loudness function, the smaller the jnd or Weber fraction for a constant increment in loudness. This simple slope prediction turned out to be not true at least in cases of loudness recruitment, where cochlear hearing loss or partial masking elevated the hearing threshold but produced abnormally steep loudness growth so that normal loudness was perceived at high sound levels (Fowler, 1937). To account for the failure of Fechner's slope theory, several researchers proposed a "proportional-jnd" theory, in which the jnd size needed to be normalized by the total jnd number within a stimulus's dynamic range (Riesz, 1933;Teghtsoonian, 1971;Lim et al., 1977). On the other hand, the "equal-loudness, equaljnd" theory argued that the jnd had no relation to the slope of the loudness function, but rather was determined by the total loudness (Zwislocki and Jordan, 1986). Despite significant effort in testing these loudness-jnd relationships, no consensus has been reached yet (Houtsma et al., 1980;Hellman et al., 1987;Schlauch and Wier, 1987;Rankovic et al., 1988;Johnson et al., 1993;Stillman et al., 1993;Schlauch et al., 1995;Allen and Neely, 1997;Hellman and Hellman, 2001).
Here I present a unified theory, starting with a general form of Zwislocki's (1965) loudness function to derive a general form of Brentano's law, and I will arrive at a general form of the loudnessjnd relationship that unifies previous loudness-jnd theories. Specifically, I find that the previous "slope, " "proportional-jnd, " and "equal-loudness, equal-jnd" theories, are three additive terms in the new unified theory. I also show that the new theory is capable of predicting loudness and jnd data across a wide range of hearing situations, including sensorineural hearing loss, simultaneous masking, forward masking, and electric hearing.

DERIVATION OF A UNIFIED THEORY
Derivation of a General Form of Brentano's or Ekman's Law I start with the general form of a loudness function proposed by Zwislocki (1965;Eq. 212): where I 0 is the detection threshold for a particular type of sound, c represents an internal noise scaling factor, and k is a constant. Generality and symmetry are the two reasons for choosing Zwislocki's loudness function. First, at high intensities (I >> I o ), Zwislocki's function can be simplified as Stevens's power law, namely, L = kI θ . At low intensities, Zwislocki made an implicit but important assumption to account for loudness recruitment near threshold: The slope (θ) of the loudness function does not increase as initially thought (Fowler, 1937), instead the loudness at threshold is increased. Setting I = I o in Eq. (1), the loudness at threshold, or is directly proportional to the threshold and "must be greater than zero (Zwislocki, 1965;p. 87)." Mathematically, the loudness at threshold is infinite when the internal noise is zero (c = 0), and vice versa. This is a fundamental argument for why the brain has or needs internal noise because infinite loudness is clearly biologically unacceptable. Zwislocki's internal noise concept was also expanded to form the basis for treating loudness recruitment as "softness imperception" (Buus and Florentine, 2002) and tinnitus as "additive central noise" (Zeng, 2013). In the interest of simplicity, I define loudness at threshold as: Second, the mathematical symmetry can be shown by differentiating Eq. (1): Adding and subtracting the same component in the above equation, I obtain: Rewriting the above equation, I obtain the general form of Brentano's law or Ekman's law, namely, L L = I I , (see Stevens, 1961, for discussion of these laws): Equation (4) is mathematically symmetrical and balanced, having a general form of Weber's law including a thresholdcorrection term in both the sensation domain (L o ) and the stimulus domain (cI o ).
To the first-order approximation, Weber's law in the stimulus domain has been "replicated in hundreds of studies across all sensory modalities and many animal species over the last two centuries (Pardo-Vazquez et al., 2019)." In auditory intensity discrimination, the Weber fraction is constant for broadband noise but decreases slightly with increasing intensity, resulting in a "near miss" to Weber's law (McGill and Goldberg, 1968). Therefore, Eq. (4) can be written as: where w and α are both constants, with α = 0 indicating perfect conformity to Weber's law.
According to the "proportional-jnd" theory (Lim et al., 1977), the constant w is inversely proportional to the number of jnds (N) within the stimulus dynamic range. In other words, w = 1/N, which can be considered as a scaling factor to account for the fact that different subjects or different types of stimuli may have a different number of discriminable steps within their respective dynamic range (e.g., a normal-hearing listener has 100 steps but a cochlear-implant user has only 10), but they all have similar loudness growth from soft at the threshold to uncomfortably loud at the upper limit of the range. The "proportional-jnd" theory states that 10 jnd steps in the normal-hearing listener would produce the same amount of loudness change as one jnd step in the cochlear-implant user. Although the "proportionaljnd" theory did not assume or require any specific jnd-loudness function, Lim et al. (1977) hinted that Brentano's law "is nearly the correct one" (see footnote 7 on p. 1264 in Lim et al., 1977). In this case, a relative change in loudness is inversely proportional to the number of jnds with an intensity correction term, whose origin will be considered in section "Discussion":

Prediction of the jnd Function From the Loudness Balance Function
Suppose that the loudness function for a tone in quiet is: L = f(I), and that the loudness balance function between the tone in quiet and the tone in masking has been obtained: I = g(I m ). By definition, at I = g(I m ), loudness is balanced so that the loudness function can be derived for a partially masked tone: Differentiating the above equation to obtain: Rewrite the above equation: Replace L m and L with Eq. (6) to obtain: To predict the jnd in the form of the Weber fraction at the same intensity, that is, I m = I so that one can cancel out the intensity correction term (I α m /I α ) and divide the above equation by (I): Taking a logarithmic transformation, one can calculate the jnd in terms of the Weber fraction in dB (WFdB): where WF m dB(I) = 10log( I m /I), which is the log Weber fraction for a masked tone and WFdB(I) = 10log( I/I), which is the log Weber fraction for a tone in quiet. Equation (12) indicates that, if WFdB(I) is known at a given intensity (I), then one can predict WF m dB(I) at the same intensity from three additional measures: (1) the local slope of the loudness balance function [g'(I m )], (2) a scaling factor (N/N m ), and (3) the local loudness ratio between the masked tone and the tone in quiet [(L m + L mo )/(L + L o )]. Interestingly, in theory, there is no need to know explicitly the detection threshold, nor the exact form of loudness growth or intensity discrimination function for the tone in quiet.
I consider Eq. (12) as a unified theory of psychophysical laws in auditory intensity perception because the last three terms in the equation contain the three previous theories that attempted to relate the jnd function to the loudness function. The 10logg'(I m ) term represents Fechner's original "slope" theory; the 10log(N/N m ) term represents Riesz's "proportional-jnd" theory; and the final term represents Zwislocki's "equal-loudness, equaljnd" theory.

Prediction of the jnd Functions in Simultaneous Masking
Simultaneous masking not only elevates a pure tone's threshold but also affects its loudness perception, similar to loudness recruitment in sensorineural hearing loss. Both loudness balance and intensity discrimination functions have been measured in the same group of listeners for pure tones in quiet and in simultaneous noise maskers (Houtsma et al., 1980;Rankovic et al., 1988;Schlauch et al., 1995).
Here, I use the Schlauch et al. (1995) data to predict the masked jnd from the quiet jnd because Schlauch et al. (1995) had the most complete set of data. Figure 1 illustrates the relative contributions of the three special terms in Eq. (12) to predictions of the jnd data in simultaneous masking. Figure 1A shows three loudness balance functions: the solid line represents a hypothetical condition where the same tone is perfectly balanced in loudness (i.e., 1:1 ratio) between two ears in quiet, the dashed line represents the measured balance function for a masked tone in a 15-SPL/Hz broadband noise and the dotted line for a masked tone in a 40-dB SPL/Hz broadband noise (from Figure 3 in Schlauch et al., 1995). An interpolation of the loudness balance function is then differentiated to derive the slopes as a function of intensity (X's represent the 15 dB SPL/Hz masking and O's represent the 40 dB SPL/Hz masking condition). Figure 1B shows the loudness growth function for a 1000-Hz tone in quiet FIGURE 1 | Predictions in simultaneous masking, with data (lines) being from Schlauch et al. (1995). Panel (A) shows loudness balance functions between a tone in quiet (y-axis) and a tone in noise (x-axis): The solid line represents the control condition where the same tone was balanced between the two ears in quiet, the dashed line represents the balance function for a tone being masked by a 15-dB SPL/Hz broadband noise, and the dotted line represents the loudness balance function for a tone by a 40-dB SPL/Hz noise. The symbols represent slope values for the balance function. The slope values use the same scale as the balance function from 0 to 100, except the slopes are unitless. Panel (B) shows derived loudness growth functions. The symbols represent loudness ratio values between quiet and masked tones and tones in quiet. Panel (C) shows the measured jnd functions (lines) and predicted jnd values (symbols).
FIGURE 2 | Predictions in forward masking, with data (lines) from Zeng (1994). Panel (A) shows loudness balance functions between a tone in quiet (y-axis) and a tone in forward masking (x-axis): The solid line represents the control condition where the same tone was balanced between the two ears in quiet, while the dashed line represents the balance function for a tone in forward masking. The * symbols represent slope values for the balance function, which uses the same scale as the balance function from 0 to 100, except the slopes are unitless. Panel (B) shows derived loudness growth functions. The symbols represent loudness ratio values between the masked tone and the tone in quiet. Panel (C) shows the measured jnd functions (lines) and predicted jnd values (symbols).

(solid line) based on Zwislocki's model [Eq.
(1), using k = 3.1; θ = 0.27; c = 2.5; I o = 10 −12 W/m 2 or 0 dB SPL], as well as the two masked loudness growth functions obtained by applying the loudness balance functions in Figure 1A to the loudness growth function in quiet. The X's and O's represent the loudness ratio between the corresponding quiet and masking conditions. Figure 1C shows measured jnd functions in quiet (solid line), 15-dB masking (dashed line), and 40-dB masking (dotted line). The X's and O's represent the predicted jnd values in the above two masking conditions based on Eq. (12). In addition to using the slope values in Figure 1A and loudness ratio values in Figure 1B, Eq. (12) uses a normalization factor of 4 dB and 8 dB for the 15-dB and 40-dB masking conditions, respectively. The 4-dB and 8-dB normalization factor was estimated from the both the dynamic range and the jnd values (Nelson et al., 1996; see their Figure 9), with the quiet condition having 2.5 times and 6.3 times more jnd steps than the 15-dB and 40-dB masking condition, respectively. There was no free parameter in this prediction. In terms of relative contributions to the successful prediction, the "equalloudness, equal-jnd" theory was essential to the prediction of the overall trend (the same downward pattern in Figures 1B,C), while the slope theory (the relatively flat pattern of the X and O symbols in Figure 1A) behaved similarly to the proportional jnd theory as a constant to shift the predicted function up or down.

Prediction of the jnd Function in Forward Masking
Loudness and its jnd functions of a stimulus can also be affected by forward and backward masking. Loudness is enhanced and intensity discrimination is degraded in forward and backward masking, particularly at middle intensities (Zeng et al., 1991;Plack and Viemeister, 1992;Zeng and Turner, 1992). Although an early attempt to relate the "midlevel hump" (the jnd function) to loudness enhancement was not successful (Zeng, 1994), Oberfeld (2008) found a significant correlation between the elevated jnd and enhanced loudness when a wide range of masker-to-signal level differences was tested.
Using the same processing steps as in Figures 1, 2 shows the loudness balance function between a 25-ms tone in quiet and in  Zeng and Shannon (1994). Reprinted with permission from AAAS. Symbols represent individual data and the solid line represents a logarithmic balance function. The dashed line represents a linear balance function, which clearly was not the true. (B) JND data (symbols) and predicted functions (lines) using the same stimuli from the same subjects in (A), adapted from Figure 4 in Zeng and Shannon (1999). Reprinted with permission from Wolters Kluwer Health.
the presence of a 90-dB SPL, 100-ms forward masker (Figure 2A), the derived loudness growth function (Figure 2B), and the measured as well as predicted jnd functions in quiet and masking ( Figure 2C). The slope theory (Figure 2A) predicted that forward masking would produce smaller than normal jnds for standard levels below 50 dB SPL but larger jnds for levels above 50 dB SPL. The "equal-loudness, equal-jnd" theory ( Figure 2B) predicted the midlevel hump jnd function due to enhanced loudness in forward masking. A 7-dB normalization factor, or five times less jnd steps in forward masking, was used in the final successful prediction ( Figure 2C) that combined all three special theories in Eq. (12). The similar pattern between Figures 2B,C is generally consistent with the observed correlation between enhanced loudness and elevated jnd (Oberfeld, 2008), but the quantitative prediction needs further investigation. It would be also interesting to know if the present unified theory could predict a similar jnd function observed for brief high-frequency tones under notched noise conditions (Carlyon and Moore, 1984). Oxenham and Moore (1995) hinted such a possibility by proposing "a new theory [that] explain[s] the severe departure from Weber's law in terms of both the variance. . . and the loudness of partially masked signals."

Predictions of the jnd Functions in Electric Hearing
In electric hearing where hair cells are missing and the auditory nerve fibers are directly stimulated by electric currents, loudness generally has a narrow dynamic range of 10-20 dB (Zeng and Galvin, 1999). Zeng and Shannon (1994) found that, in cochlear implant users, loudness grows as a traditional power function of electric current for stimulus frequencies lower than 300 Hz, but as an exponential function for stimulus frequencies higher than 300 Hz. These two different loudness growth functions would produce a logarithmic loudness balance function between lowand high-frequency electric stimuli. Figure 3A shows, indeed, such a logarithmic balance function (solid lines) between a 100-Hz stimulus (sinusoid or pulse amplitude on y-axis) and a 1000-Hz sinusoid (x-axis).
where θ is the slope of the logarithmic loudness balance function. Differentiating the above equation to derive the following JND function between the high-and low-frequency electric stimuli: Zeng and Shannon (1999) measured jnds of these stimuli in the same implant subjects (symbols in Figure 3B) and found that not only did this jnd function hold but more importantly the jnd function was nearly constant (the solid line in Figure 3B). Given the same power loudness growth function for the 100-Hz electric stimuli, it is not surprising that their Weber fraction was also constant. But why was the absolute difference ( E 1000 Hz ) constant for the 1000-Hz stimulus? Zeng and Shannon (1999) showed that this constant absolute difference was a result of the exponential loudness growth function.
Differentiating the above equation to obtain: L 1000 Hz E 1000 Hz = exp(E 1000 Hz ) = L 1000 Hz Frontiers in Psychology | www.frontiersin.org Rewriting the above equation to obtain: L 1000 Hz L 1000 Hz = E 1000 Hz (17) Equation (17) means that Brentano's ratio is also constant in electric stimulation. The only difference between Eqs. (17) and (4) is that (17) does not contain a threshold term, probably due to a lack of spontaneous neural activity in the deafened ear (Kiang and Moxon, 1972).

DISCUSSION
None of the individual components in the present unified theory is new. Previous studies have proposed these individual theories and evaluated them separately (e.g., Zwislocki and Jordan, 1986;Hellman, 1990, 2001;Schlauch, 1994;Schlauch et al., 1995;Allen and Neely, 1997). The present study is novel in two respects. First, the present study integrates the previously disconnected individual components through a unified theoretical framework, namely, the general form of Brentano's law in Eq. (4). Second, the present study offers a new formula, namely, Eq. (12), which specifically combines these individual terms to successfully predict the loudness and jnd relationships in simultaneous and forward masking, as well as in cochlear implant users. The present unified theory and its successful applications suggest that although Weber's law needs to be replaced by the general form of Brentano's law, Fechner's original idea using jnds to derive psychophysical laws is valid at least in the wide range of hearing situations examined here.
The general form of Brentano's law can be used to examine how close the actual jnd data follow Weber's law and its potential mechanisms by combining Eqs. (4) and (5): where both w'(= w/θ) and α are free parameters to be estimated, with α = 0 indicating perfect conformity to Weber's law. Figure 4 shows the jnd data and the model estimation for a 1-kHz tone (Schlauch et al., 1995), 8-kHz broadband noise (6-14 kHz) and the same noise in a notched noise background (Viemeister, 1983). All three sets of data can be modeled by a two-stage function, with a steep first stage (∼10-20 dB SPL) reflecting the threshold influence and a shallower second stage (∼20-100 dB SPL) with its slope being α in Eq. (16). All three sets of data follow the nearmiss to Weber's law (McGill and Goldberg, 1968), with α being −0.09 for the tone, −0.03 for the noise, and 0.04 for the noise in a notched noise background. The near-miss ranges from −9% to 4% and has an average of 3% for the three stimuli considered here.
To provide a solution to the near-miss to Weber's law, McGill and Goldberg adopted a Poisson-like process, in which the loudness mean (L) and its variance (σ 2 ) are equal, where σ is the standard deviation. To achieve 75% correct detection in a jnd task, the signal detection theory requires: d = L σ = L L 0.5 = 1 (Green and Swets, 1966). Replacing L = L 0.5 in Eq. (19) to produce: Compared with the −0.14 slope predicted by the Poissonlike process, the estimated slope was is 5% off for the tone, 11% off for the noise and 18% off for the noise in a notchednoise background. As an overcorrection, McGill and Goldberg's solution has created a much greater difference (average = 11%) than the original problem, i.e., the near-miss (average = 3%) to Weber's law. Alternatively, the use of spread of excitation cue is the more likely mechanism underlying the near-miss to Weber's law (Florentine and Buus, 1981;Viemeister, 1983), but a quantitative treatment of its predictive accuracy is still lacking. At least as a first-order approximation, Weber's law holds for sound intensity discrimination.
While it is challenging, the search for a unified psychophysical law has continued to attract attention, especially on its biological basis (e.g., Shepard, 1987;Nieder and Miller, 2003;Dehaene et al., 2008;Dzhafarov and Colonius, 2011;Teghtsoonian, 2012;Pardo-Vazquez et al., 2019). In an influential paper, which drew 30 open peer commentaries, Krueger (1989) attempted to reconcile Fechner and Stevens by proposing a unified psychophysical law, in which (1) "each jnd has the same subjective magnitude for a given modality, " (2) "subjective magnitude increases as approximately a power function of physical magnitude, " and (3) "subjective magnitude depends primarily on peripheral sensory processes, that is, no non-linear central transformations occur." With regard to (1), Krueger preferred S or in the present term L = c (constant) for the law of parsimony, but was willing to accept L/L = c (Brentano's Law) or even L/L = L −0.5 (McGill and Goldberg's Poisson process). The present study favors Brentano's Law with a threshold correction factor. The second point was the primary concern of Kruger's unified law, in which not only did he attempt to reconcile the different ways to measure sensation magnitude (e.g., magnitude estimate versus categorical rating), but also derive the subjective magnitude function from the jnd data. He explicitly examined the "proportional-jnd theory" (p. 260), implicitly discussed the "slope" theory (his Table 1 on p. 261), but probably didn't know about the "equal-loudness, equal-jnd" theory, letting alone consider them as three independent factors that collectively contribute to the jnd-loudness function (the present study). Kruger's third point treating the brain as a linear device is wrong, because not only does the present study (B3) show that electric stimulation of the auditory nerve, which bypasses the auditory hair cells, produces an exponential loudness function in cochlear implant users, but more importantly many studies on neuroplasticity have found abnormally increased gain in the brain in response to reduced input in the periphery (e.g., Qiu et al., 2000;Norena, 2011;Chambers et al., 2016).

DATA AVAILABILITY STATEMENT
The datasets generated for this study are available on request to the corresponding author.

ETHICS STATEMENT
Ethical approval was not required for this study as the human data were from previously published studies. The patients/participants provided their written informed consent to participate in this study.

AUTHOR CONTRIBUTIONS
F-GZ is solely responsible for the work presented here.