Abstract
Neuroimaging work has shed light on the cerebral architecture involved in processing the melodic and harmonic aspects of music. Here, recent evidence is reviewed illustrating that subcortical auditory structures contribute to the early formation and processing of musically relevant pitch. Electrophysiological recordings from the human brainstem and population responses from the auditory nerve reveal that nascent features of tonal music (e.g., consonance/dissonance, pitch salience, harmonic sonority) are evident at early, subcortical levels of the auditory pathway. The salience and harmonicity of brainstem activity is strongly correlated with listeners’ perceptual preferences and perceived consonance for the tonal relationships of music. Moreover, the hierarchical ordering of pitch intervals/chords described by the Western music practice and their perceptual consonance is well-predicted by the salience with which pitch combinations are encoded in subcortical auditory structures. While the neural correlates of consonance can be tuned and exaggerated with musical training, they persist even in the absence of musicianship or long-term enculturation. As such, it is posited that the structural foundations of musical pitch might result from innate processing performed by the central auditory system. A neurobiological predisposition for consonant, pleasant sounding pitch relationships may be one reason why these pitch combinations have been favored by composers and listeners for centuries. It is suggested that important perceptual dimensions of music emerge well before the auditory signal reaches cerebral cortex and prior to attentional engagement. While cortical mechanisms are no doubt critical to the perception, production, and enjoyment of music, the contribution of subcortical structures implicates a more integrated, hierarchically organized network underlying music processing within the brain.
In Western tonal music, the octave is divided into 12 equally spaced pitch classes (i.e., semitones). These elements can be further arranged into seven tone subsets to construct the diatonic major/minor scales that define tonality and musical key. Music theory and composition stipulate that the pitch combinations (i.e., intervals) formed by these scale-tones carry different weight, or importance, within a musical framework (Aldwell and Schachter, ). That is, pitch intervals follow a hierarchical organization in accordance with their functional role in musical composition (Krumhansl, 1990). Intervals associated with stability and finality are regarded as consonant while those associated with instability (i.e., requiring resolution) are regarded as dissonant. Given their anchor-like function in musical contexts, it is perhaps unsurprising that consonant pitch relationships occur more frequently in tonal music than dissonant relationships (Budge, ; Vos and Troost, 1989). Ultimately, it is the ebb and flow between consonance and dissonance which conveys musical tension and establishes the structural foundations of melody and harmony, the fundamental building blocks of Western tonal music (Rameau, 1722/1971; Krumhansl, 1990).
The Perception of Musical Pitch: Sensory Consonance and Dissonance
The music cognition literature distinguishes the aforementioned musical definitions from those used to describe the psychological attributes of musical pitch. The term tonal- or sensory-consonance-dissonance refers to the perceptual quality of two or more simultaneous tones presented in isolation (Krumhansl, 1990) and is distinct from consonance arising from contextual or cognitive influences (see Dowling and Harwood, 1986, for a discussion of non-sensory factors). Perceptually, consonant pitch relationships are described as sounding more pleasant, euphonious, and beautiful than dissonant combinations which sound unpleasant, discordant, or rough (Plomp and Levelt, 1965). Consonance is often described parsimoniously as the absence of dissonance. A myriad of empirical studies have quantified the perceptual qualities of musical pitch relationships. In such behavioral experiments, listeners are typically played various two-tone pitch combinations (dyads) constructed from the musical scale and asked to rate their degree of consonance (i.e., “pleasantness”). Examples of such ratings, as reported in the seminal studies of Kameoka and Kuriyagawa (1969a,b), are shown in Figure 1A. The rank order of intervals according to their perceived consonance is shown in Figure 1B. Two trends emerge from the pattern of ratings across a number of studies: (i) listeners routinely prefer consonant pitch relationships (e.g., octave, fifth, fourth, etc.) to their dissonant counterparts (e.g., major/minor second, sevenths) and (ii) intervals are not heard in a strict binary manner (i.e., consonant vs. dissonant) but rather, are processed differentially based on their degree of perceptual consonance (e.g., Kameoka and Kuriyagawa, 1969a,b; Krumhansl, 1990). These behavioral studies demonstrate that musical pitch relationships are perceived hierarchically and in an arrangement that parallels their relative use and importance in music composition (Krumhansl, 1990; Schwartz et al., 2003).
Figure 1
Interestingly, the preference for consonance and the hierarchical nature of musical pitch perception is reported even for non-musician listeners (Van De Geer et al., 1962; Tufts et al., 2005; Bidelman and Krishnan,
The current review aims to provide a comprehensive overview of recent work examining the psychophysiological bases of consonance, dissonance, and the hierarchical foundations of musical pitch. Discussions of these musical phenomena have enjoyed a rich history of arguments developed over many centuries. As such, treatments of early explanations are first provided based on mathematical, acoustic, and psychophysical accounts implicating peripheral auditory mechanisms (e.g., cochlear mechanics) in musical pitch listening. Counterexamples are then provided which suggest that strict acoustic and cochlear theories are inadequate to account for the findings of recent studies examining human consonance judgments. Lastly, recent neuroimaging evidence is highlighted which supports the notion that the perceptual attributes of musical pitch are rooted in neurophysiological processing performed by the central nervous system. Particular attention is paid to recent studies examining the neural encoding of musical pitch using scalp-recorded brainstem responses elicited from human listeners. Brainstem evoked potentials demonstrate that the perceptual correlates of musical consonance and pitch hierarchy are well represented in subcortical auditory structures, suggesting that attributes important to music listening emerge well before the auditory signal reaches cerebral cortex. The contribution of subcortical mechanisms implies that music engages a more integrated, hierarchically organized network tapping both sensory (pre-attentive) and cognitive levels of brain processing.
Historical Theories and Explanations for Musical Consonance and Dissonance
The acoustics of musical consonance
Early explanations of consonance and dissonance focused on the underlying acoustic properties of musical intervals. It was recognized as early as the ancient Greeks, and later by Galilei (1638/1963), that pleasant sounding (i.e., consonant) musical intervals were formed when two vibrating entities were combined whose frequencies formed simple integer ratios (e.g., 3:2 = perfect fifth, 2:1 = octave). In contrast, “harsh” or “discordant” (i.e., dissonant) intervals were created by combining tones with complex ratios (e.g., 16:15 = minor second). By these purely mathematical standards, consonant intervals were regarded as divine acoustic relationships superior to their dissonant counterparts and, as a result, were heavily exploited by early composers (for a historic account, see Tenney, 1988). Indeed, the most important pitch relationships in music, including the major chord, can be derived directly from the first few components of the harmonic series (Gill and Purves, 2009). Yet, while attractive prima facie, the long held theory that the ear prefers simple ratios is no longer tenable when dealing with contemporary musical tuning systems. For example, the ratio of the consonant perfect fifth under modern equal temperament (442:295) is hardly a small integer relationship. Though intimately linked, explanations of consonance-dissonance based purely on these physical constructs (e.g., frequency ratios) are, in and of themselves, insufficient in describing all of the cognitive aspects of musical pitch (Cook and Fujisawa, 2006; Bidelman and Krishnan,
Psychophysiology of musical consonance
Psychophysical roughness/beating and the cochlear critical band
Helmholtz (1877/1954) offered some of the earliest psychophysical explanations for sensory consonance-dissonance. He observed that when adjacent harmonics in complex tones interfere they create the perception of “roughness” or “beating,” percepts closely related to the perceived dissonance of tones (Terhardt, 1974). Consonance, on the other hand, occurs in the absence of beating, when low-order harmonics are spaced sufficiently far apart so as not to interact. Empirical studies suggest this phenomenon is related to cochlear mechanics and the critical-band hypothesis (Plomp and Levelt, 1965). This theory postulates that the overall consonance-dissonance of a musical interval depends on the total interaction of frequency components within single auditory filters. Pitches of consonant dyads have fewer partials which pass through the same critical bands and therefore, yield more pleasant percepts; in contrast, the partials of dissonant intervals compete within individual channels and as such, yield discordant percepts.
Unfortunately, roughness/beating is often difficult to isolate from consonance percepts given that both covary with the spacing between frequency components in the acoustic waveform, and are thus, intrinsically coupled. While within-channel interactions may produce some amount of dissonance, modern empirical evidence indicates that beating/roughness plays only a minor role in its perception. Indeed, at least three pieces of evidence support the notion that consonance may not be mediated by roughness/beating, per se. First, psychoacoustic findings indicate that roughness percepts are dominated by lower modulation rates (∼30–150 Hz) (Terhardt, 1974; McKinney et al., 2001, p. 2). Yet, highly dissonant intervals are heard for tones spaced well beyond this range (Bidelman and Krishnan,
Tonal fusion and harmonicity
Alternate theories have suggested musical consonance is determined by the sense of “fusion” or “tonal affinity” between simultaneously sounding pitches (Stumpf, 1890). Pitch fusion describes the degree to which multiple pitches are heard as a single, unitary tone (DeWitt and Crowder, 1987). Fusion is closely related to harmonicity, which describes how well a sound’s acoustic spectrum agrees with a single harmonic series (Gill and Purves, 2009; McDermott et al., 2010; Bidelman and Heinz,
Neurophysiology of musical consonance
The fact that these perceptual factors do not depend on long-term enculturation or musical training and have been reported even in non-human species (Izumi, 2000; Watanabe et al., 2005; Brooks and Cook,
Neural Correlates of Consonance, Dissonance, and Musical Pitch Hierarchy
Neuroimaging methods have offered a window into the cerebral architecture underlying the perceptual attributes of musical pitch. Functional magnetic resonance imaging (fMRI), for example, has shown differential and enhanced activation across cortical regions (e.g., inferior/middle frontal gyri, premototor cortices, interior parietal lobule) when processing consonant vs. dissonant tonal relationships (Foss et al., 2007; Minati et al., 2009; Fujisawa and Cook, 2011). Scalp-recorded event-related brain potentials (ERPs) have proved to be a particularly useful technique to non-invasively probe the neural correlates of musical pitch. ERPs represent the time-locked neuroelectric activity of the brain generated by the activation of neuronal ensembles within cerebral cortex. The auditory cortical ERP consists of a series of voltage deflections (i.e., “waves”) within the first ∼250 ms after the onset of sound. Each deflection represents the subsequent activation in a series of early auditory cortical structures including thalamus and primary/secondary auditory cortex (Näätänen and Picton, 1987; Scherg et al., 1989; Picton et al., 1999). The millisecond temporal resolution of ERPs provides an ideal means to investigate the time-course of music processing within the brain not afforded by other, more sluggish neuroimaging methodologies (e.g., fMRI).
Cortical correlates of musical consonance
Using far-field recorded ERPs, neural correlates of consonance, dissonance, and musical scale pitch hierarchy have been identified at a cortical level of processing (Brattico et al.,
Figure 2

Cortical event-related potentials (ERPs) elicited by musical dyads. (A) Cortical ERP waveforms recorded at the vertex of the scalp (Cz lead) in response to chromatic musical intervals. Response trace color corresponds to the evoking stimulus denoted in music notation. Interval stimuli were composed of two simultaneously sounding pure tones. (B) Cortical N2 response magnitude is modulated by the degree of consonance; dissonant pitch relationships evoke larger N2 magnitude than consonant intervals. The shaded region demarcates the critical bandwidth (CBW); perceived dissonance created by intervals larger than the CBW cannot be attributed to cochlear interactions (e.g., beating between frequency components). Perfect consonant intervals (filled circles); imperfect consonant intervals (filled triangles); dissonant intervals (open circles) (C) Response magnitude is correlated with the degree of simplicity of musical pitch intervals; simpler, more consonant pitch relationships (e.g., P1, P8, P5) elicit smaller N2 than more complex, dissonant pitch relationships (e.g., M2, TT, M7). Figure adapted from Itoh et al. (2010) with permission from The Acoustical Society of America.
Brainstem correlates of musical consonance and scale pitch hierarchy
To assess human subcortical auditory processing, electrophysiological studies have utilized the frequency-following responses (FFRs). The FFR is a sustained evoked potential characterized by a periodic waveform which follows the individual cycles of the stimulus (for review, see Krishnan, 2007; Chandrasekaran and Kraus, 2010; Skoe and Kraus, 2010). Based on its latency (Smith et al., 1975), lesion data (Smith et al., 1975; Sohmer et al., 1977), and known extent of phase-locking in the brainstem (Wallace et al., 2000; Aiken and Picton,
In a recent study (Bidelman and Krishnan,
Figure 3

Human brainstem frequency-following responses (FFRs) elicited by musical dyads. Grand average FFR waveforms (A) and their corresponding frequency spectra (B) evoked by the dichotic presentation of four representative musical intervals. Consonant intervals, blue; dissonant intervals, red. (A) Clearer, more robust periodicity is observed for consonant relative to dissonant intervals. (B) Frequency spectra reveal that FFRs faithfully preserve the harmonic constituents of both musical notes of the interval (compare response spectrum, filled area, to stimulus spectrum, harmonic locations denoted by dots). Consonant intervals evoked more robust spectral magnitudes across harmonics than dissonant intervals. Amplitudes are normalized relative to the unison. (C) Correspondence between FFR pitch salience computed from brainstem responses and behavior consonance ratings. Neural responses well predict human preferences for musical intervals. Note the systematic clustering of consonant and dissonant intervals and the maximal separation of the unison (most consonant interval) from the minor second (most dissonant interval) in the neural-behavioral space. Data from Bidelman and Krishnan (
Importantly, these strong brain-behavior relationships have been observed in non-musician listeners and under conditions of passive listening (most subjects fell asleep during EEG testing). These factors imply that basic perceptual aspects of music might be rooted in intrinsic sensory processing. Unfortunately, these brainstem studies employed adult human listeners. As such, they could not rule out the possibility that non-musicians’ brain responses might have been preferentially tuned via long-term enculturation and/or implicit exposure to the norms of Western music practice.
Auditory nerve correlates of musical consonance
To circumvent confounds of musical experience, enculturation, memory, and other top-down factors which influence the neural code for music, Bidelman and Heinz (
Auditory nerve population responses were obtained by pooling single-unit responses from 70 fibers with characteristic frequencies spanning the range of human hearing. Spike trains were recorded in response to 220 dyads within the range of an octave where f1/f2 separation varied from the unison (i.e., f2 = f1) to the octave (i.e., f2 = 2f1). First-order interspike interval histograms computed from raw spike times allowed for the quantification of periodicity information contained in the aggregate AN response (Figure 4A). Adopting techniques of (Bidelman and Krishnan,
Figure 4

Auditory nerve (AN) responses to musical dyads. (A) Population level interspike interval histograms (ISIHs) for a representative consonant (perfect fifth: 220 + 330 Hz) and dissonant (minor second: 220 + 233 Hz) musical interval. ISIHs quantify the periodicity of spike discharges from a population of 70 AN fibers driven by a single two-tone musical interval. (B) Neural pitch salience profiles computed from ISIHs via harmonic sieve analyses quantify the salience of all possible pitches contained in AN responses based on harmonicity of the spike distribution. Their peak magnitude (arrows) represents a singular measure of neural pitch salience for the eliciting musical interval. (C) AN pitch salience across the chromatic intervals is more robust for consonant than dissonant intervals. Rank order of the intervals according to their neural pitch salience parallels the hierarchical arrangement of pitches according to Western music theory (i.e., Un > Oct > P5, >P4, etc.). (D) AN pitch representations predict the hierarchical order of behavioral consonance judgments of human listeners (behavioral data from normal-hearing listeners of Tufts et al., 2005). AN data reproduced from Bidelman and Heinz (
In follow-up analyses, it was shown that neither acoustic nor traditional psychophysical explanations (e.g., periodicity, roughness/beating) could fully account for human consonance ratings (Bidelman and Heinz,
The Hierarchical Nature and Basis of Subcortical Pitch Processing
To date, overwhelming evidence suggests that cortical integrity is necessary to support the cognitive aspects of musical pitch (Johnsrude et al., 2000; Ayotte et al.,
Figure 5

Comparison between auditory nerve, human brainstem evoked potentials, and behavioral responses to musical intervals. (Top left) AN responses correctly predict perceptual attributes of consonance, dissonance, and the hierarchical ordering of musical dyads. AN neural pitch salience is shown as a function of the number of semitones separating the interval’s lower and higher pitch over the span of an octave (i.e., 12 semitones). Consonant musical intervals (blue) tend to fall on or near peaks in neural pitch salience whereas dissonant intervals (red) tend to fall within trough regions, indicating more robust encoding for the former. Among intervals common to a single class (e.g., all consonant intervals), AN responses show differential encoding resulting in the hierarchical arrangement of pitch typically described by Western music theory (i.e., Un > Oct > P5, >P4, etc.). (Top middle) neural correlates of musical consonance observed in human brainstem responses. As in the AN, brainstem responses reveal stronger encoding of consonant relative to dissonant pitch relationships. (Top right) behavioral consonance ratings reported by human listeners. Dyads considered consonant according to music theory are preferred over those considered dissonant [minor second (m2), tritone (TT), major seventh (M7)]. For comparison, the solid line shows predictions from a mathematical model of consonance and dissonance (Sethares, 1993) where local maxima denote higher degrees of consonance than minima, which denote dissonance. (Bottom row) auditory nerve (left) and brainstem (middle) responses similarly predict behavioral chordal sonority ratings (right) for the four most common triads in Western music. Chords considered consonant according to music theory (i.e., major, minor) elicit more robust subcortical responses and show an ordering expected by music practice (i.e., major > minor ≫ diminished > augmented). AN data from Bidelman and Heinz (
As in language (Hickok and Poeppel, 2004), brain networks engaged during music likely involve a series of computations applied to the neural representation at different stages of processing. It is likely that higher-level abstract representations of musical pitch structure are first initiated in acoustics (Gill and Purves, 2009; McDermott et al., 2010). Physical periodicity is then transformed to musically relevant neural periodicity very early along the auditory pathway (AN; Tramo et al., 2001; Bidelman and Heinz,
Importantly, it seems that even the non-musician brain is especially sensitive to the pitch relationships found in music and is enhanced when processing consonant relative to dissonant chords/intervals. The preferential encoding of consonance might be attributable to the fact that it generates more robust and synchronous phase-locking than dissonant pitch intervals. A higher neural synchrony for the former is consistent with previous neuronal recordings in AN (Tramo et al., 2001), midbrain (McKinney et al., 2001), and cortex (Fishman et al., 2001) of animal models which show more robust temporal responses for consonant musical units. For these pitch relationships, neuronal firing occurs at precise, harmonically related pitch periods; dissonant relations on the other hand produce multiple, more irregular neural periodicities. Pitch encoding mechanisms likely exploit simple periodic (cf. consonant) information more effectively than aperiodic (cf. dissonant) information (Rhode, 1995; Langner, 1997; Ebeling, 2008), as the former is likely to be more compatible with pitch extraction templates and provides a more robust, unambiguous cue for pitch (McDermott and Oxenham, 2008). In a sense, dissonance may challenge the auditory system in ways that simple consonance does not. It is conceivable that consonant music relationships may ultimately reduce computational load and/or require fewer brain resources to process than their dissonant counterparts due to the more coherent, synchronous neural activity they evoke (Burns,
One important issue concerning the aforementioned FFR studies is the degree to which responses reflect the output of a subcortical, brainstem “pitch processer” or rather, a reflection of the representations propagated from more peripheral sites (e.g., AN). Indeed, IC architecture [orthogonal frequency-periodicity maps (Langner, 2004; Baumann et al.,
Subcortical Plasticity in Musical Pitch Processing
The aforementioned studies demonstrate a critical link between sensory coding and the perceptual qualities of musical pitch which are independent of musical training and long-term enculturation. Electrophysiological studies thus largely converge with behavioral work, demonstrating that both musicians and non-musicians show both a similar bias for consonance and a hierarchical hearing of the pitch combinations in music (Roberts, 1986; McDermott et al., 2010). Yet, realizing the profound impact of musical experience on the auditory brain, recent studies have begun to examine how musicianship might impact the processing and perceptual organization of consonance, dissonance, and scale pitch hierarchy. Examining training-induced effects also provides a means to examine the roles of nature and nurture on the encoding of musical pitch as well as the influence of auditory experience on music processing.
Neuroplastic effects on pitch processing resulting from musical training
Comparisons between musicians and non-musicians reveal enhanced brainstem encoding of pitch-relevant information in trained individuals (Figure 6) (Musacchia et al., 2007; Bidelman and Krishnan,
Figure 6

Experience-dependent enhancement of brainstem responses resulting from musical training. (A) Brainstem FFR time-waveforms elicited by a chordal arpeggio (i.e., three consecutive tones) recorded in musician and non-musicians listeners (red and blue, respectively). (B) Expanded time window around the onset response to the chordal third (≈117 ms), the defining note of the arpeggio sequence. Relative to non-musicians, musician responses are both larger and more temporally precise as evident by their shorter duration P-N onset complex (C) and more robust amplitude (D). Musical training thus improves both the precision and magnitude of time-locked neural activity to musical pitch. Error bars = SEM. Data from Bidelman et al. (
Experience-dependent changes in the psychophysiological processing of musical consonance
At a subcortical level, recent studies have demonstrated more robust and coherent brainstem responses to consonant and dissonant intervals in musically trained listeners relative to their non-musician peers (Lee et al., 2009). Brainstem phase-locking to the temporal periodicity of the stimulus envelope – a prominent correlate of roughness/beating (Terhardt, 1974) – is also stronger and more precise in musically trained listeners (Lee et al., 2009). These results suggest that brainstem auditory processing is shaped experientially so as to refine neural representations of musical pitch in a behaviorally relevant manner (for parallel effects in language, see Bidelman et al.,
Recent work also reveals similar experience-dependent effects at a cortical level. Consonant chords, for example, elicit differential hemodynamic responses in inferior and middle frontal gyri compared to dissonant chords regardless of an individual’s musical experience (Minati et al., 2009). Yet, the hemispheric laterality of this activation differs between groups; while right lateralized for non-musicians, activation is more symmetric in musicians suggesting that musical expertise recruits a more distributed neural network for music processing. Cortical brain potentials corroborate fMRI findings. Studies generally show that consonant and dissonant pitch intervals elicit similar modulations in the early components of the ERPs (P1/N1) for both musicians and non-musicians alike. But, distinct variation in the later waves (N2) are found nearly exclusively in musically trained listeners (Regnault et al., 2001; Itoh et al., 2003, 2010; Schön et al., 2005; Minati et al., 2009). Thus, musicianship might have a differential effect on the time-course of cortical auditory processing; musical training might exert more neuroplastic effects on later, endogenous mechanisms (i.e., N2) than on earlier, exogenous processing (e.g., P1, N1). Indeed, variations in N2 – which covaries with perceived consonance – are exaggerated in musicians (Itoh et al., 2010). These neurophysiological findings are consistent with recent behavioral reports which demonstrate musicians’ higher sensitivity and perceptual differentiation of consonant and dissonant pitches (McDermott et al., 2010; Bidelman et al.,
Limitations of these reports are worth mentioning. Most studies examining the effects of musical training on auditory abilities have employed cross-sectional and correlational designs. Such work has suggested that the degree of a musicians’ auditory perceptual and neurophysiological enhancements is often positively associated with the number of years of his/her musical training and negatively associated with the age at which training initiated (e.g., Bidelman et al.,
Is There a Neurobiological Basis for Musical Pitch?
There are notable commonalities (i.e., universals) among many of the music systems of the world including the division of the octave into specific scale steps and the use of a stable reference pitch to establish key structure. In fact, it has been argued that culturally specific music is simply an elaboration of only a few universal traits (Carterette and Kendall, 1999), one of which is the preference for consonance (Fritz et al., 2009). Together, our recent findings from human brainstem recordings (Bidelman and Krishnan,
It is interesting to note that musical intervals and chords deemed more pleasant sounding by listeners are also more prevalent in tonal composition (Budge,
Conclusion
Brainstem evoked potentials and AN responses reveal robust correlates of musical pitch at subcortical levels of auditory processing. Interestingly, the ordering of musical intervals/chords according to the magnitude of their subcortical representations tightly parallels their hierarchical arrangement as described by Western music practice. Thus, information relevant to musical consonance, dissonance, and scale pitch structure emerge well before cortical and attentional engagement. The close correspondence between subcortical brain representations and behavioral consonance rankings suggests that listeners’ judgments of pleasant- or unpleasant-sounding pitch relationships may, at least in part, be rooted in early, pre-attentive stages of the auditory system. Of the potential correlates of musical consonance described throughout history (e.g., acoustical ratios, cochlear roughness/beating, neural synchronicity), results suggest that the harmonicity of neural activity best predicts human judgments. Although enhanced with musical experience, these facets of musical pitch are encoded in non-musicians (and even non-human animals), implying that certain fundamental attributes of music listening exist in the absence of training, long-term enculturation, and memory/cognitive capacity. It is possible that the preponderance of consonant pitch relationships and choice of intervals, chords, and tuning used in modern compositional practice may have matured based on the general processing and constraints of the sensory auditory system.
Statements
Acknowledgments
The author wishes to thank Dr. Kosuke Itoh for kindly sharing figures of the cortical ERP data and Dr. Jennifer Bidelman and Stefanie Hutka for comments on earlier versions of this manuscript. Preparation of this work was supported by a grant-in-aid awarded by GRAMMY Foundation to Gavin M. Bidelman.
Conflict of interest
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Footnotes
1.^A binaural interaction component (BIC) is derived from scalp-recorded ERPs as the difference between potentials evoked via binaural stimulation from the summed responses evoked by monaural stimulation. Assuming confounding factors such as acoustic cross-talk and middle ear reflex are eliminated, the resulting BIC response is thought to reflect neural interaction in the outputs from both ears converging at or above the level of the brainstem (Krishnan and McDaniel, 1998). Binaural interaction has been observed in brainstem, middle-latency, and cortical auditory evoked potentials and can be used to investigate the central interaction of auditory information (McPherson and Starr, 1993).
References
1
AikenS. J.PictonT. W. (2008). Envelope and spectral frequency-following responses to vowel sounds. Hear. Res.245, 35–47.10.1016/j.heares.2008.08.004
2
AldwellE.SchachterC. (2003). Harmony and Voice Leading. Boston: Thomson/Schirmer.
3
AlkhounI.GallégoS.MoulinA.MénardM.VeuilletE.Berger-VachonC.et al (2008). The temporal relationship between speech auditory brainstem responses and the acoustic pattern of the phoneme/ba/in normal-hearing adults. J. Clin. Neurophysiol.119, 922–933.10.1016/j.clinph.2007.12.010
4
AyotteJ.PeretzI.HydeK. (2002). Congenital amusia: a group study of adults afflicted with a music-specific disorder. Brain125, 238–251.10.1093/brain/awf028
5
BaumannS.GriffithsT. D.SunL.PetkovC. I.ThieleA.ReesA. (2011). Orthogonal representation of sound dimensions in the primate midbrain. Nat. Neurosci.14, 423–425.10.1038/nn.2771
6
BidelmanG. M.GandourJ. T.KrishnanA. (2011a). Cross-domain effects of music and language experience on the representation of pitch in the human auditory brainstem. J. Cogn. Neurosci.23, 425–434.10.1162/jocn.2009.21362
7
BidelmanG. M.GandourJ. T.KrishnanA. (2011b). Musicians and tone-language speakers share enhanced brainstem encoding but not perceptual benefits for musical pitch. Brain Cogn.77, 1–10.10.1016/j.bandc.2011.07.006
8
BidelmanG. M.GandourJ. T.KrishnanA. (2011c). Musicians demonstrate experience-dependent brainstem enhancement of musical scale features within continuously gliding pitch. Neurosci. Lett.503, 203–207.10.1016/j.neulet.2011.08.036
9
BidelmanG. M.KrishnanA.GandourJ. T. (2011d). Enhanced brainstem encoding predicts musicians’ perceptual advantages with pitch. Eur. J. Neurosci.33, 530–538.10.1111/j.1460-9568.2010.07527.x
10
BidelmanG. M.HeinzM. G. (2011). Auditory-nerve responses predict pitch attributes related to musical consonance-dissonance for normal and impaired hearing. J. Acoust. Soc. Am.130, 1488–1502.10.1121/1.3605559
11
BidelmanG. M.HutkaS.MorenoS. (2013). Tone language speakers and musicians share enhanced perceptual and cognitive abilities for musical pitch: evidence for bidirectionality between the domains of language and music. PLoS ONE8:e60676.10.1371/journal.pone.0060676
12
BidelmanG. M.KrishnanA. (2009). Neural correlates of consonance, dissonance, and the hierarchy of musical pitch in the human brainstem. J. Neurosci.29, 13165–13171.10.1523/JNEUROSCI.3900-09.2009
13
BidelmanG. M.KrishnanA. (2010). Effects of reverberation on brainstem representation of speech in musicians and non-musicians. Brain Res.1355, 112–125.10.1016/j.brainres.2010.07.100
14
BidelmanG. M.KrishnanA. (2011). Brainstem correlates of behavioral and compositional preferences of musical harmony. Neuroreport22, 212–216.10.1097/WNR.0b013e328344a689
15
BoomsliterP.CreelW. (1961). The long pattern hypothesis in harmony and hearing. J. Music Theory5, 2–31.10.2307/842868
16
BratticoE.TervaniemiM.NaatanenR.PeretzI. (2006). Musical scale properties are automatically processed in the human auditory cortex. Brain Res.1117, 162–174.10.1016/j.brainres.2006.08.023
17
BraunM. (1999). Auditory midbrain laminar structure appears adapted to f0 extraction: further evidence and implications of the double critical bandwidth. Hear. Res.129, 71–82.10.1016/S0378-5955(98)00223-8
18
BrooksD. I.CookR. G. (2010). Chord discrimination by pigeons. Music Percept.27, 183–196.10.1525/mp.2010.27.3.183
19
BuchE. (1900). Uber die verschmelzungen von empfindungen besonders bei klangeindrucken. Phil. Stud.15, 240.
20
BudgeH. (1943). A Study of Chord Frequencies. New York: Teachers College, Columbia University.
21
BurnsE. M. (1999). “Intervals, scales, and tuning,” in The Psychology of Music, 2nd Edn, ed. DeutschD. (San Diego: Academic Press), 215–264.
22
CarcagnoS.PlackC. J. (2011). Subcortical plasticity following perceptual learning in a pitch discrimination task. J. Assoc. Res. Otolaryngol.12, 89–100.10.1007/s10162-010-0236-1
23
CarteretteE. C.KendallR. A. (1999). “Comparative music perception and cognition,” in The Psychology of Music, 2nd Edn, ed. DeutschD. (San Diego: Academic Press), 725–791.
24
CazdenN. (1958). Musical intervals and simple number ratios. J. Res. Music Educ.7, 197–220.10.2307/3344215
25
CedolinL.DelgutteB. (2005). Pitch of complex tones: rate-place and interspike interval representations in the auditory nerve. J. Neurophysiol.94, 347–362.10.1152/jn.01114.2004
26
ChandrasekaranB.KrausN. (2010). The scalp-recorded brainstem response to speech: neural origins and plasticity. Psychophysiology47, 236–246.10.1111/j.1469-8986.2009.00928.x
27
CookN. D.FujisawaT. X. (2006). The psychophysics of harmony perception: harmony is a three-tone phenomenon. Empir. Musicol. Rev.1, 1–21.
28
CousineauM.McDermottJ. H.PeretzI. (2012). The basis of musical consonance as revealed by congenital amusia. Proc. Natl. Acad. Sci. U.S.A.109, 19858–19863.10.1073/pnas.1207989109
29
DeWittL. A.CrowderR. G. (1987). Tonal fusion of consonant musical intervals: the oomph in Stumpf. Percept. Psychophys.41, 73–84.10.3758/BF03208216
30
DowlingJ.HarwoodD. L. (1986). Music Cognition. San Diego: Academic Press.
31
EbelingM. (2008). Neuronal periodicity detection as a basis for the perception of consonance: a mathematical model of tonal fusion. J. Acoust. Soc. Am.124, 2320–2329.10.1121/1.2968688
32
EberleinR. (1994). Die Entstehung der tonalen Klangsyntax [The Origin of Tonal-Harmonic Syntax]. Frankfurt: Peter Lang.
33
FaistA. (1897). Versuche uber tonverschmelzung. Z. Psychol. Physiol. Sinnesorgane15, 102–131.
34
FishmanY. I.VolkovI. O.NohM. D.GarellP. C.BakkenH.ArezzoJ. C.et al (2001). Consonance and dissonance of musical chords: neural correlates in auditory cortex of monkeys and humans. J. Neurophysiol.86, 2761–2788.
35
FossA. H.AltschulerE. L.JamesK. H. (2007). Neural correlates of the Pythagorean ratio rules. Neuroreport18, 1521–1525.10.1097/WNR.0b013e3282ef6b51
36
FritzT.JentschkeS.GosselinN.SammlerD.PeretzI.TurnerR.et al (2009). Universal recognition of three basic emotions in music. Curr. Biol.19, 573–576.10.1016/j.cub.2009.02.058
37
FujisawaT. X.CookN. D. (2011). The perception of harmonic triads: an fMRI study. Brain Imaging Behav.5, 109–125.10.1007/s11682-011-9116-5
38
GalbraithG. C. (1994). Two-channel brain-stem frequency-following responses to pure tone and missing fundamental stimuli. Electroencephalogr. Clin. Neurophysiol.92, 321–330.10.1016/0168-5597(94)90100-7
39
GalileiG. (1638/1963). Discorsi e dimostrazioni matematiche interno à due nuove scienze attenenti alla mecanica ed i movimenti locali [Dialogues Concerning Two New Sciences], trans. CrewH.de SalvioA.New York: McGraw-Hill Book Co., Inc. [Original work published in 1638].
40
GillK. Z.PurvesD. (2009). A biological rationale for musical scales. PLoS ONE4:e8144.10.1371/journal.pone.0006489
41
GockelH. E.CarlyonR. P.MehtaA.PlackC. J. (2011). The frequency following response (FFR) may reflect pitch-bearing information but is not a direct representation of pitch. J. Assoc. Res. Otolaryngol.12, 767–782.10.1007/s10162-011-0284-1
42
GoldsteinJ. L. (1973). An optimum processor theory for the central formation of the pitch of complex tones. J. Acoust. Soc. Am.54, 1496–1516.10.1121/1.1978261
43
GreenbergS.MarshJ. T.BrownW. S.SmithJ. C. (1987). Neural temporal coding of low pitch. I. Human frequency-following responses to complex tones. Hear. Res.25, 91–114.10.1016/0378-5955(87)90083-9
44
HelmholtzH. (1877/1954). On the Sensations of Tone, trans. EllisA. J.. New York: Dover Publications, Inc. [Original work published in 1877].
45
HickokG.PoeppelD. (2004). Dorsal and ventral streams: a framework for understanding aspects of the functional anatomy of language. Cognition92, 67–99.10.1016/j.cognition.2003.10.011
46
HinkR. F.KoderaK.YamadaO.KagaK.SuzukiJ. (1980). Binaural interaction of a beating frequency-following response. Audiology19, 36–43.10.3109/00206098009072647
47
HoutsmaA. J.GoldsteinJ. L. (1972). The central origin of the pitch of complex tones: evidence from musical interval recognition. J. Acoust. Soc. Am.51, 520–529.10.1121/1.1912873
48
HuronD. (1991). Tonal consonance versus tonal fusion in polyphonic sonorities. Music Percept.9, 135–154.10.2307/40285526
49
HydeK. L.LerchJ.NortonA.ForgeardM.WinnerE.EvansA. C.et al (2009). Musical training shapes structural brain development. J. Neurosci.29, 3019–3025.10.1523/JNEUROSCI.5118-08.2009
50
ItohK.SuwazonoS.NakadaT. (2003). Cortical processing of musical consonance: an evoked potential study. Neuroreport14, 2303–2306.10.1097/00001756-200312190-00003
51
ItohK.SuwazonoS.NakadaT. (2010). Central auditory processing of noncontextual consonance in music: an evoked potential study. J. Acoust. Soc. Am.128, 3781.10.1121/1.3500685
52
IzumiA. (2000). Japanese monkeys perceive sensory consonance of chords. J. Acoust. Soc. Am.108, 3073–3078.10.1121/1.1323461
53
JanataP.BirkJ. L.Van HornJ. D.LemanM.TillmannB.BharuchaJ. J. (2002). The cortical topography of tonal structures underlying Western music. Science298, 2167–2170.10.1126/science.1076262
54
JohnsrudeI. S.PenhuneV. B.ZatorreR. J. (2000). Functional specificity in the right human auditory cortex for perceiving pitch direction. Brain123, 155–163.10.1093/brain/123.1.155
55
KameokaA.KuriyagawaM. (1969a). Consonance theory part I: consonance of dyads. J. Acoust. Soc. Am.45, 1451–1459.10.1121/1.1911623
56
KameokaA.KuriyagawaM. (1969b). Consonance theory part II: consonance of complex tones and its calculation method. J. Acoust. Soc. Am.45, 1460–1469.10.1121/1.1911623
57
KreugerF. (1913). Consonance and dissonance. J. Philos. Psychol. Scientific Methods10, 158.10.2307/2012516
58
KrishnanA. (2007). “Human frequency following response,” in Auditory Evoked Potentials: Basic Principles and Clinical Application, eds BurkardR. F.DonM.EggermontJ. J. (Baltimore: Lippincott Williams & Wilkins), 313–335.
59
KrishnanA.McDanielS. S. (1998). Binaural interaction in the human frequency-following response: effects of interaural intensity difference. Audiol. Neurootol.3, 291–299.10.1159/000013801
60
KrohnK. I.BratticoE.ValimakiV.TervaniemiM. (2007). Neural representations of the hierarchical scale pitch structure. Music Percept.24, 281–296.10.1525/mp.2007.24.3.281
61
KrumhanslC. L. (1990). Cognitive Foundations of Musical Pitch. New York: Oxford University Press.
62
LangnerG. (1997). Neural processing and representation of periodicity pitch. Acta Otolaryngol. Suppl.532, 68–76.10.3109/00016489709126147
63
LangnerG. (2004). “Topographic representation of periodicity information: the 2nd neural axis of the auditory system,” in Plasticity of the Central Auditory System and Processing of Complex Acoustic Signals, eds SykaJ.MerzenichM. (New York: Plenum Press), 21–26.
64
LeeK. M.SkoeE.KrausN.AshleyR. (2009). Selective subcortical enhancement of musical intervals in musicians. J. Neurosci.29, 5832–5840.10.1523/JNEUROSCI.5273-08.2009
65
MalmbergC. F. (1918). The perception of consonance and dissonance. Psychol. Monogr.25, 93–133.10.1037/h0093119
66
McDermottJ.HauserM. D. (2005). The origins of music: innateness, uniqueness, and evolution. Music Percept.23, 29–59.10.1525/mp.2005.23.1.29
67
McDermottJ. H.LehrA. J.OxenhamA. J. (2010). Individual differences reveal the basis of consonance. Curr. Biol.20, 1035–1041.10.1016/j.cub.2010.04.019
68
McDermottJ. H.OxenhamA. J. (2008). Music perception, pitch, and the auditory system. Curr. Opin. Neurobiol.18, 452–463.10.1016/j.conb.2008.09.005
69
McKinneyM. F.TramoM. J.DelgutteB. (2001). “Neural correlates of the dissonance of musical intervals in the inferior colliculus,” in Physiological and Psychophysical Bases of Auditory Function, eds BreebaartD. J.HoutsmaA. J. M.KohlrauschA.PrijsV. F.SchoonhovenR. (Maastricht: Shaker), 83–89.
70
McPhersonD. L.StarrA. (1993). Binaural interaction in auditory evoked potentials: brainstem, middle- and long-latency components. Hear. Res.66, 91–98.10.1016/0378-5955(93)90263-Z
71
MeinongA.WitasekS. (1897). Zur Experimentallen bestimmung der tonver schmelzungsgrade. Z. Psychol. Physiol. Sinnesorgane15, 189–205.
72
MinatiL.RosazzaC.D’IncertiL.PietrociniE.ValentiniL.ScaioliV.et al (2009). Functional MRI/event-related potential study of sensory consonance and dissonance in musicians and nonmusicians. Neuroreport20, 87–92.10.1097/WNR.0b013e328330c751
73
MorenoS.MarquesC.SantosA.SantosM.CastroS. L.BessonM. (2009). Musical training influences linguistic abilities in 8-year-old children: more evidence for brain plasticity. Cereb. Cortex19, 712–723.10.1093/cercor/bhn120
74
MusacchiaG.SamsM.SkoeE.KrausN. (2007). Musicians have enhanced subcortical auditory and audiovisual processing of speech and music. Proc. Natl. Acad. Sci. U.S.A.104, 15894–15898.10.1073/pnas.0701498104
75
NäätänenR.PictonT. (1987). The N1 wave of the human electric and magnetic response to sound: a review and an analysis of the component structure. Psychophysiology24, 375–425.10.1111/j.1469-8986.1987.tb00311.x
76
PearT. H. (1911). Differences between major and minor chords. Br. J. Psychol.4, 56–94.
77
PeretzI.BratticoE.JarvenpaaM.TervaniemiM. (2009). The amusic brain: in tune, out of key, and unaware. Brain132, 1277–1286.10.1093/brain/awp055
78
PictonT. W.AlainC.WoodsD. L.JohnM. S.SchergM.Valdes-SosaP.et al (1999). Intracerebral sources of human auditory-evoked potentials. Audiol. Neurootol.4, 64–79.10.1159/000013823
79
PlompR.LeveltW. J. (1965). Tonal consonance and critical bandwidth. J. Acoust. Soc. Am.38, 548–560.10.1121/1.1909741
80
RameauJ.-P. (1722/1971). Treatise on Harmony, trans. GossettP.. New York: Dover Publications, Inc. [Original work published in 1722].
81
RegnaultP.BigandE.BessonM. (2001). Different brain mechanisms mediate sensitivity to sensory consonance and harmonic context: evidence from auditory event-related brain potentials. J. Cogn. Neurosci.13, 241–255.10.1162/089892901564298
82
RhodeW. S. (1995). Interspike intervals as a correlate of periodicity pitch in cat cochlear nucleus. J. Acoust. Soc. Am.97, 2414–2429.10.1121/1.411963
83
RobertsL. (1986). Consonant judgments of musical chords by musicians and untrained listeners. Acustica62, 163–171.
84
SchellenbergE. G.TrainorL. J. (1996). Sensory consonance and the perceptual similarity of complex-tone harmonic intervals: tests of adult and infant listeners. J. Acoust. Soc. Am.100, 3321–3328.10.1121/1.417355
85
SchellenbergE. G.TrehubS. E. (1994). Frequency ratios and the perception of tone patterns. Psychon. Bull. Rev.1, 191–201.10.3758/BF03200773
86
SchergM.VajsarJ.PictonT. W. (1989). A source analysis of the late human auditory evoked potentials. J. Cogn. Neurosci.1, 336–355.10.1162/jocn.1989.1.4.336
87
SchönD.RegnaultP.YstadS.BessonM. (2005). Sensory consonance: an ERP study. Music Percept.23, 105–117.10.1525/mp.2005.23.2.105
88
SchwartzD. A.HoweC. Q.PurvesD. (2003). The statistical structure of human speech sounds predicts musical universals. J. Neurosci.23, 7160–7168.
89
SetharesW. A. (1993). Local consonance and the relationship between timbre and scale. J. Acoust. Soc. Am.94, 1218–1228.10.1121/1.408175
90
SkoeE.KrausN. (2010). Auditory brain stem response to complex sounds: a tutorial. Ear Hear.31, 302–324.10.1097/AUD.0b013e3181cdb272
91
SlaymakerF. H. (1970). Chords from tones having stretched partials. J. Acoust. Soc. Am.47, 1569–1571.10.1121/1.1974599
92
SmithJ. C.MarshJ. T.BrownW. S. (1975). Far-field recorded frequency-following responses: evidence for the locus of brainstem sources. Electroencephalogr. Clin. Neurophysiol.39, 465–472.10.1016/0013-4694(75)90047-4
93
SohmerH.PrattH.KinartiR. (1977). Sources of frequency-following responses (FFR) in man. Electroencephalogr. Clin. Neurophysiol.42, 656–664.10.1016/0013-4694(77)90282-6
94
StumpfC. (1890). Tonpsychologie [Tone Psychology]. Leipzig: Hirzel.
95
StumpfC. (1989). Konsonanz and dissonanz. Beitr. Akust. Musikwiss.1, 91–107.
96
SugimotoT.KobayashiH.NobuyoshiN.KiriyamaY.TakeshitaH.NakamuraT.et al (2010). Preference for consonant music over dissonant music by an infant chimpanzee. Primates51, 7–12.10.1007/s10329-009-0160-3
97
TenneyJ. (1988). A History of “Consonance” and “Dissonance”. New York: Excelsior.
98
TerhardtE. (1974). On the perception of period sound fluctuations (roughness). Acustica20, 215–224.
99
TerhardtE.StollG.SeewannM. (1982). Algorithm for the extraction of pitch and pitch salience from complex tonal signals. J. Acoust. Soc. Am.71, 679–687.10.1121/1.387543
100
TrainorL. (2008). Science and music: the neural roots of music. Nature453, 598–599.10.1038/453598a
101
TrainorL.TsangC.CheungV. (2002). Preference for sensory consonance in 2- and 4-month-old infants. Music Percept.20, 187.10.1525/mp.2002.20.2.187
102
TramoM. J.CarianiP. A.DelgutteB.BraidaL. D. (2001). Neurobiological foundations for the theory of harmony in western tonal music. Ann. N. Y. Acad. Sci.930, 92–116.10.1111/j.1749-6632.2001.tb05727.x
103
TrehubS. E.HannonE. E. (2006). Infant music perception: domain-general or domain-specific mechanisms?Cognition100, 73–99.10.1016/j.cognition.2005.11.006
104
TrehubS. E.ThorpeL. A.TrainorL. J. (1990). Infants’ perception of good and bad melodies. Psychomusicology9, 5–19.10.1037/h0094162
105
TuftsJ. B.MolisM. R.LeekM. R. (2005). Perception of dissonance by people with normal hearing and sensorineural hearing loss. J. Acoust. Soc. Am.118, 955–967.10.1121/1.1942347
106
Van De GeerJ. P.LeveltW. J. M.PlompR. (1962). The connotation of musical consonance. Acta Psychol. (Amst)20, 308–319.10.1016/0001-6918(62)90010-0
107
VosP. G.TroostJ. M. (1989). Ascending and descending melodic intervals: statistical findings and their perceptual relevance. Music Percept.6, 383–396.10.2307/40285439
108
WallaceM. N.RutkowskiR. G.ShackletonT. M.PalmerA. R. (2000). Phase-locked responses to pure tones in guinea pig auditory cortex. Neuroreport11, 3989–3993.10.1097/00001756-200012180-00017
109
WatanabeS.UozumiM.TanakaN. (2005). Discrimination of consonance and dissonance in Java sparrows. Behav. Processes70, 203–208.10.1016/j.beproc.2005.06.001
110
WongP. C.SkoeE.RussoN. M.DeesT.KrausN. (2007). Musical experience shapes human brainstem encoding of linguistic pitch patterns. Nat. Neurosci.10, 420–422.
111
ZatorreR.McGillJ. (2005). Music, the food of neuroscience?Nature434, 312–315.10.1038/434312a
112
ZendelB. R.AlainC. (2013). The influence of lifelong musicianship on neurophysiological measures of concurrent sound segregation. J. Cogn. Neurosci.25, 503–516.10.1162/jocn_a_00329
113
ZilanyM. S.BruceI. C. (2006). Modeling auditory-nerve responses for high sound pressure levels in the normal and impaired auditory periphery. J. Acoust. Soc. Am.120, 1446–1466.10.1121/1.2225512
114
ZilanyM. S.BruceI. C.NelsonP. C.CarneyL. H. (2009). A phenomenological model of the synapse between the inner hair cell and auditory nerve: long-term adaptation with power-law dynamics. J. Acoust. Soc. Am.126, 2390–2412.10.1121/1.3238250
Summary
Keywords
musical pitch perception, consonance and dissonance, tonality, auditory event-related potentials, brainstem response, frequency-following response (FFR), musical training, auditory nerve
Citation
Bidelman GM (2013) The Role of the Auditory Brainstem in Processing Musically Relevant Pitch. Front. Psychol. 4:264. doi: 10.3389/fpsyg.2013.00264
Received
13 March 2013
Accepted
23 April 2013
Published
13 May 2013
Volume
4 - 2013
Edited by
Robert J. Zatorre, McGill University, Canada
Reviewed by
Erin Hannon, University of Nevada, USA; Chris Plack, The University of Manchester, UK
Copyright
© 2013 Bidelman.
This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in other forums, provided the original authors and source are credited and subject to any copyright notices concerning any third-party graphics etc.
*Correspondence: Gavin M. Bidelman, School of Communication Sciences and Disorders, University of Memphis, 807 Jefferson Avenue, Memphis, TN 38105, USA. e-mail: g.bidelman@memphis.edu
This article was submitted to Frontiers in Auditory Cognitive Neuroscience, a specialty of Frontiers in Psychology.
Disclaimer
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.