Enhanced representation of spectral contrasts in the primary auditory cortex

The role of early auditory processing may be to extract some elementary features from an acoustic mixture in order to organize the auditory scene. To accomplish this task, the central auditory system may rely on the fact that sensory objects are often composed of spectral edges, i.e., regions where the stimulus energy changes abruptly over frequency. The processing of acoustic stimuli may benefit from a mechanism enhancing the internal representation of spectral edges. While the visual system is thought to rely heavily on this mechanism (enhancing spatial edges), it is still unclear whether a related process plays a significant role in audition. We investigated the cortical representation of spectral edges, using acoustic stimuli composed of multi-tone pips whose time-averaged spectral envelope contained suppressed or enhanced regions. Importantly, the stimuli were designed such that neural responses properties could be assessed as a function of stimulus frequency during stimulus presentation. Our results suggest that the representation of acoustic spectral edges is enhanced in the auditory cortex, and that this enhancement is sensitive to the characteristics of the spectral contrast profile, such as depth, sharpness and width. Spectral edges are maximally enhanced for sharp contrast and large depth. Cortical activity was also suppressed at frequencies within the suppressed region. To note, the suppression of firing was larger at frequencies nearby the lower edge of the suppressed region than at the upper edge. Overall, the present study gives critical insights into the processing of spectral contrasts in the auditory system.


INTRODUCTION
The main goal of the central auditory system is to organize the acoustic environment into a coherent auditory scene, namely to detect, localize, discriminate, segregate and identify the multiple sources composing a sound mixture (Bregman, 1990;Darwin, 1997;Griffiths and Warren, 2004;Shamma and Micheyl, 2010). At the initial steps of processing, the auditory system can rely on the fact that sensory objects are composed of spectral cues such as spectral edges or contours where the stimulus energy reaches a maximum or changes abruptly over frequency (Moore and Glasberg, 1983;Assmann and Summerfield, 2004;Palmer and Shamma, 2004). While spectral peaks produced by vocal tract resonances are known to play an important role for indentifying conspecific vocalizations, in speech for example (Darwin, 1984;Assmann and Nearey, 1987;Henry et al., 2005), spectral troughs or notches produced by the head-related transfer function are also recognized as being critical for localizing sound sources on the sagittal plane (Carlile et al., 2005;Grothe et al., 2010).
A critical issue in auditory neuroscience is how the central auditory system represents acoustic stimuli, in particular the frequency-specific information that is critical for organizing the auditory scene. One possibility is that the central representation mirrors that found in the peripheral cochlear nerve, whereby peaks and troughs in the spectral envelope of the acoustic stimulus could be represented by peaks and troughs in the firing rate of neurons along the tonotopic axis (Sachs and Young, 1979;Blackburn and Sachs, 1990;Silkes and Geisler, 1991;Poon and Brugge, 1993;Conley and Keilson, 1995;May et al., 1998;Recio and Rhode, 2000). However, this peripheral "rate-place" representation has significant limitations. First, the rate-place representation strongly depends on the frequency resolution of the auditory system. In particular, the spectral decomposition carried out by the cochlea tends to smooth the internal representation of the spectral envelope of complex acoustic stimuli (Moore and Glasberg, 1987;Baer et al., 1993). Second, the rate-place representation of the spectral envelope in the cochlear nerve is degraded at high levels of stimulation where the firing rate of cochlear neurons tend to saturate and/or the auditory filters broaden (Sachs and Young, 1979;Glasberg and Moore, 2000;Palmer and Shamma, 2004;Oxenham and Simonson, 2006). Finally, the peak-to-valley ratio of the rate-place representation is further decreased by the presence of background noise, which fills in the spectral valleys (Baer et al., 1993;Assmann and Summerfield, 2004).
The limitations of the cochlear nerve representation suggest that the central auditory system may have developed a strategy to overcome them, in particular to enhance the representation of spectral edges and spectral contrasts (energy ratios between adjacent peaks and valleys). The visual system, for instance, is thought to rely heavily on this mechanism. This is suggested, in particular, by the phenomenon of "Mach bands" which refers to illusory bands perceived at the spatial boundaries where the stimulus luminance changes abruptly over space (Von Bekesy, 1967, 1969a. While there have been some attempts to investigate this issue in audition (Von Bekesy, 1967;Carterette et al., 1969;Houtgast, 1972), it is still unclear whether a related process plays a significant role in this modality. Interestingly, however, some psychoacoustic phenomena are consistent with a mechanism enhancing spectral edges. For example, neural enhancement at spectral edges may account for the pitch induced by noise bands at their spectral edges (Small and Daniloff, 1967;Bilsen, 1977), and for the dominant role played in pitch perception by the lowest and highest partials of a harmonic complex, especially when the low-numbered (resolved) partials are removed from the complex (Dai, 2000;Moore and Gockel, 2011).
The aim of the present study is to investigate the sensory representation of the stimulus spectrum in auditory cortex, and in particular whether the representation of spectral edges is enhanced. This was accomplished by employing acoustic stimuli composed of multiple pure tones of various frequencies and presented randomly over time. These stimuli can be thought as mimicking acoustic environments with different spectral profiles when timeaveraged over a few hundreds of milliseconds. Importantly, the fact that this particular stimulus was composed of a mixture of tone pips with non-synchronous onsets allowed for estimating the spectro-temporal receptive fields of cortical neurons for different time-averaged spectral envelopes (deCharms et al., 1998;Blake and Merzenich, 2002;Valentine and Eggermont, 2004;Norena et al., 2008). The present study extends earlier work where the dependence of the spectro-temporal acoustic context on cortical neurons were investigated (Gourévitch et al., 2009).

ANIMAL PREPARATION
The care and use of animals used in this study were approved by the Animal Care Committee of Bouches du Rhones, France (# A 13-504). Ten guinea pigs weighing between 300 and 800 g were used for this study. All animals were deeply anesthetized with the administration of 50 mg/kg of ketamine hydrochloride (Imalgene 1000) and 3 mg/kg of xylazine (Rompun 2%), injected intramuscularly; 0.1 ml of Atropine methyl nitrate and an analgesic (Tolfedine) were also administered. Throughout the experiment, anesthesia was maintained with half the dose of ketamine and xylazine administrated every hour. The tissue overlying the frontal lobe was opened and two screws were fixed to the top of the skull (on the antero-posterior axis) with dental cement, and used to fixate the animal's head. The tissue overlying the right or left side of the skull above the temporal lobe was removed. The skull was opened and the dura was cut back to expose the primary auditory cortex (AI). We used the location of the electrodes (Wallace et al., 2000) as well as the characteristic frequency of the neurons to ensure that the electrodes were located in AI (i.e., progression of best frequencies across electrodes). The body temperature was maintained at 37 • C with a thermostatically controlled heating blanket. After the experiment, a lethal dose of sodium pentobarbital was administered.

Acoustic stimulation
Stimuli were generated in MATLAB and transferred to an RP2.1-based sound delivery system (Tucker Davis Technologies). Acoustic stimuli were presented in a sound booth room from a headphone (Sennheiser HD595) placed 10 cm in front of the ear contralateral to the cortex where the recordings were carried out. The amplitude of each tone pip was adjusted to the transfer function of the sound delivery system so that they were presented at the desired level in dB SPL.
Spectro-Temporal Receptive Fields (STRFs) were obtained from a 180-s multi-tone pip stimuli ( Figure 1F) (deCharms et al., 1998;Blake and Merzenich, 2002;Valentine and Eggermont, 2004;Norena et al., 2008;Gourévitch et al., 2009). Tone pips (49 frequencies, 8 frequencies per octave covering 6 octaves) were presented randomly over time (independent Poisson process for each frequency with a rate of 2 Hz and a 50-ms dead time designed so that tones of the same frequency did not overlap in time). Tone pips of different frequencies could overlap in time. The envelope of the tone pips is given by γ(t) = (t/4) 2 e −t/4 with t in milliseconds (stimulus duration is 50 ms, maximum amplitude is reached at 8 ms). The average rate of tone pip presentation was around 16 Hz/octave (considering the number of tone frequencies present per octave, along with the average rate of presentation of each). Control STRFs were obtained from multi-tone stimuli with tone pips presented at 70 dB SPL (ctrl-70) or 40 dB SPL (ctrl-40) (Figures 1A,B). In the attenuated frequency band (AFB) conditions, all pure tones were presented at 70 dB, except those corresponding to the frequency band of the AFB where pure tones were omitted or presented at 40 dB SPL, producing a large or moderate spectral contrast, respectively (Figures 1E and C,D). The frequencies immediately outside of the AFB were called the edge-out frequencies, while the frequencies immediately inside of the AFB were called the edge-in frequencies ( Figure 1C). The bandwidth of the AFB was varied (0.5, 1, and 2 octaves). The slope of the spectral contrast (transition in dB/oct between the edge-in frequency and the edge-out frequency) was 240 dB/oct in all conditions (namely 30-dB difference between the edge-in frequency and the edge-out frequency, except in one AFB condition (with 1-oct bandwidth) where the slope was 80 dB/oct until the level of the tone pip frequencies around the center frequency of the AFB was 40 dB (Figures 1C,D). The center frequency of the AFB was set as follows. First, the BF for each cortical site was derived from the control stimulus (ctrl-70). The center frequency of the AFB was then set to the BF of a given cortical site. Cortical responses were obtained for all stimulus conditions (different widths, slopes and depths) for that specific center frequency of the AFB. Once a set of recordings was completed, another set of recordings was carried out with a different AFB stimulus (centered on the BF of another cortical site). And so on for all cortical sites with a significant STRF (see below). One notes that as we recorded from many cortical sites simultaneously, the BFs could correspond to the center frequency of the AFB, one edge frequency of the AFB, or a remote frequency from the AFB. An additional stimulus condition was investigated which consisted of multi tone pips where all pure tones were presented at 40 dB SPL, except at one frequency which was presented at 70 dB SPL ( Figure 1G). Some example sound files are provided in the supplemental material.

MUA and LFP recording procedure
Each set of recordings was obtained with 1 array of 16 electrodes (Alpha-Omega Eng, Nazareth, Israel) arranged in an 8 by 2 pattern with 0.25 mm electrode separation within the long row and 0.5 mm separation between rows. The electrodes had impedances between 0.8-1.4 MOhm. The array was manually advanced using a Narishige microdrive into primary auditory cortex (according to the location provided by Wallace et al., 2000). The signals were then amplified 10,000 times with filter cutoff frequencies set at 2 Hz and 5 kHz. The amplified signals were processed by a TDT-System three multichannel data acquisition system. Multi-unit activity was sampled at 24,414 Hz and was extracted from the 300-Hz high-pass filtered signal. Local field potentials (LFPs) were sampled at 1061 Hz and were extracted from the 300-Hz low-pass filtered signal. In this way, we were able to record spikes and LFPs simultaneously.
At an initial stage of the experiments, a "search procedure" was used and consisted of recording cortical activity induced by clicks, noise bursts and tone pips (from 500 Hz to 32 kHz, 1/8 octave step). This procedure provided a rough estimate of the tonotopy and the amplitude of LFPs. Moreover, electrodes were placed at a depth where the (negative) amplitude of stimulus-induced LFPs was near maximal (region of the border between layer III and IV- Szymanski et al., 2011).

Data analysis
All results were computed using custom MATLAB routines. Multi-unit activity or "spike events" were detected by using an amplitude threshold on the high-pass filtered data. The median was calculated on the negative values of the filtered signal; the threshold was then set to six times the median (see Quiroga et al., 2004 for a similar method). The spike waveforms were inspected visually throughout the experiments to ensure that they had a typical shape; inserts in Figures 2, 3 show the typical shape of multiunit activity.
The methodology for computing STRFs was similar to that used in previous studies (Valentine and Eggermont, 2004;Norena et al., 2008). Briefly, STRFs for MUA were determined by constructing poststimulus time histograms (PSTHs), with time bins of 1 ms for each tone pip frequency. In other words, spikes falling in the averaging time window (starting at the stimulus onset and lasting 100 ms) are counted. Because the average interstimulus interval in the stimulus ensemble is smaller than the averaging time window, a spike can be counted in the PSTH of several FIGURE 2 | Neural tuning of individual recordings obtained from one selected example at a given location in primary auditory cortex. First row: long-term spectrum of all stimulus conditions. Each column corresponds to a stimulus condition. Second and third rows: STRFs derived from MUA and LFPs, respectively. Horizontal dotted lines represent the edge frequencies of the notch. Inset in the first column and second row represents the averaged waveform (±2 standard deviations) for the MUA at that particular location. Fourth and fifth rows: frequency "profile" obtained by taking the maximum firing rate of MUA, or by taking the minimal amplitude of LFPs, in the 10-40 ms time window, respectively. Vertical dotted lines represent the edge frequencies of the AFB. In order to permit direct comparison of neural responses between control and AFB conditions (red lines), the ctrl-70 (black lines) and ctrl-40 (dotted lines) conditions are added to each figure. Neural responses are greatly enhanced at the edges of the notch and decreased within the notch. The largest increase of responses is observed for the sharp contrast and the full-notch conditions (seventh column). pip frequencies. STRFs for LFPs were obtained by a similar procedure, except that the LFP waveforms (0-80 ms after stimulus onset) were averaged for each appropriate tone pip frequency. The maximal MUA response (or the minimal LFP amplitude) within the 10-40 ms time window after stimulus onset and over all frequencies was obtained from the ctrl-70 STRF. All STRFs (including those obtained from the ctrl-70 condition) are then normalized by dividing the mean neural activity by this single value. This normalization was aimed at minimizing the firing rate variability across recording sites. By definition, the maximum neural activity for the ctrl-70 condition was 1 (at the best frequency), and usually lower than 1 for the ctrl-40 condition. One notes that values above 1 are sometimes observed in the AFB conditions (i.e., at the edge frequencies of the AFB); this indicates that the maximum of absolute firing rate in the AFB conditions is larger than the maximum of absolute firing rate in the ctrl-70 condition (see Figures 2, 3). This normalized mean neural activity is the dependent variable displayed in the STRFs (Figures 2, 3).
In order to compare the STRFs obtained from control and stimuli producing an AFB (and for display purpose), the differences between their frequency profiles were computed. The frequency profiles were obtained from the normalized STRFs by taking the maximum neural activity within a time window of 10-40 ms for each tone pip frequency. For the frequencies outside the AFB, which were presented at 70 dB, the responses were compared to the corresponding frequencies obtained from the ctrl-70. For the frequencies inside the AFB, which were presented at 40 dB, the responses were compared to the corresponding frequencies obtained from the ctrl-40.
Finally, the patterns of excitation, namely the neural population activity over the tonotopic axis, were obtained for the AFB conditions: for each tone pip frequency, the averaged normalized activity is derived for all MUA. Then, instead of plotting neural activity (for a given cortical site) as a function of stimulus frequency, neural activity evoked by a given frequency is plotted as a function of the best frequency of neurons (obtained from many cortical sites). The pattern of excitation could be FIGURE 3 | Neural tuning of individual recordings obtained from one selected example at a given location in primary auditory cortex. Otherwise as Figure 2. Note that the neural receptive field is unchanged when the notch is at remote frequencies.
obtained and plotted for each tone pip frequency. Assuming that auditory information is represented as a "rate-place" code in the auditory cortex, the pattern of excitation may be closer to what downstream neurons read out during stimulation. In other words, the cortex may not directly detect changes in the neurons' best frequencies but rather read out the population activity, namely neural activity along the tonotopic axis. We were particularly interested in the pattern of excitation of the edge-in and edge-out frequencies of the AFB, as we suspected that the pattern of excitation would be modified at these frequencies in the AFB conditions compared to the control conditions.

Statistics
Before applying any statistical test we first verified the normality of the distribution in order to validate or not the use of parametric tests. As all distributions followed the Normal law, we then used the parametric Student test (t-test) to compare two distributions or one distribution against zero. Significativity thresholds were adjusted according to the number of comparisons (Bonferroni's correction). First, the group analysis was carried out on sites with "significant" STRFs for the 70-ctrl condition: the maximal response of the 70-ctrl STRFs within the 10-40 ms time window had to be significantly larger than the "background activity" (computed from the neural activity over the 49 frequencies and 100 time bins, so 4900 values). As 1470 comparisons were made (30 time bins × 49 frequencies), the significativity threshold was adjusted accordingly (Bonferroni correction, p = 0.05/1470). Second, all other comparisons between data sets and a reference value (or data sets between ech other) were also Bonferroni corrected when needed. For instance, when the differences in firing rate between control and AFB conditions were compared to 0 for 32 different frequencies (±2 octaves on either side of the AFB center), the significance value was adjusted accordingly (p = 0.05/32). As the center frequency of AFB was usually centered on neuron's BF, the number of recordings was larger for sites with BF near the center frequency of the AFB than at remote frequencies. Overall, the number of recordings as a function of the distance from the center frequency of the AFB were comprised between 24 (BFs remote from AFB center) and 117 (BFs at or near the AFB center).

RESULTS
The aim of the present study was to investigate the cortical representation of spectral edges in auditory cortex. A total of 317 multi-unit activity (MUA) recordings were obtained from the primary auditory cortex of 10 anesthetized guinea pigs. The median for the distribution of best frequencies derived from the STRFs Frontiers in Systems Neuroscience www.frontiersin.org June 2013 | Volume 7 | Article 21 | 5 was 11,314 Hz (lower and upper quartiles were 7336 Hz and 20,749 Hz, respectively).

CORTICAL REPRESENTATION OF A BROAD-BAND STIMULUS WITH AN ATTENUATED FREQUENCY BAND (AFB)
Here, we investigated the cortical representation of the frequencies composing a multi-tone stimulus. In particular, we focused on the representation of spectral edges (edge-in and edge-out frequencies) of the AFB. As we were also interested in studying whether the representation of the edge frequencies is sensitive to their local acoustic context (the spectral shape around the edge frequencies), the width, depth and sharpness of the AFB were varied.

INDIVIDUAL EXAMPLES
Figure 2 depicts a representative example of MUA and LFP responses obtained for the different conditions of multi tone stimuli (the long-term frequency spectrum of the stimuli is shown in the first row). For this example, the center frequency of the AFB was chosen to correspond to the best frequency (BF) of the MUA (around 2348 Hz). When comparing the responses at the edge-out frequencies with those obtained from the 70-dB control condition, one observes a dramatic increase. Remarkably, there was a clear neural response (in terms of multi-unit activity) at the upper edge-out frequencies for the 2-octave AFB condition (column 5), even though both spectral edges fall outside of the MUA receptive field recorded in control conditions. The increase of responses at the edge-out frequencies was larger for the sharp contrast (compare columns 4 and 6) and for the large contrast (compare columns 4 and 7) conditions. On the other hand, when comparing the responses within the AFB with the 40 dB control condition, one observes a dramatic decrease of responses in all conditions, especially for edge-in frequencies. Figure 3 shows an additional example, where the BF (4362 Hz) of MUA was almost 1 octave above the center of the AFB (2378 Hz). This example illustrates that responses are not modified when the frequency range of the AFB is far away from the MUA's receptive field (column 3). However, neural responses were broadly increased when the upper edge of the AFB was near (columns 4 and 7) or overlapped with the receptive field (column 5). Once again, this example shows that the neural enhancement is larger for the conditions with sharp (compare columns 4 and 6) and large (compare columns 4 and 7) spectral contrast.

GROUP DATA
The frequency profiles averaged over the recordings where the BF corresponded to the center of the AFB are shown in Figure 4. On average, the cortical responses are greatly enhanced at both the upper and lower edge-out frequencies, and decreased within the frequency range of the AFB. The enhancement of responses at both upper and lower edge-out frequencies was maximal for the sharp and the large contrast conditions. We next computed the difference between the frequency profiles of neural responses obtained from the AFB stimuli and those obtained from the control stimuli (see methods). This comparison was carried out for three specific positions of BF relative to the center of the AFB: when BF corresponded to the lower edge (1/8 octave), the upper edge (±1/8 octave) and the center of the AFB (±1/8 octave). The average differences between the frequency profile of AFB and control stimuli for three positions of BF relative to the center of the AFB are shown in Figure 5. The effects of the AFB stimuli relative to the control stimuli were tested statistically for both MUA and LFPs. As the results were generally not different between MUA and LFPs, we did not discriminate between these two signals in the rest of the manuscript. In other words, when a statistical difference is reported, this applies for both MUA and LFPs.
The neural enhancement for the edge-out frequencies was statistically significant for all widths of AFB and for the three positions of BF relative to the center of the AFB (p < 0.0014). It is worth mentioning that cortical responses were increased by about 70% for the fully AFB (when BF corresponded to either lower or upper spectral edge). Moreover, the neural enhancement for the edge-out frequencies was sensitive to the sharpness and the depth of the contrast. Indeed, cortical responses at edge-out frequencies were significantly larger for the sharp contrast condition (vs. the shallow-slope condition, p < 0.05) and for the 1 octave large contrast condition (vs. the 1 octave moderate contrast condition, p < 0.05), for all 3 positions of BF. The enhancement at the edgeout frequencies was also sensitive to the width of the AFB; indeed, the enhancement was smaller for the 0.5 octave condition compared to the 1 and 2 octave conditions (p < 0.05, 1 and 2 octaves conditions were not different from each other).
Besides the neural enhancement at the edge-out frequencies, there was a significant suppression of responses at the edge-in frequencies for the three positions of BF and for all notch widths (p < 0.0014, except for the 2-octave condition, and when BF was positioned at the upper edge frequency of the AFB). Interestingly, one notes that this neural suppression at edge-in frequencies was asymmetric for the 2-octave condition: the suppression was stronger when BF corresponded to the lower edge of the AFB (versus when BF corresponded to the upper edge of the AFB) (p < 0.05). Finally, the suppression at frequencies around the center of the AFB was largest for the 0.5 condition, when the BF corresponded to the center of the AFB.
While we found, on average, a clear (and significant) effect of the AFB on neural responses at the edge-in (suppression) and edge-out (enhancement) frequencies (Figures 4, 5), the prevalence of these changes, namely whether they concern a majority of recording sites or not, is unknown. The percentages of recording sites showing at least 20% increase or decrease as a function of frequency for the three positions of BF relative to the center of the AHL are shown in Figure 6. Nearly 90% of the recording sites showed an increase of neural responses at the lower and upper edge-out frequencies, while a decrease of neural responses at edge-in frequencies was observed in around 60% of the recordings. This suggests that the cortical changes induced by the notched stimuli are very systematic. It is also worthwhile to mention that while the percentages of sites showing an increase at the edge-out frequencies is similar whether the BF corresponded to the lower edge or the upper edge of the AFB, the percentages of sites showing a decrease at the edge-in frequencies is larger when BF corresponded to the lower edge of the AFB (around 60% of the sites) than when BF corresponded to the upper edge of the FIGURE 4 | Average frequency profile for all stimulus conditions, for one BF position relative to the AFB center (neural BF corresponding to the AFB center) for MUA (first row) and LFPs (second row). Frequency profiles are obtained by taking the maximal normalized firing rate (for MUA, or the minimal normalized amplitude for LFPs) between 10 and 40 ms after stimulus onset. Each column corresponds to an AFB condition. Schematics illustrating the long-term spectrum of the acoustic conditions are shown at the top of the figure. First column: half-an-octave partially AFB. Second column: one-octave partially AFB.
Third column: two-octaves partially AFB. Fourth column: shallow-slope partially condition. Fifth column: fully AFB. Both control conditions (at 70 dB SPL, continuous black line and at 40 dB SPL, dashed black line) are shown in all panels to permit a direct comparison between control and AFB conditions (red line). Neural responses were greatly enhanced at both edges of the notch, and reduced within the notch. The neural enhancement at both edges of the AFB was sensitive to the sharpness and the depth of the spectral contrast (compare 2nd and 4th columns, and 2nd and 5th columns, respectively).
AFB (around 20-30% of the sites). This result is consistent with the asymmetry in suppression observed from the averaged data (Figure 5), showing that neural suppression of edge-in frequencies is stronger at the lower edge of the AFB than at the upper edge (Figure 6) (see discussion for putative functional implications).

POPULATION ACTIVITY OVER THE TONOTOPIC AXIS
Thus far, neural data were analyzed with an emphasis on the characteristics of neural tuning. From a neural decoding point of view, on the other hand, a more relevant representation may be the spatio-temporal distribution of population activity. The cortex processes this dynamic and distributed population activity in real time over remote cortical regions. In order to provide a representation of neural activity closer to what may be relevant in the auditory cortex, we derived an excitation pattern (thought to approximate population activity) from MUA and LFP recordings. One notes that this representation was made possible by our matrix electrodes which allowed a relatively dense sampling of cortical responses over the tonotopic axis. The excitation patterns were obtained for each tone pip frequency presented in the multi-tone stimuli. Instead of representing the individual or average activity of cortical neurons as a function of frequency, the average neural activity was represented as a function of neural BF for each given stimulus frequency. This representation gives an estimate of the spatial representation (or population activity) of each tone pip frequency over the tonotopic axis (Figure 7). The resulting excitation patterns in the control condition (stimulus with a flat spectrum) were very homogeneous over frequency and resembled a Gaussian-shaped curve: for a given tone pip frequency, the activity is maximal for neurons with BF corresponding to that frequency (by definition), while neural activity decreases gradually for neurons whose BF is further from that frequency. More interestingly, the excitation patterns obtained from AFB stimuli were very different from those derived from control conditions. At the edge-out frequencies, the excitation patterns were not only increased (in terms of neural response amplitude, as already shown above) but they became broader. On the other hand, at the edge-in frequencies, the excitation patterns were decreased in amplitude and became narrower.
In order to investigate whether the AFB stimuli modified the cortical representation of edge-in and edge-out frequencies, the width of the excitation patterns was derived (at the normalized neural activity of 0.2). The respective widths obtained from control and AFB stimuli were then compared statistically (Figure 8). During the stimulation with the AFB stimuli, the representation of the edge-out frequencies was expanded (p < 0.05), while the representation of the edge-in frequencies was narrowed (p < 0.05). These results suggest that the cortical representation of stimulus frequencies (in terms of the amplitude of the response and number of the neurons involved) is highly dynamic and depends heavily on the overall acoustic spectrum or acoustic context.

CORTICAL REPRESENTATION OF A BROAD-BAND STIMULUS WITH AN ENHANCED FREQUENCY (EF)
In order to gain further insight into the properties of the firing rate reduction on either side of a spectral edge (later called "lateral suppression"), in particular its width and asymmetry, an additional experiment was carried out. In this experiment, cortical responses were obtained from a multi-tone stimulus where all pure tones were presented at 40 dB SPL, except at one frequency which was presented at 70 dB SPL; that frequency was referred as the enhanced frequency (EF). The main purpose of this experiment was to investigate the cortical representation of tone pip frequencies surrounding the EF. Indeed, if a central mechanism exists that sharpens the neural representation of spectral edges, then one expects a decrease of neural responses at frequencies adjacent to the EF, as this would produce an increase of the peak-to-valley ratio.
The average differences between the frequency profiles of the EF stimulus and control stimuli for three positions of BF relative to the EF are shown in Figure 9. Neural responses obtained from the EF stimulus at the EF were compared to the ctrl-70 and those at other frequencies were compared to the ctrl-40. Cortical responses at the EF were largely increased in the EF condition compared to those in the control condition (p < 0.0014). On the other hand, neural responses were significantly decreased on either side of the EF (up to ¼ octave away from the EF, p < 0.0014-condition where the EF corresponded to BF). Interestingly, this decrease was largely asymmetric over frequency: the decrement of cortical responses was stronger (and slightly wider) towards high frequencies than towards low frequencies. In the condition where BF was lower than the EF, only one frequency below the EF was significantly suppressed (p < 0.0014). In the condition where BF was higher than the EF, frequencies up to 3/8 octave above the EF were suppressed (p < 0.0014). The width and the asymmetry of the suppressed sidebands observed in this stimulus condition are broadly consistent with the neural changes produced by the AFB stimuli (Figures 5, 6) (see discussion).

DISCUSSION
The present study was aimed at investigating whether there is an enhancement, in auditory cortex, of the representation of spectral edges in acoustic stimuli. Overall, we show that the cortical representation of the acoustic spectrum tends to enhance the spectral edges. As the stimuli used in this study have spectral contrasts or edges only when they are time-averaged over a few hundreds of milliseconds, our results imply that auditory centers integrate the stimulus spectrum over hundreds of milliseconds. More specifically, in the condition where a frequency band was attenuated, we observed that cortical responses were increased near the edge-out frequencies, whereas they were reduced for the edge-in frequencies. Interestingly, by estimating the neural population activity over the tonotopic axis, we also found that the cortical response profile following the presentation of a stimulus with an AFB was Frontiers in Systems Neuroscience www.frontiersin.org June 2013 | Volume 7 | Article 21 | 9 FIGURE 7 | Averaged "patterns of excitation" for MUA (first and second rows) and LFPs (third and fourth rows) in the 1-octave AFB condition. Neural activity was normalized to the maximal activity obtained in the ctrl-70 condition and was smoothed with a moving average. In the first (for MUA) and third (for LFPs) rows, excitation patterns for the ctrl-70 (first column) and the ctrl-40 (second column) are represented. In the first column, the second (for MUA) and fourth (for LFPs) rows represent the excitation patterns for the fully AFB condition. The "excitation patterns" for frequencies at the upper and lower edge-out frequencies are represented by thick red lines. In the second column, the second (for MUA) and fourth (for LFPs) rows represent the excitation patterns for the partially AFB condition. The excitation patterns for frequencies at the upper and lower edge-in frequencies are represented by thick blue lines (excitation pattern at center frequency of the AFB is represented by thick black line). This figure shows that the representations of frequencies at both edge-out frequencies of the AFB as well as at edge-in frequencies are greatly "distorted" .
greatly altered: the relative number of sites responsive to the edgeout frequencies was increased, while the relative number of sites responsive to the edge-in frequencies was decreased (compared to the number of sites representing the frequencies remote from the AFB). These cortical changes were sensitive to the properties of the AFB, namely its width, depth and sharpness. These changes were highly systematic, being present in the majority of cortical recording sites. In the condition where the sound level of a single tone frequency was increased, neural activity was reduced at the neighboring frequencies of the enhanced frequency.

THE SPECTRO-TEMPORAL INTEGRATION INVESTIGATED BY OTHER STUDIES
The effects "at a distance" between frequencies presented in a given temporal sequence reported in the present study are reminiscent of those reported previously. In particular, presenting a pulsated tone pip at a given frequency for seconds to minutes has been shown to produce a decrement of cortical responses not only at the tone pip frequency but also at nearby frequencies (Condon and Weinberger, 1991;Ulanovsky et al., 2004). The spectrotemporal interactions of acoustic stimuli in auditory cortex have Frontiers in Systems Neuroscience www.frontiersin.org June 2013 | Volume 7 | Article 21 | 10

FIGURE 8 | Difference (in octaves) in the width of the "excitation patterns" between the two AFB conditions (one-octave and fully AFB) and the control conditions (40-dB within the AFB and 70-dB outside the AFB).
Positive values indicate increased width, while negative values indicate decreased width. The width of the "excitation patterns" gives an estimate of the number of neurons involved in the representation of a given frequency.
FIGURE 9 | Average difference between the stimulus condition with an EF and the control conditions for three positions of the EF relative to the neural BF, as a function of frequency. Each column represents a position of the EF relative to the neural BF. At the top of the figure, schematics illustrate all stimulus conditions, namely the location of the receptive field relative to the EF (see text for more details). First column: average data for neurons with BF lower than the EF. Second column: average data for neurons with BF corresponding to the EF. Third column: average data for neurons with BF higher than the peak frequency. One notes the neural suppression at frequencies adjacent to the peak frequency and the asymmetry of this neural suppression (suppression is stronger from low to high frequencies).

Frontiers in Systems Neuroscience www.frontiersin.org
June 2013 | Volume 7 | Article 21 | 11 been investigated using two-tone sequences (Shamma, 1985;Shamma and Symmes, 1985;Calford and Semple, 1995;Rajan, 1998). A complex pattern of firing suppression and facilitation has been reported, which depends on the frequency separation and the delay between the two tones Schreiner, 1997, 2000;Brosch and Scheich, 2008;Sadagopan and Wang, 2010). One notes that little difference has been found between these effects in multi-unit and single unit activity Schreiner, 1997, 2000;Brosch et al., 1999). Our demonstration that cortical responses (multi-unit activity and local field potentials) are either suppressed or enhanced depending on the stimulus context is consistent with these studies. However, the above studies did not specifically address the cortical representation of spectral edges embedded in spectrally complex acoustic stimuli or the sensitivity of this representation to the characteristics of the spectral edges (width, depth, sharpness). Indeed, while neural enhancement at spectral edges has been predicted by various computational models (Shamma, 1985;Yost, 1986;Gerken, 1996;Parra and Pearlmutter, 2007), there is to our knowledge only one experimental study showing cortical enhancement near the cutoff frequency of 2-octave wide multitone stimuli (Gourévitch et al., 2009). The present study, however, extends the latter by reporting, for the first time, the effects of the physical characteristics of the spectral contrast (sharpness, depth and width) and by documenting neural responses for frequencies nearby and within spectral notches. Our study also provides new information about the width and asymmetry of lateral suppression that are crucial for computational models and the functional implications of these mechanisms (see below). Finally, while we did not investigate specifically the effects of rate and spectral density of our acoustic stimuli on cortical responses (Blake and Merzenich, 2002;Valentine and Eggermont, 2004;Norena et al., 2008), it is likely that the central changes reported in the present study are sensitive to these parameters. Presentation rate and spectral density have to be high enough to fall within the spectro-temporal integration constants of cortical neurons. Indeed, very small presentation rate should not produce any edge enhancement, even for spectrally dense stimuli, and reversely.

MECHANISMS OF NEURAL ENHANCEMENTS AND DECREMENTS AT THE SPECTRAL EDGES
The malleability of cortical responses reported in the present study are produced by acoustic stimuli presented passively for only 3 min, in contrast to studies that reported rapid modifications in frequency tuning but during/after active listening (Edeline et al., 1993;Fritz et al., 2003Fritz et al., , 2005Elhilali et al., 2007) or after prolonged (on the order of weeks) stimulation (Norena et al., 2006;Kim and Bao, 2009). The rapidity of these cortical changes precludes the involvement of slow cortical changes such as those involved in homeostatic plasticity (Watt and Desai, 2010) or longterm depression and potentiation (Buonomano and Merzenich, 1998). Instead, they are likely the results of one or a combination of relatively fast mechanisms, occurring on the order of milliseconds to seconds or minutes. The first (fast) mechanism that comes to mind to account for our results is lateral inhibition. Indeed, it has long been recognized that lateral inhibition could be used by sensory systems to sharpen/enhance the representation of stimulus contrasts (Hartline et al., 1956;Ratliff and Hartline, 1959;Von Bekesy, 1967, 1969aMarr and Hildreth, 1980). The presence of lateral inhibition has been suspected at virtually all levels of the central auditory system using various methodology such as whole-cell recordings (Wu et al., 2008), electrophysiology (twotone sequences, effects of hearing loss) (Shamma and Symmes, 1985;Calford et al., 1993;Rhode and Greenberg, 1994;Calford and Semple, 1995;Suga et al., 1997;Rajan, 1998Rajan, , 2001Wang et al., 2002; and pharmacology (Yang et al., 1992;LeBeau et al., 2001). Lateral inhibition is likely to contribute to our results when tones of different frequencies overlap in time. Moreover, tones are presented in random temporal sequences with a relatively short average inter-stimulus interval (500 ms for one frequency, or nearly 60 ms for one-octave frequency band-roughly the width of the STRFs) suggesting that the cortical activity induced by a tone at a given time also depends on the tones presented shortly before it Schreiner, 1997, 2000;Brosch and Scheich, 2008;Sadagopan and Wang, 2010). These post-stimulatory effects on neural activity have been shown to result from synaptic inhibition up to 100 ms after stimulus presentation (Wehr and Zador, 2005). At longer delays, on the other hand, other mechanisms involved in synaptic depression, such as receptor desensitization, vesicle depletion and changes in presynaptic release probability are thought to be at work.
In summary, we propose that the cortical changes reported in the present study are likely the results of different mechanisms such as synaptic inhibition and synaptic depression. Reduced synaptic inhibition and/or synaptic depression produced by frequencies falling in the AFB could result in a relative increase of responses at the edge-out frequencies, while the enhanced synaptic inhibition and/or synaptic depression produced by the (enhanced) edge-out frequencies could in turn reduce the responses at the edge-in frequencies. One further important question relative to the mechanisms enhancing spectral contrasts is whether they operate at the cortical level or are inherited from earlier stages of the auditory pathway. The very similar pattern of responses for MUA and LFPs (the latter known to represent mainly the thalamic inputs sent toward the cortex) (Mitzdorf, 1985;Steinschneider et al., 1992) suggests that the enhancement of spectral contrasts observed in cortex is largely inherited from lower levels. Consistent with a sub-cortical contribution to the cortical changes reported in the present study, a complex pattern of firing suppression has been evidenced in the cochlear nucleus (Rhode and Greenberg, 1994). Further studies will be needed to investigate this important question. Finally, the present study has been carried out on anesthetized animals (mixture of ketamine and xylazine); consequently, it is unclear whether the results reported here also apply to awake animals. However, a study carried out in the primary auditory cortex of awake macaques shows that anesthetics only marginally modifies the pattern of neural suppression and facilitation produced by two-tone sequences (Brosch and Scheich, 2008). The latter study suggests that the cortical responses produced by our stimuli may also apply to awake preparation.

SENSORY INPUT CONDITIONS MIMICKED BY OUR STIMULI WITH ATTENUATED FREQUENCY BAND
The stimuli used in the present study can be thought as producing acoustic environments with different spectral profiles when time-averaged over a few hundreds of milliseconds. One can wonder whether these synthetic stimuli mimic natural sensory input conditions for the auditory system. One pattern of sensory inputs that may be mimicked by our notched stimuli is that produced by a complex broadband environment in presence of sharp notched hearing loss. Hearing losses restricted to a given frequency band (i.e., referred to as an audiogram with notches) have been reported in many studies (Gates et al., 2000;McBride and Williams, 2001;Rabinowitz et al., 2006;Nondahl et al., 2009;Etchelecou et al., 2011). Assuming that the time-averaged acoustic background is "flat", this particular shape of hearing loss is thought to result in an averaged pattern of (rate-place) activity in the cochlear nerve with a dip corresponding to the hearing loss region. More specifically, frequency regions outside hearing loss are evenly stimulated, while the frequency region of hearing loss receives only weak stimulation, if any (Gerken, 1996). In other words, our notched stimuli mimic the contrast in the averaged rate-place sensory inputs over the tonotopic axis in presence of hearing loss. As the pattern of sensory inputs provided by the AFB stimuli resembles the averaged pattern of sensory inputs in presence of notched hearing loss, the AFB stimuli can be interpreted as producing an acute "functional deafferentation" or "artificial hearing loss" (Pantev et al., 1999;Norena et al., 2000;Okamoto et al., 2007). In this context, our notched stimuli can be considered as an equivalent of the stimulus used in vision to produce an "artificial scotoma," i.e., moving lines or random dots stimulating the visual field around a small non-stimulated area (Ramachandran and Gregory, 1991;Pettet and Gilbert, 1992;Das and Gilbert, 1995;DeAngelis et al., 1995). One notes, however, that our stimuli do not model some typical characteristics accompanying cochlear damage, such as the decrease in spontaneous activity in the cochlear nerve within the frequency range of cochlear damage (Liberman and Dodds, 1984), the neural degeneration of cochlear fibers Liberman, 2006, 2009) and/or the broadening of auditory filters (Glasberg and Moore, 1986).
The results of the present study may give some insights into the sensitivity of the auditory cortex to the characteristics of acute hearing loss. These properties are potentially important for the understanding of the functional implications of the cortical changes produced by acute hearing loss, such as tinnitus, for example (Norena, 2011;Noreña and Farley, 2013). One notes that the relationship between cortical changes and the characteristics of hearing loss is relatively difficult to study in practice as (noise-induced) hearing loss is generally variable (Loeb and Smith, 1967;Atherley et al., 1968). In conclusion, our study suggests that the cortical changes produced by acute hearing loss could be sensitive to the sharpness, depth and width of hearing loss. Moreover, while the cortical changes observed in the present study are short-term, it is possible that a more prolonged exposure to the AFB stimulus could induce long-lasting changes such as those produced by chronic hearing loss or reported in previous studies (Robertson and Irvine, 1989;Rajan et al., 1993;Norena and Eggermont, 2005;Norena et al., 2006;Pienkowski et al., 2013).

PROPERTIES OF CENTRAL INHIBITION
Our results also provide some information about the properties of lateral suppression of firing (either it is produced by synaptic inhibition and/or synaptic depression) in the central auditory system. The bandwidth of suppressed sidebands derived from this study (0.25-0.4 octave) closely approximates the lateral inhibitory networks described by Shamma (1985, 0.3 octave) and Yost (1986, 0.2 octave). We also show that lateral suppression is asymmetric as a function of frequency with a stronger and wider suppression produced toward high frequencies (suppression was significant up to around 0.4 octaves above the spectral peak) than toward low frequencies (suppression was significant up to 0.25 octaves below the spectral peak) (Figure 9). This particular pattern of asymmetric inhibition is consistent with the results of (Zhang et al., 2003) for high BF neurons. As the pattern of vibration of the basilar membrane is asymmetric (slope is shallower on the basal side of the cochlea compared to the apical side), leading to the corresponding asymmetric pattern of excitation in the cochlear nerve, it has been suggested that the asymmetry of central inhibition (stronger inhibition from low to high frequencies) may further refine the central representation of spectral edges (Suga, 1995;Okamoto et al., 2007).
We have proposed that the "Zwicker tone," the tonal and faint illusory percept produced after the presentation of a notched noise (broadband noise containing a suppressed frequency band) (Zwicker, 1964;Lummis and Guttman, 1972;Wiegrebe et al., 1996;Franosch et al., 2003), could be interpreted as a model of transient tinnitus (Norena et al., 2000;Parra and Pearlmutter, 2007). The "Zwicker tone" can also be induced by low-pass or high-pass noises, although the former is more efficient to produce the sensation (Lummis and Guttman, 1972). It is interesting to note that this asymmetry for producing the "Zwicker tone" might be related to the asymmetry in neural suppression reported in the present study (larger neural suppression at lower edge-in frequencies vs. upper edge-in frequencies).

IMPLICATIONS FOR NEURAL CODING
The present study shows that the cortical representation of spectral edges is enhanced (more neurons are dedicated to the representation of spectral edges). A putative link between stimulus importance and its representational size in the primary auditory cortex has been suggested (Rutkowski and Weinberger, 2005). Our study further suggests that the representational size of spectral cues may be dynamically enhanced in cortex. This may improve the processing of relevant spectral cues (edges) within the ever changing acoustic environment.
It has been suggested that the responsiveness (gain) of subcortical and cortical neurons could be dynamically adapted to the statistics (mean and variance) of stimuli. This mechanism provides an elegant solution to the dynamic range problem (Viemeister, 1988) by adjusting the input-output function of neurons to the distribution of input levels (Dean et al., 2005(Dean et al., , 2008Barbour, 2008, 2011;Rabinowitz et al., 2011Rabinowitz et al., , 2012. These studies varied the mean and variance of stimulus level across conditions but the mean stimulus level was fixed (for single pure tone) or uniform (for noise bursts or multi-tone pips) over frequency for a given condition. Our study can be considered as an extension of these studies as the mean level was varied over frequency (mean level was low in the AFB, and high elsewhere). While the hypothesis of gain control predicts a decrease of gain for high contrast stimuli (and the reverse at low contrast stimulineurons become more sensitive to small level variations), our results suggest the opposite: the firing rate difference between edge-out and edge-in frequencies are maximally enhanced for sharp and deep contrast. These results emphasize the importance of considering the effects of the spectral dimension (spectral envelope) in future studies investigating contrast gain control.