Mismatch Brain Response to Speech Sound Changes in Rats

Understanding speech is based on neural representations of individual speech sounds. In humans, such representations are capable of supporting an automatic and memory-based mechanism for auditory change detection, as reflected by the mismatch negativity (MMN) of event-related potentials. There are also findings of neural representations of speech sounds in animals, but it is not known whether these representations can support the change detection mechanism analogous to that underlying the MMN in humans. To this end, we presented synthesized spoken syllables to urethane-anesthetized rats while local field potentials were epidurally recorded above their primary auditory cortex. In an oddball condition, a deviant stimulus /ga/ or /ba/ (probability 1:12 for each) was rarely and randomly interspersed between frequently presented standard stimulus /da/ (probability 10:12). In an equiprobable condition, 12 syllables, including /da/, /ga/, and /ba/, were presented in a random order (probability 1:12 for each). We found evoked responses of higher amplitude to the deviant /ba/, albeit not to /ga/, relative to the standard /da/ in the oddball condition. Furthermore, the responses to /ba/ were higher in amplitude in the oddball condition than in the equiprobable condition. The findings suggest that anesthetized rat’s brain can form representations of human speech sounds, and that these representations can support the memory-based change detection mechanism analogous to that underlying the MMN in humans. Our findings show a striking parallel in speech processing between humans and rodents and may thus pave the way for feasible animal models of memory-based change detection.


INTRODUCTION
The ability to represent individual speech sounds is a necessary condition for understanding speech. Interestingly, this ability is not unique to humans. For example, rodents (e.g., Kuhl and Miller, 1975;Kuhl and Miller, 1978;Reed et al., 2003;Engineer et al., 2008), birds (e.g., Dooling and Brown, 1990), and monkeys (e.g., Sinnott et al., 1976;Steinschneider et al., 1995;Sinnott et al., 1998) can be trained to discriminate speech sounds suggesting a deep evolutionary basis of this ability.
In humans, neural representations of speech sounds are formed already pre-attentively. They can also support automatic detection of changes, as reflected by a component of event-related potentials of the brain named mismatch negativity (MMN, Näätänen et al., 1978, for a review, see Näätänen et al., 2007). MMN can be elicited in an oddball condition comprising rare auditory event ("deviant," e.g., speech sound /ba/) interspersed with a repetitive one ("standard,"e.g., /da/). The disappearance of MMN in control conditions in which the standard stimuli are removed (so called deviant-alone condition or equiprobable condition, e.g., Alho et al., 1990;Jacobsen and Schröger, 2001) suggests that standard stimuli form a memory trace against which incoming sounds are compared in the neural level. The mismatch response is elicited when discrepancy between the memory representation and the deviant input is detected (memory-comparison explanation of MMN: Näätänen, 1990;Näätänen et al., 2005).
An alternative explanation for the generation of the MMN response suggests that it originates from the differential activity of afferent neuronal populations. Specifically, it is assumed that because the standard stimuli repeatedly activate their afferent pathways, this leads to greater refractoriness in those pathways than in the afferent neural pathways activated by the rare deviant stimuli. In this way, deviant stimulus pathways may better retain their reactivating capacity leading accordingly to enlarged neural responses to deviant stimuli comparing to standard stimuli (for the refractoriness explanation of MMN, see Näätänen, 1990; see also May and Tiitinen, 2010 for related adaptation hypothesis).
In animals, MMN-like responses to deviant stimuli (i.e., higher amplitude responses to deviants than standards, mismatch response 1 ) for example in frequency and duration have been reported in many different animal species (e.g., Csépe et al., 1987Csépe et al., , 1989Javitt et al., 1992;Kraus et al., 1994;Ruusuvirta et al., 1996Ruusuvirta et al., , 1998Umbricht et al., 2005;Astikainen et al., 2006). However, there are also negative findings in rodents (frequency changes: Lazar and Metherate, 2003;von der Behrens et al., 2009, speech sound changes: Eriksson andVilla, 2005). Furthermore, while experiments in humans have consistently supported the memory-comparison explanation (e.g., Schröger and Wolff, 1996;Jacobsen and Schröger, 2001;Jacobsen et al., 2003), animal studies have provided inconsistent results (support for memorycomparison explanation: Ruusuvirta et al., 1998;Tikhonravov et al., 2008Tikhonravov et al., , 2010Ruusuvirta et al., 2010;Astikainen et al., 2011, support for the refractoriness explanation: Lazar and Metherate, 2003;Umbricht et al., 2005). Moreover, to contrast the memorycomparison and the refractoriness explanation, animal studies have mainly relied on the deviant-alone condition in which standards are omitted leaving only "control-deviant" stimuli in the series (Kraus et al., 1994;King et al., 1995;Ruusuvirta et al., 1998;Lazar and Metherate, 2003;Umbricht et al., 2005;Tikhonravov et al., 2008Tikhonravov et al., , 2010 for equiprobable control condition, see Ruusuvirta et al., 2010;Astikainen et al., 2011). However, equiprobable condition is a more valid method than deviant-alone condition for testing the memory-comparison hypothesis. This is because, unlike the deviant-alone condition, equiprobable condition preserves the same overall stimulation rate as it is in the oddball condition. In the equiprobable condition, standards are replaced with heterogeneous stimuli (with respect to the feature that differentiates deviants from standards) and they are all presented with equal probability. It thus enables comparison of responses to oddball-deviant stimuli and to those to "control-deviant" stimuli that are physically the same sounds and also presented with the same probability differing only in their background stimuli.
The present study addresses whether brains of urethaneanesthetized rats generate local field potentials functionally analogous to human MMN to speech sounds. Even if discrimination of speech sounds as indexed by the mismatch response in rodents has been reported earlier (Kraus et al., 1994) there is also a negative finding in anesthetized rats (Eriksson and Villa, 2005). Furthermore, the underlying process of the speech sound discrimination ability as indexed by the mismatch response (memory-comparison versus neural refractoriness) has remained unclear. In order to enlighten this aspect, we applied equiprobable control condition.

SUBJECTS
The subjects were nine male adult Sprague-Dawley rats from Harlan Laboratories (England, UK). The rats were between 13 and 18 weeks of age and weighed 410-500 g at the time of the recordings. The animals were housed in groups of 2-4 in standard plastic cages under controlled temperature and 12-h light/dark cycle, with free access to water and food pellets in the Experimental Animal Unit of the University of Jyväskylä, in Jyväskylä, Finland. Experiments were carried out in accordance with the European Communities Council Directive (86/609/EEC) regarding the care and use of animals for experimental procedures. The license for the present experiments has been approved by County Administrative Board of Southern Finland (Permit code: ESLH-2007-00662).

SURGICAL PROCEDURES
All surgical procedures were done under urethane anesthesia. The animals were initially anesthetized with intraperitoneal injections of urethane (1.2 g/kg dose, 0.24 g/ml concentration, Sigma-Aldrich, St. Louis, MO, USA) and given an additional 10% of the original dosage if necessary until they appeared completely unresponsive to painful stimuli (firm toe or tail pinch). Animals were rehydrated with a 2-ml injection of saline under the skin every 2 h.
The anesthetized animal was moved into a Faraday cage and mounted in a standard stereotactic frame (David Kopf Instruments, Model 962, California, USA). The animal's head was fixed to the stereotactic frame using blunt ear bars. Under additional local anesthesia of the skin and muscles (lidocaine 20%, Orion Pharma, Espoo, Finland) the skull was exposed. Two stainless steel skull screws (0.9 mm diameter, World Precision Instruments, Berlin, Germany) positioned on the right side of the brain above the cerebellum (AP −11.0, ML 3.0) and frontal cortex (AP +4.0, ML 3.0) served as reference and ground electrodes, respectively.
Unilateral craniotomy was performed in order to expose a 2 mm × 2 mm region over the left primary auditory cortex (4.5-6.5 mm posterior to the bregma and 2-4 mm lateral to the bone edge of the upper skull surface).
Before the electrocorticogram recording, a headstage composed of a screw and dental acrylic was attached to the right prefrontal part of the skull to hold the head in place and allowing removal of the right ear bar. After the experiment, the animals were further anesthetized with urethane and then sacrificed by cervical dislocation.

STIMULI
Two stimulus conditions were applied: the oddball condition and the equiprobable condition. In the oddball condition, frequently repeated "standard" stimulus, consonant-vowel (CV) stimulus /da/ (p = 10:12), was pseudorandomly replaced by two rare "deviant" stimuli, /ga/ or /ba/ (p = 1:12 each). At least two standard stimuli were presented between two consecutive deviant stimuli. The stimuli in the oddball condition were five-formant stop CV syllables synthesized with male voice. The most prominent difference between the syllables was in the direction of the CV frequency transition and in the duration of this transition in the second formant (F2; Figure 1). The F2 frequency increased slightly during the transition in /ba/ but decreased for /ga/ and /da/ transitions. For F2, the duration of CV transitions was 20 ms for /ba/ (deviant), 35 ms for /da/ (standard), and 45 ms for /ga/ (deviant). The transition duration for all other formants was fixed to 40 ms. These stimuli were modified with Praat software (Paul Boersma and David Weenink, University of Amsterdam, the Netherlands), shortened to a uniform length (110 ms) and normalized for peak intensity. The waveform and spectrograms of the oddball stimuli are presented in Figure 1. The main frequencies in the formants of the stimuli are shown in Table 1.
In the equiprobable condition, the three syllables applied in the oddball condition (/da/, /ba/, and /ga/) were presented together with nine other syllables to set their occurring probabilities equal to that of the syllables presented as deviant stimuli in the oddball condition (p = 1:12 for each). /Ba/ and /da/ syllables presented in the equiprobable condition are hereafter called as "controldeviant" stimuli. Half of the stimuli in the equiprobable condition were synthesized similarly to the stimuli in the oddball condition and half were naturally produced (see Table 1). The natural sounds were spoken by a male voice and normalized for peak intensity. In order to make the stimulation heterogeneous, syllables in Frontiers in Psychology | Auditory Cognitive Neuroscience  the equiprobable condition were of two different durations. The shorter syllables (/da/, /ba/, /ga/, /ta/, /pa/, /ka/) were 110 ms and the longer ones (/daa/, /baa/, /gaa/, /taa/, /paa/, kaa/) 250 ms in duration. The stimuli were partly the same applied previously in human studies (Molfese and Molfese, 1997;Guttorm et al., 2001). Please note that even if the relatively low frequencies of stimuli were suboptimal for the rat hearing, our previous studies have shown that rats are able to represent also low frequencies such as 750 and 1000 Hz (Astikainen et al., 2006). The offset-to-onset intervals between consecutive syllables were 350 ms and a total of 1200 stimuli were presented in both the oddball and equiprobable conditions. Across animals, the two conditions were presented in a counterbalanced arrangement, aiming to control for the possible effects of the order of the sequences and possible variations in the level of anesthesia.
The stimulus presentation was controlled by E-prime software (Pittsburg, PA, USA), and stimuli were played from a PC via an active loudspeaker system (Studiopro 3, M-audio, Irwindale, CA, USA). The stimulation was presented with the passive part of the loudspeaker system directed toward the right ear of the animal at a distance of 20 cm. The sound pressure level for each tone was 70 dB with C-weighting (optimized for 40-100 dB measurement), as measured with a sound level meter (type 2235, Bruel and Kjaer, Naerum, Denmark), in the location where the animal's right pinna was during the recording.

RECORDING AND ANALYSIS
After the surgery, the right ear bar was removed and recording started. Continuous electrocorticogram was first 10-fold amplified using the AI 405 amplifier (Molecular Devices Corporation, Union City, CA, USA), high-pass filtered at 0.1 Hz, 200-fold amplified, and low-pass filtered at 400 Hz (CyberAmp 380, Molecular Devices Corporation), and finally sampled with a 16-bit precision at 2 kHz (DigiData 1320A, Molecular Devices Corporation). The www.frontiersin.org data were stored on a computer hard disk using Axoscope 9.0 data acquisition software (Molecular Devices Corporation, Union City, CA, USA).
Local field potentials were recorded with a teflon-coated stainless steel wire (200 μm in diameter; Medwire, Sigmund Cohn Corp., Mount Vernon, NY, USA) located on the dura surface above the left primary auditory cortex of the animal. The position of the electrode on the dura was set on the basis of evoked potentials (a sharp peak of positive polarity approximately at 35 ms latency from stimulus onset) to the tones of 4000 Hz presented 1/s. The anesthetic state of animals was monitored periodically throughout the experiment.
The data were offline filtered (0.1-30 Hz, 24 dB/octave), baseline corrected (based on the average amplitude of the 50-ms pre-stimulus period) and averaged separately for deviants and standards for each animal. In order to have same amount of standard and deviant stimulus responses in the analysis (i.e., 100 responses for both), only the responses to the standard stimuli immediately preceding the deviant stimuli were analyzed. First, the responses to the deviant stimuli in the oddball condition were compared to the responses to the standard stimuli on a point-by-point basis (1-300 ms from stimulus onset) with a two-tailed t -test. p-Values smaller than 0.05 for at least 20 consecutive sample points (i.e., for the period of 10 ms) were required for the difference in local field potentials to be considered robust (see also Guthrie and Buchwald, 1991). Second, in order to test whether the mismatch response was dependent on the context provided by the standard stimuli, as assumed, the amplitude of the responses to the deviant stimulus in the oddball condition was compared with the point-by-point t -tests with the amplitudes of the responses to the control-deviant stimulus (i.e., the same stimulus presented in the equiprobable condition).

RESULTS
In the oddball condition, responses between the standard /da/ and the deviant /ba/ stimulus differed significantly [t (8) = 2.33-2.98, p = 0.017-0.049] in amplitude at 30-80 ms from stimulus onset (Figure 2A). Responses to the deviant stimulus were higher in amplitude than those to the standard stimulus (mean values in the latency range of 30-80 ms were 49.8 μV for the deviant stimuli and 11.6 μV for the standard stimuli). There was no significant difference in the response amplitude between the standard /da/ and the deviant /ga/ stimulus ( Figure 2B).
Next we tested the dependency of the found differential response to the context of the repetitive standard stimuli. Responses to the deviant stimulus /ba/, which differed in amplitude from responses to the standard stimulus /da/ in the oddball condition, were compared to responses to control-deviant stimulus (i.e., physically the same /ba/ as in the oddball condition) interspersed with other syllables in the equiprobable condition. Responses to the /ba/ syllable were found to be significantly [t (8) = 2.36-3.40, p = 0.009-0.046] higher in amplitude in the oddball condition than in the equiprobable condition at a latency of 29-69 ms ( Figure 2C). The mean response amplitude in this latency range for the /ba/ in the oddball condition was 58.1 μV and for the /ba/ in the equiprobable condition 23.1 μV.

FIGURE 2 | Grand-averaged responses to syllables in the oddball and equiprobable conditions. (A)
Responses to the standard syllable /da/ and deviant syllable /ba/ presented in the oddball condition; (B) responses to the standard syllable /da/ and deviant syllable /ga/ presented in the oddball condition; (C) responses to the deviant syllable /ba/ in the oddball condition and control-deviant syllable /ba/ in the equiprobable condition. The latency ranges that indicated significant differences between the responses in t -tests are marked with a rectangle. The y -axis indicates the stimulus onset.

DISCUSSION
We found that the anesthetized rat brain responds to one type of stimulus contrast in a repeated human spoken syllable (change from /da/ to /ba/) but not to the other (change from /da/ to /ga/). The difference between the local field potential responses to /ba/ versus /da/ was observed at the latency range of 30-80 ms after the stimulus onset. The neural mechanism underlying this response was examined by applying the equiprobable control condition. The response to the deviant syllable /ba/ was found to be specific to the repetitive context provided by the standard stimuli. Namely, the response was higher in amplitude in the oddball than in the equiprobable condition at the latency of 29-69 ms. In this respect, at the latency of 30-69 ms, the response was analogous to human MMN (e.g., Schröger and Wolff, 1996;Jacobsen and Schröger, 2001), suggesting that in the animals, the mismatch response is, as MMN in humans, triggered by a memory-comparison process rather than due to neural refractoriness.
It is noteworthy that the equiprobable control condition applied in the current experiment preserves the overall rate of auditory stimulation present in the oddball condition and, therefore, provides a more valid control than the deviant-alone control condition that alters this rate by completely omitting standard stimuli from the series. There are a few studies in humans (e.g., Frontiers in Psychology | Auditory Cognitive Neuroscience MMN to sound intensity: Wolff, 1996, MMN to sound frequency: Jacobsen andSchröger, 2001;Jacobsen et al., 2003) and also in animals (mismatch response to sound frequency: Ruusuvirta et al., 2010;Astikainen et al., 2011) that have applied equiprobable control condition. However, in studies of speech sound processing this control has mostly been ignored (see however, Jacobsen and Schröger, 2004 in humans). There is thus an obvious need for studies exploring the underlying mechanism of MMN to speech sounds.
In the current study, the infrequently presented syllable /ga/ interspersed with the standard /da/ in the oddball condition did not elicit the mismatch response. The reason for an observable mismatch response for /da/-/ba/ contrast, but not for the /da/-/ga/ contrast, is most probably explained by the differences between these stimuli during about the first 50 ms in the F2 formant (see Materials and Methods and Figure 1). First, the stimuli differed in their CV transition direction in F2. The frequency increased slightly during the transition in /ba/ but decreased for /ga/ and /da/ transitions allowing thus probably better discrimination of /ba/ than /ga/ from /da/. Second, there were also differences in CV transition durations between these syllables in F2 (transition duration for all other formants was fixed to 40 ms). These durations were 35 ms for /da/ (standard), 20 ms for /ba/ (deviant), and 45 ms for /ga/ (deviant). This also resulted in the faster sound energy level rise for /ba/ than for /da/ or /ga/, which is also evident in Figure 1. The current experiment cannot, however, determine to what extent the mismatch response is due to the difference in the direction of the CV transition (decrease versus increase), the difference in the duration of this transition (15-ms shortening versus 10ms lengthening), or other differences in spectro-temporal aspects between the stimuli. Nevertheless, the results show that urethaneanesthetized rats, even without previous training, were capable of discriminating subtle changes in these spectrally complex auditory stimuli.
Our results from the epidural recording above the primary auditory cortex showing neurophysiological discrimination of /da/-/ba/ contrast but not /da/-/ga/ contrast are partly in line with the results obtained from the thalamus (caudomedial portion of the medial geniculate nucleus) of anesthetized guinea pigs (Kraus et al., 1994). Namely, mismatch response was not found to /ga/-/da/ contrast, while another contrast (/ba/-/wa/) elicited the response. However, in the epidural midline surface (secondary albeit not primary auditory cortex) a significant mismatch response was found to the both contrasts. The current positive finding of the mismatch response in anesthetized rats to speech sounds is in contrary to a previous negative one (Eriksson and Villa, 2005). Future studies are needed to solve whether for example the different anesthetic agent (urethane versus ketamine-xylazine) or differences in stimulation (e.g., inter-stimulus interval of 350 versus 750 ms) may have caused these inconsistent findings in rats.
Using partly the same stimuli, albeit in different stimulus conditions as in the present study, Guttorm et al. (2001) has shown that newborns at risk for dyslexia differ mostly in their responses to the syllable /ga/ from controls (although some group differences were also found to /da/ and /ba/) and that the response to /ga/ predicts later pre-reading skills (Guttorm et al., 2010). In addition, Kraus et al. (1996) found that children with learning disabilities differ from controls in their mismatch response to deviant syllable /ga/ interspersed with /da/. These findings are somewhat reminiscent of the present results of the mismatch response to the /da/-/ba/ contrast but not to the /da/-/ga/ contrast in rats, suggesting that discriminating acoustic features of formant transitions in /ga/ is particularly difficult for both human and rat auditory cortex.
Our finding of speech sound discrimination may pave the way for feasible animal models of memory-based speech sound discrimination. Pharmacological and genetic modulations (for electrophysiological studies utilizing the mismatch response, see, e.g., Javitt et al., 1996;Ehrlichman et al., 2008;Tikhonravov et al., 2008) as well as subcortical recordings (e.g., Astikainen et al., 2005) can only be conveniently done in animals. Animal models also provide a unique possibility to address how linguistic and non-linguistic stimuli are dealt with by non-linguistic brains, allowing theoretically important implications to be drawn from parallel findings in humans with linguistic brains.
In conclusion, we found a mismatch response in urethaneanesthetized rats to one type of contrast in stop CV syllables, change from /da/ to /ba/, but not to the other type, change from /da/ to /ga/. As indicated by the equiprobable control condition, this response had its origin in a memory-based mechanism analogous or homologous to that underlying human MMN. Neural representations of the syllables were thus similarly accessible to the auditory change detection mechanism as they are in humans, suggesting a fundamental parallel in processing of spectro-temporally complex speech sounds between humans and animals.