Skip to main content


Front. Hum. Neurosci., 15 July 2016
Sec. Brain Health and Clinical Neuroscience

Mismatch Negativity to Threatening Voices Associated with Positive Symptoms in Schizophrenia

Chenyi Chen1†, Chia-Chien Liu1,2†, Pei-Yuan Weng2† and Yawei Cheng1,3*
  • 1Institute of Neuroscience, National Yang-Ming University, Taipei, Taiwan
  • 2Department of Psychiatry, National Yang-Ming University Hospital, Yilan, Taiwan
  • 3Department of Rehabilitation, National Yang-Ming University Hospital, Yilan, Taiwan

Although the general consensus holds that emotional perception is impaired in patients with schizophrenia, the extent to which neural processing of emotional voices is altered in schizophrenia remains to be determined. This study enrolled 30 patients with chronic schizophrenia and 30 controls and measured their mismatch negativity (MMN), a component of auditory event-related potentials (ERP). In a passive oddball paradigm, happily or angrily spoken deviant syllables dada were randomly presented within a train of emotionally neutral standard syllables. Results showed that MMN in response to angry syllables and angry-derived non-vocal sounds was significantly decreased in individuals with schizophrenia. P3a to angry syllables showed stronger amplitudes but longer latencies. Weaker MMN amplitudes were associated with more positive symptoms of schizophrenia. Receiver operator characteristic analysis revealed that angry MMN, angry-derived MMN, and angry P3a could help predict whether someone had received a clinical diagnosis of schizophrenia. The findings suggested general impairments of voice perception and acoustic discrimination in patients with chronic schizophrenia. The emotional salience processing of voices showed an atypical fashion at the preattentive level, being associated with positive symptoms in schizophrenia.


Schizophrenia, a chronic and disabling brain disorder, has three categories of symptoms: positive, negative, and cognitive symptoms. Hearing voices is the most common type of hallucination associated with positive symptoms. Deficits in the ability to recognize emotions from vocal expressions are treatment resistant and associated with poor outcomes (Bach et al., 2009; Leitman et al., 2010, 2011). To advance our understandings of the relationship between the symptoms of schizophrenia and the perception of emotional voices, this study, through the neurophysiological approach, clarified whether emotional voice processing is impaired per se, and further, associated with sensory dysfunction or attention abnormalities.

The extent to which basic auditory processing contributes to impaired voice perception in schizophrenia is unclear. Some studies reported that deficits of emotional prosodic identification in individuals with schizophrenia reflect, at least in part, a relative inability to process the acoustic characteristics of prosodic stimuli (Leitman et al., 2005, 2010, 2011). They have argued that schizophrenia is associated with structural and functional disturbances at the primary auditory cortex (Leitman et al., 2007). However, other studies found that individuals with schizophrenia had more difficulties at emotional prosody comprehension than controls, but equivalently proficient at stress prosody comprehension (Murphy and Cutting, 1990). Their performance was worse at identifying high-clarity emotional prosodic stimuli, but not at identifying low-clarity stimuli (Bach et al., 2009). Individuals with schizophrenia relative to healthy controls showed comparable performance for discriminating among terminal pitch changes, but more difficulties for internal pitch discrimination (Matsumoto et al., 2006).

Mismatch negativity (MMN) and P3a are event-related potentials (ERPs) that can be elicited by a passive oddball paradigm. MMN and P3a have been used as neurophysiological biomarkers in schizophrenia research (Javitt et al., 2008; Javitt and Sweet, 2015). MMN reflects a preattentive stage of auditory information processing. For MMN generation, oddball stimuli may differ from standards based on a number of physical dimensions, including sensory modality, frequency, duration, or intensity (Näätänen et al., 2007). Primary generators for MMN are located in the primary auditory cortex (Alho, 1995; Maess et al., 2007). Through a meta-analysis, deficits in MMN generation were suggested to be a robust feature in chronic schizophrenia, indicating abnormalities in automatic context-dependent auditory information processing in these patients (Umbricht and Krljes, 2005). MMN reduction was associated with global impairments in everyday functioning in schizophrenia patients (Light and Braff, 2005). MMN appeared to be reduced, even at illness onset (Salisbury et al., 2002, 2007; Umbricht et al., 2006; Jahshan et al., 2012). In addition, P3a is an ERP-index of an involuntary attention switch (Escera et al., 2000). Auditory P3a is the earliest ERP abnormality to be studied in schizophrenia (Roth and Cannon, 1972). P3a was reduced in patients with chronic schizophrenia (Mathalon et al., 2000; Jeon and Polich, 2003). P3a might serve as a risk or trait marker of the genetic risk of schizophrenia (Winterer, 2000; Hall et al., 2006).

Until recently, emotional MMN and P3a were not utilized to assess the automaticity and involuntary attention of emotional salience processing of voices, respectively (Schirmer et al., 2005). The unexpected presence of emotionally spoken syllables embedded in a passive oddball paradigm can trigger emotional MMN and P3a. Particularly, emotional mismatch response, an infant analog of the adult emotional MMN, was identified in newborns, reflecting the emergence of emotional arousal during the first days of life (Cheng et al., 2012). Females exhibited stronger emotional MMN and P3a than did males, inferring the sex hormone-mediated processing of emotional voices (Hung and Cheng, 2014). Testosterone administrations could alter emotional MMN and P3a, lending support to the involvement of amygdala in the generator sources (Chen et al., 2015). These findings support the notion that emotional MMN and P3a can probe emotional voice processing. In the same vain, emotional MMN and P3a were reduced in individuals with autism spectrum conditions and lower angry MMN amplitudes were associated with higher levels of autistic traits (Fan and Cheng, 2014). However, to the best of knowledge, emotional MMN and P3a have been examined in individuals with schizophrenia.

To understand the extent to which basic auditory processing contributes to impaired emotional salience processing of voices, we presented the emotionally spoken meaningless syllables dada, and acoustically matched non-vocal sounds in a passive oddball paradigm to individuals with chronic schizophrenia and matched controls. It is worth to mention that the disrupted activity in amygdala might lead to abnormal assignment of salience to ambiguous, potentially threatening stimuli, such as angry voices, in patients with schizophrenia, particularly in those with positive symptoms (Holt and Philops, 2009). One neuroimaging study demonstrated that the amygdala was activated by using a passive oddball paradigm on angry syllables (Schirmer et al., 2008). It is thus hypothesized that, if general deficits in auditory processing existed, then patients with schizophrenia would exhibit altered MMN and P3a responses to angry and happy syllables and corresponding non-vocal sounds. If the deficit were selective for threatening voices, then individuals with schizophrenia would elicit distinct MMN and P3a to angry syllables from controls. In addition, to further explore the relationship between neurophysiological responses and symptom severity, we conducted correlation analyses to test the extent to which emotional MMN and P3a covaried with the Positive and Negative Syndrome Scale (PANSS).

Materials and Methods


Thirty schizophrenia patients and thirty controls were enrolled. Individuals with schizophrenia were recruited from local hospital. Using the Structured Clinical Interview from the Diagnostic and Statistical Manual of Mental Disorders Fourth Revised Patient Edition, psychiatrists reconfirmed that the illness was in a non-acute and stable phase. All subjects were ethnic Chinese. The age- and handedness-matched controls were recruited from local community and screened for major psychiatric illnesses by using the Structured Clinical Interview for DSM-IV Axis I Disorders (SCID-I). Table 1 lists demographics and clinical variables. Subjects with comorbid psychiatry or neurological disorders (e.g., dementia or seizures), a history of head injury, alcohol or substance abuse or dependence were excluded. All of the participants exhibited normal peripheral hearing bilaterally (pure tone average thresholds<15 dB HL) at the time of testing. For the handedness, medications, and does for each patient, please refer to Table 2. All subjects provided written informed consent and assent for the study, which was approved by local ethics committee (Yang–Ming University Hospital) and conducted in accordance with the Declaration of Helsinki.


TABLE 1. Demographic and clinical variables related to study participants.


TABLE 2. Handedness and medications for each individual with schizophrenia.

Auditory Stimuli

The stimuli have two categories: emotional syllables and acoustically matched non-vocal sounds. For emotional syllables, a young female speaker from a performing arts school produced the meaningless syllable dada with three sets of emotional (angry, happy, neutral) prosodies. Within each set of emotional syllables, the speaker produced the syllables for more than ten times. Emotional syllables were edited to become equally long (550 ms) and loud (min: 57 dB; max: 62 dB; mean 59 dB) using Cool Edit Pro 2.0 and Sound Forge 9.0. Each syllable set was rated for emotionality on a 5-point Likert-scale (see Cheng et al., 2012; Fan et al., 2013; Fan and Cheng, 2014 for validation). Two emotional syllables that were consistently identified as ‘extremely angry’ and ‘extremely happy’ and one neutral syllables rated as the most emotionless were selected as the stimuli. The ratings on the Likert-scale (mean ± SD) were 4.26 ± 0.85, 4.04 ± 0.91, and 2.47 ± 0.87 for angry, happy, and neutral syllables, respectively.

To create a set of control stimuli that retain acoustical correspondence with emotional syllables, we synthesized non-vocal sounds by using Praat (Boersma, 2001) and MATLAB (The MathWorks, Inc., USA). The central gravity of frequency (fn) of each original syllable was defined as [|∫ X(f)|2× f df)/(|∫ X(f)|2 df], where X(f) was the Fourier spectrum of emotional syllables. The fn of angry, happy, and neutral syllables was 1249 Hz, 1159 Hz, and 1156 Hz, respectively. We then produced non-vocal sounds by multiplying the sine waveform with two Hamming windows that were temporarily centered at each of the syllable [non-vocal sounds = fn(t) × Hamming window(t)]. This way has been used to synthesize non-vocal sounds for controlling the temporal envelope and core spectral element of emotional syllables (Fan et al., 2013; Chen et al., 2014; Hung and Cheng, 2014). The time-course and frequency spectrum of emotional syllables and corresponding non-vocal sounds are illustrated in Figure 1. In addition, non-vocal sounds had comparable emotionality ratings on the Likert-scale (2.47 ± 0.87) with neutral syllables (P > 0.1) as well as below-chance hits on the emotional categorization task (Chen et al., 2014), indicating emotional neutrality of acoustic controls.


FIGURE 1. Acoustic properties of stimulus materials. (A) Oscillogram of auditory stimulus. (B) Spectrogram of auditory stimulus. Non-vocal sounds retain the spectral centroid (fn) as well as the temporal envelope of emotional syllables.

EEG Apparatus, Procedures, Recording, and Data Analysis

Before EEG recordings, psychiatrists administered the PANSS (Kay et al., 1987; Phillips et al., 1991) to evaluate the symptom severity of schizophrenia. The EEG recording was conducted in a sound-attenuated and electrically shielded room. During EEG recording, participants were required to watch a silent movie with subtitles while task-irrelevant stimuli in oddball sequences were presented. Particularly, instead of presenting physically identical stimuli as both of standards and deviants (Schirmer et al., 2007), we applied the same theorem as previous work for controlling the mismatch paradigm (Čeponienë et al., 2003; Chen et al., 2014). The passive oddball paradigm for emotional syllables employed happy and angry syllables as deviants (D1, D1) and neutral syllables as standards (S). Their correspondingly non-vocal sounds were applied in the same oddball paradigm, but were presented as separate blocks so that relative acoustic features among S, D1, and D2 were controlled across blocks. Each stimulus category (emotional syllables vs. non-vocal sounds) comprised two blocks, the order of which was counter-balanced and randomized across participants. Each block consisted of 600 trials, of which 80% were neutral syllables or neutral-derived sounds, 10% were angry syllables or angry-derived sounds, and 10% were happy syllables or happy-derived sounds. The sequences of blocks and stimuli were quasirandomized to avoid successive blocks and successive deviants from identical stimulus categories. A minimum of two standards was always presented between any two deviants. The stimulus-onset-asynchrony was 1200 ms, including a stimulus length of 550 ms and an inter-stimulus interval of 650 ms.

The MMN and P3a amplitudes were analyzed as an average within a 50-ms window surrounding the peak at selected electrode sites. Based on prior literature (Näätänen et al., 2007, 2011), the MMN peak was defined as the largest negativity after subtracting the standard ERP from the deviant ERP during a period of 150 to 350 ms after stimulus onset. Only the standards before the deviants were included in the analysis. The P3a peak was defined as the largest positivity within the period of 250 to 450 ms. Three-way mixed ANOVAs were separately performed on MMN and P3a for each category (emotional syllables or non-vocal sounds) with deviant type (happy or angry) and electrode site (F3, Fz, F4, C3, Cz, or C4) as the within-subject factors and group (schizophrenia or control) as the between-subject factor. The dependent variables were the mean amplitudes and peak latencies of the MMN and P3a components at the selected electrode sites. Degrees of freedom were corrected using the Greenhouse–Geisser method while sphericity had been violated. A Bonferroni-corrected t-test was only conducted when preceded by significant main effects. Spearman’s correlation analysis was conducted between emotional MMN or P3a and the PANSS subscales.


Neurophysiological Measures between Groups

Preattentive discrimination of emotional voices was studied using MMN, determined by subtracting neutral ERP from angry and happy ERPs (Table 3). The ANOVA on MMN amplitudes to emotional syllables reflected main effects exerted by the deviant type [F(1,58) = 70.60, p < 0.001, ηp2 = 0.55], electrode [F(1,58) = 4.97, p < 0.001, ηp2 = 0.08], and group [F(1,58) = 4.45, p = 0.039, ηp2 = 0.07] (Figure 2). As compared to controls (mean ± SE: 1.9 ± 0.16 μv), individuals with schizophrenia displayed weaker emotional MMN (1.44 ± 0.16) irrespective of happy or angry deviants. Angry MMN (2.15 ± 0.14) was significantly stronger than happy MMN (1.19 ± 0.11) irrespective of schizophrenia or control. MMN had the strongest amplitudes at electrode Fz (1.83 ± 0.13) as compared to F3 (1.68 ± 0.11), F4 (1.72 ± 0.12), C3 (1.57 ± 0.1), Cz (1.66 ± 0.13), or C4 (1.56 ± 0.11). In addition, none of their interactions reached significance, including electrode × group [F(5,290) = 1.84, p = 0.132, ηp2 = 0.03], deviant type × group [F(1,58) = 1.11, p = 0.296, ηp2 = 0.02], electrode × deviant type [F(5,290) = 0.19, p = 0.914, ηp2 < 0.01] or electrode × deviant type × group [F(5,290) = 1.72, p = 0.160, ηp2 = 0.03].


TABLE 3. Mean amplitude and latencies of MMN to emotional syllables and non-vocal sounds within a time window of 150 to 350 ms at frontal electrodes in individuals with schizophrenia and controls (Mean ± SD)


FIGURE 2. Mismatch negativity (MMN) amplitudes to emotional syllables and non-vocal sounds at electrode site Fz between individuals with schizophrenia and their matched controls. Non-vocal deviants retaining acoustical features of emotional syllables were derived from angry (angry-derived) and happy (happy-derived) syllables. MMN to angry and angry-derived deviants (black solid and dotted lines) were significantly weaker in individuals with schizophrenia than in the controls (p = 0.02; p = 0.007), whereas no differences between these two groups emerged in the MMN to happy and happy-derived deviants (gray solid and dotted lines).

Mismatch negativity amplitudes to non-vocal sounds showed main effects for the deviant type [F(1,58) = 132.58, p < 0.001, ηp2 = 0.70], electrode [F(1,58) = 11.75, p < 0.001, ηp2 = 0.17], and group [F(1,58) = 5.47, p = 0.023, ηp2 = 0.09], as well as interactions for deviant type × electrode [F(1,58) = 8.02, p < 0.001, ηp2 = 0.12] and deviant type × group [F(1,58) = 5.42, p = 0.023, ηp2 = 0.09] (Figure 2). Post hoc analyses revealed that angry-derived MMN exerted an electrode effect (p < 0.001) with the strongest amplitude at the electrode site Fz (2.47 ± 0.15 μv), but happy-derived MMN did not exhibit this effect (p = 0.17). Individuals with schizophrenia relative to the controls showed weaker angry-derived MMN (p = 0.037) but comparable happy-derived MMN (0.75 ± 0.12 vs. 0.65 ± 0.12, p = 0.56).

Mismatch negativity peak latencies to emotional syllables and non-vocal sounds did not reveal any effect involving the group factor. It indicated that individuals with schizophrenia did not differ from controls in term of the speed of preattentive processing.

Involuntary attention switches to emotional voices were studied using P3a (Table 4). P3a amplitudes to emotional syllables reflected main effects for the deviant type [F(1,58) = 4.66, p = 0.035, ηp2 = 0.07] and electrode [F(1,58) = 7.60, p < 0.001, ηp2 = 0.12] as well as a statistically non-significant trend toward the group effect [F(1,58) = 3.01, p = 0.088, ηp2 = 0.05]. As compared to controls (1.13 ± 0.18 μv), individuals with schizophrenia displayed stronger emotional P3a to emotional syllables (1.58 ± 0.18) irrespective of happy or angry deviants. Angry P3a (1.52 ± 0.16 μv) was stronger than happy P3a (1.19 ± 0.14) irrespective of group or electrode. P3a had the strongest amplitude at Fz (1.54 ± 0.15) as compared to F3 (1.45 ± 0.14), F4 (1.37 ± 0.14), C3 (1.21 ± 0.12), Cz (1.44 ± 0.16), or C4 (1.14 ± 0.13).


TABLE 4. Mean amplitude and latencies of P3a to emotional syllables and non-vocal sounds within a time window of 250 to 450 ms at frontal electrodes in individuals with schizophrenia and controls (Mean ± SD).

P3a amplitudes to non-vocal sounds showed an electrode effect [F(1,58) = 5.09, p < 0.001, ηp2 = 0.03] and an deviant type × group interaction [F(1,58) = 5.32, p = 0.025, ηp2 = 0.08]. Post hoc analyses indicated that angry-derived P3a was stronger than happy-derived P3a for control [F(1,58) = 4.93, p = 0.034, ηp2 = 0.15], whereas no difference emerged for schizophrenia [F(1,58) = 0.79, p = 0.38, ηp2 = 0.03].

P3a peak latencies to emotional syllables reflected an effect for the deviant type [F(1,58) = 14.12, p < 0.001, ηp2 = 0.20] and an deviant type × group interaction [F(1,58) = 5.47, p = 0.023, ηp2 = 0.09]. Post hoc analyses indicated that schizophrenia had longer latencies for angry P3a than controls [t(58) = 2.27, p = 0.027, d = 0.59], whereas two groups displayed similar happy P3a [t(58) = 0.74, p = 0.46, d = 0.19]. In addition, P3a peak latencies to non-vocal sounds reflected an effect for the deviant type [F(1,58) = 11.51, p = 0.001, ηp2 = 0.17], indicating earlier latencies for happy-derived P3a than angry-derived P3a irrespective of group or electrode.

Correlation between Emotional MMN or P3a and Symptom Severity

There were significant correlations between angry MMN amplitudes and positive symptoms (Table 5). The Holm–Bonferroni step-down procedure was conducted to control the family wise error rate (FWER, p < 0.05) for multiple comparisons. Spearman’s correlation analyses on the PANSS subscales indicated that angry MMN amplitudes at C3 were negatively correlated with positive symptoms (ρ = -0.52, p = 0.003) (Figure 3). Such correlation was not observed either in the negative symptoms or general psychopathology scores. P3a was not correlated with the PANSS. In addition, neither MMN nor P3a exhibited any age-related correlation.


TABLE 5. The correlation analysis between PANSS (positive, negative, and general psychopathology) and angry MMN/P3a amplitudes, including the correlation coefficients (ρ) and P values corrected for multiple comparisons.


FIGURE 3. Correlations between emotional MMN amplitudes and symptom severities in individuals with schizophrenia. The severity of positive and negative symptoms was assessed with the PANSS.

Relationship between Sensitivity and Specificity for Angry MMN

Receiver operating characteristic (ROC) analyses was conducted to measure the ability of emotional and non-vocal MMN amplitudes to differentiate between schizophrenia and control individuals (Figure 4). The area under the ROC curve (AUC) is indicative of the overall accuracy of the measure, representing the probability that a randomly selected true-positive individual scored higher on the measure than a randomly selected true-negative individual while 50% was chance level.


FIGURE 4. Receiver operating characteristic (ROC) analysis. Angry and angry-derived MMN amplitudes were used as a predictor to differentiate schizophrenia patients from the controls.

Receiver operating characteristic analysis for angry MMN resulted in AUC values of 0.65 (p = 0.049) over frontal electrodes. The most appropriate cut-off point for angry MMN with sensitivity of 70% and specificity of 70% was -1.89 μV. The AUC values for angry-derived MMN and angry P3a were 0.70 (p = 0.007) and 0.66 (p = 0.037). This indicated that angry and angry-derived MMN as well as angry P3a could help predict whether someone had received a clinical diagnosis of schizophrenia or not.


This study aims to clarify the extent to which basic auditory processing contributes to impaired emotional prosodic detection in schizophrenia. The results indicated that abnormal assignment of salience to threatening voices could help predict positive symptoms in schizophrenia. MMN, indexing preattentive detection of emotional salience of voices, was significantly reduced to angry syllables and angry-derived non-vocal sounds in schizophrenia. P3a, an index for selective attention control, showed greater amplitudes but longer latencies to angry syllables in schizophrenia. Weaker MMN amplitudes were associated with more positive symptoms of PANSS. ROC analyses suggested that angry MMN and P3a could predict whether someone had received a clinical diagnosis of schizophrenia or not.

Mismatch negativity amplitudes decreased for angry syllables and angry-derived non-vocal sounds in chronic schizophrenia. This finding might support the proposal that basic auditory processing abnormalities contribute to affective prosody dysfunction in schizophrenia (Leitman et al., 2005, 2007, 2010, 2011). Similarly, affective prosody recognition and MMN amplitudes elicited by infrequent high-pitched tones in the oddball paradigm were significantly associated (Jahshan et al., 2013). The emotional-derived non-vocal sounds in this study may partially reflect analog frequency (pitch) changes in pure tones. Studies on schizophrenia patients have reported decreased MMN in response to pitch deviants (e.g., Javitt et al., 1993, 1998; Catts et al., 1995). As indicated by reduced MMN in response to angry syllables and angry-derived non-vocal sounds, this study demonstrates that people with chronic schizophrenia process emotional voices in an atypical fashion at the preattentive level. Emotional voice processing abnormalities might be partially driven by impaired processing of low-level acoustic parameters.

Angry P3a, an index for selective attention control, had longer latencies in schizophrenia patients than in controls. Despite general consensus that P3a indexes attention switching to novel stimuli associated with psychopathology, the findings on increases or decreases of P3a amplitudes in psychotic patients are mixed (Javitt et al., 2008). Some reports have stated that patients at risk for schizophrenia exhibit weaker P3a (Mathalon et al., 2000; Jeon and Polich, 2003), whereas another found that stronger P3a was associated with an increased risk (Winterer, 2000). Atypical P3a might reflect pathological distractibility in chronic psychiatric patients (Escera et al., 2000). Emotional voices usually attract involuntary attention (Grandjean et al., 2008). Disturbed reciprocal fronto-limbic pathways might impair prefrontal dominance for controlling the hyperactive limbic system, resulting in failure to inhibit irrelevant information (Weinberger, 1987). Schizophrenia patients with auditory hallucination symptoms find it more difficult to control their selective attention, particularly in the presence of emotional distracters (Alba-Ferrara et al., 2013). In this study, the presence of P3a differentiation between angry and happy syllables along with the absence to differentiate angry-derived from happy-derived non-vocal P3a among schizophrenia patients could be ascribed to an imbalance of involuntary attention switching between emotional voices and acoustic attributes. Consistent with P3 being quantitative phenotypes (Winterer et al., 2003), ROC analyses indicated that angry P3a could help predict whether someone had received a clinical diagnosis of schizophrenia.

Positive symptoms coupled with angry MMN amplitudes within schizophrenia patients support the hypothesis that prosodic dysfunction may mediate the misattribution of auditory hallucination (David, 1994). Hearing voices is the most common type of hallucination associated with positive schizophrenia symptoms. Deficits of emotional prosodic perceptions were proposed as critical contributors to the formation of auditory hallucinations (Leitman et al., 2005; Rossell and Boundy, 2005). Patients experiencing auditory hallucinations were not as successful at recognizing prosodic cues as the non-hallucinating patients (Shea et al., 2007). Hallucinating patients exhibited reduced activations in the amygdala and insula when hearing crying sounds (Kang et al., 2009). Sensory gating deficits reflect the inability to filter out extraneous noise from meaningful sensory inputs (Freedman et al., 1987). They cause a cascade failure, rendering the malfunctioned limbic system unable to detect the emotional salience of incoming stimuli (Anticevic et al., 2012).

Some limitation of this study must be acknowledged. First, regarding sample homogeneity, the generalizability of the results may be limited because people with acute schizophrenia were not included. Second, the MMN recording here may not be state of the art. Unlike to the use of the stimuli as both of standards and deviants for controlling the mismatch paradigm (Schirmer et al., 2007), the MMN effect in this study may be potentially driven by physical stimulus characteristics. However, based on the same theorems as previous work (Čeponienë et al., 2003), we have conducted a series of studies to verify emotional and non-vocal MMN in the strict sense of disentangling emotional salience from physical properties (Cheng et al., 2012; Hung et al., 2013; Fan and Cheng, 2014; Chen et al., 2015). This may not be the optimal design, and future studies are warranted with a larger sample size, in which people with acute schizophrenia are recruited and stimuli with greater acoustic correspondence are included.

This study demonstrates that patients with chronic schizophrenia exhibited reduced MMN responses to both of angry syllables and non-vocal sounds, indicating general impairments of voice perception and acoustic discrimination. The atypical processing of emotional salience at the preattentive level might be partially driven by impaired processing of low-level acoustic parameters. The failure to tune their attention to contextually irrelevant stimuli of emotional voices could be ascribed to pathological distractibility. In particular, the MMN amplitudes to emotional voices predicted the severity of positive symptoms. These findings could provide evidence for bottom–up (i.e., perceptually based) cognitive remediation approaches, and indicate that emotional MMN and P3a can be potential neurophysiological endophenotypes of schizophrenia (Turetsky et al., 2007; Javitt and Sweet, 2015).

Author Contributions

CC, P-YW, and YC took part in designing the study. CC and C-CL undertook the statistical analysis. C-CL and YC managed the literature search and wrote the first draft of the manuscript. All authors have contributed to and approved the manuscript.


The study was funded by the Ministry of Science and Technology (MOST 103-2410-H-010-003-MY3; 104-2420-H-010-001-; 105-2420-H-010-003-), National Yang-Ming University Hospital (RD2015-004; RD2016-004), Ministry of Education (Aim for the Top University Plan), and Department of Health, Taipei City Government (10401-62-023). The funders had no role in the study design, data collection, and analysis, decision to publish, or preparation of the manuscript.

Conflict of Interest Statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.


We thank the study participants and clinicians involved in recruitment and assessment. We also thank Yu-Fen Kuan and Shin-Yi Lee for assisting data collection.


Alba-Ferrara, L., de Erausquin, G. A., Hirnstein, M., Weis, S., and Hausmann, M. (2013). Emotional prosody modulates attention in schizophrenia patients with hallucinations. Front. Hum. Neurosci. 7:59. doi: 10.3389/fnhum.2013.00059

CrossRef Full Text | Google Scholar

Alho, K. (1995). Cerebral generators of mismatch negativity (MMN) and its magnetic counterpart (MMNm) elicited by sound changes. Ear Hear. 16, 38–51. doi: 10.1097/00003446-199502000-00004

CrossRef Full Text | Google Scholar

Anticevic, A., Repovs, G., and Barch, D. M. (2012). Emotion effects on attention, amygdala activation, and functional connectivity in schizophrenia. Schizophr. Bull. 38, 967–980. doi: 10.1093/schbul/sbq168

CrossRef Full Text | Google Scholar

Bach, D. R., Buxtorf, K., Grandjean, D., and Strik, W. K. (2009). The influence of emotion clarity on emotional prosody identification in paranoid schizophrenia. Psychol. Med. 39, 927–938. doi: 10.1017/S0033291708004704

CrossRef Full Text | Google Scholar

Boersma, P. (2001). Praat, a system for doing phonetics by computer. Glot Int. 5, 341–345.

Google Scholar

Catts, S. V., Shelley, A. M., Ward, P. B., Liebert, B., McConaghy, N., Andrews, S., et al. (1995). Brain potential evidence for an auditory sensory memory deficit in schizophrenia. Am. J. Psychiatry 152, 213–219. doi: 10.1176/ajp.152.2.213

CrossRef Full Text | Google Scholar

Čeponienë, R., Lepistö, T., Shestakova, A., Vanhala, R., Alku, P., Näätänen, R., et al. (2003). Speech-sound-selective auditory impairment in children with autism: they can perceive but do not attend. Proc. Natl. Acad. Sci. U.S.A. 100, 5567–5572. doi: 10.1073/pnas.0835631100

CrossRef Full Text | Google Scholar

Chen, C., Chen, C. Y., Yang, C. Y., Lin, C. H., and Cheng, Y. (2015). Testosterone modulates preattentive sensory processing and involuntary attention switches to emotional voices. J. Neurophysiol. 113, 1842–1849. doi: 10.1152/jn.00587.2014

CrossRef Full Text | Google Scholar

Chen, C., Lee, Y. H., and Cheng, Y. (2014). Anterior insular cortex activity to emotional salience of voices in a passive oddball paradigm. Front. Hum. Neurosci. 8:743. doi: 10.3389/fnhum.2014.00743

CrossRef Full Text | Google Scholar

Cheng, Y., Lee, S. Y., Chen, H. Y., Wang, P. Y., and Decety, J. (2012). Voice and emotion processing in the human neonatal brain. J. Cogn. Neurosci. 24, 1411–1419. doi: 10.1162/jocn_a_00214

CrossRef Full Text | Google Scholar

David, A. S. (1994). “The neuropsychology of auditory-verbal hallucinations,” in The Nueropsychology of Schizophrenia, eds A. David and J. Cutting (New York, NY: Psychology Press), 269–312.

Escera, C., Alho, K., Schröger, E., and Winkler, I. (2000). Involuntary attention and distractibility as evaluated with event-related brain potentials. Audiol. Neurootol. 5, 151–166. doi: 10.1159/000013877

CrossRef Full Text | Google Scholar

Fan, Y.-T., and Cheng, Y. (2014). Atypical mismatch negativity in response to emotional voices in people with autism spectrum conditions. PLoS ONE 9:e102471. doi: 10.1371/journal.pone.0102471

CrossRef Full Text | Google Scholar

Fan, Y. T., Hsu, Y. Y., and Cheng, Y. (2013). Sex matters: n-back modulates emotional mismatch negativity. Neuroreport 24, 457–463. doi: 10.1097/WNR.0b013e32836169b9

CrossRef Full Text | Google Scholar

Freedman, R., Adler, L. E., Gerhardt, G. A., Waldo, M., Baker, N., Rose, G. M., et al. (1987). Neurobiological studies of sensory gating in schizophrenia. Schizophr. Bull. 13, 669–678. doi: 10.1093/schbul/13.4.669

CrossRef Full Text | Google Scholar

Grandjean, D., Sander, D., Lucas, N., Scherer, K. R., and Vuilleumier, P. (2008). Effects of emotional prosody on auditory extinction for voices in patients with spatial neglect. Neuropsychologia 46, 487–496. doi: 10.1016/j.neuropsychologia.2007.08.025

CrossRef Full Text | Google Scholar

Hall, M. H., Schulze, K., Bramon, E., Murray, R. M., Sham, P., and Rijsdijk, F. (2006). Genetic overlap between P300, P50, and duration mismatch negativity. Am. J. Med. Genet. B Neuropsychiatr. Genet. 141B, 336–343. doi: 10.1002/ajmg.b.30318

CrossRef Full Text | Google Scholar

Holt, D. J., and Philops, M. L. (2009). “The human amygdala in schizophrenia,” in The Human Amygdala, eds P. J. Whalen and E. A. Phelps (New York, NY: Guilford Press), 344–361.

Google Scholar

Hung, A. Y., Ahveninen, J., and Cheng, Y. (2013). Atypical mismatch negativity to distressful voices associated with conduct disorder symptoms. J. Child Psychol. Psychiatry 54, 1016–1027. doi: 10.1111/jcpp.12076

CrossRef Full Text | Google Scholar

Hung, A. Y., and Cheng, Y. (2014). Sex differences in preattentive perception of emotional voices and acoustic attributes. Neuroreport 25, 464–469. doi: 10.1097/WNR.0000000000000115

CrossRef Full Text | Google Scholar

Jahshan, C., Cadenhead, K. S., Rissling, A. J., Kirihara, K., Braff, D. L., and Light, G. A. (2012). Automatic sensory information processing abnormalities across the illness course of schizophrenia. Psychol. Med. 42, 85–97. doi: 10.1017/S0033291711001061

CrossRef Full Text | Google Scholar

Jahshan, C., Wynn, J. K., and Green, M. F. (2013). Relationship between auditory processing and affective prosody in schizophrenia. Schizophr. Res. 143, 348–353. doi: 10.1016/j.schres.2012.11.025

CrossRef Full Text | Google Scholar

Javitt, D. C., Doneshka, P., Zylberman, I., Ritter, W., and Vaughan, H. G. Jr. (1993). Impairment of early cortical processing in schizophrenia: an event-related potential confirmation study. Biol. Psychiatry 33, 513–519. doi: 10.1016/0006-3223(93)90005-X

CrossRef Full Text | Google Scholar

Javitt, D. C., Grochowski, S., Shelley, A. M., and Ritter, W. (1998). Impaired mismatch negativity (MMN) generation in schizophrenia as a function of stimulus deviance, probability, and interstimulus/interdeviant interval. Electroencephalogr. Clin. Neurophysiol. 108, 143–153. doi: 10.1016/S0168-5597(97)00073-7

CrossRef Full Text | Google Scholar

Javitt, D. C., Spencer, K. M., Thaker, G. K., Winterer, G., and Hajos, M. (2008). Neurophysiological biomarkers for drug development in schizophrenia. Nat. Rev. Drug Discov. 7, 68–83. doi: 10.1038/nrd2463

CrossRef Full Text | Google Scholar

Javitt, D. C., and Sweet, R. A. (2015). Auditory dysfunction in schizophrenia: integrating clinical and basic features. Nat. Rev. Neurosci. 16, 535–550. doi: 10.1038/nrn4002

CrossRef Full Text | Google Scholar

Jeon, Y. W., and Polich, J. (2003). Meta-analysis of P300 and schizophrenia: patients, paradigms, and practical implications. Psychophysiology 40, 684–701. doi: 10.1111/1469-8986.00070

CrossRef Full Text | Google Scholar

Kang, J. I., Kim, J. J., Seok, J. H., Chun, J. W., Lee, S. K., and Park, H. J. (2009). Abnormal brain response during the auditory emotional processing in schizophrenic patients with chronic auditory hallucinations. Schizophr. Res. 107, 83–91. doi: 10.1016/j.schres.2008.08.019

CrossRef Full Text | Google Scholar

Kay, S. R., Fiszbein, A., and Opler, L. A. (1987). The positive and negative syndrome scale (PANSS) for schizophrenia. Schizophr. Bull. 13, 261–276. doi: 10.1093/schbul/13.2.261

CrossRef Full Text | Google Scholar

Leitman, D. I., Foxe, J. J., Butler, P. D., Saperstein, A., Revheim, N., and Javitt, D. C. (2005). Sensory contributions to impaired prosodic processing in schizophrenia. Biol. Psychiatry 58, 56–61. doi: 10.1016/j.biopsych.2005.02.034

CrossRef Full Text | Google Scholar

Leitman, D. I., Hoptman, M. J., Foxe, J. J., Saccente, E., Wylie, G. R., Nierenberg, J., et al. (2007). The neural substrates of impaired prosodic detection in schizophrenia and its sensorial antecedents. Am. J. Psychiatry 164, 474–482. doi: 10.1176/appi.ajp.164.3.474

CrossRef Full Text | Google Scholar

Leitman, D. I., Laukka, P., Juslin, P. N., Saccente, E., Butler, P., and Javitt, D. C. (2010). Getting the cue: sensory contributions to auditory emotion recognition impairments in schizophrenia. Schizophr. Bull. 36, 545–556. doi: 10.1093/schbul/sbn115

CrossRef Full Text | Google Scholar

Leitman, D. I., Wolf, D. H., Laukka, P., Ragland, J. D., Valdez, J. N., Turetsky, B. I., et al. (2011). Not pitch perfect: sensory contributions to affective communication impairment in schizophrenia. Biol. Psychiatry 70, 611–618. doi: 10.1016/j.biopsych.2011.05.032

CrossRef Full Text | Google Scholar

Light, G. A., and Braff, D. L. (2005). Mismatch negativity deficits are associated with poor functioning in schizophrenia patients. Arch. Gen. Psychiatry 62, 127–136. doi: 10.1001/archpsyc.62.2.127

CrossRef Full Text | Google Scholar

Maess, B., Jacobsen, T., Schröger, E., and Friederici, A. D. (2007). Localizing pre-attentive auditory memory-based comparison: magnetic mismatch negativity to pitch change. Neuroimage 37, 561–571. doi: 10.1016/j.neuroimage.2007.05.040

CrossRef Full Text | Google Scholar

Mathalon, D. H., Ford, J. M., and Pfefferbaum, A. (2000). Trait and state aspects of P300 amplitude reduction in schizophrenia: a retrospective longitudinal study. Biol. Psychiatry 47, 434–449. doi: 10.1016/S0006-3223(99)00277-2

CrossRef Full Text | Google Scholar

Matsumoto, K., Samson, G. T., O’Daly, O. D., Tracy, D. K., Patel, A. D., and Shergill, S. S. (2006). Prosodic discrimination in patients with schizophrenia. Br. J. Psychiatry 189, 180–181. doi: 10.1192/bjp.bp.105.009332

CrossRef Full Text | Google Scholar

Murphy, D., and Cutting, J. (1990). Prosodic comprehension and expression in schizophrenia. J. Neurol. Neurosurg. Psychiatry 53, 727–730. doi: 10.1136/jnnp.53.9.727

CrossRef Full Text | Google Scholar

Näätänen, R., Kujala, T., and Winkler, I. (2011). Auditory processing that leads to conscious perception: a unique window to central auditory processing opened by the mismatch negativity and related responses. Psychophysiology 48, 4–22. doi: 10.1111/j.1469-8986.2010.01114.x

CrossRef Full Text | Google Scholar

Näätänen, R., Paavilainen, P., Rinne, T., and Alho, K. (2007). The mismatch negativity (MMN) in basic research of central auditory processing: a review. Clin. Neurophysiol. 118, 2544–2590. doi: 10.1016/j.clinph.2007.04.026

CrossRef Full Text | Google Scholar

Phillips, M. R., Xiong, W., Wang, R. W., Gao, Y. H., Wang, X. Q., and Zhang, N. P. (1991). Reliability and validity of the Chinese versions of the Scales for Assessment of Positive and Negative Symptoms. Acta Psychiatr. Scand. 84, 364–370. doi: 10.1080/00048670902873672

CrossRef Full Text | Google Scholar

Rossell, S. L., and Boundy, C. L. (2005). Are auditory-verbal hallucinations associated with auditory affective processing deficits? Schizophr. Res. 78, 95–106. doi: 10.1016/j.schres.2005.06.002

CrossRef Full Text | Google Scholar

Roth, W. T., and Cannon, E. H. (1972). Some features of the auditory evoked response in schizophrenia. Arch. Gen. Psychiatry 27, 466–471. doi: 10.1001/archpsyc.1972.01750280034007

CrossRef Full Text | Google Scholar

Salisbury, D. F., Kuroki, N., Kasai, K., Shenton, M. E., and McCarley, R. W. (2007). Progressive and interrelated functional and structural evidence of post-onset brain reduction in schizophrenia. Arch. Gen. Psychiatry 64, 521–529. doi: 10.1001/archpsyc.64.5.521

CrossRef Full Text | Google Scholar

Salisbury, D. F., Shenton, M. E., Griggs, C. B., Bonner-Jackson, A., and McCarley, R. W. (2002). Mismatch negativity in chronic schizophrenia and first-episode schizophrenia. Arch. Gen. Psychiatry 59, 686–694. doi: 10.1001/archpsyc.59.8.686

CrossRef Full Text | Google Scholar

Schirmer, A., Escoffier, N., Zysset, S., Koester, D., Striano, T., and Friederici, A. D. (2008). When vocal processing gets emotional: on the role of social orientation in relevance detection by the human amygdala. Neuroimage 40, 1402–1410. doi: 10.1016/j.neuroimage.2008.01.018

CrossRef Full Text | Google Scholar

Schirmer, A., Simpson, E., and Escoffier, N. (2007). Listen up! Processing of intensity change differs for vocal and nonvocal sounds. Brain Res. 1176, 103–112. doi: 10.1016/j.brainres.2007.08.008

CrossRef Full Text | Google Scholar

Schirmer, A., Striano, T., and Friederici, A. D. (2005). Sex differences in the preattentive processing of vocal emotional expressions. Neuroreport 16, 635–639. doi: 10.1097/00001756-200504250-00024

CrossRef Full Text | Google Scholar

Shea, T. L., Sergejew, A. A., Burnham, D., Jones, C., Rossell, S. L., Copolov, D. L., et al. (2007). Emotional prosodic processing in auditory hallucinations. Schizophr. Res. 90, 214–220. doi: 10.1016/j.schres.2006.09.021

CrossRef Full Text | Google Scholar

Turetsky, B. I., Calkins, M. E., Light, G. A., Olincy, A., Radant, A. D., and Swerdlow, N. R. (2007). Neurophysiological endophenotypes of schizophrenia: the viability of selected candidate measures. Schizophr. Bull. 33, 69–94. doi: 10.1093/schbul/sbl060

CrossRef Full Text | Google Scholar

Umbricht, D., and Krljes, S. (2005). Mismatch negativity in schizophrenia: a meta-analysis. Schizophr. Res. 76, 1–23. doi: 10.1016/j.schres.2004.12.002

CrossRef Full Text | Google Scholar

Umbricht, D. S., Bates, J. A., Lieberman, J. A., Kane, J. M., and Javitt, D. C. (2006). Electrophysiological indices of automatic and controlled auditory information processing in first-episode, recent-onset and chronic schizophrenia. Biol. Psychiatry 59, 762–772. doi: 10.1016/j.biopsych.2005.08.030

CrossRef Full Text | Google Scholar

Weinberger, D. R. (1987). Implications of normal brain development for the pathogenesis of schizophrenia. Arch. Gen. Psychiatry 44, 660–669. doi: 10.1001/archpsyc.1987.01800190080012

CrossRef Full Text | Google Scholar

Winterer, G. (2000). Schizophrenia: reduced signal-to-noise ratio and impaired phase-locking during information processing. Clin. Neurophysiol. 111, 837–849. doi: 10.1016/S1388-2457(99)00322-3

CrossRef Full Text | Google Scholar

Winterer, G., Egan, M. F., Raedler, T., Sanchez, C., Jones, D. W., Coppola, R., et al. (2003). P300 and genetic risk for schizophrenia. Arch. Gen. Psychiatry 60, 1158–1167. doi: 10.1001/archpsyc.60.11.115860/11/1158

CrossRef Full Text | Google Scholar

Keywords: schizophrenia, emotional salience, voices, mismatch negativity, receiver operator characteristic

Citation: Chen C, Liu C-C, Weng P-Y and Cheng Y (2016) Mismatch Negativity to Threatening Voices Associated with Positive Symptoms in Schizophrenia. Front. Hum. Neurosci. 10:362. doi: 10.3389/fnhum.2016.00362

Received: 20 November 2015; Accepted: 05 July 2016;
Published: 15 July 2016.

Edited by:

Peter Sörös, University of Oldenburg, Germany

Reviewed by:

Mitja Bodatsch, University of Cologne, Germany
Chun-Yu Tse, The Chinese University of Hong Kong, Hong Kong

Copyright © 2016 Chen, Liu, Weng and Cheng. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Yawei Cheng,

These authors have contributed equally to this work.