Skip to main content


Front. Behav. Neurosci., 02 August 2017
Sec. Individual and Social Behaviors
Volume 11 - 2017 |

Face Recognition, Musical Appraisal, and Emotional Crossmodal Bias

  • 1Human Anatomy and Neuroscience Lab, Department of Environmental Science and Technology, University of Salento, Lecce, Italy
  • 2Department of Psychology and Cognitive Sciences, University of Trento, Trento, Italy
  • 3Santa Chiara Institute, Lecce, Italy
  • 4Department of Electrical and Information Engineering, Polytechnic University of Bari, Bari, Italy
  • 5Department of Medical Science, Neuroscience, and Sense Organs, University Aldo Moro, Bari, Italy

Recent research on the crossmodal integration of visual and auditory perception suggests that evaluations of emotional information in one sensory modality may tend toward the emotional value generated in another sensory modality. This implies that the emotions elicited by musical stimuli can influence the perception of emotional stimuli presented in other sensory modalities, through a top-down process. The aim of this work was to investigate how crossmodal perceptual processing influences emotional face recognition and how potential modulation of this processing induced by music could be influenced by the subject's musical competence. We investigated how emotional face recognition processing could be modulated by listening to music and how this modulation varies according to the subjective emotional salience of the music and the listener's musical competence. The sample consisted of 24 participants: 12 professional musicians and 12 university students (non-musicians). Participants performed an emotional go/no-go task whilst listening to music by Albeniz, Chopin, or Mozart. The target stimuli were emotionally neutral facial expressions. We examined the N170 Event-Related Potential (ERP) and behavioral responses (i.e., motor reaction time to target recognition and musical emotional judgment). A linear mixed-effects model and a decision-tree learning technique were applied to N170 amplitudes and latencies. The main findings of the study were that musicians' behavioral responses and N170 is more affected by the emotional value of music administered in the emotional go/no-go task and this bias is also apparent in responses to the non-target emotional face. This suggests that emotional information, coming from multiple sensory channels, activates a crossmodal integration process that depends upon the stimuli emotional salience and the listener's appraisal.


The wide discussion of recent research on the interaction between music and emotion addresses various issues, mainly those relating to comparisons between emotional processing and sensory experience, and the definition of music a process of “sense making” that involves and influences aspects of perception and cognition, as posited in a joint model of embodied mind (Reybrouck, 2005; Reybrouck and Brattico, 2015; Schiavio et al., 2016). Pioneering research on the crossmodal integration of visual and auditory perception suggests that evaluations of emotional information in one sensory modality may tend toward the emotional value generated in another (de Gelder and Vroomen, 2000; Logeswaran and Bhattacharya, 2009; Balconi and Carrera, 2011). Ad example, a realistic model able to explain emotional recognition process and crossmodal integration is the model of Balconi (Balconi and Carrera, 2011). This model is based on an experiment of a face recognition task interfaced, in crossmodal condition, with prosody, and analyzed through P2 ERP. The model highlights how an early ERP component (i.e., P2) can be considered a cognitive marker in multisensory processing. Thus, the emotion produced by musical stimulation, as could be prosody in the previous model, may influence the stimuli perception of stimuli, presented in other sensory modalities, through a top-down process (Sekuler et al., 1997; Jolij and Meurs, 2011; Wong and Gauthier, 2012). Different musical genres can also modulate arousal and others psychophysiological parameters eliciting different emotions (Schellenberg, 2005; Caldwell and Riby, 2007; Sammler et al., 2007; Fritz et al., 2009; Ladinig and Schellenberg, 2012; Schellenberg and Mankarious, 2012; Kawakami et al., 2014; Bhatti et al., 2016). For example, Baumgartner and colleagues investigated the psychophysiological effect of the interaction between emotional visual images, music, and a crossmodal presentation (music and images; Baumgartner et al., 2006). More intensely perceived emotions emerged in the crossmodal condition, and this was accompanied by predominant alpha band activity in EEG.

It has been proposed that music primes emotional responses to information in the visual domain (Logeswaran and Bhattacharya, 2009). Logeswaran and Bhattacharya demonstrated that musical priming (positive or negative) can modulate perceptions of emotional faces. In their study, participants were asked to rate the emotional salience of the faces, and the results demonstrated the existence of a crossmodal priming effect. Happy faces were rated as happier when they were presented after a happy piece of music and vice versa. The priming effect of music is evident with neutral targets. Analysis of Event-Related Potential (ERP) components showed that the N1 response to neutral faces increases when stimulus presentation is preceded by happy music than when it was preceded by sad music. Previous studies have observed an increased N1 component in the auditory cortex during simultaneous presentation of an emotionally congruent face (i.e., face–voice pairs; Pourtois et al., 2002). The N1 component was distributed over the frontal regions, suggesting the involvement of top–down psychophysiological mechanisms (Zanto et al., 2011; Gilbert and Li, 2013). Moreover, the perception of music is affected by the listener's emotional and cognitive state (Kawakami et al., 2013, 2014). Many studies have highlighted differences between the cognitive processing and cortical responses of musicians and non-musicians (Pantev et al., 1998; Brattico et al., 2010; Müller et al., 2010; Pallesen et al., 2010; Herholz and Zatorre, 2012; Proverbio et al., 2013).

Recent studies suggest that musical stimulation may interact with fatigue and motor activity, thereby affecting the motivation of individuals who are under intense physical stress (Bigliassi et al., 2016a,b). Music can modulate perception and cognition via a complex interaction between the perceptual and emotional characteristics of a musical stimulus and the physical (i.e., sex differences; Miles et al., 2016), psychophysiological, (Gosselin et al., 2007) and cognitive characteristics of the listener. Because of this interaction, the emotion invoked by music can result in biased responses (Chen et al., 2008). The aim of our study was to investigate how cross-modal perception—in this instance processing of emotional faces whilst performing a task that involves listening to music—varies with the subjective emotional salience of the music and with musical competence. This effect can be seen at cognitive and behavioral level, in decisions and appraisals, (Ellsworth and Scherer, 2003) and at motor level (in motor reaction time; Brattico et al., 2013). In fact, the motor and perceptual systems can be subject to early, top-down modulation induced by crossmodal stimulation, which can induce emotional bias, reflected at the behavioral level and in cortical responses (i.e., electrophysiological level). We also evaluated whether this bias could be modulated by the participant's appraisal of the musical stimulus (Brattico and Jacobsen, 2009) choosing an electrophysiological investigation of N170 ERP component. N170 ERP component is the most sensible ERP component able to be modulated in the Face Recognition Tasks (Eimer, 2000, 2011; Heisz et al., 2006; Kolassa et al., 2009; Ibanez et al., 2012; Leleu et al., 2015; Almeida et al., 2016). In particular, N170 is strictly linked to automatic processes (Heisz et al., 2006), instead of P2, that is a demonstrated cognitive marker in crossmodal cognition (Balconi and Carrera, 2011; Peretz, 2012). Still, in the condition in which music is perceived as a cognitive expertise, the emotional salience of the stimulus observed (i.e., face expression), may be affected by emotional bias, and this effect can be early observable through N170 modulations.

Materials and Methods


Twenty-four participants were recruited in University and in Musical Conservatory, and were selected according to their musical skills. Twelve musicians, graduates in a Musical Conservatory, (5 men and 7 women; mean age = 29.8 years; SD ± 7.2) were compared to a group of 12 non-musicians, University students (graduated of the three-year degree and attending the specialist degree) without educational musical training (7 men and 5 women; mean age = 26.9 years; SD ± 4.5). The instruments played by the group of musicians included piano, guitar, trumpet, and trombone; one musician was a singer. All participants were right-handed, had normal hearing, and normal or corrected-to-normal vision. Participants provided written, informed consent to participation in accordance with the Helsinki Declaration. Participants did not receive any financial compensation. The local ethics committee (ASL Lecce, Apulia Region, Italy) approved the study.


Participants performed an emotional go/no-go task (emo go/no-go), presented using E-Prime 2.0 (Richard and Charbonneau, 2009), during the EEG recordings.

The emotional go/no-go task (Schulz et al., 2007; Waters and Valvoi, 2009; Yerys et al., 2013) is a variant of the cognitive go/no-go task (Gomez et al., 2007) in which emotional information, measured through a decision-making process, is accompanied by a motor response. Generally, during an emo go/no-go task, the participant has to press the spacebar of a keyboard in response to an emotional face (neutral, angry, fearful, or happy). The choice of the face emotional expression depends on the task and on the process being investigated. The emo go/no-go task is a paradigm often used in ERP studies investigating a mismatch in response to stimulus salience (Jodo and Kayama, 1992; Smith et al., 2013; Moreno et al., 2014; Invitto et al., 2016). The N170 ERP component is the most sensitive in face recognition tasks (Eimer, 2000, 2011; Heisz et al., 2006; Blau et al., 2007). In this study, the computerized behavioral task required participants to press the spacebar when they identified a neutral face; EEG data were recorded whilst they were performing the task. Facial expressions were extracted from the NimStim Set of Facial Expressions (

The NimStim Set is a collection of 672 images of the faces of 70 professional actors displaying various emotional expressions. The actors are of varying ethnicity and are represented in the same proportions by women and men. The collection consists of images of eight emotional facial expressions: fear, happiness, sadness, anger, surprise, disgust, neutral, and calm. In this experiment, we presented a sample of 64 images of fearful, happy and neutral faces, the expression categories were matched for sex and ethnicity.

In each condition the go-no-go task was accompanied by one of the following pieces from the classical piano repertoire:

• Chopin: Nocturne Op. 9 n. 1 and Nocturne Op. 9 n. 2.

• Mozart: Sonata in D major, K.V. 311.

• Albeniz: In Iberia, Rondeña.

Musical stimuli were delivered via two earphones, with a Windows 7 reproduction intensity of 60% (−6.4 dB), Conexant Smart Audio HD, Roland Sound Canvas, with a sampling rate of 48,000 Hz and 24-bit depth (system information: professional quality).

Each condition began with the listening of a musical piece, selected from the pieces above, whilst participants were performing the emo go/no-go task, looking at the emotional face displayed on the screen.

Participants rated the sadness and happiness each piece of music invoked using visual analog scales (VASs). The scales were administered immediately after each condition. The VASs consisted of a ten-centimeter line with the poles labeled 0 (absence of pleasure, sadness or happiness) and 10 (highest possible degree of pleasure, sadness, or happiness).

Each condition lasted approximately 500 s. Images of neutral (target), fearful and happy (non-target) faces were presented in pseudo-random order. Both target and non-target images were presented for 1,500 ms and the interstimulus interval was 1,500 ms.

Participants were instructed to sit so that there was a gap of about 75 cm between the front edge of the chair and the base of the computer screen. They had to listen to the pieces of classical music and respond to the presentation of neutral face on the screen by pressing the spacebar of the computer keyboard. At the end of each condition participants rated the emotions the accompanying music had elicited using the VASs described above.

N170 ERP Recording

EEGs were recorded from 64 active channels, mounted in an electrode cap according to the International 10–20-system. Signals were recorded through Brain Vision actiCHamp (Brain Products GmbH); the recording software was Brain Vision Recorder and the analysis software was Brain Vision Analyzer (Brain Products GmbH). Electrode impedance was kept below 15 kΩ. The EEG was amplified (band pass 0.1–40 Hz, 24 dB), with a sampling rate of 1000 Hz. Electrodes were referenced online to the FpZ. One electrode placed at the outer canthus of the right eye and used to monitor horizontal eye movements. Vertical eye movements and blinks were monitored by electrodes above and below the left eye. Trials contaminated by eye movements, amplifier conditioning, or other artifacts were rejected. The signal was filtered offline (0.01–50 Hz, 24 dB), and the threshold for artifact rejection was set at > |125|μV. The ocular rejection was performed through independent component analysis (ICA). The ERP epochs included a 100-ms pre-stimulus baseline period and a 500-ms post-stimulus segment. Separate averages were calculated for each facial expression (neutral, happy, and fearful) in each music condition (Albeniz, Mozart, and Chopin). The onset of ERP N170 peaks was estimated from grand average waveforms, according to the ERP latency definition (Heisz et al., 2006; De Vos et al., 2012; Smith et al., 2013). Peaks were automatically detected for all channels, using the global maxima in interval method (Giroldini et al., 2016).

Data Analysis and Results

To investigate the role of the experimental manipulation on behavioral and psychophysiological data, we combined a linear mixed modeling with a decision-tree learning approach. Statistical analyses on linear mixed-models were performed with lme4, car, and lmertest packages supplied in the R environment whereas the decision-tree model was built by means of a tailor-made algorithm (Menolascina et al., 2007).

Behavioral Data

Independent-samples t-tests were used to analyze data from the three VASs for each condition (see Table 1).


Table 1. Independent-samples t-tests of VAS results.

A repeated measures ANOVA was performed to analyze behavioral Reaction Time to neuter faces in the Emo Go/No-Go paradigm. The analysis considered Music (Albeniz, Chopin, Mozart) as within factor (3 Levels) and Group (2 Levels) as between factor. The model showed significant results in Group (F = 57.055, df = 1, p = 0.01), results just over the limits of statistical significance in Music condition (F = 2.947, df = 2, p = 0.053) and an interaction Music condition × Group (F = 3.012, df = 2, p = 0.049). The results showed a trend in higher response times in the musicians group, with a slower reaction time in Chopin session (Table 2).


Table 2. Mean of the behavioral reaction times (in millisecond) in response to neutral faces during the emo go/no-go task.

Psychophysiological Data

The latency and amplitude of the N170 component were analyzed using separate linear mixed-effects models (LMMs) lme4 package (Bates et al., 2015) supplied as part of the R package (Bates et al., 2013, 2014). In both models, Group (musicians; non-musicians) and Music (Albeniz; Mozart; Chopin) were defined as fixed factors and participant and channel were coded as random effects. The interaction between Group and Music was also examined in the models. Sixty-one EEG electrodes were clustered into four main regions (ROIs): left anterior (L-Ant), right anterior (R-Ant), left posterior (L-Post) and right posterior (R-Post). Left and right were defined according to the standard international 10–20 system whereas anterior and posterior were defined according to the following rule: ANT (F, Fp, FC, FT, C, T, AF) and POST (TP, CP, P, PO, O; Frömer et al., 2012; Bornkessel-Schlesewsky and Schlesewsky, 2013) and as according the recent suggestions about the reduction of data dimensions (Luck and Gaspelin, 2017). To investigate potential regional differences, separate LMM analyses were run for reach ROI. In all these models, the Face variable was kept fixed at the Neutral emotional level (i.e., Target variable in the behavioral task, as described in the Materials section). To identify graphically the Regions of interest (ROIs), were processes through Analyzer a Pooling Elaboration with the creation of 4 New areas: Right Anterior (R-Ant), Right Posterior (R-Post), Left Anterior (L-Ant) and Left Posterior (L-Post).

N170 Amplitude

Table 3 shows the results of LMMs for N170 amplitude. In the L-Ant region there was no effect of Group or Music, although there was an interaction (B = −2.152, t896 = −2.152, p = 0.03). In the R-Ant region there were main effects of Group (B = −0.419, t32 = −2.11, p = 0.04; Figure 1) and Music (B = 0.299, t906 = 2.69, p = 0.007), reflecting ampler N170 in the musicians group and in the Chopin condition (Figure 2). There was also an interaction between Group and Music: musicians, in Chopin condition, revealed an increased amplitude (B = −0.360, t910 = −2.13, p = 0.03) and a decreased amplitude elicited in Mozart condition (B = 0.541, t913 = 3.16, p = 0.001; Figure 3). In the L-Post region there was an effect of Group (Figures 2, 4), reflecting increased N170 amplitude in musicians (B = −1.276, t26 = −6.13, p = 0.002). There was also a Group × Music interaction reflecting an increase in N170 amplitude in non-musicians during the Mozart condition (B = 0.857, t537 = 2.59, p = 0.009; Figure 5). In the R-Post region there was an effect of Group (Figure 6): N170 amplitude was greater in the musicians (B = −1.187, t25 = −2.16, p = 0.04; Figure 5), and Group × Music interaction: non-musicians showed an increase in N170 amplitude in the Mozart condition (B = 0.827, t552 = 2.42, p = 0.01). Respect these results, more negative components are visible through a Mapping imaging reconstruction in musicians vs. non-musicians (Figures 7, 8).


Table 3. Results of linear mixed-effects model: fixed effects for group and music on N170 amplitude.


Figure 1. N170 amplitudes in non-musicians and musicians in R-ANT ROI (Right Anterior Region of Interest).


Figure 2. Matching ERP of Grand average elicited by the emo go/no-go face recognition task in non-musicians in (black line) and musician (red line) in right anterior, right posterior, left anterior, and left posterior regions. The Graphic of ROIs Regions has been performed through the channels pooling processing.


Figure 3. Grand average of the ERP components elicited by the emo go/no-go face recognition task in non-musicians in the Albeniz (black line), Chopin (red line), and Mozart condition (blue line) in right anterior, right posterior, left anterior, and left posterior regions.


Figure 4. N170 amplitudes in non-musicians and musicians in L-POST ROI (Left Posterior Region of Interest).


Figure 5. Grand average of the ERP components elicited by the emo go/no-go face recognition task in non-musicians in the Albeniz (black line), Chopin (red line), and Mozart conditions (blue line) in right anterior, right posterior, left anterior, and left posterior regions.


Figure 6. N170 amplitudes in non-musicians and musicians in R-POST ROI (Right Posterior Region of Interest). N170 amplitudes in non-musicians and musicians in R-POST ROI (Right Posterior).


Figure 7. Topographies of N170 amplitude elicited by neutral facial expressions in non-musicians.


Figure 8. Topographies of N170 amplitude elicited by neutral facial expressions in musicians.

N170 Latency

Table 4 shows the results of LMMs for N170 latency. There were no effects of Group or Music in the L-Ant region. In R-Ant latencies (Figure 9) were shorter in the Chopin condition (B = −15.800, t919 = −4.25, p < 0.001), the opposite effect was found in L-Post (Figure 10), with slower latencies in the Chopin condition (B = −8.743, t730 = −2.16, p = 0.03). Finally, in the R-Post (Figure 11) region latencies were shorter in both the Chopin (B = −15.285, t741 = −4.88, p < 0.001) and Mozart (B = −15.543, t743 = −4.81, p < 0.001) conditions.


Table 4. Results of linear mixed-effects model: fixed effects for group and music on N170 latency.


Figure 9. N170 Latency in non-musicians and musicians in R-ANT ROI (Right Anterior Region of Interest).


Figure 10. N170 Latency in non-musicians and musicians in L-POST ROI (Left Posterior Region of Interest).


Figure 11. N170 Latency in non-musicians and musicians in R-POST ROI (Right Posterior Region of Interest).

Assessing the Gender Effect

In order to evaluate whether the bias could be related to a gender effect, we proceed by comparing the fixed-effects structure of the previous linear-mixed models by adding and excluding the factor gender from the models. The results were evaluated in terms of model fit by using an information theory based approach (McElreath, 2016). To do so, for each ROI we considered two models: M0 (Simple model: excluding Gender variable) and M1 (Complex Model: including the Gender variable) and we fit the model via maximum likelihood. The BIC information criterion was then computed on the log-likelihood of the models along with the Vuong's statistic (Vuong, 1989; Merkle et al., 2016). Finally, asymptotic confidence intervals (CIs) on the BIC differences of the models (ΔBIC) were also computed. All the computations involved were performed by means of the nonnest2 package in the R environment.

Table 5 shows results for the model comparisons considering Amplitude and Latency of N170. As for the previous analyses (see Tables 2, 3), four models were considered with respect to the four ROIs previously defined. Overall, the Vuong's test did not allow to reject the null hypothesis of indistinguishable between models with and without the Gender Variable. The model, in all ROI, showed very similar BICs. This strongly suggests that the evidence of the models is the same. Indeed, the 95% confidence intervals of ΔBIC overlapped the zero, implying that the models are enough close and M1s cannot be preferred over M0s. These results would suggest that including the gender variable in the models (M1s) did not improve their evidence with regards to the previous models (M0s). In this case, adding Gender Effect, don't significantly change the evidence of the model, when compared to the sample data. Therefore, using Occam's razor, we resorted to considering the simplest models in terms of parameters, according to the principle of simplifying the variables in an experiment (Srinagesh, 2006; Luck and Gaspelin, 2017).


Table 5. Comparison respect to gender effect.

Decision-Tree Modeling: Target and Non-target Stimuli

To validate that the emotional bias, generated by combined stimuli, is correlated with the class of participant (musicians/non-musicians), we processed the input data calculating, for each participant, the relative variation between the music conditions considering each EEG channel.

To do this, we used the following equation (Equation 1):

Δxij=|xkxaxa|    (1)

where i ∈ 1, …, 24 was the participant, j ∈ 1, …, 61 was the EEG channel, a = Albeniz; k = Chopin or Mozart.

The output data was further processed to evaluate which EEG channels showed the best discrimination capability for the classification between the two groups. A predictive model was implemented using a tree-building algorithm (Menolascina et al., 2007), by generating prediction rules from partially pruned decision trees that were built using C4.5 Quinlan's heuristics (Quinlan, 1993), whose main goal consists in the minimization of the tree levels and nodes number, thereby maximizing data generalization. This technique uses an information theoretical procedure to select, at each choice point in the tree, the attribute that would maximize the information gained from splitting the data.

Predictive Model Results

The predictive model was trained and tested 200 times considering different random combinations of training and test sets, obtained from the input dataset considering a splitting percentage of 81.82%. The results are expressed as mean values, considering 200 iterations, of Accuracy, Sensitivity, Specificity and Area Under the Curve (AUC) and are reported in Tables 6, 7.


Table 6. Mean performances of the predictive models – N170 amplitude.


Table 7. Mean performances of the predictive models – N170 latency.

We tried to improve the performance of the previous predictive model by reducing the number of the considered EEG channels using a correlation-based filter that selects the most highly correlated features. A fast correlation-based filter (FCBF) algorithm (Yu and Liu, 2003) was adopted.

The same procedure discussed in the previous section was applied considering the obtained subset of EEG channels, and a new predictive model was implemented and evaluated.

The performance of the new predictive model is reported in Tables 8, 9.


Table 8. Mean performances of the FCBF-filtered predictive models – N170 amplitude.


Table 9. Mean performances of the FCBF-filtered predictive models – N170 latency.


Our aim was to investigate modulation of emotional face recognition by cross-modal perception, treated as a function of background music. Synesthesia and crossmodal perception can have a strong modulatory effect on cortical processing, conditioning or facilitating perception and interpretation of the administered stimulus. We analyzed how musicians' recognition of facial expressions was affected by music-induced emotions. These data allow us to suggest that the presence of emotional information from another sensory channel (i.e., auditory information from background music) activates cross-modal integration of information and that this process can be modulated by the perception of the musical stimulus. This salience, for emotional face, could be explicable in terms adaptive: identify more early stage emotions is a skill that, developmentally, can be crucial for the survival and, proximal and contingent, is an indispensable social competence (Niu et al., 2012). So, in a condition where the participants are more “emotionally involved,” the neuter face, that is ambiguous for a defined emotional recognition and that is more difficult to recognize, can be more affected by emotion music induced. This justifies the fact that the musicians evaluated music as more pleasant and emotional (happy and sad) than non-musicians, and this judgment on emotional engagement is in agreement with their musical appraisal and competence. This emotional involvement leads to a delay in reaction times. These results imply that the motor and perceptual systems can be modulated, in a top-down process, by music-induced emotions. The electrophysiological data revealed increased N170 amplitudes in musicians in all conditions. The background music had less impact in non-musicians, then can produce less bias in the task. Instead, an earlier onset of the global processing of the stimulus indicates that music interacts with the interpretation of salience, producing a behavioral delay and an increased cortical arousal in musicians. This result suggests that perception of facial expressions can vary according to perceptions of a concurrent auditory stimulus and an individual's musical background.

The decreased ERP amplitude, faster reaction times and lower VAS scores in the non-musicians group, suggests that non-musicians found the background music less engaging and emotionally arousing. Hence their top-down processes (less modulated by musical listening), doesn't bias the face perception. The relative changes in arousal, during the face recognition process, are driven by the subjective emotional reaction and top-down processing. The evidence of this concept was obtained from the comparison of responses to the neutral face (Target) whilst listening to music by Albeniz (pleasant), Mozart (happy) and Chopin (judged, at the same time, both sad and pleasant). We also assessed whether, within our model, there was a gender effect (Miles et al., 2016), but, in our study, gender analysis did not improve evidence with regards to the simpler model. In this case, adding gender effect, don't significantly change the evidence of the model, when compared to the sample data. We chose to keep the simpler model, even in accordance with the latest methodological ERP guidelines (Luck, 2005; Luck and Gaspelin, 2017). Probably in a future study, increasing the number of the sample, so that we can analyze the gender effect within the model, we could implement the complex model.

In view of these results, to investigate other possible bias variable-related, we sought to determine whether the bias effect could be present not only on neutral faces, as literature highlight. According to this hypothesis, we tested, using the predictive model, the N170 components for the other face emotional expressions showed during the task (happy and fear).

The predictive model allowed us to determine the most significant decision-tree features; in fact, the classification performances obtained using the trained predictive model were high, regardless of training and test sets. In this case, we find modulation of the response even in happy faces, but not in fear faces. This could also be explained by theories on emotions where the stimulus that produces fear is the least susceptible to alterations because it is the one most immediately and easily perceived (Vuilleumier et al., 2001; Phelps and LeDoux, 2005; Almeida et al., 2016).

Emotional salience allows the recognition and discrimination of neutral expressions. Our data indicate that the simultaneous presence of emotional information from multiple sensory channels activates a process of crossmodal integration that could be facilitated by music. Further research using different neuroscientific and behavioral techniques and paradigms is needed to improve our understanding of emotional crossmodal integration.

Ethics Statement

Lecce Ethical Comittee, ASL Hospital Vito Fazzi, approved the study with Verbal n. 2 all.21 approved the study in date October, 02, 2013.

Author Contributions

SI: Study design and coordination, whole data analysis design, manuscript preparation and editing and reviewing. AC: Statistical Data Analysis. AM: Subjects selection and data recording. GP: Subjects selections and data recording. RS: Subjects selection, Classical Music pieces choice. DT: Data recording. ID, AB, and VB: Machine Learning Data Analysis. MdT: Manuscript editing.


CUIS Project, Award for Young Researcher, Interprovincial Consortium University of Salento.

Conflict of Interest Statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.


Almeida, P. R., Ferreira-Santos, F., Chaves, P. L., Paiva, T. O., Barbosa, F., and Marques-Teixeira, J. (2016). Perceived arousal of facial expressions of emotion modulates the N170, regardless of emotional category: time domain and time–frequency dynamics. Int. J. Psychophysiol. 99, 48–56. doi: 10.1016/j.ijpsycho.2015.11.017

PubMed Abstract | CrossRef Full Text | Google Scholar

Balconi, M., and Carrera, A. (2011). Cross-modal integration of emotional face and voice in congruous and incongruous pairs: the P2 ERP effect. J. Cogn. Psychol. 23, 132–139. doi: 10.1080/20445911.2011.473560

CrossRef Full Text | Google Scholar

Bates, D., Maechler, M., and Bolker, B. (2013). Lme4: Linear Mixed-Effects Models using S4 Classes. R Package version 0.999999–2.999999.

Bates, D., Maechler, M., Bolker, B., and Walker, S. (2014). Lme4: Linear Mixed-Effects Models Using S4 Classes. R package version 1.1-6. R.

Bates, D., Maechler, M., Bolker, B., and Walker, S. (2015). Fitting linear mixed-effects models using lme4. J. Stat. Softw. 67, 1–91. doi: 10.18637/jss.v067.i01

CrossRef Full Text | Google Scholar

Baumgartner, T., Esslen, M., and Jäncke, L. (2006). From emotion perception to emotion experience: emotions evoked by pictures and classical music. Int. J. Psychophysiol. 60, 34–43. doi: 10.1016/j.ijpsycho.2005.04.007

PubMed Abstract | CrossRef Full Text | Google Scholar

Bhatti, A. M., Majid, M., Anwar, S. M., and Khan, B. (2016). Human emotion recognition and analysis in response to audio music using brain signals. Comput. Hum. Behav. 65, 267–275. doi: 10.1016/j.chb.2016.08.029

CrossRef Full Text | Google Scholar

Bigliassi, M., Karageorghis, C. I., Nowicky, A. V., Orgs, G., and Wright, M. J. (2016a). Cerebral mechanisms underlying the effects of music during a fatiguing isometric ankle-dorsiflexion task. Psychophysiology 53, 1472–1483. doi: 10.1111/psyp.12693

PubMed Abstract | CrossRef Full Text | Google Scholar

Bigliassi, M., Silva, V. B., Karageorghis, C. I., Bird, J. M., Santos, P. C., and Altimari, L. R. (2016b). Brain mechanisms that underlie the effects of motivational audiovisual stimuli on psychophysiological responses during exercise. Physiol. Behav. 158, 128–136. doi: 10.1016/j.physbeh.2016.03.001

PubMed Abstract | CrossRef Full Text | Google Scholar

Blau, V. C., Maurer, U., Tottenham, N., and McCandliss, B. D. (2007). The face-specific N170 component is modulated by emotional facial expression. Behav. Brain Funct. 3:7. doi: 10.1186/1744-9081-3-7

PubMed Abstract | CrossRef Full Text | Google Scholar

Bornkessel-Schlesewsky, I., and Schlesewsky, M. (2013). Reconciling time, space and function: a new dorsal-ventral stream model of sentence comprehension. Brain Lang. 125, 60–76. doi: 10.1016/j.bandl.2013.01.010

PubMed Abstract | CrossRef Full Text | Google Scholar

Brattico, E., Brigitte, B., and Jacobsen, T. (2013). Toward a neural chronometry for the aesthetic experience of music. Front. Psychol. 4:206. doi: 10.3389/fpsyg.2013.00206

PubMed Abstract | CrossRef Full Text | Google Scholar

Brattico, E., and Jacobsen, T. (2009). Subjective appraisal of music: neuroimaging evidence. Ann. N.Y. Acad. Sci. 1169, 308–317. doi: 10.1111/j.1749-6632.2009.04843.x

PubMed Abstract | CrossRef Full Text | Google Scholar

Brattico, E., Jacobsen, T., De Baene, W., Glerean, E., and Tervaniemi, M. (2010). Cognitive vs. affective listening modes and judgments of music - An ERP study. Biol. Psychol. 85, 393–409. doi: 10.1016/j.biopsycho.2010.08.014

PubMed Abstract | CrossRef Full Text | Google Scholar

Caldwell, G. N., and Riby, L. M. (2007). The effects of music exposure and own genre preference on conscious and unconscious cognitive processes: a pilot ERP study. Conscious. Cogn. 16, 992–996. doi: 10.1016/j.concog.2006.06.015

PubMed Abstract | CrossRef Full Text | Google Scholar

Chen, J., Yuan, J., Huang, H., Chen, C., and Li, H. (2008). Music-induced mood modulates the strength of emotional negativity bias: an ERP study. Neurosci. Lett. 445, 135–139. doi: 10.1016/j.neulet.2008.08.061

PubMed Abstract | CrossRef Full Text | Google Scholar

de Gelder, B., and Vroomen, J. (2000). The perception of emotions by ear and by eye. Cogn. Emot. 14, 289–311. doi: 10.1080/026999300378824

CrossRef Full Text | Google Scholar

De Vos, M., Thorne, J. D., Yovel, G., and Debener, S. (2012). Let's face it, from trial to trial: comparing procedures for N170 single-trial estimation. Neuroimage 63, 1196–1202. doi: 10.1016/j.neuroimage.2012.07.055

PubMed Abstract | CrossRef Full Text | Google Scholar

Eimer, M. (2000). The face-specific N170 component reflects late stages in the structural encoding of faces. Neuroreport 11, 2319–2324. doi: 10.1097/00001756-200007140-00050

PubMed Abstract | CrossRef Full Text | Google Scholar

Eimer, M. (2011). “The face-sensitive N170 component of the event-related brain potential,” in The Oxford Handbook of Face Perception, eds A. J. Calder, G. Rhodes, M. Johnson, and J. V. Haxby (Oxford: Oxford University Press), 329–344.

Google Scholar

Ellsworth, P. C., and Scherer, K. R. (2003). “Appraisal processes in emotion,” in Handbook of Affective Sciences, eds R. J. Davidson, K. R. Scherer, and H. H. Goldsmith (New York, NY: Oxford University Press), 572–595.

Fritz, T., Jentschke, S., Gosselin, N., Sammler, D., Peretz, I., Turner, R., et al. (2009). Universal recognition of three basic emotions in music. Curr. Biol. 19, 573–576. doi: 10.1016/j.cub.2009.02.058

PubMed Abstract | CrossRef Full Text | Google Scholar

Frömer, R., Hafner, V., and Sommer, W. (2012). Aiming for the bull's eye: preparing for throwing investigated with event-related brain potentials. Psychophysiology 49, 335–344. doi: 10.1111/j.1469-8986.2011.01317.x

PubMed Abstract | CrossRef Full Text | Google Scholar

Gilbert, C. D., and Li, W. (2013). Top-down influences on visual processing. Nat. Rev. Neurosci. 14, 350–363. doi: 10.1038/nrn3476

PubMed Abstract | CrossRef Full Text | Google Scholar

Giroldini, W., Pederzoli, L., Bilucaglia, M., Melloni, S., and Tressoldi, P. (2016). A new method to detect event-related potentials based on Pearson's correlation. EURASIP J. Bioinforma. Syst. Biol. 11, 1–23. doi: 10.1186/s13637-016-0043-z

CrossRef Full Text

Gomez, P., Ratcliff, R., and Perea, M. (2007). A model of the Go/No-Go task. J. Exp. Psychol. Gen. 136, 389–413. doi: 10.1037/0096-3445.136.3.389

PubMed Abstract | CrossRef Full Text | Google Scholar

Gosselin, N., Peretz, I., Johnsen, E., and Adolphs, R. (2007). Amygdala damage impairs emotion recognition from music. Neuropsychologia 45, 236–244. doi: 10.1016/j.neuropsychologia.2006.07.012

PubMed Abstract | CrossRef Full Text | Google Scholar

Heisz, J. J., Watter, S., and Shedden, J. M. (2006). Automatic face identity encoding at the N170. Vis. Res. 46, 4604–4614. doi: 10.1016/j.visres.2006.09.026

PubMed Abstract | CrossRef Full Text | Google Scholar

Herholz, S. C., and Zatorre, R. J. (2012). Musical training as a framework for brain plasticity: behavior, function, and structure. Neuron 76, 486–502. doi: 10.1016/j.neuron.2012.10.011

PubMed Abstract | CrossRef Full Text | Google Scholar

Ibanez, A., Melloni, M., Huepe, D., Helgiu, E., Rivera-Rei, A., Canales-Johnson, A., et al. (2012). What event-related potentials (ERPs) bring to social neuroscience? Soc. Neurosci. 7, 632–649. doi: 10.1080/17470919.2012.691078

PubMed Abstract | CrossRef Full Text | Google Scholar

Invitto, S., Faggiano, C., Sammarco, S., De Luca, V., and De Paolis, L. (2016). Haptic, virtual interaction and motor imagery: entertainment tools and psychophysiological testing. Sensors 16:394. doi: 10.3390/s16030394

PubMed Abstract | CrossRef Full Text | Google Scholar

Jodo, E., and Kayama, Y. (1992). Relation of a negative ERP component to response inhibition in a Go/No-go task. Electroencephalogr. Clin. Neurophysiol. 82, 477–482. doi: 10.1016/0013-4694(92)90054-L

PubMed Abstract | CrossRef Full Text | Google Scholar

Jolij, J., and Meurs, M. (2011). Music alters visual perception. PLoS ONE 6:e18861. doi: 10.1371/journal.pone.0018861

PubMed Abstract | CrossRef Full Text | Google Scholar

Kawakami, A., Furukawa, K., Katahira, K., and Okanoya, K. (2013). Sad music induces pleasant emotion. Front. Psychol. 4:311. doi: 10.3389/fpsyg.2013.00311

PubMed Abstract | CrossRef Full Text | Google Scholar

Kawakami, A., Furukawa, K., and Okanoya, K. (2014). Music evokes vicarious emotions in listeners. Front. Psychol. 5:431. doi: 10.3389/fpsyg.2014.00431

PubMed Abstract | CrossRef Full Text | Google Scholar

Kolassa, I.-T., Kolassa, S., Bergmann, S., Lauche, R., Dilger, S., Miltner, W. H. R., et al. (2009). Interpretive bias in social phobia: an ERP study with morphed emotional schematic faces. Cogn. Emot. 23, 69–95. doi: 10.1080/02699930801940461

CrossRef Full Text | Google Scholar

Ladinig, O., and Schellenberg, E. G. (2012). Liking unfamiliar music: effects of felt emotion and individual differences. Psychol. Aesthetics Creat. Arts 6, 146–154. doi: 10.1037/a0024671

CrossRef Full Text | Google Scholar

Leleu, A., Godard, O., Dollion, N., Durand, K., Schaal, B., and Baudouin, J. Y. (2015). Contextual odors modulate the visual processing of emotional facial expressions: an ERP study. Neuropsychologia 77, 366–379. doi: 10.1016/j.neuropsychologia.2015.09.014

PubMed Abstract | CrossRef Full Text | Google Scholar

Logeswaran, N., and Bhattacharya, J. (2009). Crossmodal transfer of emotion by music. Neurosci. Lett. 455, 129–133. doi: 10.1016/j.neulet.2009.03.044

PubMed Abstract | CrossRef Full Text | Google Scholar

Luck, S. (2005). “Ten simple rules for designing ERP experiments,” in Event-Related Potentials A Methods Handbook, ed T. C. Handy (Cambridge, MA; London, UK: The MIT Press), 17–32.

Google Scholar

Luck, S. J., and Gaspelin, N. (2017). How to get statistically significant effects in any ERP experiment (and why you shouldn't). Psychophysiology 54, 146–157. doi: 10.1111/psyp.12639

PubMed Abstract | CrossRef Full Text | Google Scholar

McElreath, R. (2016). Statistical Rethinking. Boca Raton, FL: Chapman & Hall/CRC.

Menolascina, F., Tommasi, S., Paradiso, A., Cortellino, M., Bevilacqua, V., and Mastronardi, G. (2007). “Novel data mining techniques in aCGH based breast cancer subtypes profiling: the biological perspective,” in 2007 IEEE Symposium on Computational Intelligence and Bioinformatics and Computational Biology (Honolulu, HI), 9–16.

Google Scholar

Merkle, E. C., You, D., and Preacher, K. J. (2016). Testing nonnested structural equation models. Psychol. Methods 21, 151–163. doi: 10.1037/met0000038

PubMed Abstract | CrossRef Full Text | Google Scholar

Miles, S. A., Miranda, R. A., and Ullman, M. T. (2016). Sex differences in music: a female advantage at recognizing familiar melodies. Front. Psychol. 7:278. doi: 10.3389/fpsyg.2016.00278

PubMed Abstract | CrossRef Full Text | Google Scholar

Moreno, S., Wodniecka, Z., Tays, W., Alain, C., and Bialystok, E. (2014). Inhibitory control in bilinguals and musicians: Event related potential (ERP) evidence for experience-specific effects. PLoS ONE 9:e94169. doi: 10.1371/journal.pone.0094169

PubMed Abstract | CrossRef Full Text | Google Scholar

Müller, M., Höfel, L., Brattico, E., and Jacobsen, T. (2010). Aesthetic judgments of music in experts and laypersons - an ERP study. Int. J. Psychophysiol. 76, 40–51. doi: 10.1016/j.ijpsycho.2010.02.002

PubMed Abstract | CrossRef Full Text | Google Scholar

Niu, Y., Todd, R. M., and Anderson, A. K. (2012). Affective salience can reverse the effects of stimulus-driven salience on eye movements in complex scenes. Front. Psychol. 3:336. doi: 10.3389/fpsyg.2012.00336

PubMed Abstract | CrossRef Full Text | Google Scholar

Pallesen, K. J., Brattico, E., Bailey, C. J., Korvenoja, A., Koivisto, J., Gjedde, A., et al. (2010). Cognitive control in auditory working memory is enhanced in musicians. PLoS ONE 5:e11120. doi: 10.1371/journal.pone.0011120

PubMed Abstract | CrossRef Full Text | Google Scholar

Pantev, C., Oostenveld, R., Engelien, A., Ross, B., Roberts, L. E., and Hoke, M. (1998). Increased auditory cortical representation in musicians. Nature 392, 811–814. doi: 10.1038/33918

PubMed Abstract | CrossRef Full Text | Google Scholar

Peretz, I. (2012). “Music, language, and modularity in action,” in Language and Music as Cognitive Systems, eds P. Rebuschat, M. Rohmeier, J. A. Hawkins, and I. Cross (New York, NY: Oxford University Press), 254–268.

Google Scholar

Phelps, E. A., and LeDoux, J. E. (2005). Contributions of the amygdala to emotion processing: from animal models to human behavior. Neuron 48, 175–187. doi: 10.1016/j.neuron.2005.09.025

PubMed Abstract | CrossRef Full Text | Google Scholar

Pourtois, G., Debatisse, D., Despland, P.-A., and de Gelder, B. (2002). Facial expressions modulate the time course of long latency auditory brain potentials. Cogn. Brain Res. 14, 99–105. doi: 10.1016/S0926-6410(02)00064-2

PubMed Abstract | CrossRef Full Text | Google Scholar

Proverbio, A. M., Manfredi, M., Zani, A., and Adorni, R. (2013). Musical expertise affects neural bases of letter recognition. Neuropsychologia 51, 538–549. doi: 10.1016/j.neuropsychologia.2012.12.001

PubMed Abstract | CrossRef Full Text | Google Scholar

Quinlan, J. (1993). C4. 5: Programs for Machine Learning. San Mateo, CA: Morgan Kaufmann.

Google Scholar

Reybrouck, M. (2005). Body, mind and music: musical semantics between experiential cognition and cognitive economy. Rev. Transcult. Música 9, 1–37.

Google Scholar

Reybrouck, M., and Brattico, E. (2015). Neuroplasticity beyond sounds: neural adaptations following long-term musical aesthetic experiences. Brain Sci. 5, 69–91. doi: 10.3390/brainsci5010069

PubMed Abstract | CrossRef Full Text | Google Scholar

Richard, L., and Charbonneau, D. (2009). An introduction to E-Prime. Tutor. Quant. Methods Psychol. 5, 68–76. doi: 10.20982/tqmp.05.2.p068

CrossRef Full Text | Google Scholar

Sammler, D., Grigutsch, M., Fritz, T., and Koelsch, S. (2007). Music and emotion: electrophysiological correlates of the processing of pleasant and unpleasant music. Psychophysiology 44, 293–304. doi: 10.1111/j.1469-8986.2007.00497.x

PubMed Abstract | CrossRef Full Text | Google Scholar

Schellenberg, E. G. (2005). Music and cognitive abilities. Curr. Dir. Psychol. Sci. 14, 317–320. doi: 10.1111/j.0963-7214.2005.00389.x

CrossRef Full Text | Google Scholar

Schellenberg, E. G., and Mankarious, M. (2012). Music training and emotion comprehension in childhood. Emotion 12, 887–891. doi: 10.1037/a0027971

PubMed Abstract | CrossRef Full Text | Google Scholar

Schiavio, A., van der Schyff, D., Cespedes-Guevara, J., and Reybrouck, M. (2016). Enacting musical emotions. sense-making, dynamic systems, and the embodied mind. Phenomenol. Cogn. Sci. doi: 10.1007/s11097-016-9477-8. [Epub ahead of print].

CrossRef Full Text | Google Scholar

Schulz, K. P., Fan, J., Magidina, O., Marks, D. J., Hahn, B., and Halperin, J. M. (2007). Does the emotional go/no-go task really measure behavioral inhibition?. Convergence with measures on a non-emotional analog. Arch. Clin. Neuropsychol. 22, 151–160. doi: 10.1016/j.acn.2006.12.001

PubMed Abstract | CrossRef Full Text | Google Scholar

Sekuler, R., Sekuler, A. B., and Lau, R. (1997). Sound alters visual motion perception. Nature 385, 308. doi: 10.1038/385308a0

PubMed Abstract | CrossRef Full Text | Google Scholar

Smith, J. L., Jamadar, S., Provost, A. L., and Michie, P. T. (2013). Motor and non-motor inhibition in the Go/NoGo task: An ERP and fMRI study. Int. J. Psychophysiol. 87, 244–253. doi: 10.1016/j.ijpsycho.2012.07.185

PubMed Abstract | CrossRef Full Text | Google Scholar

Srinagesh, K. (2006). The Principles of Experimental Research. Amsterdam; Burlington, MA: Elsevier/Butterworth-Heinemann.

Google Scholar

Vuilleumier, P., Armony, J. L., Driver, J., and Dolan, R. J. (2001). Effects of attention and emotion on face processing in the human brain: an event-related fMRI study. Neuron 30, 829–841. doi: 10.1016/S0896-6273(01)00328-2

PubMed Abstract | CrossRef Full Text | Google Scholar

Vuong, Q. H. (1989). Likelihood ratio tests for model selection and non-nested hypotheses. Econom. J. Econom. Soc. 57, 307–333. doi: 10.2307/1912557

CrossRef Full Text | Google Scholar

Waters, A. M., and Valvoi, J. S. (2009). Attentional bias for emotional faces in paediatric anxiety disorders: an investigation using the emotional go/no go task. J. Behav. Ther. Exp. Psychiatry 40, 306–316. doi: 10.1016/j.jbtep.2008.12.008

PubMed Abstract | CrossRef Full Text | Google Scholar

Wong, Y. K., and Gauthier, I. (2012). Music-reading expertise alters visual spatial resolution for musical notation. Psychon. Bull. Rev. 19, 594–600. doi: 10.3758/s13423-012-0242-x

PubMed Abstract | CrossRef Full Text | Google Scholar

Yerys, B. E., Kenworthy, L., Jankowski, K. F., Strang, J., and Wallace, G. L. (2013). Separate components of emotional go/no-go performance relate to autism versus attention symptoms in children with autism. Neuropsychology 27, 537–545. doi: 10.1037/a0033615

PubMed Abstract | CrossRef Full Text | Google Scholar

Yu, L., and Liu, H. (2003). “Feature selection for high-dimensional data: a fast correlation-based filter solution,” in Proceedings of the Twentieth International Conference on Machine Learning (ICML-2003), (Washington, DC), 856–863.

Google Scholar

Zanto, T. P., Rubens, M. T., Thangavel, A., and Gazzaley, A. (2011). Causal role of the prefrontal cortex in top-down modulation of visual processing and working memory. Nat. Neurosci. 14, 656–661. doi: 10.1038/nn.2773

PubMed Abstract | CrossRef Full Text | Google Scholar

Keywords: music cognition, face recognition, N170 ERP, emotional salience, crossmodal integration, emotional biases, musical appraisal

Citation: Invitto S, Calcagnì A, Mignozzi A, Scardino R, Piraino G, Turchi D, De Feudis I, Brunetti A, Bevilacqua V and de Tommaso M (2017) Face Recognition, Musical Appraisal, and Emotional Crossmodal Bias. Front. Behav. Neurosci. 11:144. doi: 10.3389/fnbeh.2017.00144

Received: 20 June 2017; Accepted: 19 July 2017;
Published: 02 August 2017.

Edited by:

Giuseppe Placidi, University of L'Aquila, Italy

Reviewed by:

Michela Balconi, Università Cattolica del Sacro Cuore, Italy
Anna Esposito, Università degli Studi della Campania “Luigi Vanvitelli” Caserta, Italy

Copyright © 2017 Invitto, Calcagnì, Mignozzi, Scardino, Piraino, Turchi, De Feudis, Brunetti, Bevilacqua and de Tommaso. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Sara Invitto,