Harmony Perception in Prelingually Deaf, Juvenile Cochlear Implant Users

Zimmer, Victoria; Verhey, Jesko L.; Ziese, Michael; Böckmann-Barthel, Martin

doi:10.3389/fnins.2019.00466

ORIGINAL RESEARCH article

Front. Neurosci., 08 May 2019

Sec. Auditory Cognitive Neuroscience

Volume 13 - 2019 | https://doi.org/10.3389/fnins.2019.00466

This article is part of the Research TopicMusic and Cochlear Implants: Recent Developments and Continued ChallengesView all 20 articles

Harmony Perception in Prelingually Deaf, Juvenile Cochlear Implant Users

Victoria Zimmer

Jesko L. Verhey

Michael Ziese

Martin Böckmann-Barthel^*

Department of Experimental Audiology, Otto von Guericke University of Magdeburg, Magdeburg, Germany

Prelingually deaf children listening through cochlear implants (CIs) face severe limitations on their experience of music, since the hearing device degrades relevant details of the acoustic input. An important parameter of music is harmony, which conveys emotional as well as syntactic information. The present study addresses musical harmony in three psychoacoustic experiments in young, prelingually deaf CI listeners and normal-hearing (NH) peers. The discrimination and preference of typical musical chords were studied, as well as cadence sequences conveying musical syntax. The ability to discriminate chords depended on the hearing age of the CI listeners, and was less accurate than for the NH peers. The groups did not differ with respect to the preference of certain chord types. NH listeners were able to categorize cadences, and performance improved with age at testing. In contrast, CI listeners were largely unable to categorize cadences. This dissociation is in accordance with data found in postlingually deafened adults. Consequently, while musical harmony is available to a limited degree to CI listeners, they are unable to use harmony to interpret musical syntax.

Introduction

For young humans, music represents a beneficial factor in language, social, creative development (see Hallam, 2010), and plays a role in adolescents’ mood regulation (Saarikallio and Erkkilä, 2007). Although cochlear implant (CI) users face substantial degradations of sound details, many of them enjoy listening to music, and its contribution to their quality of life has been reported repeatedly (Leal et al., 2003; Lassaletta et al., 2007). This was mainly studied in adults but a positive attitude toward music may be regarded as an important objective also for young prelingually deaf who acquire their musical experience via the CI only. However, music appreciation is deteriorated by the unavoidable reduction of spectral and dynamical sound information coming with electrical stimulation (for a review, see Limb and Roy, 2014), partly due to technical shortcomings such as the limited number of electrodes and reduced fine temporal details which result in reduced pitch cues, and partly due to neuronal deprivation over the period of deafness. CI listeners perceive pitch less accurately than normal-hearing (NH) listeners (Pretorius and Hanekom, 2008; Kang et al., 2009), as well as other spectral parameters in music, such as melody contour (Galvin et al., 2009) and instrument timbre (Kang et al., 2009; Brockmeier et al., 2011). Roy et al. (2014) found similar results in CI children, who exhibited difficulty in discriminating pitch and timbre, but less so for discriminating chord sequences.

Western music makes use of distinct tone combinations that may convey pleasantness or rest, as opposed to agitation or tension, commonly seen as different degrees of dissonance. Discrimination and preference of two-tone intervals or chords combining three or more tones was only addressed in very few studies involving CI listeners, showing, for example, that they may be able to discriminate chords from natural piano recordings, but with significantly more effort than NH listeners (Brockmeier et al., 2011; Böckmann-Barthel et al., 2013). CI users may also assign valences to these chords (Brockmeier et al., 2011). The Mu.S.I.C. Perception test used in Brockmeier et al. (2011) was replicated in children with comparable results (Stabej et al., 2012), although the authors reported only the average valence of all chords. Roy et al. (2014) investigated five musical discrimination tasks in young CI listeners at an age of about seven and NH peers. Whereas on average CI listeners were outperformed by the NH peers, both groups were on a level in distinguishing three-chord sequences that differed only in the central chord. Whereas chords may be distinguishable through a CI, the perceived harmonic valence remains unclear.

The concept of harmony has been defined more precisely in music literature as the “combining of musical notes, simultaneously, to produce chords, and successively, to produce chord progressions” (Dahlhaus, 1980). Thus, this definition comprises “vertical” consonance of simultaneous musical tones as well as the “horizontal” relation of consecutive tone combinations. Vertical consonance itself consists of sensory factors such as roughness (Plomp and Levelt, 1965), and music-cultural factors acquired implicitly by exposition (Tramo et al., 2001; Cook et al., 2007). With respect to vertical consonance, the major triad, containing a note four semitones and another one seven semitones above the root note, is generally regarded as the most consonant chord. Several studies showed that (i) minor triads are perceived as somewhat less consonant than major triads, and (ii) that augmented and diminished are rather dissonant (Roberts, 1986; Cook et al., 2007; Johnson-Laird et al., 2012), in accordance with music theory.

The “horizontal” succession of tone combinations structures a musical piece, along with the melody, by means of harmonic tension and release. It requests characteristic chord sequences that indicate the conclusion of a musical phrase, and thus carry syntactic information, just as a full-stop in speech (Rockstro et al., 1980). The most general archetype is the authentic (or perfect) cadence, which is concluded by the dominant (a major chord with a root on the fifth step of the scale) followed by the tonic chord (on the root note of the scale). The present study addresses both the consonance of isolated chords and their functional role in authentic cadences. Following Tramo et al. (2001), we restrict the use of the term “consonance” to the vertical impression that can be derived from isolated chords. In contrast, “harmony” also comprises the horizontal arrangement of chords and their functional roles.

Koelsch et al. (2004) addressed the availability of such horizontal harmony to CI users by means of event-related brain potentials (ERP). The presence of components associated with musical syntax suggested that a certain harmonic irregularity, the Neapolitan sixth chord, is indeed transmitted, although the respective ERP amplitudes are considerably smaller than in NH listeners. Knobloch et al. (2018) varied authentic cadences by replacing the final tonic chord by an unexpected, ill-fitting chord. NH listeners easily detect such an alteration. In contrast, the vast majority of CI listeners were unsuccessful in this task, no matter whether the final chord was a vertically consonant transposition of the tonic, or a vertically dissonant chord. This finding indicates a different perception of chords within a cadence in contrast with chords in isolation, since the CI listeners judged the major chords (which ended the original cadences in one experimental condition) as clearly more consonant than the more dissonant types when presented alone.

Difficulties to perceive musical harmony through a CI may also depend on musical experience. In NH listeners, substantial aspects of musical harmony perception develop with age. For example, the identification of the musical modes major and minor with happy and sad emotions, respectively, is, in accompanied melodies, available by the age of eight but not at the age of four (Gregory et al., 1996). Horizontal aspects of harmony are significantly more subject to development. Processing of authentic cadences is not completely available to children at the age of 5 years when compared to children at 11 years (Schellenberg et al., 2005). These authors also concluded that acquisition of knowledge on horizontal aspects of harmony mostly relies on implicit learning. Only sensory consonance of isolated tone combinations is regarded as predominantly innate (Trainor and Heinmiller, 1998, however, see Plantinga and Trehub, 2014).

Such findings suggest that lack of exposure to music contributes to the above mentioned difficulties of CI listeners to gather harmonic syntax (Knobloch et al., 2018). Whereas these data were obtained from experienced, postlingually deafened listeners who were exposed to music prior to implantation, it is widely unclear to what degree the harmonic concepts, such as vertical consonance or horizontal cadences, might be transferred from previous acoustical experience to the perception of the CI signal. The findings of Knobloch et al. (2018) argue against such a benefit, because except for a single case their CI listeners were largely unable to recognize authentic cadences. It is, however, possible that the comparison with the acoustic music experience renders the music experience via the CI uncomfortable and confusing, because the dissimilar sound sensation of the electrical stimulation might conflict with the memory of previously experienced musical nuances. In this case, prelingually deafened CI listeners might respond closer to NH listeners especially with respect to deviant cadences.

In order to separate the contribution of prior musical experience from the signal-driven percept, this study focused on prelingually deafened children, whose only hearing experience is through a CI. This study includes three experiments, each focusing on a different aspect of harmony perception. Isolated chords had to be discriminated in the first experiment, providing a prerequisite for a correct perception of cadences. The hypothesis is that the CI may be able to do this task, although less accurate than the NH listeners, since the representation of the stimuli in the CI should be different for the different chords. The second experiment tested vertical consonance by investigating which chord types were preferred as more pleasant over others. If vertical harmony was preserved by the degraded CI signal, CI listeners with some musical experience, at least implicitly acquired, would actually prefer the same chords as their NH peers. The third experiment investigated the ability of the CI users to evaluate the musical correct chord progression in the form of authentic cadences with respect to horizontal harmony. If previous musical experience interfered with the experience of music through the CI, the prelingually deaf participants would be expected to be more successful here than the postlingually deaf adults in Knobloch et al. (2018). The cohort of NH listeners covered the hearing age of the CI listeners and was included in the study to test if the tasks were appropriate even for the youngest participants.

General Methods

Participants

Cochlear implant listeners were recruited from regular follow-up visitors at the university hospital in Magdeburg and the Cecilienstift Cochlear Implant Rehabilitation Center in Halberstadt. Twelve children with bilateral congenital or prelingual deafness (four males and eight females) participated in the study. Except for listener CI02, all were implanted bilaterally and used both devices in daily life. Their age ranged between 7.6 and 18.9 years, with a mean of 14.4 ± 3.4 years. They had a CI experience between 6.0 and 17.2 years with a mean of 12.5 ± 3.3 years which is referred to as hearing age below. CI experience was highly correlated with age at testing, r = 0.989, p < 0.01. Seven of them used devices by MED-EL, four by Cochlear and one by Advanced Bionics. All of them were profoundly deaf by 2 years of age. No cases of known neurologic disorders or meningitis were included. Demographic and device data are specified in Table 1. All CI listeners spoke German as their first language. None of them had received any musical training beyond school, which usually covers some singing and basic musical knowledge. In particular, they did not participate in any individual instrument training.

TABLE 1

Table 1. Demographic and device data of the CI participants.

Twenty-four NH children (14 males, 10 females) without musical training beyond school served as control group. They were recruited through internet announcements. Their age ranged between 5.8 and 18.2 years with a mean of 12.3 ± 3.5 years, thus matching the hearing age of the CI group. Normal hearing was verified prior to the experiments with pure-tone audiometry at audiometric frequencies from 125 to 8000 Hz. To be considered as a NH listener, all thresholds had to be better or equal to 25 dB hearing loss in both ears. All NH children spoke German as their first language. All 12 CI users and 24 NH listeners completed three experiments described below. Written informed consent to the study was obtained before the measurement by a parent or legal guardian or, in the case of the older children, the participant himself. The study was approved by the local institutional review board to fulfill the Declaration of Helsinki.

Apparatus

The chords used in all three experiments were constructed of four harmonic complex tones, as in our previous study with postlingually deaf adults (Knobloch et al., 2018). Each harmonic tone complex consisted of the fundamental frequency (F0) and the next four partials (2 F0 to 5 F0) with random phases and a decay of 6 dB per partial.

The children were tested separately in a large sound-attenuated room. Sounds were presented through a single frontal monitor loudspeaker (Reveal R5A, Tannoy Ltd., Coatbridge, United Kingdom) at a distance of 1.3 m to the forehead of the child. The sound level was chosen to be clear enough and comfortable to the listener, and did not exceed 85 dB SPL. If the child preferred so, a parent was allowed to be present within the room but outside the child’s view and without the opportunity to interact.

Stimulus presentation and response collection were administered by a MATLAB graphical user interface (The Mathworks Inc., Natick MA, United States). Instructions were provided and responses were given on a touchscreen monitor display in front of the listener.

Procedure

In order to familiarize the children with the tasks and the setup, the experiment was preceded by a short visual two-interval, two-alternative task that was a visual analog to the first discrimination experiment and used the same graphical user interface. Two pictures (drawn from a cartoon animal set of an orange mouse, a blue elephant, and a yellow duck) were shown in succession and the instruction: “Are the following images identical?” After each presentation, the two answer buttons marked “Ja” (“Yes” in German) with two identical pink triangles and “Nein” (“No” in German) with two different symbols (a pink triangle and a yellow circle) were shown. The symbols were added to enable even children without perfect reading to respond adequately. It was evident after only a few presentations that all children (including the youngest) responded perfectly and were thus able to perform the discrimination task.

Blocks of the three experimental tasks were interspersed. In each experiment, the listener started the next trial by pressing the button marked “Listen.” Stimuli started then without any cue sound after 500 ms. Repeated listening was allowed in all experiments, but this was rarely used by most of the CI and NH listeners. The specific tasks are described in detail in the following sections, and include description of the statistical analysis specific to each experiment.

Statistical analysis were performed with SPSS Statistics version 24 (IBM, Armonk NY, United States). For all experiments Pearson correlations were used to analyze age correlations.

Experiment 1

Methods

Experiment 1 assessed the discrimination of two chords. All chords were presented in open harmony. At least six semitone steps separated adjacent notes within each chord. Four chord types were used: major, minor, augmented, and diminished chords. Scores are shown in Figure 1. The fundamental F0 of the chord root was randomly chosen from five values separated by one semitone step: 125, 132, 140, 148, and 157 Hz. Each chord had a duration of 1500 ms including 80 ms raised cosine ramps at the beginning and end. The two chords of a pair were separated by silence of 2000 ms. The chord pairs were generated on demand by a MATLAB routine. Equal numbers of all chord types were presented in 48 pairs, 24 comprising identical chords and 24 comprising differing chords. Thus, each chord occurred in three identical pairs and six times in a differing pair. They were separated in four blocks of 12 trials each.

FIGURE 1

Figure 1. Musical scores for the four different chords used in experiments 1 and 2.

The instruction of the graphical user interface read (English translation of the original German instruction): “You will hear two sounds one after another. Are they the same?” and two answer buttons as above.

For the statistical analysis of the data, each response of type “Yes” (same) following an identical pair was considered as a hit, and each response of type “Yes” following a differing pair as a false alarm. For each participant the occurrence rates of hits (HR) and false alarms (FR) were converted into a sensitivity index according to signal detection theory as d′ = z(HR) – z(FR) (Macmillan and Creelman, 1990). In order to avoid infinite values, perfect false alarm rates of 0 were replaced by 1/(2n), n being the number of differing pairs, and perfect hit rates of 1 were replaced by 1 – 1/(2m), m being the number of identical pairs (cf. Stanislaw and Todorov, 1999). With this correction, perfect performance results in a value of d′ = 4.07. A one-sample t-test was used to examine if d′ was different from zero, i.e., chance performance, in the groups. To examine a possible bias in the answer behavior, the decision criterion c = [z(HR) + z(FR)]/2 was also calculated (Macmillan and Creelman, 1990). A listener’s c < 0 would indicate a bias toward judging even differing pairs as identical. Again, a one-sample t-test examined if c was different from zero in the groups. An independent-samples t-test was used to compare the mean d′ values of the two groups.

Results

In the discrimination of chord types, five out of twelve CI listeners scored a sensitivity index d′ < 1 for the ability to discriminate pairs of single chords. According to signal detection theory, this means that the probability density functions of the responses to targets and distractors are separated by less than one standard deviation (Stanislaw and Todorov, 1999). In other words, these listeners did not discriminate the chord types. In contrast, in the group of 24 NH listeners, only two listeners performed at such a low level. The CI listeners obtained a group mean d′ = 1.19 (SD = 0.86). The NH control listeners reached a group mean d′ = 2.00 (SD = 0.90), indicating that they were mostly able to discriminate the different chords. One-sample t-tests showed that for both groups the sensitivity indices were significantly above chance level, t(11) = 4.77, p < 0.01 for the CI listeners and t(23) = 10.85, p < 0.001 for the NH listeners. The performance of the CI listeners was significantly lower than that of the NH listeners, t(34) = 2.58, p < 0.05. In order to display the perceived differences of the chord types, Table 2 collects the correct rejection rates of the various differing pairs rated as different. The NH listeners discriminated the pairs involving a minor chord with greater accuracy than the others. The pattern is similar but less pronounced in the CI listeners. Figure 2 shows d′ values as a function of hearing age for individual subjects. A significant correlation was found in CI listeners between hearing age and d′, r = 0.654, p < 0.05. The correlation with age at testing was also significant, r = 0.654, p < 0.05. For the NH control listeners, the correlation between age and d′ was not significant, r = 0.378, p > 0.05. The mean decision criterion testing a tendency toward one of the two alternatives was c = -0.25 (SD = 0.64) on average for the CI listeners. This value was not significantly different from zero, t(11) = -1.33, p > 0.05. For the NH listeners, however, the mean decision criterion was c = -0.34 (SD = 0.47), which was significantly different from 0, t(23) = -3.68, p < 0.001, indicating a bias toward judging the chords as “same.”

TABLE 2

Table 2. Detailed percent correct rejection rates, i.e., differing pairs rated as different, of experiment 1, for the combinations of chord types.

FIGURE 2

Figure 2. Sensitivity index d′ for experiment 1 (chord discrimination task) for individual CI (circles) and NH listeners (diamonds), as a function of hearing age. The lines show linear regressions to the data.

Discussion

Although CI listeners on average were able to discriminate the chords, discrimination performance was significantly poorer than that of the NH peers. This was expected, since CI listeners typically face difficulties in tasks that rely on accurate spectral information (Limb and Roy, 2014). CI listeners often exhibit pitch difference limens for single tones on the order of several semitones (see, e.g., Pretorius and Hanekom, 2008; Kang et al., 2009). In the present experiment, a given pair of chords differed by only one or two semitones in the top two notes of the chords. Taken this small difference into account, an even larger discrepancy between the two groups might therefore have been expected. In some cases, children listening through a CI have been reported to discriminate chords on the same level as their NH peers (Roy et al., 2014). In their experiment, the target chords were framed by harmonically related major chords. Whereas this framing is not expected to facilitate the discrimination, the good performance might be related to large contrasts between the center chords in the frequency range. The present study showed that when using only chords with the same fundamental, still half of the CI listeners were able to discriminate these chords, although with more difficulty than the NH peers. It should be noted that the NH listeners showed a significant bias toward judging the pairs as same, underlining that even to them the stimuli sounded rather similar.