Individuality That is Unheard of: Systematic Temporal Deviations in Scale Playing Leave an Inaudible Pianistic Fingerprint

Van Vugt, Floris  Tijmen; Jabusch, Hans-Christian; Altenmüller, Eckart

doi:10.3389/fpsyg.2013.00134

ORIGINAL RESEARCH article

Front. Psychol., 21 March 2013

Sec. Cognitive Science

Volume 4 - 2013 | https://doi.org/10.3389/fpsyg.2013.00134

This article is part of the Research TopicIndividuality in music performanceView all 15 articles

Individuality that is unheard of: systematic temporal deviations in scale playing leave an inaudible pianistic fingerprint

Floris Tijmen Van Vugt^1,2*

Hans-Christian Jabusch³

Eckart Altenmüller¹

¹Institute of Music Physiology and Musicians’ Medicine, University of Music, Drama, and Media, Hanover, Germany
²Lyon Neuroscience Research Center, CNRS-UMR 5292, INSERM U1028, University Lyon-1, Lyon, France
³Institute of Musicians’ Medicine, University of Music “Carl Maria von Weber,”, Dresden, Germany

Whatever we do, we do it in our own way, and we recognize master artists by small samples of their work. This study investigates individuality of temporal deviations in musical scales in pianists in the absence of deliberate expressive intention. Note-by-note timing deviations away from regularity form a remarkably consistent “pianistic fingerprint.” First, eight professional pianists played C-major scales in two sessions, separated by 15 min. Euclidian distances between deviation traces originating from different pianists were reliably larger than traces originating from the same pianist. As a result, a simple classifier that matched deviation traces by minimizing their distance was able to recognize each pianist with 100% accuracy. Furthermore, within each pianist, fingerprints produced by the same movements were more similar than fingerprints resulting in the same scale sound. This allowed us to conclude that the fingerprints are mostly neuromuscular rather than intentional or expressive in nature. However, human listeners were not able to distinguish the temporal fingerprints by ear. Next, 18 pianists played C-major scales on a normal or muted piano. Recognition rates ranged from 83 to 100%, further supporting the view that auditory feedback is not implicated in the creation of the temporal signature. Finally, 20 pianists were recognized 20 months later at above chance level, showing signature effects to be long lasting. Our results indicate that even non-expressive playing of scales reveals consistent, partially effector-unspecific, but inaudible inter-individual differences. We suggest that machine learning studies into individuality in performance will need to take into account unintentional but consistent variability below the perceptual threshold.

Introduction

Our actions are highly individual and we can tell people apart by how they move (Flach et al., 2004; Loula et al., 2005; Prasad and Shiffrar, 2009; Sevdalis and Keller, 2011). People may recognize those close to them by the way they sneeze or walk the stairs. Even when trying to achieve the same aim, the actions that are selected toward this aim and the way in which they are executed vary considerably between individuals. The human observer seems to rely on action simulation to recognize individuals by their movements, since recognition is generally stronger when distinguishing one’s own performance from that of others (Jeannerod, 2003).

A first question is how movements from different individuals vary physically. Why are certain parameters of our actions remarkably stable between multiple iterations by the same person, and yet strikingly different between individuals? A second question is to what extent movements vary perceptually. For example, some movements may differ so subtly that the individual features are not distinguishable to a human observer under normal conditions.

Music is a suitable paradigm to study individuality since actions are directed toward a clearly defined auditory goal: when we play music, the aim is to make a certain sound. Furthermore, differences between performers are sometimes so salient that listeners will often refuse to listen to a musical piece that is a mere “cover” of the original. Music played by different individuals varies physically. For example, machine ensemble learning approaches are able to tell musical performers apart based on structural features such as timing and loudness differences (Stamatatos and Widmer, 2005) or kinematics (Dalla Bella and Palmer, 2011). The individuality is also perceptual. Indeed, non-musicians and musicians alike were able to recognize performances reliably (Gingras et al., 2011). Again, action simulation in the form of musical imagery appears to play a role in the recognition process. For example, piano players turn out to be capable of recognizing their own playing from a few months previously, even if the sound was switched off at the time of the recording (Repp and Knoblich, 2004).

In music performance recognition the differences in sound that different players produce are often understood as a result of their artistic individuality. However, there is no reason to assume that the individuality in the way we walk serves any particular purpose. Indeed, even task-irrelevant sounds matching a golf swing are recognized significantly better than chance (Murgia et al., 2012). On the other hand, individuality in music performance is tacitly assumed to define a performer’s unique artistic identity. But we have to date no empirical validation of the extent to which individuality in music performance is deliberate. The study coming closest to answering this question requested pianists to play mechanically, and found that recognition was somewhat impaired for these inexpressive recordings (Gingras et al., 2011). However, even metronomic playing has been shown to contain the same timing patterns as expressive playing, but to a lesser extent (Repp, 1999a). To avoid this problem, we instead investigated the playing of musical scales (Wagner, 1971; MacKenzie and Van Eerd, 1990). When participants are instructed to play a scale as regularly as possible and in a legato style, there is a clear auditory target of perceptual evenness and it is understood that the task at hand is not to play scales in one’s own particular way. In other words, isolated scales are not thought of as expressive musical materials. There is some objective standard and trying to meet it is a merely technical task.

Yet, it is found that musical scales show systematic temporal deviations (MacKenzie and Van Eerd, 1990; van Vugt et al., 2012). These deviations are thought of as the result of perceptual distortions (Drake, 1993), residual expressive timing (Repp, 1999a), or of some note transitions involving more difficult movements (Engel et al., 1997).

Our question is whether these temporal deviations are individual in the same way that expressive performance is. We restrict our attention to timing of note onsets, discarding information such as differences in loudness and note duration. In Experiment I, we first established timing deviations of individual notes (van Vugt et al., 2012). The resulting timing profile is then used to recognize pianists across two sessions, separated by 15 min. In this way, we aim to establish individuality that is physically present in the timing of musical scales. In Experiment II, we then proceed to assess whether the timing differences can be perceived by musically trained observers. In Experiment III we investigate the role of auditory feedback in the formation of these timing profiles. Finally, in order to investigate to what extent these timing deviation profiles are stable, we follow a group of pianists over 27 months in Experiment IV.

Experiment I

Materials and Methods

The data reported here were collected as part of a validation procedure for a scale unevenness quantification method published elsewhere (Jabusch et al., 2004). Eight pianists (six female) were recruited from the student/teacher pool at the Hanover University of Music and were 24.3 (SD 2.4) years old. All but one were right-handed (M = 57.2, SD = 66% right-handed according to the Edinburgh handedness inventory). None of the participants reported any neurological condition. Participants played on a MP 9000 MIDI keyboard (Kawai, Krefeld, Germany). The keyboard’s digital music interface (MIDI out) signal was captured on a PC using a commercially available sequencer software (Musicator Win, version 2.12; Music Interactive Technology, Bergen, Norway).

Participants were requested to play two-octave C-major scales beginning with the C (131 Hz) one octave below the middle C and ending with the C (523 Hz) one octave higher than the middle C. Ascending and descending scales were interleaved. The instruction to the participants was to play as evenly as possible, without expression, and in a legato style at mezzo-forte loudness. A metronome gave a beat at 120 BPM and the instructions were to play at four notes per metronome beat, resulting in eight notes per second. Participants performed 10–15 scales with the right hand and with the left hand (first measurement). After a 15 min break, the procedure was repeated (follow-up).

Analysis of Scale Timing

First, we isolated correctly performed scale runs, discarding those containing errors or surplus notes. We then converted the note values to their rank in the C-major scale (i.e., C has rank 0, D has rank 1, E has rank 2, etc., up to C″ with rank 14) and performed a least-square straight line fit to this set of pairs of rank and timing. This allowed us to compute for each note the expected onset time (according to this fit) and then the deviation of the timing of the actually measured onset (in ms) (van Vugt et al., 2012). We performed this fit for all scale runs and then pooled the results by hand (left or right), playing direction (inward or outward) and note, calculating the mean lateness (in ms) for that condition. The result was a 2 (hands) × 2 (directions) × 15 (notes) matrix of timing deviations, which we will refer to as our irregularity trace. As an illustration, Figure 1A shows the irregularity trace for right hand ascending scales in one pianist in the two measurement sessions, and Figure 1B for two different pianists. It is clear that the irregularity traces originating from the same pianist (Figure 1A) are strikingly similar, whereas those originating from different pianists (Figure 1B) are qualitatively different. This is the observation that our analysis (described below) aims to capture.

FIGURE 1

Figure 1. Illustration of the note onset timing traces of two typical pianists, showing only the right hand ascending scale timings. One pianist (CA) was recorded playing two-octave C-major scales. Using a previously established technique, we are able to determine the precise timing of each individual note (for further details see text). (A) The note-by-note temporal deviation (in ms) is strikingly similar between the two recordings (blue and green line). The red vertical bars and shaded area indicate the temporal distance between the traces, which is on average around 3 ms. (B) Comparison of CA’s temporal deviation trace with that of a different pianist (MD). The traces are qualitatively different, which is captured by a higher temporal distance of around 7 ms.

Additionally, we calculated the unevenness of the scale in accordance with a previously established protocol (Jabusch et al., 2004) as follows. For each correct scale run, the intervals between the consecutive note onsets were calculated and then we took the standard deviation of these. For each hand, direction, and recording (first or follow-up) we took the median of the standard deviations of the scale runs (in ms). The higher this unevenness score, the more temporally irregular the scales.

In ANOVAs we report $η_{G}^{2}$ as the generalized effect size (Bakeman, 2005). Following musicological notational convention, we will refer to the notes in the scale as 1, 2, 3, 4, 5, 6, 7, 1′, 2′, 3′, 4′, 5′, 6′, 7′,1 ″, in ascending order.

Results

Preliminaries

First, we isolated the correctly played scales, yielding an average total of 11.7 (SD 0.97) scales per person and condition. As a control analysis, we used the number of scales as an outcome measurement in an ANOVA that revealed no significant difference according to hand [F(1, 7) = 3.43, p = 0.11], direction [F(1, 7) ≈ 0.00, p ≈ 1.00], recording session [F(1, 7) = 1.19, p = 0.74] nor any interaction effect [all F(1, 7) < 0.11]. We can conclude that there is no selection bias due to the discarding of scales.

Now we turn to the unevenness measure (the standard deviation of the inter-keystroke-intervals). ANOVA yielded a significant main effect of hand [F(1, 7) = 5.73, p < 0.05, $η_{G}^{2}$ = 0.04], showing that left hand scales were played more unevenly (mean unevenness 9.19 ms, SD 1.67) than right hand scales (mean unevenness 8.44 ms, SD 1.81). This replicates a previous finding (Kopiez et al., 2011). There was no main effect of playing direction [F(1, 7) = 0.01, p = 0.92] nor of recording session [F(1, 7) = 1.00, p = 0.35] but there was a two-way interaction between direction and recording [F(1, 7) = 7.00, p = 0.03, $η_{G}^{2}$ = 0.02], showing that although outward scales were played equally evenly across the sessions, inward scales were more even in the follow-up session (unevenness 8.43 ms, SD 1.86) than in the first session (unevenness 9.13 ms, SD 2.33), perhaps revealing a habituation effect.

Recognizing individual pianists

A salient feature of the temporal traces is that they are highly individual: traces from the same individual but different sessions vary little, whereas traces from different pianists vary much more (Figure 1). To quantify this observation, we define the temporal distance as the Euclidian distance between any pair of vectors representing the irregularity traces. That is, we calculated the sum of squares of the item-by-item distances. Then we divided this by the number of notes in the traces (15 notes for a two-octave scale). Finally, we took the square root to yield a distance value in ms. First we calculate these distances for each of the two hands, two directions separately. We find that irregularity traces originating from the same pianist have a distance of 3.42 ms (SD 0.89), whereas those originating from different pianists have a distance of 7.24 ms (SD 0.54) (Figure 4). ANOVA with distance as dependent variable shows a significant main effect of self vs. other [F(1, 7) = 108.18, p < 0.001, $η_{G}^{2}$ = 0.79] but no effect of hand [F(1, 7) = 0.55, p = 0.48] nor playing direction [F(1, 7) = 0.30, p = 0.60] nor any interaction effect [all F(1, 7) < 1.1].

As a result, we designed the simplest possible classification algorithm as follows. Our algorithm is given a database of the irregularity traces for the first measurements of each of the eight pianists. Then it is presented each of the follow-up irregularity traces, without the player label, and its task is to match each pianist to one of the traces in its database. Our algorithm simply chooses the irregularity trace that matches most closely.

This procedure is performed separately for the four sets of average irregularity traces from the two hands and two playing directions. Classification was flawless (100%) for all the right hand scales (inward and outward), as well as the left hand outward scales. In the left hand inward scales, six pianists are classified correctly and two incorrectly. Chance is at 0.125 recognition rate, meaning that in all cases classification is significantly better than chance [binomial p < 0.001, 95% confidence interval = (0.35, 0.97) for the left hand inward scales and (0.63, 1.0) for the other cases]. When instead of the complete irregularity trace (15 data points per two-octave scale) we used only the unevenness (one data point per two-octave scale) classification rate dropped to between 0.25 and 0.5, which exceeded chance performance only for the right hand inward scales [binomial p = 0.01, 95% confidence interval = (0.16, 0.84)].

The Euclidian distance is not necessarily the only or best way to quantify the (dis)similarity between irregularity traces. To illustrate this, we perform the same analysis, but this time we compute the correlation (Pearson r) between pairs of irregularity traces. ANOVA on the Fisher r-to-z transformed correlation coefficients shows a main effect of self vs. other [F(1, 7) = 63.92, p < 0.001, $η_{G}^{2}$ = 0.74], showing that correlations between irregularity traces from the same pianists are higher [z(r) = 1.39, SD 0.42] than irregularity traces from different pianists [z(r) = 0.40, SD 0.21]. There is no effect of hand except for a trend [F(1, 7) = 5.40, p = 0.05, $η_{G}^{2}$ = 0.03], nor a main effect of direction [F(1, 7) = 2.76, p = 0.14]. Of the interaction effects only that between hand and direction [F(1, 7) = 11.50, p = 0.01, $η_{G}^{2}$ = 0.10] is significant [all other F(1, 7) < 1.05], revealing that whereas left hand traces correlate equally in both playing directions, right hand inward scales correlate higher than outward scales.

We re-ran our recognition algorithm with the only difference that this time, given an irregularity trace to recognize, it chose the irregularity trace that showed the greatest correlation. Recognition rates are identical to those for Euclidian distance: flawless in all but the case of left hand inward scales with six out of two correctly classified (hence still exceeding chance performance).

Comparing irregularity traces of the same pianist

So far, we have only compared the irregularity traces produced by the same hand and in the same playing direction but by different pianists. How do the traces produced by the same pianist but by different hands and different directions compare? We argue that these comparisons may provide crucial insight into what causes the timing deviations (Figure 2A). Our reasoning was as follows. If the temporal deviations result from remnants of expressive timing (Repp, 1999a), then we expect irregularity traces that sound similar to be more similar. That is, we expect the left hand inward and right hand outward traces to be closest together (since they have the same auditory result, modulo octave differences), and similarly the right hand outward and left hand inward scales to be close. If, on the other hand, the temporal deviation traces are mostly determined by biomechanical or neuromuscular factors, then we expect traces generated by the same movements to be closer together than those generated by different movements (Figure 2B). More specifically, the pairs of inward and pairs of outward scales are expected to be closer together than pairs with an inward and outward scale.

FIGURE 2

Figure 2. (A) Overview of the body-central directions (inward and outward, in blue) and the keyboard-central directions (ascending and descending, in green). (B) Predictions of the two hypotheses. If the irregularity traces mostly stem from neuromuscular constraints, we expect traces originating from the same movements to be similar. If they originate mostly from residual expression, we expect traces producing the same sounds to be similar. (C) Experimental results, in line with the neuromuscular hypothesis.

Furthermore, note that in all these comparisons we have aligned the irregularity traces in time (in the order in which they are played) and not in space (the order in which they appear on the keyboard). That means, when we compare left hand inward and right hand outward scales, they are the same movement in time, but mirrored in space.

An ANOVA with distance as dependent measure revealed a main effect of movement [F(1, 7) = 7.63, p = 0.03, $η_{G}^{2}$ = 0.10], reflecting that distances between irregularity traces produced by the same movement are shorter (6.47 ms, SD 0.85) than those produced by different movements (7.64 ms, SD 1.58) (Figure 2C). That is, the results are in line with the hypothesis that the temporal deviations are mostly neuromuscular in nature. No other factor has a main effect [all F(1, 7) < 1.6] and there were no interactions [all F(1, 7) < 2.0].

Effector-specificity of the individuality

To what extent is the individuality in the traces specific to the effector (i.e., hand)? To answer this question, we repeated the analysis above, but comparing the distances across hands within and between pianists. That is, we computed the distance between left and right hand irregularity traces for the same movement direction (inward or outward) and for either the same pianist or different pianists. We found a main effect of same vs. different pianist [F(1, 7) = 28.35, p = 0.001, $η_{G}^{2}$ = 0.01], revealing that cross-hand distances are smaller between traces from the same pianist (M = 6.41, SD = 0.87 ms) than traces from different pianists (M = 7.47, SD = 0.42 ms). There were no main effects of hand, direction or recording, nor any interaction effects [all F(1, 7) < 2.74, p > 0.14].

Discussion

Let us pause an instant to take stock. We have shown that pianists do not play scales perfectly regularly. Rather, consistent temporal deviations are present. For the first time we show that these deviations are not mere noise, since they are reliably reproduced across two recording sessions. Furthermore, differences between individuals are so pronounced that a surprisingly simple recognition algorithm is able to recognize pianists nearly flawlessly using the average timing profile of a dozen runs of two-octave scales. The algorithm works equally well when it matches irregularity traces by minimizing distance or by maximizing correlation.

An important observation is that the pianists’ temporal irregularities are qualitatively different. If the irregularity profiles had been qualitatively the same, that is, the same vector simply multiplied by a coefficient, then recognition on the overall unevenness would perform as well as recognition using the entire irregularity trace. But we find the contrary: recognition using a simple overall unevenness metric (the median of the inter-keystroke-intervals) was barely above chance. We can conclude that it is the qualitative differences in the scale timing that enable us to tell the different pianists apart. Hence we can speak of a pianistic fingerprint.

What determines this temporal fingerprint? We showed that temporal irregularity traces generated by the same movement are more similar than those generating the same sound. As a consequence, the contribution of biomechanical constraints to these timing profiles must be stronger than expressive or perceptual influences. Furthermore, we found that the individuality in the traces is to some extent effector-independent: the two hands of the same pianist are less different than hands of different pianists. This suggests that the individuality is represented in cortical areas accessible to both effectors (Rijntjes et al., 1999).

In sum, temporal differences are physically present in the produced timing in musical scales. At this point, it remains unclear whether this individuality is also perceptually present: are human observers able to identify performers in the same way our algorithm could?