Event Abstract

Decoding Musicianship from Neural Processing of Musical Features

  • 1 University of Jyvaskyla, Department of Music, Finland
  • 2 Aarhus University, Department of Clinical Medicine, Denmark

Introduction Music listening is effective at recruiting a wide range of areas in the brain [1]. The decoding approach, that aims to describe stimuli or conditions in terms of changes in the neural activations [2], has been employed to investigate the neural processing of musical features using functional magnetic resonance imaging (fMRI) during naturalistic music listening. A modest accuracy has been reached at predicting low- and high-level musical features [3]. The accuracy levels suggest subject-specific differences in the neural processing of musical material. In one of the few studies on subject-specific differences in the neural recruitment during music listening, [4] linked musical training to an increased functional symmetry in multiple regions of the brain. Decoding demographic attributes, such as psychiatric conditions, based on the whole-brain connectivity has received increased attention [5]. However, this type of approach has not been employed with musical stimuli and musically relevant variables, such as musicianship. Moreover, past research has not attempted to model subject-specific differences on the basis of the patterns of brain responses to music stimulus features. The present study aims to decode musicianship based on the neural processing of musical features in naturalistic music listening. Materials and Methods FMRI data acquisition and preprocessing An equal number of musically trained (n=18) and untrained (n=18) participants participated in an fMRI experiment. This data has previously been used in [4]. Three musical pieces covering distinct musical genres were used: (i) Stream of Consciousness by Dream Theater (rock/metal); (ii) Adios Nonino by Astor Piazzolla (tango); and (iii) Rite of Spring Part I, episodes 1-3 (classical music) by Igor Stravinsky. Participants’ brain responses were acquired while they attentively listened to each of the musical stimuli. Scanning was performed using a whole-body scanner, and slices covering the whole brain were acquired at 0.5Hz. fMRI data was preprocessed as in [4]. Last, the fMRI volumes were parcellated into 116 regions according to the Automated Anatomical Labeling (AAL) atlas. The regional mean time-series were estimated by within-region voxel averaging. Musical feature processing An approach similar to that in [1] was employed for musical feature extraction and processing. 25 musical features related to timbre, rhythm and tonality were extracted from the stimuli using the MIRToolbox [6]. The temporal evolution of features was extracted using a time frame-based approach. To make the data comparable with the fMRI data, each of the features were convolved with a double-gamma Hemodynamic Response Function, de-trended by spline interpolation and temporal filtering, and down-sampled to 0.5Hz. The musical features were then subjected to Principal Component Analysis (PCA) and Varimax rotation to map the data to six components labeled as Fullness, Brightness, Activity, Timbral Complexity, Pulse Clarity, and Key Clarity. Decoding A multi-stage modeling approach was employed for the decoding. First, the region-based fMRI data for all participants was concatenated in time, and each region was z-score transformed to zero mean and unit standard deviation. Then, Group-PCA [7] with a low number of components was run on the data to smooth out noise and to reduce the data dimensionality. Separately for each participant, the data was then time-aligned with the musical features, cross-covariance was computed between the regions and features, and the resulting matrix for each participant was vectorized. Then, the z-score transformation was performed over participants was performed. L1-norm penalized linear Support Vector Machine (SVM) classifier was trained to predict the musicianship from the cross-covariances, as it is robust against irrelevant features. The cost parameter controlling the number of retained features was optimized for each model in an inner cross-validation (CV) loop. To obtain realistic decoding performance estimates, an outer CV loop across participants was run ten times following the leave-one-subject-per-group-out (LOSGO) strategy [5]. Random partitioning of the participants was used at each run and the obtained accuracies were pooled. The three musical stimuli were concatenated in time for the main analysis, appropriating an overall view of the neural processing related to multiple musical styles. Results and Discussion The classification accuracy (FIG1) obtained with the concatenated stimulus depended on the number of principal components used, reaching the peak at three components (M=0.681, SD=0.026). This accuracy is statistically significant at p < 0.05 based on Monte-Carlo sampling. When training and testing with each individual stimulus (FIG2), the mean classification accuracy did not reach the significance level of p=0.05 except for Piazzolla at three components. This may indicate the lack of variability in the musical material within these stimuli, leading to less stable covariances. The contribution of different neural regions and musical features in the optimal models (three components, concatenated stimulus) were assessed by examining the feature coefficients in the SVM models as well as the principal component loadings, averaged across the CV runs and folds. The most important features predicting group membership included increased activation in the auditory network in musicians and in the salience network in non-musicians during moments of high timbral complexity and low fullness (PC3). This may indicate musicians' greater ability to efficiently process complex musical material. Furthermore, moments of high activation and unclear tonality were associated with higher activation in the executive control and salience networks and frontal regions in musicians (PC2). This may suggest that musicians, because of their training, were performing more efficient tonal prediction. Finally, high overall activation during moments of high brightness was found to predict musicianship (PC1). This may indicate higher efficiency of timbral processing in musicians. Conclusions A moderately high accuracy was achieved at decoding musicianship from neural processing of musical features, and distinct processing patterns at various brain areas for different musical features were discovered between musicians and non-musicians. This suggests a profound connection between musical training and music perception. This also highlights the need to take into account the demographic attributes when studying music processing in the field of neuroinformatics. However, a modest and unstable accuracy achieved with stimuli representing individual musical styles warrants further study on the effect of musical styles on music perception and processing.

Figure 1
Figure 2

References

[1] Alluri V, Toiviainen P, Jääskeläinen IP, Glerean E, Sams M, Brattico E., Large-scale brain networks emerge from dynamic processing of musical timbre, key and rhythm. Neuroimage (2012) 59(4):3677–89. doi: 10.1016/j.neuroimage.2011.11.019

[2] Naselaris, T., Kay, K.N., Nishimoto, S., Gallant, J.L., Encoding and decoding in fMRI. NeuroImage (2011) 56 (2), 400–410. doi: 10.1016/j.neuroimage.2010.07.073

[3] Toiviainen, P., Alluri, V., Brattico, E., Wallentin, M., Vuust, P., Capturing the musical brain with Lasso: Dynamic decoding of musical features from fMRI data. Neuroimage (2014) 88:170-180. doi:10.1016/j.neuroimage.2013.11.017

[4] Burunat I, Brattico E, Puoliväli T, Ristaniemi T, Sams M, Toiviainen P., Action in Perception: Prominent Visuo-Motor Functional Symmetry in Musicians during Music Listening. PLoS ONE (2015) 10(9). doi: 10.1371/journal.pone.0138238

[5] Rosa M.J., Portugal L., Hahn T., Fallgatter A.J., Garrido M.I., Shawe-Taylor J., Mourao-Miranda J. Sparse network-based models for patient classification using fMRI. Neuroimage (2015) 105:493-506. doi: 10.1016/j.neuroimage.2014.11.021

[6] Lartillot, O., Toiviainen, P., MIR in Matlab (II): a toolbox for musical feature extraction from audio. Proceedings of 8th International Conference on Music Information Retrieval (2007) (Available online at http://ismir2007.ismir.net/proceedings/ISMIR2007_p127_lartillot.pdf).

[7] Smith, S.M., Hyvärinen, A., Varoquaux, G., Miller, K.L., Beckmann, C.F., Group-PCA for very large fMRI datasets. Neuroimage (2014) 101:738–749. doi: 10.1016/j.neuroimage.2014.07.051

Keywords: fMRI, musical features, Decoding, musical training, music perception

Conference: Neuroinformatics 2016, Reading, United Kingdom, 3 Sep - 4 Sep, 2016.

Presentation Type: Demo

Topic: Computational neuroscience

Citation: Saari P, Brattico E and Toiviainen P (2016). Decoding Musicianship from Neural Processing of Musical Features. Front. Neuroinform. Conference Abstract: Neuroinformatics 2016. doi: 10.3389/conf.fninf.2016.20.00092

Copyright: The abstracts in this collection have not been subject to any Frontiers peer review or checks, and are not endorsed by Frontiers. They are made available through the Frontiers publishing platform as a service to conference organizers and presenters.

The copyright in the individual abstracts is owned by the author of each abstract or his/her employer unless otherwise stated.

Each abstract, as well as the collection of abstracts, are published under a Creative Commons CC-BY 4.0 (attribution) licence (https://creativecommons.org/licenses/by/4.0/) and may thus be reproduced, translated, adapted and be the subject of derivative works provided the authors and Frontiers are attributed.

For Frontiers’ terms and conditions please see https://www.frontiersin.org/legal/terms-and-conditions.

Received: 24 Jul 2016; Published Online: 01 Sep 2016.

* Correspondence: Dr. Pasi Saari, University of Jyvaskyla, Department of Music, Jyvaskyla, Finland, pasirikhard@gmail.com