Event Abstract

Cross-linguistic analysis of speech rhythm with decomposition of the amplitude envelope

  • 1 University of Kent, English Language and Linguistics, United Kingdom
  • 2 Cornell University, Linguistics, United States

This paper presents a new approach to characterizing speech rhythm, based upon empirical mode decomposition (EMD) [Huang etal., Proc. R. Soc. London Ser. A 454, 903–995 (1998)]. Intrinsic mode functions (IMFs) are obtained from EMD of a vocalic energy envelope of speech, which is a smoothly varying signal, representing primary vocalic resonance energy. EMD uses an iterative sifting process to decompose the signal into IMFs with zero mean and zero-crossings between extrema. Investigations of IMFs revealed that the last two IMFs appear to capture foot- and syllable-timescale oscillations in the envelope, respectively, while the ratio of signal power in the foot- and syllable-associated IMFs can be used as a metric of the relative influence of foot-based timing on speech. EMD was applied to speech corpora of English, German, Greek, Italian, Korean, and Spanish obtained from eight speakers of each language with three elicitation methods: read sentences, read running text, and spontaneous speech. For all languages, the metrics indicate that spontaneous speech exhibits more “stress-timing” like characteristics than read speech, having higher interval variability and more dominant stress-timescale periodicity in the envelope. Cross-linguistic differences emerged in some cases, but these were not entirely consistent across metrics and were affected by the elicitation method. Overall the data suggest that the elicitation effects (i.e., read versus spontaneous speech) are larger than differences between languages. These similarities between languages further indicate the presence of a common basis for speech rhythm–quite likely foot structure–crosslinguistically.

Acknowledgements

We thank UCSD Speech Lab members Younah Chung, Noah Girgis, Page Piccinini, Tristie Ross and Nadav Sofer for help with data collection and analysis. The financial support of the UCSD Committee on Research through Grant no. LIN201G to Amalia Arvaniti with Tristie Ross as GSR is hereby gratefully acknowledged.

Keywords: speech rhythm, Empirical Mode Decomposition, rhythm classes, english, German, Greek, Korean, Italian, spanish

Conference: 14th Rhythm Production and Perception Workshop Birmingham 11th - 13th September 2013, Birmingham, United Kingdom, 11 Sep - 13 Sep, 2013.

Presentation Type: Poster Presentation

Topic: Rhythm Production and Perception

Citation: Arvaniti A and Tilsen S (2013). Cross-linguistic analysis of speech rhythm with decomposition of the amplitude envelope. Conference Abstract: 14th Rhythm Production and Perception Workshop Birmingham 11th - 13th September 2013. doi: 10.3389/conf.fnhum.2013.214.00026

Copyright: The abstracts in this collection have not been subject to any Frontiers peer review or checks, and are not endorsed by Frontiers. They are made available through the Frontiers publishing platform as a service to conference organizers and presenters.

The copyright in the individual abstracts is owned by the author of each abstract or his/her employer unless otherwise stated.

Each abstract, as well as the collection of abstracts, are published under a Creative Commons CC-BY 4.0 (attribution) licence (https://creativecommons.org/licenses/by/4.0/) and may thus be reproduced, translated, adapted and be the subject of derivative works provided the authors and Frontiers are attributed.

For Frontiers’ terms and conditions please see https://www.frontiersin.org/legal/terms-and-conditions.

Received: 15 Jul 2013; Published Online: 24 Sep 2013.

* Correspondence: Prof. Amalia Arvaniti, University of Kent, English Language and Linguistics, Canterbury, United Kingdom, a.arvaniti@kent.ac.uk