Event Abstract

Speech onset latency detection in naming: a pattern recognition approach

  • 1 Aristotle University of Thessaloniki, Greece

In order to estimate speech onset latencies in naming experiments, it is common to use technologies that rely on sound pressure changes (amplitude threshold), such as voice-key devices. These devices are used in online experiments and are prone to data loss. Moreover, several studies have revealed that voice-key devices suffer from low accuracy due to poor detection of unvoiced speech. Consequently, to assess reliable data, researchers often favour the offline manual waveform analysis (visual inspection), which is a time-consuming method and hardly scalable. In this study, we applied state-of-the-art machine learning (ML) techniques as an alternative to speech onset latency measurement. Our classification method followed the basic steps of pattern recognition approach. Two participants (Rec1 – male and Rec2 – female) were recorded during picture naming. Four categories of prototypes were adopted: a) beep tone – a sound indicator that a picture was presented on the computer screen, b) noise – possible noise signal, c) response – participant’s response and d) unclassified signal – the silent and other parts of the recording. The onsets for the above four prototypes were annotated. For further analysis and processing, these modules were converted into feature values. Additionally, a feature ranking algorithm was used to conclude to the most discriminatory features. As a result, a group of audio low-level features (33 in total) were selected for the final model. Lastly, a train-test experimental setup was applied to learn a Random Forest model. Preliminary results indicate that the proposed method is able to achieve an accuracy of 84.27% in detecting the voice prototypes (labels). Moreover, our approach has a time accuracy advantage over E-prime voice-key measures, for the same data. The given time error was calculated for our method and E-prime voice-key, and was compared to ground truth measures (visual inspection). The mean absolute difference (MAD) was calculated for E-prime voice-key measures (Rec1 = 389 ms / Rec2 = 379 ms) and for our method (Rec1 = 180msec / Rec2 = 280msec). In summary, our results indicate that the proposed pattern recognition approach provides a promising and efficient alternative to voice-key detection methods in speech onset latency detection problems, and it is able to scale better while employing standard sound/speech processing technologies.

Keywords: naming, Reaction Time, pattern recognition, sound processing, Speech Processing

Conference: SAN2016 Meeting, Corfu, Greece, 6 Oct - 9 Oct, 2016.

Presentation Type: Poster Presentation in SAN2016 Conference

Topic: Posters

Citation: Siatra V, Vrysis L, Papanikolaou G, Kalliris G and Foroglou N (2016). Speech onset latency detection in naming: a pattern recognition approach. Conference Abstract: SAN2016 Meeting. doi: 10.3389/conf.fnhum.2016.220.00085

Copyright: The abstracts in this collection have not been subject to any Frontiers peer review or checks, and are not endorsed by Frontiers. They are made available through the Frontiers publishing platform as a service to conference organizers and presenters.

The copyright in the individual abstracts is owned by the author of each abstract or his/her employer unless otherwise stated.

Each abstract, as well as the collection of abstracts, are published under a Creative Commons CC-BY 4.0 (attribution) licence (https://creativecommons.org/licenses/by/4.0/) and may thus be reproduced, translated, adapted and be the subject of derivative works provided the authors and Frontiers are attributed.

For Frontiers’ terms and conditions please see https://www.frontiersin.org/legal/terms-and-conditions.

Received: 29 Jul 2016; Published Online: 01 Aug 2016.

* Correspondence: Mrs. Vasiliki Siatra, Aristotle University of Thessaloniki, Thessaloniki, Greece, vasiliki.siatra@gmail.com