Impact Factor 2.870 | CiteScore 2.96
More on impact ›

Original Research ARTICLE Provisionally accepted The full-text will be published soon. Notify me

Front. Hum. Neurosci. | doi: 10.3389/fnhum.2019.00335

The principle of inverse effectiveness in audiovisual speech perception

 Luuk P. van de Rijt1, Anja Roye2, Emmanuel A. Mylanus1,  A. J. van Opstal2 and  Marc M. Van Wanrooij2*
  • 1Department of Otorhinolaryngology, Donders Institute for Brain, Cognition, and Behaviour, Radboud University Nijmegen Medical Centre, Netherlands
  • 2Department of Biophysics, Donders Institute for Brain, Cognition, and Behaviour, Radboud University Nijmegen, Netherlands

We assessed how synchronous speech listening and lipreading affects speech recognition in acoustic noise. In simple audiovisual perceptual tasks, inverse effectiveness is often observed, which holds that the weaker the unimodal stimuli, or the poorer their signal-to-noise ratio, the stronger the audiovisual benefit. So far, however, inverse effectiveness has not been demonstrated for complex audiovisual speech stimuli. Here we assess whether this multisensory integration effect can also be observed for the recognizability of spoken words.
To that end, we presented audiovisual sentences to 18 native-Dutch normal-hearing participants, who had to identify the spoken words from a finite list. Speech-recognition performance was determined for auditory-only, visual-only (lipreading) and auditory-visual conditions. To modulate acoustic task difficulty, we systematically varied the auditory signal-to-noise ratio. In line with a commonly-observed multisensory enhancement on speech recognition, audiovisual words were more easily recognized than auditory-only words (recognition thresholds of -15 dB and -12 dB, respectively).
We here show that the difficulty of recognizing a particular word, either acoustically or visually, determines the occurrence of inverse effectiveness in audiovisual word integration. Thus, words that are better heard or recognized through lipreading, benefit less from bimodal presentation.
Audiovisual performance at the lowest acoustic signal-to-noise ratios (45%) fell below the visual recognition rates (60%), reflecting an actual deterioration of lipreading in the presence of excessive acoustic noise. This suggests that the brain may adopt a strategy in which attention has to be divided between listening and lipreading.

Keywords: multisensory, Lipreading, Hearing, Speech recognition in noise, listening, audiovisual

Received: 29 Apr 2019; Accepted: 11 Sep 2019.

Copyright: © 2019 van de Rijt, Roye, Mylanus, van Opstal and Van Wanrooij. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence: PhD. Marc M. Van Wanrooij, Radboud University Nijmegen, Department of Biophysics, Donders Institute for Brain, Cognition, and Behaviour, Nijmegen, 6500 HC, Gelderland, Netherlands,