Your new experience awaits. Try the new design now and help us make it even better

ORIGINAL RESEARCH article

Front. Psychol.

Sec. Auditory Cognitive Neuroscience

Volume 16 - 2025 | doi: 10.3389/fpsyg.2025.1512851

This article is part of the Research TopicCrossing Sensory Boundaries: Multisensory Perception Through the Lens of AuditionView all 13 articles

Exploring Cross-Modal Perception in a Virtual Classroom: The Effect of Visual Stimuli on Auditory Selective Attention

Provisionally accepted
  • 1Institute for Hearing Technology and Acoustics, RWTH Aachen University, Aachen, Germany
  • 2Cognitive and Developmental Psychology, RPTU Kaiserslautern-Landau, Kaiserslautern, Rhineland-Palatinate, Germany
  • 3Audiovisual Technology Group, Technical University Ilmenau, Ilmenau, Germany

The final, formatted version of the article will be published soon.

In virtual reality research, distinguishing between auditory and visual influences on perception has become increasingly challenging. To study auditory selective attention in more close-to-real-life settings, an auditory task was adapted to a virtual classroom. The new environment suggested evidence of increased attention, possibly introduced by the visual representation, gamification effects, and immersion. This could engage participants more effectively. To delve deeper into the impact of cross-modal effects, the paradigm was extended by visual stimuli. Participants were initially tasked with directing their auditory attention to a cued spatial position and categorizing animal names played from that position while ignoring distracting sounds. Animal pictures introduced in Experiment 1 were either congruent or incongruent with the auditory target stimuli, thus either supporting or competing with the auditory information. The concurrent presentation of animal pictures with the animal names increased response times compared to the auditory condition, and incongruent visual stimuli increased response times more than congruent ones. Fewer errors were made with congruent compared to incongruent pictures, and error rates of the auditory condition fell in between. When the visual stimulus was presented 750 ms or 500 ms before the auditory stimuli in Experiment 2, auditory and visual congruence effects interacted. In the 500 ms case, visually congruent stimuli decreased error rates in auditory incongruent trials. Conversely, visually incongruent stimuli decreased error rates on auditory incongruent trials at 750 ms. This reversal of effects suggests a positive priming effect at 500 ms and a semantic inhibition of return effect at 750 ms. Taken together, these findings indicate that cross-modal priming is at least partially different from multisensory integration.

Keywords: Audio-visual attention, auditory selective attention, binaural hearing, virtual reality, visual priming, attention switching

Received: 17 Oct 2024; Accepted: 08 Oct 2025.

Copyright: © 2025 Breuer, Vollmer, Leist, Fremerey, Raake, Klatte and Fels. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence:
Carolin Breuer, carolin.breuer@akustik.rwth-aachen.de
Lukas Vollmer, lukas.vollmer@akustik.rwth-aachen.de

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.