Your new experience awaits. Try the new design now and help us make it even better

ORIGINAL RESEARCH article

Front. Robot. AI

Sec. Human-Robot Interaction

Volume 12 - 2025 | doi: 10.3389/frobt.2025.1597276

This article is part of the Research TopicErrors and Mistakes in Human-Robot InteractionsView all 4 articles

Should Robots Display What They Hear? Mishearing as a Practical Accomplishment

Provisionally accepted
Damien  RUDAZDamien RUDAZ1*Christian  LicoppeChristian Licoppe2,3
  • 1University of Copenhagen, Copenhagen, Denmark
  • 2Telecom Paris, Palaiseau, France
  • 3Institut Polytechnique de Paris (IP Paris), Palaiseau, France

The final, formatted version of the article will be published soon.

As a contribution to research on transparency and failures in human-robot interaction, our study investigates whether the informational ecology configured by publicly displaying a robot's automatic speech recognition results (ASR) is consequential in how miscommunications emerge and are dealt with. After a preliminary quantitative analysis of our participants' gaze behavior during an experiment where they interacted with a conversational robot, we rely on a micro-analytic approach to detail how the interpretation of this robot's conduct as inadequate was configured by what it displayed having "heard" on its tablet. We investigate cases where an utterance or gesture by the robot was treated by participants as sequentially relevant only as long as they had not read the automatic speech recognition transcript -but was then re-evaluated as troublesome once they had read it. In doing so, we contribute to HRI by showing that systematically displaying an ASR transcript can play a crucial role in participants' interpretation of a co-constructed action (such as shaking hands with a robot) as having "failed". We demonstrate that "mistakes" and "errors" can be approached as practical accomplishments that emerge as such over the course of interaction -rather than as social or technical phenomena pre-categorized by the researcher in reference to criteria exogenous to the activity being analyzed. In the end, while narrowing down on two video fragments, we find that this peculiar informational ecology did not merely impact how the robot was responded to. Instead, it modified the very definition of "mutual understanding" that was enacted and oriented to as relevant by the human participants in these fragments. Besides social robots, we caution that systematically providing such transcripts is a design decision not to be taken lightly; depending on the setting, it may have unintended consequences on interactions between humans and any form of conversational interface.

Keywords: automatic speech recognition, Errors and Mistakes, Transparency, Action ascription, conversation analysis, Ethnomethodology, repair, Mishearing

Received: 20 Mar 2025; Accepted: 11 Jul 2025.

Copyright: © 2025 RUDAZ and Licoppe. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence: Damien RUDAZ, University of Copenhagen, Copenhagen, Denmark

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.