ORIGINAL RESEARCH article
Front. Robot. AI
Sec. Human-Robot Interaction
Volume 12 - 2025 | doi: 10.3389/frobt.2025.1662819
Exploring Multimodal Collaborative Storytelling with Pepper: A Preliminary Study with Zero-Shot LLMs
Provisionally accepted- Faculty of Informatics, University of the Basque Country, Donostia, Spain
Select one of your emails
You have multiple emails registered with Frontiers:
Notify me on publication
Please enter your email address:
If you already have an account, please login
You don't have a Frontiers account ? You can register here
With the rise of large language models (LLMs), collaborative storytelling in virtual agents or chatbots has gained popularity. Despite storytelling has long been employed in social robotics as a means to educate, entertain, and persuade audiences, the integration of LLMs into such platforms remains largely unexplored. This paper presents the initial steps for a novel multimodal collaborative storytelling system in which users co-create stories with the social robot Pepper through natural language interaction and by presenting physical objects. The robot employs a YOLO-based vision system to recognize these objects and seamlessly incorporate them into the narrative. Story generation and adaptation are handled autonomously using the Llama model in a zero-shot setting, aiming to assess the usability and maturity of such models in interactive storytelling. To enhance immersion, the robot performs the final story using expressive gestures, emotional cues, and speech modulation. User feedback, collected through questionnaires and semi-structured interviews, indicates a high level of acceptance.
Keywords: collaborative storytelling, social robotics, Zero-Shot LLM, Gesture generation, human-robot interaction
Received: 09 Jul 2025; Accepted: 18 Sep 2025.
Copyright: © 2025 Zabala, Echevarria, Rodriguez and Lazkano. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
* Correspondence:
Unai Zabala, unai.zabalac@ehu.eus
Elena Lazkano, e.lazkano@ehu.eus
Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.