ORIGINAL RESEARCH article
Front. Robot. AI
Sec. Computational Intelligence in Robotics
Volume 12 - 2025 | doi: 10.3389/frobt.2025.1621033
This article is part of the Research TopicSynergizing Large Language Models and Computational Intelligence for Advanced Robotic SystemsView all 3 articles
Large Language Model-Driven Natural Language Interaction Control Framework for Single-Operator Bimanual Teleoperation
Provisionally accepted- 1Lancaster University, Lancaster, United Kingdom
- 2Tsinghua University, Beijing, China
- 3Dalian Jiaotong University, Dalian, China
- 4South China University of Technology, Guangdong, China
- 5Shanghai Jiaotong University, Shanghai, China
Select one of your emails
You have multiple emails registered with Frontiers:
Notify me on publication
Please enter your email address:
If you already have an account, please login
You don't have a Frontiers account ? You can register here
Bimanual teleoperation imposes cognitive and coordination demands on a single human operator tasked with simultaneously controlling two robotic arms. Although assigning each arm to a separate operator can distribute workload, it often leads to ambiguities in decision authority and degrades overall efficiency. To overcome these challenges, we propose a novel bimanual teleoperation large language model assistant (BTLA) framework, an intelligent copilot that augments a single operator's motor control capabilities. In particular, BTLA enables operators to directly control one robotic arm through conventional teleoperation while directing a second assistive arm via simple voice commands, and therefore commanding two robotic arms simultaneously. By integrating the GPT-3.5-turbo model, BTLA interprets contextual voice instructions and autonomously selects among six predefined manipulation skills, including realtime mirroring, trajectory following, and autonomous object grasping. Experimental evaluations in bimanual object manipulation tasks demonstrate that BTLA increased task coverage by 76.1 % and success rate by 240.8% relative to solo teleoperation, and outperformed dyadic control with a 19.4% gain in coverage and a 69.9% gain in success. Furthermore, NASA Task Load Index (NASA-TLX) assessments revealed a 38-52% reduction in operator mental workload, and 85% of participants rated the voice-based interaction as "natural" and "highly effective.
Keywords: Human-robot collaboration, teleoperation, Bimanual manipulation, embodied AI, large language model (LLM)
Received: 30 Apr 2025; Accepted: 30 Jun 2025.
Copyright: © 2025 Fei, Xue, Lin, Du, Guo and Wang. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
* Correspondence: Ziwei Wang, Lancaster University, Lancaster, United Kingdom
Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.