Human Intention Recognition by Deep LSTM and Transformer Networks for Real-Time Human-Robot Collaboration

Mavsar, Matija; Simonič, Mihael; Ude, Aleš

doi:10.3389/frobt.2025.1708987

ORIGINAL RESEARCH article

Front. Robot. AI

Sec. Human-Robot Interaction

Human Intention Recognition by Deep LSTM and Transformer Networks for Real-Time Human-Robot Collaboration

Provisionally accepted

Matija Mavsar^*

Mihael Simonič

Aleš Ude

Institut Jožef Stefan (IJS), Ljubljana, Slovenia

The final, formatted version of the article will be published soon.

Collaboration between humans and robots is essential for optimizing the performance of complex tasks in industrial environments, reducing worker strain, and improving safety. This paper presents an integrated human-robot collaboration (HRC) system that leverages advanced intention recognition for real-time task sharing and interaction. By utilizing state-of-the-art human pose estimation combined with deep learning models, we developed a robust framework for detecting and predicting worker intentions. Specifically, we employed LSTM-based and transformer-based neural networks with convolutional and pooling layers to classify human hand trajectories, achieving higher accuracy compared to previous approaches. Additionally, our system integrates dynamic movement primitives (DMPs) for smooth robot motion transitions, collision prevention, and automatic motion onset/cessation detection. We validated the system in a real-world industrial assembly task, demonstrating its effectiveness in enhancing the fluency, safety, and efficiency of human-robot collaboration. The proposed method shows promise in improving real-time decision-making in collaborative environments, offering a safer and more intuitive interaction between humans and robots.

Keywords: Human-robot collaboration, deep neural networks, LSTM, transformer, Intention recognition

Received: 19 Sep 2025; Accepted: 26 Nov 2025.

Copyright: © 2025 Mavsar, Simonič and Ude. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence: Matija Mavsar

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.