Your new experience awaits. Try the new design now and help us make it even better

ORIGINAL RESEARCH article

Front. Neurorobot.

Volume 19 - 2025 | doi: 10.3389/fnbot.2025.1649870

This article is part of the Research TopicAdvancing Neural Network-Based Intelligent Algorithms in Robotics: Challenges, Solutions, and Future Perspectives - Volume IIView all 6 articles

Imitation-Relaxation Reinforcement Learning for Sparse Badminton Strikes via Dynamic Trajectory Generation

Provisionally accepted
  • 1Zhejiang University, Hangzhou, China
  • 2ZJU-Hangzhou Global Scientific and Technological Innovation Center, Hangzhou, China
  • 3Zhejiang University State Key Laboratory of Fluid Power and Mechatronic Systems, Hangzhou, China
  • 4Zhejiang University Institute of Applied Mechanics, Hangzhou, China

The final, formatted version of the article will be published soon.

Robotic racket sports present exceptional benchmarks for evaluating dynamic motion control capabilities of robots. Due to the highly nonlinear dynamics of shuttlecock, stringent demands on robot's dynamic responses, convergence difficulties due to sparse reward in reinforcement learning, badminton strikes remain a formidable challenge for robot systems. To address these, this work proposes DTG-IRRL, a novel learning framework for badminton strikes, which integrates imitation-relaxation reinforcement learning with dynamic trajectory generation. The framework demonstrates significantly improved training efficiency and performance, achieving faster convergence and twice higher landing accuracy. Analysis of the reward function within a specific parameter space hyperplane intuitively reveals the convergence difficulties arising from inherent sparse reward in racket sports and demonstrates the framework's effectiveness in mitigating local and slow convergence. Implemented on hardware with zero-shot transfer, the framework achieves a 90% hitting rate and a 70% landing accuracy, enabling sustained humanrobot rallies. Cross-platform validation using UR5 robot demonstrates framework's generalizability while highlighting the requirement for high dynamic performance of robotic arm in racket sports.

Keywords: reinforcement learning, robotic badminton, Sparse reward, Nonlinear Dynamics, State prediction, trajectory generation

Received: 19 Jun 2025; Accepted: 08 Aug 2025.

Copyright: © 2025 Yuan, Tao, Cheng, Liang, Jin and WANG. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence:
Yongbin Jin, Zhejiang University, Hangzhou, China
Hongtao WANG, Zhejiang University, Hangzhou, China

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.