ORIGINAL RESEARCH article
Front. Mar. Sci.
Sec. Ocean Observation
Volume 12 - 2025 | doi: 10.3389/fmars.2025.1641093
This article is part of the Research TopicIntegrating Unmanned Platforms and Deep Learning Technologies for Enhanced Ocean Observation and Risk Mitigation in Ocean EngineeringView all articles
A novel reinforcement learning framework-based path planning algorithm for unmanned surface vehicle
Provisionally accepted- 1Yantai University, Yantai, China
- 2National CAD Supported Software Engineering Centre, School of Mechanical Science and Engineering, Huazhong University of Science and Technology, Wuhan, China
- 3Suzhou Tongyuan Software & Control Technology Co., Ltd., Suzhou, China
Select one of your emails
You have multiple emails registered with Frontiers:
Notify me on publication
Please enter your email address:
If you already have an account, please login
You don't have a Frontiers account ? You can register here
Unmanned surface vehicles (USVs) nowadays have been widely used in ocean observation missions, helping researchers to monitor climate change, collect environmental data, and observe marine ecosystem processes. However, path planning for USVs often faces several inherent difficulties during ocean observation missions: high dependence on environmental information, long convergence time, and low-quality generated paths. To solve these problems, this article proposes a novel artificial potential field-heuristic reward-averaging deep Q-network (APF-RADQN) framework-based path planning algorithm, aiming at finding optimal paths for USVs. First, the USV path planning is modeled as a Markov decision process (MDP). Second, a comprehensive reward function incorporating artificial potential field (APF) inspiration is designed to guide the USV to reach the target region. Subsequently, an optimized deep neural network with a reward-averaging strategy is constructed to effectively enhance the learning and convergence speed of the algorithm, thus further improving the global search capability and interface performance of USV path planning. In addition, the Bezier curve is applied to make the generated path more feasible. Finally, the effectiveness of the proposed algorithm is verified by comparing it with the DQN, A*, and APF algorithms in simulation experiments. Simulation results demonstrate that the APF-RADQN improves the interface ability and path quality, significantly enhancing the USV navigation safety and ocean observation mission operation efficiency.
Keywords: Unmanned surface vehicles, reinforcement learning, Deep Q-learning algorithm, Artificial potential field algorithm, path planning
Received: 04 Jun 2025; Accepted: 14 Jul 2025.
Copyright: © 2025 Mou, Shi, Wang, Yu, Wang, Zhong, Zheng, Wang and Li. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
* Correspondence: Junjie Li, Yantai University, Yantai, China
Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.