ORIGINAL RESEARCH article

Front. Mar. Sci.

Sec. Ocean Observation

Volume 12 - 2025 | doi: 10.3389/fmars.2025.1581794

A lightweight YOLO network using temporal features for high-resolution sonar segmentation

Provisionally accepted
Sen  GaoSen Gao1,2Wei  GuoWei Guo1,2*Gaofei  XuGaofei Xu1,2Ben  LiuBen Liu1,2Yu  SunYu Sun1Bo  YuanBo Yuan1
  • 1Institute of Deep-Sea Science and Engineering, Chinese Academy of Sciences (CAS), Sanya, China
  • 2University of Chinese Academy of Sciences, Beijing, Beijing, China

The final, formatted version of the article will be published soon.

High-resolution sonar is important to underwater robots, providing precise and detailed environmental information. To enable underwater robots to respond swiftly to uncertain and dynamic underwater environments, improving the processing speed of sonar images remains a significant challenge. A YOLO-based segmentation model is proposed in this paper, which adopts a lightweight Backbone and utilizes temporal features to simplify the network structure and speed up the segmentation of consecutive frames. Firstly, we divide the consecutive frames into keyframes and non-keyframes using multi-scale feature similarity, and train a bypass BiLSTM network for consecutive temporal feature learning throughout the entire training process.Then, on non-keyframes, we use the trained BiLSTM model to predict the semantic vectors for segmentation, skipping certain layers to significantly reduce computational complexity. In addition, a high-resolution sonar dataset is proposed that includes various obstacles. The sonar data were collected in two real underwater environments using an AUV-mounted Oculus MD750d multibeam forward-looking sonar with dedicated labels. The effectiveness of our proposed method was validated on the proposed dataset and compared with other existing works. The latency of the algorithm on Nvidia Jetson TX2 is reduced to 87.4 ms (keyframes) and 35.3 ms (non-keyframes) while maintaining satisfactory segmentation accuracy.

Keywords: Lightweight Network, image segmentation, Autonomous Underwater Vehicles, Forward-looking sonar, collision avoidance

Received: 23 Feb 2025; Accepted: 08 May 2025.

Copyright: © 2025 Gao, Guo, Xu, Liu, Sun and Yuan. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence: Wei Guo, Institute of Deep-Sea Science and Engineering, Chinese Academy of Sciences (CAS), Sanya, China

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.