ORIGINAL RESEARCH article

Front. Mar. Sci.

Sec. Ocean Observation

Volume 12 - 2025 | doi: 10.3389/fmars.2025.1586237

This article is part of the Research TopicRemote Sensing Applications in Oceanography with Deep LearningView all 14 articles

A deep learning-based data augmentation method for marine mammal call signals

Provisionally accepted
Jiaming  JiangJiaming JiangWanlu  ChengWanlu ChengShengwen  GongShengwen Gong*Jingjing  WangJingjing Wang*
  • Qingdao University of Science and Technology, Qingdao, Shandong Province, China

The final, formatted version of the article will be published soon.

In marine ecology research, it is crucial to accurately identify the marine mammal species active in the target area during the current season, which helps researchers understand the behavioral patterns of different species and their ecological environment. However, the difficulty and high cost of collecting marine mammal calls, coupled with limited publicly available datasets, results in insufficient data for support, making it difficult to obtain accurate and reliable identification results.To address this problem, we propose MarGEN, a deep learning-based augmentation method for marine mammal call signal data. This method processed the call data into Mel spectrograms, then designed a self-attention conditional generative adversarial network to generate new samples of Mel spectrograms that were highly similar to the real data, and finally reconstructed them into call signals using WaveGlow. This study marks the first application of generative adversarial networks to the field of data augmentation of marine mammal call signals. Experimental results showed that the MarGEN method significantly increased the number and diversity of call signals, improving the model's recognition accuracy of marine mammal call signals by an average of 4.7%, effectively promoting marine ecology research.

Keywords: marine ecology, marine mammal call signals, MarGEN, deep learning, Data augmentation, self-attention conditional generative adversarial network

Received: 02 Mar 2025; Accepted: 21 May 2025.

Copyright: © 2025 Jiang, Cheng, Gong and Wang. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence:
Shengwen Gong, Qingdao University of Science and Technology, Qingdao, 266061, Shandong Province, China
Jingjing Wang, Qingdao University of Science and Technology, Qingdao, 266061, Shandong Province, China

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.