ORIGINAL RESEARCH article
Front. Mar. Sci.
Sec. Ocean Observation
Volume 12 - 2025 | doi: 10.3389/fmars.2025.1603090
Recognition and Classification Techniques of Marine Mammal Calls Based on LSTM and Expanded Causal Convolution
Provisionally accepted- 1Qingdao University of Science and Technology, Qingdao, China
- 2Shandong Key Laboratory of Deep Sea Equipment Intelligent Networking, Qingdao, China
- 3Technische Universität Ilmenau, Ilmenau, Thuringia, Germany
Select one of your emails
You have multiple emails registered with Frontiers:
Notify me on publication
Please enter your email address:
If you already have an account, please login
You don't have a Frontiers account ? You can register here
The calls of marine mammals are crucial tools for navigation and localization between individuals.Effectively classifying these calls holds significant importance for ecological monitoring, species conservation, and covert underwater acoustic communication in military biomimetics. However, the classification method based on traditional machine learning makes it difficult to deal with complex patterns, which affects the performance and classification effect of the model. Existing deep learning methods typically use a single frequency-domain feature extraction approach, which cannot fully capture important time-domain features, leading to incomplete extracted features. Additionally, these methods require large amounts of data to train the model, resulting in insufficient classification performance when the training data is limited for marine mammal calls. To enhance the accuracy and robustness of recognizing and classifying marine mammal calls, we propose a method based on a time-attention Long Short-Term Memory(LSTM) and a multi-scale dilated causal convolutional network, which consists of three main components. The first component is the frequency-domain feature extraction module, which uses a multiscale dilated causal convolutional network to extract features from the Mel spectrogram of the audio. This module connects dilated causal convolutional networks at three different scales, with each scale incorporating a distinct dilation factor. Second, is the time-domain feature extraction module, which inputs the Mel-frequency cepstral coefficients(MFCC) features of the audio data into an LSTM network for extracting time-domain features, thereby supplementing the temporal information in the audio. A time-attention mechanism is then introduced to enhance the focus on important time-domain features. Thirdly, we adopt transfer learning, where a pre-trained neural network is further trained on real marine mammal call data to replace the convolutional network used for classification. Extensive experiments on four species of marine mammals demonstrate that our method achieves the best performance across four metrics: accuracy, precision, recall, 1 Wanlu Cheng et al.
Keywords: marine mammals, Marine mammal call recognition and classification, Transfer Learning, LSTM, Expansive causal convolutional networks
Received: 31 Mar 2025; Accepted: 30 Apr 2025.
Copyright: © 2025 Cheng, Chen, Jiang, Li, Wang and Zhou. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
* Correspondence:
Jingjing Wang, Qingdao University of Science and Technology, Qingdao, China
Yanping Zhou, Qingdao University of Science and Technology, Qingdao, China
Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.