Training marine species object detectors with synthetic images and unsupervised domain adaptation

Doig, Heather; Pizarro, Oscar; Williams, Stefan

doi:10.3389/fmars.2025.1581778

ORIGINAL RESEARCH article

Front. Mar. Sci., 11 July 2025

Sec. Ocean Observation

Volume 12 - 2025 | https://doi.org/10.3389/fmars.2025.1581778

Training marine species object detectors with synthetic images and unsupervised domain adaptation

Heather Doig¹

Oscar Pizarro^2*

Stefan Williams¹

¹Australian Centre for Robotics, University of Sydney, Sydney, NSW, Australia
²Department of Marine Technology, Norwegian University of Science and Technology, Trondheim, Norway

Visual surveys by autonomous underwater vehicles (AUVs) and other underwater platforms provide a valuable method for analysing and understanding the benthic environment. Scientists can measure the presence and abundance of benthic species by manually annotating survey images with online annotation software or other tools. Neural network object detectors can reduce the effort involved in this process by locating and classifying species of interest in the images. However, accurate object detectors often rely on large numbers of annotated training images which are not currently available for many marine applications. To address this issue, we propose a novel pipeline for generating large amounts of synthetic annotated training data for a species of interest using 3D modelling and rendering software. The detector is trained with synthetic images and annotations along with real unlabelled images to improve performance through domain adaptation. Our method is demonstrated on a sea urchin detector trained only with synthetic data, achieving a performance slightly lower than an equivalent detector trained with manually labelled real images (AP50 of 84.3 vs 92.3). Using realistic synthetic data for species or objects with few or no annotations is a promising approach to reducing the manual effort required to analyse imaging survey data.

1 Introduction

Increasing human activity in our oceans, such as wind energy, aquaculture and mining, can lead to changes in marine communities. Regular photographic surveys of the seafloor allow scientists to measure changes in the presence and abundance of a variety of marine species that may be impacted. Detection of specific benthic species supports the study and exploration of a healthy underwater environment (Estes et al., 2021; Perkins et al., 2022; Er et al., 2023; Peng et al., 2021). Images of the seafloor can be used to detect endangered (Stuart-Smith et al., 2020), invasive (Liu et al., 2022, 2021b) and sentinel species (Perkins et al., 2022) captured by Autonomous Underwater Vehicles (AUVs), Remote Operated Vehicles (ROVs) and other underwater platforms. Using neural network object detectors to locate and classify a species of interest can help marine scientists efficiently review the large volume of images captured during underwater surveys. Ideally, the detector is trained in a supervised manner with large amounts of data to provide high performance (Inoue et al., 2018; Zhang et al., 2022; Munir et al., 2023), but this is often not available for underwater images (Liu et al., 2020; Er et al., 2023).

A high-performing detector often relies on training the model with large numbers of images annotated by experts who can identify the species of interest. These annotations may comprise bounding boxes around the objects of interest but may also include simple point annotations, object boundaries or full pixel-level semantic labelling. For underwater scenes, there are often few labelled annotations to train supervised models such as an object detector (Xu et al., 2023). Underwater images present particular challenges for reliable object detection due to variations in the attenuation of light in water, water clarity and camera platforms. The low number of annotations may not provide a good distribution of examples to train a model to generalise for variations in light, water and camera platforms. Generating synthetic images with 3D modelling software offers a new method to alleviate these issues by providing abundant annotations with variations in simulated water, light and camera conditions increasing the ability for the detector to perform well (Zhang et al., 2022; Oza et al., 2024; Er et al., 2023).

Synthetic images have been used to address the lack of annotated data in other domains such as autonomous driving (Johnson-Roberson et al., 2017) but only in a limited manner in the underwater domain (Sans-Muntadas et al., 2022; Zwilgmeyer et al., 2021; Lin et al., 2023). 3D modelling and simulation software such as Blender¹, Unity² and Unreal Engine³ are increasingly popular in providing photorealistic images and animations to supplement or replace real data (Saini et al., 2022; Ebadi et al., 2022; Peñarroya et al., 2023; Lu et al., 2023; Becktor et al., 2022; Diamanti et al., 2024). Blender has combined functionality to produce large numbers of synthetically rendered images with variety and realism for underwater scenes. Through its Python interface, it provides procedural generation of scenes and assets creating randomisation (Santos et al., 2024; Raistrick et al., 2024, 2023), add-ons to generate annotations for training machine learning models (Denninger et al., 2023; Raistrick et al., 2023) and the ability to simulate the effect of light in water (Zwilgmeyer et al., 2021; Sans-Muntadas et al., 2022).

Infinigen (Raistrick et al., 2023) is a framework based on Blender that generates natural scenes using procedurally generated objects, providing the opportunity to generate a vast range of realistic images. Creating realistic 3D models of the morphospecies or object of interest allows the generation of a tailored synthetic dataset. The 3D model can be placed in various benthic scenes with random variations of water conditions and camera configurations.

Our work proposes a new method for training an object detector for a target marine species or object by generating large volumes of realistic benthic images with Blender and Infinigen combined with domain adaptation during training. While the synthetic data provides larger volumes of annotations for supervised training, the different distributions between the synthetic and real images must be addressed to perform effectively during real-world deployment. Previous work has used image-to-image translation to bridge this gap but this requires training an additional translation network (Sans-Muntadas et al., 2022; Lin et al., 2023). In our method, we apply two domain adaptation methods to reduce the domain gap during training without image-to-image translation.

The contributions of this paper are:

● a new method to train an object detector of a target marine species or object with no manually annotated training data

● a framework to generate realistic and varied synthetic images of benthic scenes for a target underwater object with annotations for object detection and semantic segmentation

● a training pipeline for high-performing object detection in underwater images using synthetic data and unlabelled real images using state-of-the-art unsupervised domain adaptation and semi-supervised methods

● experimental validation of our method on two real datasets with images of sea urchins captured with different underwater vehicles, water conditions and locations

Our results shows that synthetic data can be used to train good-quality detectors without requiring any manually annotated real images. While we focus on object detection, our method could also be applied to semantic segmentation.

The remainder of this paper is organised as follows. Section 2 presents an overview of work using synthetic training images and training object detectors with domain adaptation. Section 3 describes our method to generate synthetic training images and train an object detector with domain adaptation. Section 4 describes the application of the method to the detection of black spiny urchins in images from AUV and ROV platforms, while Section 5 provides a summary of the insights from this method followed by concluding remarks in Section 6.

2 Related work

2.1 Object detection of marine species

Underwater object detection with low numbers of annotations has been improved with enhancements to network architectures, augmentation and pre-processing strategies (Liu et al., 2023, 2020; Israk Ahmed et al., 2024). Recent work has addressed the lack of labelled data for training underwater detectors with augmentation using style transfer of different water types as well as a domain alignment step during training (Liu et al., 2020). Israk Ahmed et al. (2024) trained an object detector for green sea urchins using labelled images from an AUV with data augmentation, colour correction and enhancements to the detector architecture. Our method is independent of image pre-processing and detector architecture and could be applied in addition to these enhancements without the need for labelled data.

2.2 Synthetic data

Synthetic images have been generated for other domains to address the lack of labelled data using 3D modelling software such as Unity gaming engine, Unreal Engine and Blender providing photorealistic renderings of scenes (Saini et al., 2022; Ebadi et al., 2022; Diamanti et al., 2024). The Sim10k dataset (Johnson-Roberson et al., 2017) provides 10,000 images of street scenes captured from a car dashboard generated from the Grand Theft Auto video game. Mayer et al. (2016) created large-scale datasets generated from Blender to train optical flow models that performed well on real video data. Procedural generation of assets and scenes using Blender’s geometry nodes and Python interface provided randomisation in synthetic images (Santos et al., 2024).

Synthetic data has also been created to train neural networks for underwater image tasks (Lin et al., 2023; Sans-Muntadas et al., 2022; Zwilgmeyer et al., 2021). OysterNet (Lin et al., 2023) created simulated oyster reefs using Blender followed by an image-to-image translation step to provide realistic water effects for a segmentation task. Sans-Muntadas et al. (2022) used simulated underwater images with a simple box structure followed by an image-to-image translation step to train a segmentation model for robotic localisation. The effect of water on light was modelled using Blender’s volume shaders to simulate absorption and scattering (Sans-Muntadas et al., 2022; Zwilgmeyer et al., 2021; Diamanti et al., 2024). Our framework generates a more visually complex and varied 3D model of the scene and objects to be detected than previous work, combined with simulated water effects with domain adaptation methods that do not require a separate image-to-image translation network.

2.3 Domain adaptation

Unsupervised Domain Adaptation (UDA) methods update a model during training to reduce the domain gap between labelled source data and unlabelled target data. UDA for classification tasks commonly update feature representations to be domain-invariant at an image level using adversarial learning (Ganin et al., 2016; Tzeng et al., 2017) or other alignment strategies (Baktashmotlagh et al., 2013; Sun and Saenko, 2016). Domain adaptation object detection (DAOD) uses several approaches to reduce the domain gap including aligning feature distributions (Saito et al., 2019; Chen et al., 2018, 2021) and pseudo-labelling (Chen et al., 2022; Zhu et al., 2023; Maurya et al., 2023). We use ALDI++ (Kay et al., 2025) which is state-of-the-art for the synthetic-to-real object detection benchmark between Sim10k and Cityscapes (Cordts et al., 2016). Semi-supervised training is a related training approach to UDA as it has the same training inputs of small amounts of labelled data and larger amounts of unlabelled training data (Zhang et al., 2021). Mean Teacher (Tarvainen and Valpola, 2017) uses pseudo-labelling to reduce the domain gap and can be applied to other object detector architectures and other tasks such as semantic segmentation (Zhou et al., 2023). We have used Mean Teacher as an additional method to reduce the domain gap.

3 Method

Our method uses synthetic training data generated from a Blender-based framework followed by a training pipeline using UDA to create an object detector with good performance for detecting a target marine species. An overview of our method is shown in Figure 1.

Figure 1

Diagram illustrating synthetic image generation and training pipeline. Section A shows various elements like target species, other objects, terrain, and water effects. Section B outlines training pipeline steps: inputting synthetic images with bounding boxes and unlabeled target data into a domain adaptation process using Faster R-CNN, leading to bounding box predictions.

Figure 1. Overview of our method. (A) 3D models of marine objects, terrain and water are combined to create a unique scene for generating images and annotations. The training pipeline (B) uses the labelled synthetic data and real unlabelled target images to train the object detector with domain adaptation.

3.1 Synthetic data generation

Synthetic training data is generated to provide a large volume of annotated data to train an object detector to address the lack of annotated real images. Images and annotations are generated using the Blender-based framework, Infinigen (Raistrick et al., 2023), with our enhancements for the benthic environment. The framework generates large numbers of realistic images taken from a simulated camera rig from a vehicle like an AUV or ROV, following a typical survey path. Blender provides physically-based rendering (PBR) (Cornetto and Suway, 2019) of light which can be combined with Blender’s Absorption Volume and Scattering Volume shaders to simulate absorption and scattering of light underwater. Figure 2 shows examples of scenes with and without the water effects provided by the Volume shaders. Bounding box annotations, semantic segmentation masks and depth maps are also generated by the framework.

Figure 2

Six rendered images labeled A to F depict underwater scenes with varying levels of water absorption and scattering effects. A and D show clear conditions with vivid colors, while B, C, E, and F introduce increasing blueness and darkness. Each image progressively demonstrates more realistic underwater lighting and color shifts, showcasing the impact of different settings such as absorption color, density, scatter color, scatter density, and anisotropy, detailed at the bottom.

Figure 2. Synthetic images showing two scenes with no absorption or scattering (A, D) then two different combinations of absorption and scattering (B, C, E, F). The absorption and scattering parameters are shown below the images with colour shown in hue, saturation, value (HSV).

Based on Blender, Infinigen generates realistic natural scenes using procedural generation, allowing an infinite variety of natural terrain and assets to increase the variety of training data (Raistrick et al., 2023). Our method enhances Infinigen with a simulated camera rig that follows a ‘mow-the-lawn’ survey track over the scene, similar to paths used in AUV missions. The camera features vehicle-mounted lighting, lens distortion based on real AUV camera calibration (Williams et al., 2012), sensor noise and motion blur. Updated natural and new man-made assets have also been added (see Supplementary Material for examples). Blender v3.6.0 has been used with Infinigen v1.6.6. The code for enhancements for benthic scenes is available on GitHub^⁴.

To simulate the effect of water on light, we have used Blender’s Volume Absorption shader and Volume Scattering shader. This is based on previous work that uses the ‘Principled Volume Shader’ (Zwilgmeyer et al., 2021; Sans-Muntadas et al., 2022; Diamanti et al., 2024), which combines the Absorption and Scattering shaders but does not allow all variables to be set. Using the separate Volume Absorption and Volume Scattering shader allows density to be defined for each shader giving more control over the amount of absorption and scattering instead of using one value for the density of both. Figure 2 gives examples with two different sets of parameters for the absorption and scattering shaders at the upper and lower ranges of values used. For each scene, a random value between the lower and upper thresholds is set for each parameter.

3.2 Object detector training

Our method uses the two-stage object detection architecture Faster R-CNN. The detector is trained using both labelled synthetic images and unlabelled real images from underwater surveys. Labelling provides a tight-fitting bounding box around the species or object of interest with a classification label. For this work, we have one target species being classified.

The domain gap between the generated synthetic data and the real images is reduced using domain adaptation with unlabelled real images. Following the naming conventions from domain adaptation methodologies, the labelled training data is called the source, and the unlabelled data is called the target. A state-of-the-art UDA method called ALDI++ by Kay et al. (2025) has been used to reduce the domain gap between the synthetic images and the target images. In addition, a semi-supervised training method called Mean Teacher by Tarvainen and Valpola (2017) has also been used. Mean Teacher is a flexible semi-supervised approach that can be applied to other detection architectures and is a state-of-the-art method for semi-supervised semantic segmentation (Tarvainen and Valpola, 2017). These approaches to reducing the domain gap have the advantage of not needing to train an additional network for image-to-image translation as used in Lin et al. (2023) and Sans-Muntadas et al. (2022).

If some labelled target data is available, the synthetic images can be used to augment the labelled target dataset. In this scenario, the object detector can be trained without domain adaptation. This method can be applied to any object detection architecture.

4 Experiment and results

We have demonstrated our method on the detection of black spiny urchins and performed evaluation on two publicly available datasets of images taken from AUVs and ROVs.

4.1 Datasets

Synthetic data A dataset of 2,406 synthetic images with 23,248 annotations of urchins was generated using the enhanced Infinigen framework. Images were generated from an ROV setup with variations in the angle of the camera and altitudes and an AUV-like configuration with a downward-looking camera at around 2m altitude. The scenes provide a variety of benthic terrain and marine assets such as kelp, seaweed, seashells and fish. Water absorption and scattering were varied in each scene produced. Figure 3 provides examples of the generated synthetic images with bounding box annotations. The synthetic images and annotations are available at https://huggingface.co/datasets/hdoi5324/SyntheticUrchin.

Figure 3

Two panels labeled A and B show underwater scenes with marine life. Panel A features a coral formation with several blue rectangles marking specific areas. Panel B displays sea urchins and large seaweed, also with blue rectangles highlighting areas of interest. Both panels have a greenish tint and a mix of natural structures and marked elements.

Figure 3. Examples of synthetically generated images from two underwater scenes with urchins, coral and kelp with bounding box annotations (blue for urchins, pink for coral). (A) shows a forward-looking view and (B) shows a downward-looking view.

Target data Two target datasets with black spiny urchins have been used to test the ability to train a good-quality detector with synthetic data. Each dataset has a training and test split to provide evaluation with unseen data. Table 1 provides the numbers of images and annotations in each dataset. The first dataset is UDD (Underwater Detection Dataset) (Liu et al., 2021a) with images captured by divers and ROVs in an open-sea farm with sea urchins, sea cucumbers and scallops. Only the sea urchin annotations are used.

Table 1

Table 1. Dataset counts for images and annotations.

The second urchin dataset, called IMOS AUV, is from AUV surveys off the coast of Tasmania, Australia, performed as part of the Integrated Marine Observing System⁵ (IMOS). The IMOS AUV dataset has training images from 2009 and test images from 2011 to investigate whether the model can generalise to images from a different time period. The AUV images and annotations are available at the underwater image repository Squidle+^⁶.

4.2 Evaluation

To validate our method, we follow other UDA methodologies (Deng et al. (2021); Chen et al. (2021); Kay et al. (2025); Chen et al. (2022)), by comparing performance of models trained with our method to an oracle, which is a reference upper limit of performance from a fully-supervised model trained with available labelled training target data. The base model refers to a model trained without any domain adaptation. A separate test split of the target data is used for evaluation using the average precision for detection with an intersection over union (IoU) of 50 or AP50. Performance is measured using the model from the final iteration of training. Due to the scarcity of labelled training data, there is no separate validation dataset for selecting the best-performing model from an earlier iteration.

4.3 Training setup and results

In this section, we describe the setup for the object detector training. We use a two-stage object detector, Faster-RCNN, which has state-of-the-art performance for UDA (Kay et al., 2025; Chen et al., 2018, 2021). Our implementation is based on ALDI++ and the Detectron2 framework (Kay et al., 2025; Wu et al., 2019). In all experiments we use the same training parameters as Detectron2 Faster-RCNN training with changes noted below.

We use a total batch size of 16 with two GPUs and an initial learning rate of 0.02. The total iterations are 9000, which is one-tenth of the iterations used by Detectron2 due to the smaller size of the datasets for this research. We use a ResNet-50 with FPN backbone (Lin et al., 2017) initialised with weights from pre-training on the ImageNet dataset (Deng et al., 2009). Weak augmentation uses a horizontal and vertical flip, and in addition, strong augmentation uses random adjustment to brightness, contrast and saturation, and random blurring. The training was performed on Nvidia P100 GPUs.

Results are provided in Figure 4 for base models trained only with synthetic data, models trained with synthetic data and unlabelled target data using Mean Teacher and ALDI++ and finally, the fully-supervised oracle model. Base models are trained with weak augmentation and strong augmentation. Training with strong augmentation uses a student-teacher configuration updating the weights of the teacher model using an exponential moving average (EMA) of the student model’s weights based on Laine and Aila (2017) (Strong Augmentation with EMA). Mean Teacher and ALDI++ start training with the base model with strong augmentation with EMA. The oracle is also trained with strong augmentation with EMA. Table 2 shows the AP50 for each model trained for the UDD and IMOS AUV datasets. Figure 5 provides annotated examples from each dataset.

Table 2

Table 2. AP50 for urchin detection trained with synthetic source data and unlabelled target data.

Figure 4

Bar chart comparing AP50 performance of different training methods: Base with Weak Augmentation, Base with Strong Augmentation, Mean Teacher, and ALDI++ on UDD and IMOS AUV datasets. Oracle performance is indicated by a red dashed line at 80. Mean Teacher has the highest performance in UDD, while ALDI++ slightly leads in IMOS AUV.

Figure 4. AP50 for urchin detection for UDD and IMOS AUV datasets using different training methods. The base models are only trained with synthetic data, while Mean Teacher and ALDI++ also use unlabelled target data. The oracle is the performance from the fully supervised model for reference.

Figure 5

Underwater images labeled A to D show diverse sea urchin detection scenarios using bounding boxes with confidence scores. Image A and B display sparse urchin distribution on bare ocean floors. Image C and D exhibit denser vegetation, with sea urchins similarly marked amongst kelp.

Figure 5. Examples of the target images with ground truth shown with a green border with the label ‘seaurchin’ and predictions in blue with the probability. (A, B) are from the UDD dataset, and (C, D) are from the IMOS AUV dataset.

4.4 Augmenting labelled target data with synthetic data

The labelled synthetic data can also be used to augment labelled target data to create a larger and more varied training dataset. Synthetic data was added to a subset of labelled real data and trained using strong augmentation with EMA. The model from the final iteration of training is used for evaluation. The results were averaged from three training runs due to the variation when training with small amounts of data (Gao et al., 2022). Figure 6 shows AP50 for training with increasing amounts of labelled target data with and without synthetic data with the standard deviation shown around the line. The UDD datasets benefited the most from the addition of synthetic data achieving higher performance than the oracle. The IMOS AUV dataset benefited from the synthetic augmentation data only when there were less than 200 labelled target images in the training dataset.

Figure 6

Two line graphs compare AP50 performance on UDD and IMOS datasets with varying numbers of labeled target images, using logarithmic scales. Lines represent results with synthetic data augmentation (green) and without (blue). AP50 values increase with more images and synthetic data, indicated by the green line, shows improved results on both datasets. A red dashed line marks a benchmark level.

Figure 6. Augmenting labelled target data with synthetic data: AP50 when using synthetic data to augment a subset of labelled target data for UDD and IMOS AUV datasets. Number of labelled target images shown with log₂ scale to emphasise the differences at lower numbers of labelled images where we see the most significant differences between the performance. Models are trained with strong augmentation with EMA, and performance is measured on the model from the last iteration of training. Results are averaged from three training runs and the standard deviation is shown in lighter colour around the line. The oracle performance is shown by the dashed orange line.

5 Discussion

Our method trained an object detector for a marine species without any manually labelled training data. Performance using synthetic data with Mean Teacher adaptation was within 13.9% and 8.6% of the oracle performance for UDD and IMOS AUV datasets, respectively (See Table 2). Strong augmentation with student-teacher training updating weights with EMA (Strong Augmentation with EMA in Table 2) gave the largest increase in AP50, followed by domain adaptation methods. Strong augmentation is also part of the Mean Teacher and ALDI++ methods. These results may be further improved with other strong augmentation strategies (e.g., mixup (Zhang et al., 2018)) or improvements to image pre-processing and detector architecture (Israk Ahmed et al., 2024).

Both domain adaptation methods successfully improved performance with the Mean Teacher method producing a slightly better result than ALDI++. Ideally, when training a neural network, a separate validation dataset is used for hyperparameter tuning and model selection. In our case, we have no labelled target data preventing any model selection. We have used the model from the final training iteration for evaluation. Being able to measure performance without a labelled validation dataset could lead to further increases in performance as seen in domain adaptation for classification (Hu et al., 2023).

While scarce labelled target data could be used for model selection during training, it could also be used as training data augmented with synthetic data. Augmentation with synthetic data was beneficial for both datasets when training with less than 100 labelled images, as shown in Figure 6. For the UDD dataset, training data augmented with synthetic data also produced better performance with any amount of labelled target data. The UDD images had more cases of blurry images and low clarity, turbid water with a strong yellow-green tinge (See Figure 5). The IMOS AUV images were generally better lit with and had little blurring. For the UDD dataset, the improvement from synthetic data augmentation may have continued with increasing amounts of labelled target data as the synthetic data behaved like a colour space augmentation (Shorten and Khoshgoftaar, 2019), training the model to ignore colour.

The species or objects used in the underwater scenes could be generated based on physical examples. 3D Blender models of marine species and man-made structures have been generated for volumetric measurements and simulations (Zhang et al., 2023; Adamczak et al., 2019; Diamanti et al., 2024). Scanning images or a physical object to generate a 3D model in Blender could extend the variety and availability of realistic assets to place within an underwater scene. Using these models for the procedural generation of assets could provide a range of realistic variations.

6 Conclusion

Our method has trained a high-performing detector of marine species in images from photographic surveys using generated synthetic labelled data combined with domain adaptation during training. The synthetic images are generated with a flexible framework based on Infinigen with variations in water conditions and camera configurations. New marine species or man-made objects could be added by creating new procedural models based on existing assets or scans of images or physical objects. Blender’s modelling capability would allow for a variety of marine species to be created, including rare species with few real images available. Future research into addressing low annotations in benthic images could investigate performing model selection without a labelled validation dataset and how synthetic data can be a successful augmentation strategy.

Data availability statement

The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

Author contributions

HD: Conceptualization, Formal Analysis, Investigation, Methodology, Software, Validation, Writing – original draft, Writing – review & editing. OP: Resources, Supervision, Writing – review & editing. SW: Resources, Supervision, Writing – review & editing.

Funding

The author(s) declare that no financial support was received for the research and/or publication of this article.

Acknowledgments

The object detection training was performed on resources provided by Sigma2 - the National Infrastructure for High-Performance Computing and Data Storage in Norway. The IMOS dataset was sourced from Australia’s Integrated Marine Observing System (IMOS) – IMOS is enabled by the National Collaborative Research Infrastructure Strategy (NCRIS). It is operated by a consortium of institutions as an unincorporated joint venture, with the University of Tasmania as Lead Agent.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Generative AI statement

The author(s) declare that no Generative AI was used in the creation of this manuscript.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Supplementary material

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fmars.2025.1581778/full#supplementary-material

Footnotes

References

Adamczak S. K., Pabst A., McLellan W. A., and Thorne L. H. (2019). Using 3D models to improve estimates of marine mammal size and external morphology. Front. Mar. Sci. 6. doi: 10.3389/fmars.2019.00334