ORIGINAL RESEARCH article

Front. Mar. Sci.

Sec. Ocean Observation

Volume 12 - 2025 | doi: 10.3389/fmars.2025.1469396

Assisting human annotation of marine images with foundation models

Provisionally accepted
  • 1Monterey Bay Aquarium Research Institute (MBARI), Moss Landing, United States
  • 2National Oceanography Centre, Southampton, United Kingdom
  • 3CVision AI, Medford, MA, United States

The final, formatted version of the article will be published soon.

Marine scientists have been leveraging supervised machine learning algorithms to analyze image and video data for nearly two decades. There have been many advances, but the cost of generating expert human annotations to train new models remains extremely high. There is broad recognition both in computer and domain sciences that generating training data remains the major bottleneck when developing ML models for targeted tasks. Increasingly, computer scientists are not attempting to produce highly-optimized models from general annotation frameworks, instead focusing on adaptation strategies to tackle new data challenges. Taking inspiration from large language models, computer vision researchers are now thinking in terms of "foundation models" that can yield reasonable zero-and few-shot detection and segmentation performance with human prompting. Here we consider the utility of this approach for ocean imagery, leveraging Meta's Segment Anything Model to enrich ocean image annotations based on existing labels. This workflow yields promising results, especially for modernizing existing data repositories. Moreover, it suggests that future human annotation efforts could use foundation models to speed progress towards a sufficient training set to address domain specific problems.

Keywords: Foundation model, marine imagery, segmentation, object detection, human-in-the-loop

Received: 23 Jul 2024; Accepted: 25 Jun 2025.

Copyright: © 2025 Orenstein, Woodward, Lundsten, Barnard, Schlining and Katija. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence: Eric Orenstein, Monterey Bay Aquarium Research Institute (MBARI), Moss Landing, United States

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.