AUTHOR=Jia Meijuan , Mao Xiaodong , Guo Shuai , Li Xin TITLE=Retrieval algorithm based on locally sensitive hash for ocean observation data JOURNAL=Frontiers in Marine Science VOLUME=Volume 12 - 2025 YEAR=2025 URL=https://www.frontiersin.org/journals/marine-science/articles/10.3389/fmars.2025.1534900 DOI=10.3389/fmars.2025.1534900 ISSN=2296-7745 ABSTRACT=As an important technology for eliminating redundant data, data deduplication significantly impacts today’s era of explosive data growth. In recent years, due to the rapid development of a series of related industries, such as ocean observation, ocean observation data has also shown a speedy growth trend, leading to the continuous increase in storage costs of ocean observation stations. Faced with the constant increase in data scale, our first consideration is to use data deduplication technology to reduce storage costs. While using duplicate data deletion technology to achieve our goals, we also need to pay attention to some of the actual situations of ocean observation stations. The fingerprint retrieval process in duplicate data deletion technology plays a key role in the entire process. Therefore, this paper proposes a fast retrieval strategy based on locally sensitive hashing. The fast retrieval algorithm based on locally sensitive hashing can enable us to quickly complete the retrieval process in duplicate data deletion technology and achieve the goal of saving computing resources. At the same time, we proposed a bucket optimization strategy for retrieval algorithms based on locally sensitive hashing. We utilized visual information to address the bottleneck problem in duplicate data deletion technology. At the end of the article, we conducted careful experiments to compare hash retrieval algorithms and concluded the strategy’s feasibility.