Your new experience awaits. Try the new design now and help us make it even better

ORIGINAL RESEARCH article

Front. Remote Sens.

Sec. Data Fusion and Assimilation

Volume 6 - 2025 | doi: 10.3389/frsen.2025.1703239

This article is part of the Research TopicAdvanced Artificial Intelligence for Remote Sensing: Methods and Applications in Earth and Environmental MonitoringView all articles

Residual State-Space Networks with Cross-Scale Fusion for Efficient Underwater Vision Reconstruction

Provisionally accepted
Nei  XiongNei Xiong1Yuhan  ZHANGYuhan ZHANG2*
  • 1Capital Normal University, Beijing, China
  • 2Zunyi Medical University - Zhuhai Campus, Zhuhai, China

The final, formatted version of the article will be published soon.

Underwater vision is inherently difficult due to wavelength-dependent light absorption, nonuniform illumination, and scattering, which collectively reduce both perceptual quality and task utility. We propose a novel architecture (ResMambaNet) that addresses these challenges through explicit decoupling of chromatic and structural cues, residual state-space modeling, and cross-scale feature alignment. Specifically, a dual-branch design separately processes RGB and Lab representations, promoting complementary recovery of color and spatial structures. A residual state-space module is then employed to unify local convolutional priors with efficient long-range dependency modeling, avoiding the quadratic complexity of attention. Finally, a cross-attention–based fusion with adaptive normalization aligns multi-scale features for consistent restoration across diverse conditions. Experiments on standard benchmarks (EUVP and UIEB) show that the proposed approach establishes new state-of-the-art performance, improving colorfulness, contrast, and fidelity metrics by large margins, while maintaining only ∼0.5M parameters. These results demonstrate the effectiveness of residual state-space modeling as a principled framework for underwater image enhancement.

Keywords: Underwater vision reconstruction, residual state-space modeling, dual-branch feature decoupling, cross-attention fusion, Computational Imaging

Received: 11 Sep 2025; Accepted: 15 Oct 2025.

Copyright: © 2025 Xiong and ZHANG. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence: Yuhan ZHANG, zhangyuhan_tj@foxmail.com

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.