ORIGINAL RESEARCH article
Front. Remote Sens.
Sec. Data Fusion and Assimilation
Volume 6 - 2025 | doi: 10.3389/frsen.2025.1703239
This article is part of the Research TopicAdvanced Artificial Intelligence for Remote Sensing: Methods and Applications in Earth and Environmental MonitoringView all articles
Residual State-Space Networks with Cross-Scale Fusion for Efficient Underwater Vision Reconstruction
Provisionally accepted- 1Capital Normal University, Beijing, China
- 2Zunyi Medical University - Zhuhai Campus, Zhuhai, China
Select one of your emails
You have multiple emails registered with Frontiers:
Notify me on publication
Please enter your email address:
If you already have an account, please login
You don't have a Frontiers account ? You can register here
Underwater vision is inherently difficult due to wavelength-dependent light absorption, nonuniform illumination, and scattering, which collectively reduce both perceptual quality and task utility. We propose a novel architecture (ResMambaNet) that addresses these challenges through explicit decoupling of chromatic and structural cues, residual state-space modeling, and cross-scale feature alignment. Specifically, a dual-branch design separately processes RGB and Lab representations, promoting complementary recovery of color and spatial structures. A residual state-space module is then employed to unify local convolutional priors with efficient long-range dependency modeling, avoiding the quadratic complexity of attention. Finally, a cross-attention–based fusion with adaptive normalization aligns multi-scale features for consistent restoration across diverse conditions. Experiments on standard benchmarks (EUVP and UIEB) show that the proposed approach establishes new state-of-the-art performance, improving colorfulness, contrast, and fidelity metrics by large margins, while maintaining only ∼0.5M parameters. These results demonstrate the effectiveness of residual state-space modeling as a principled framework for underwater image enhancement.
Keywords: Underwater vision reconstruction, residual state-space modeling, dual-branch feature decoupling, cross-attention fusion, Computational Imaging
Received: 11 Sep 2025; Accepted: 15 Oct 2025.
Copyright: © 2025 Xiong and ZHANG. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
* Correspondence: Yuhan ZHANG, zhangyuhan_tj@foxmail.com
Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.