ORIGINAL RESEARCH article
Front. Bioinform.
Sec. Computational BioImaging
ZR2ViM: A Recursive Vision Mamba Model for Boundary-Preserving Medical Image Segmentation
Provisionally accepted- 1Sichuan University of Science and Engineering, Zigong, China
- 2Zigong First People's Hospital, Zigong, China
Select one of your emails
You have multiple emails registered with Frontiers:
Notify me on publication
Please enter your email address:
If you already have an account, please login
You don't have a Frontiers account ? You can register here
Introduction: Medical image segmentation is fundamental to quantitative disease analysis and therapeutic decision-making. However, constrained by limited computational resources, existing deep learning methods often struggle to simultaneously model long-range dependencies and preserve boundary precision, particularly when delineating structures with complex morphology or blurred edges. Method: To overcome these challenges, we propose ZR2ViM, a recursion-enhanced visual state space model designed for medical image segmentation. ZR2ViM augments the Vision Mamba framework with a Zigzag Recursive Reinforced (ZR2) Block that incorporates Stacked State Redistribution (SSR) and a Nested Recursive Connection (NRC). The NRC employs dual inner and outer pathways to iteratively fuse local details with global context while preserving 2D spatial adjacency. Furthermore, a Cross-directional Zigzag WKV (CZ-WKV) module executes multi-step recursive updates along multiple zigzag trajectories, injecting spatial directional information via Quad-Directional Token Shift(Q-Shift) directional priors. Collectively, these mechanisms mitigate serialization-induced banding artifacts and enhance the representation of fine, elongated, and low-contrast structures, all while maintaining near-linear computational complexity. Results: Comprehensive evaluations across four medical imaging domains—spanning dermatoscopic images, breast ultrasound, colorectal polyps, and abdominal multi-organ CT—on five public datasets demonstrate that ZR2ViM consistently outperforms representative convolutional, attention-based, and visual state space architectures in region consistency and boundary localization. Notably, ZR2ViM achieves a 2.15 mm reduction in the HD95 on the Synapse multi-organ CT dataset relative to the CC-ViM baseline, substantiating its superior capability for precise, clinically relevant boundary delineation. Sample et al. Running Title Conclusion: The ZR2ViM framework delivers accurate, boundary-preserving segmentation across diverse imaging modalities and anatomically complex structures, achieving these gains with near-linear computational complexity. These findings demonstrate that ZR2ViM offers a robust and efficient solution for medical image analysis, establishing a promising foundation for advanced clinical and research applications.
Keywords: Boundary preservation, deep learning, Medical image segmentation, State space models, vision mamba, Zigzag scanning
Received: 18 Dec 2025; Accepted: 29 Jan 2026.
Copyright: © 2026 Hua, Xiang, Li and Zhou. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
* Correspondence:
Liuying Li
Xia Zhou
Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.
