You're viewing our updated article page. If you need more time to adjust, you can return to the old layout.

ORIGINAL RESEARCH article

Front. Plant Sci.

Sec. Sustainable and Intelligent Phytoprotection

RAM-UNet: An Improved U-Net–Based Semantic Segmentation Model for the Main Stem of Mature Soybean Plants

  • Jilin Agriculture University, Changchun, China

The final, formatted version of the article will be published soon.

Abstract

As the key structure connecting the vegetative and reproductive organs of soybean plants, the main stem plays a crucial role, and its morphological parameters serve as core phenotypic indicators for evaluating plant growth, lodging resistance, and yield potential. At the mature stage, the main stem exhibits high similarity to pods in color and texture, along with complex curvature and severe occlusion by pods and leaves, making accurate and continuous extraction challenging for conventional segmentation methods. To address this, this study proposes RAM-UNet, a high-precision semantic segmentation model based on an improved U-Net architecture. The model adopts ResNet50 as the backbone and replaces standard convolutions with deformable convolutions to capture curved stem morphology and improve feature extraction for low-contrast edges. In the encoder, the Convolutional Block Attention Module (CBAM) is combined with an improved atrous spatial pyramid pooling (ASPP) module (C-ASPP) with four dilation rates, enhancing multi-scale feature representation compared to the original three-rate design. A multi-scale attention aggregation (MSAA) module in the decoder improves continuity and integrity of stem boundaries. During training, a composite loss function combining Dice loss and cross-entropy loss is employed to mitigate foreground pixel sparsity. Experimental results on a self-constructed dataset show that RAM-UNet achieves a mean Intersection over Union (mIoU) of 90.58%, with Recall and Precision reaching 94.99% and 94.58%, respectively. Compared with U-Net, DeepLabv3+, PSPNet, and SegNet, RAM-UNet improves mIoU by 6.41%, 10.51%, 22.41%, and 17.37%, respectively. Automatically measured stem lengths show high agreement with manual measurements (R² = 0.9746), validating practical applicability. RAM-UNet also generalizes well on the public PASCAL VOC 2012 dataset, achieving an mIoU of 73.14%. The results indicate that the proposed model enables high-precision and continuous segmentation of main stems in mature soybean plants, providing an effective technical solution for automated and nondestructive measurement of crop phenotypic parameters.

Summary

Keywords

attentionmechanism, deformable convolution, mature soybean plants, Multi-scale feature fusion, Semantic segmentation

Received

02 January 2026

Accepted

17 February 2026

Copyright

© 2026 Zhu, Li, Fu, Li and Feng. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Yuxuan Feng

Disclaimer

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

Outline

Share article

Article metrics