ORIGINAL RESEARCH article

Front. Artif. Intell.

Sec. Machine Learning and Artificial Intelligence

Volume 8 - 2025 | doi: 10.3389/frai.2025.1572433

This article is part of the Research TopicArtificial Intelligence in Visual InspectionView all 3 articles

HMA -Net: A Hybrid Mixer Framework with Multihead Attention for Breast Ultrasound Image Segmentation

Provisionally accepted
Soumya  Sara KoshySoumya Sara KoshyJANI  ANBARASI LJANI ANBARASI L*
  • Vellore Institute of Technology (VIT), Chennai, India

The final, formatted version of the article will be published soon.

Breast cancer is a severe illness predominantly affecting women, and in most cases, it leads to loss of life if left undetected. Early detection can significantly reduce the mortality rate associated with breast cancer. Ultrasound imaging has been widely used for effectively detecting the disease, and segmenting breast ultrasound images aids in the identification and localization of tumors, thereby enhancing disease detection accuracy. Numerous computer aided methods have been proposed for the segmentation of breast ultrasound images. A deep learning based architecture utilizing ConvMixer based encoder and ConvNeXT based decoder coupled with convolution enhanced multihead attention has been proposed for segmenting breast ultrasound images. The enhanced ConvMixer modules utilizes spatial filtering and channel-wise integration to efficiently capture local and global contextual features, enhancing feature relevance thus increasing segmentation accuracy through dynamic channel recalibration and residual connections. The bottleneck with the attention mechanism enhances segmentation by utilizing multihead attention to capture long range dependencies thus enabling the model to focus on relevant features across distinct regions. The enhanced ConvNeXT modules with squeeze and excitation utilizes depthwise convolution for efficient spatial filtering, layer normalization for stabilizing training, and residual connections to ensure the preservation of relevant features for accurate segmentation. A combined loss function, integrating binary cross entropy and dice loss, is used to train the model. The proposed model has exceptional performance in segmenting intricate structures, as confirmed by comprehensive experiments conducted on two datasets namely BUSI dataset and BrEaST dataset of breast ultrasound images. The model achieved a Jaccard index of 98.04% and 94.84%, and a Dice similarity coefficient of 99.01% and 97.35% on the BUSI and BrEaST datasets respectively.

Keywords: INDEX TERMS Breast Cancer, deep learning, Ultrasound images, segmentation, ConvNeXt

Received: 07 Feb 2025; Accepted: 12 May 2025.

Copyright: © 2025 Sara Koshy and ANBARASI L. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence: JANI ANBARASI L, Vellore Institute of Technology (VIT), Chennai, India

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.