ORIGINAL RESEARCH article

Front. Comput. Sci.

Sec. Computer Vision

Volume 7 - 2025 | doi: 10.3389/fcomp.2025.1569017

This article is part of the Research TopicFoundation Models for Healthcare: Innovations in Generative AI, Computer Vision, Language Models, and Multimodal SystemsView all 10 articles

Heterogeneous ensemble learning: modified ConvNextTiny for detecting molecular expression of breast cancer on standard biomarkers

Provisionally accepted
Indo  IntanIndo Intan1*Sitti  HarlinaSitti Harlina1Andrea  Stevens KarnyotoAndrea Stevens Karnyoto2Berti  Julian NelwanBerti Julian Nelwan3Devin  SetiawanDevin Setiawan4Amalia  YaminAmalia Yamin5Ririn  Endah PuspitasariRirin Endah Puspitasari3
  • 1Department of Informatics Engineering, Dipa Makassar University, Makassar, Indonesia
  • 2Computer Science Department, BINUS Graduate Program-Master of Computer Science, Faculty of Engineering, Binus University, West Jakarta, Jakarta, Indonesia
  • 3Department of Anatomical Pathology, Faculty of Medicine, Hasanuddin University, Indonesia, Makassar, Indonesia
  • 4Department of Electrical Engineering and Computer Science, School of Engineering, University of Kansas, Lawrence, Kansas, United States
  • 5Laboratory of Anatomical Pathology, Wahidin Sudirohusodo General Hospital, Indonesia, Makassar, Indonesia

The final, formatted version of the article will be published soon.

Breast cancer is the highest-ranking type of cancer, with 2.3 million new cases diagnosed each year. Immunohistochemistry (IHC) is the gold standard for determining the expression of cancer malignancies in patients with the ultimate goal of determining prognosis and therapy. Immunohistochemistry refers to the four WHO standard biomarkers: estrogen receptor, progesterone receptor, human epidermal growth factor receptor-2, and Ki-67. The indications of the four biomarkers focus on the quantity of cell nuclei and the intensity of brown cell membranes. Our study aims to detect the expression of breast cancer malignancy as an initial step in determining prognosis and therapy. We implemented homogeneous and heterogeneous ensemble learning models. The homogeneous ensemble learning model uses the majority vote technique to select the best performance between the Xception, ResNet50V2, InceptionResNet50V2, and ConvNextTiny models. The heterogeneous ensemble learning model takes the ConvNextTiny model as the best model. Feature engineering in ConvNextTiny combines convolution and cell-quantification features as feature fusion. ConvNextTiny, which applies feature fusion, can detect the expression of cancer malignancy. Heterogeneous ensemble learning outperforms homogeneous ensemble learning. The model performs well for accuracy, precision, recall, F1-score, and ROC-AUC of 0.997, 0.973, 0.991, 0.982, and 0.994, respectively. These results indicate that the model can classify malignancy expressions of breast cancer well. This model still requires the configuration of the visual laboratory device to test the real-time model capabilities.

Keywords: breast cancer, ConvNextTiny, ensemble learning, Canny, Otsu, IHC, ER/PR/Ki-67, HER-2

Received: 31 Jan 2025; Accepted: 24 Jun 2025.

Copyright: © 2025 Intan, Harlina, Karnyoto, Nelwan, Setiawan, Yamin and Puspitasari. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence: Indo Intan, Department of Informatics Engineering, Dipa Makassar University, Makassar, Indonesia

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.