Your new experience awaits. Try the new design now and help us make it even better

ORIGINAL RESEARCH article

Front. Med.

Sec. Gastroenterology

Volume 12 - 2025 | doi: 10.3389/fmed.2025.1655612

This article is part of the Research TopicAdvancing Gastrointestinal Disease Diagnosis with Interpretable AI and Edge Computing for Enhanced Patient CareView all 5 articles

Comparative Analysis of Optimized Logistic Regression with State-of-the-Art Models for Complex Gastroenterological Image Analysis

Provisionally accepted
  • 1George Emil Palade University of Medicine, Pharmacy, Sciences and Technology of Târgu Mureş, Târgu Mures, Romania
  • 2Universitatea Babes-Bolyai, Cluj-Napoca, Romania

The final, formatted version of the article will be published soon.

To classify gastrointestinal (GI) polyps detected in colonoscopy images, an essential task in colorectal cancer prevention, it was optimised Multiclass Logistic Regression (LR), a preferred machine learning (ML) algorithm among physicians due to its interpretability. Given the diagnostic ambiguity of serrated polyps, which share features with both hyperplastic and adenomatous lesions, we focused on multiclass classification using a structured dataset of 152 instances and 698 extracted features. A key objective was to explore hyperparameter tuning approaches for ML models. We proposed a decision rule to guide model selection and conducted a statistical analysis of 88 distinct LR configurations, varying solvers, penalties, and regularisation strengths. The best-performing LR model (liblinear solver, L1 penalty, C = 0.01) achieved 70.39% accuracy, exceeding the average performance of physician evaluators (experts: 65.00%, beginners: 58.42%). To enhance classification performance, four additional ML algorithms were implemented and benchmarked: k-Nearest Neighbours (kNN), Support Vector Machine (SVM), Random Forest (RF), and XGBoost. For each classifier, parameter tuning was applied using grid search and performed stratified cross-validation. In the multiclass setting, XGBoost achieved the highest macro-average F1-score (0.88) and overall accuracy (89%), followed by Random Forest (F1 = 0.85, accuracy = 86%), SVM (F1 = 0.83, accuracy = 84%), and kNN (F1 = 0.56, accuracy = 66%).

Keywords: Gastrointestinal polyps, Colorectal disease, machine learning, Logistic regression algorithm, Multinomial classifier, random forest, k-nearest neighbours, Support vector machine

Received: 28 Jun 2025; Accepted: 20 Oct 2025.

Copyright: © 2025 Cristea, Sima and Iantovics. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence: Laszlo Barna Iantovics, barna.iantovics@umfst.ro

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.