Your new experience awaits. Try the new design now and help us make it even better

ORIGINAL RESEARCH article

Front. Artif. Intell.

Sec. Machine Learning and Artificial Intelligence

Optimized Ensemble Machine Learning Model for Cyberattack Classification in Industrial IoT

Provisionally accepted
Batool  AlabdullahBatool AlabdullahSuresh  SankaranarayananSuresh Sankaranarayanan*
  • Department of Computer Science, King Faisal University, Al Hofuf, Saudi Arabia

The final, formatted version of the article will be published soon.

The increasing cyber threats targeting Industrial Control Systems (ICS) and the Internet of Things (IoT) pose significant risks, especially in critical infrastructures like the oil and gas sector. Existing machine learning (ML) approaches for cyberattack detection often rely on binary classification and lack computational efficiency. This study proposes two optimized stacked ensemble models to enhance attack detection accuracy while reducing computational overhead. The main contribution in our approach lies in the strategic selection and integration of diverse base models, which are chosen to address the unique characteristics of security datasets, such as class imbalance, noise, and complex attack patterns. This curated combination aims to leverage different decision boundaries and learning mechanisms, thereby enhancing the ensemble's ability to detect sophisticated threats effectively. The model combines both linear and non-linear classifiers (Logistic Regression, Extra Tree Classifier, XGBoost, and LGBM) with RFC as the final estimator for improved generalization. Evaluations of traditional models such as Logistic Regression, Decision Tree (DT), RFC, Bagging, Boosting (Xgboost, Catboost, LightGBM), and Extra Tree Classifier demonstrate that our Stacked Ensemble_2 model achieves 97% accuracy with computation time of 54 and 1 minute towards training and testing. Stacked Ensemble_2 has been evaluated against traditional Stacked Ensemble_1 which integrates two boosting algorithms (AdaBoost and XGBoost) as base learners, with Random Forest Classifier (RFC) as the final estimator. In addition, both stacked ensemble models have been generalized on CICDOS 2017 dataset proving that our best performing Stacked Ensemble_2 has performed excellently on new unseen data achieving an accuracy of 100% with AUROC of 99%. This approach provides a scalable, real-time detection mechanism for securing ICS and IoT environments

Keywords: Cyberattack, ensemble learning, Industrial control systems (ICS), Industrial Internet of Things (IIoT), Internet of Things (IoT), machine learning, malicious behavior, oil and gas

Received: 13 Aug 2025; Accepted: 03 Dec 2025.

Copyright: © 2025 Alabdullah and Sankaranarayanan. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence: Suresh Sankaranarayanan

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.