- 1Escuela de Ingeniería en Ciberseguridad, FICA, Universidad de Las Américas, Quito, Ecuador
- 2Escuela de Informática y Telecomunicaciones, Universidad Diego Portales, Santiago, Chile
The growing interconnection of industrial devices in IIoT networks has significantly increased the exposure of critical infrastructures to sophisticated cyberattacks, including 0-day threats, sensor spoofing, and lateral propagation. Conventional intrusion detection systems, based on static rules or supervised learning, often fail to generalize to unknown patterns and lack adaptability in decentralized edge environments. Moreover, most AI-based approaches do not offer real-time interpretability, hindering their deployment in regulated and auditable industrial contexts. This work proposes an autonomous and distributed defense system for IIoT networks based on Deep Deterministic Policy Gradient agents deployed at the edge, coordinated through asynchronous federated learning. Each agent performs local inference using real-time extracted traffic features, such as entropy, command frequency, and inter-packet time, and integrates an embedded SHAP-based XAI module for real-time explainability. The model is trained in an open-world setting, excluding entire attack classes during training to simulate realistic 0-day conditions. Experimental validation using the TON_IoT and N-BaIoT datasets demonstrates that the system maintains a detection F1-score of 92.0%, a false positive rate of 4.1%, and an inference latency of 182 m under multi-node attack conditions. The federated architecture ensures robustness and model continuity even with unstable node participation, while the embedded interpretability mechanism enables on-site auditability and decision traceability.
1 Introduction
The large-scale deployment of cyber-physical devices across industrial plants, production lines, and critical infrastructures has accelerated the adoption of the Industrial Internet of Things (IIoT). In these environments, sensors, actuators, programmable controllers, and monitoring platforms interoperate through distributed and modular architectures to improve efficiency and reduce downtime (Safa et al., 2023). However, this same interconnection has expanded the attack surface of Industrial Control Systems (ICS), which now face sophisticated threats, including sensor spoofing, lateral propagation, and 0-day attacks. These attacks exploit previously unknown vulnerabilities and remain undetected by traditional intrusion detection systems (IDS) based on signatures, static rules, or centralized monitoring mechanisms (Serhane et al., 2023).
The increasing complexity and velocity of industrial threats highlight the need for autonomous, adaptive, and explainable defense mechanisms that operate directly on edge devices. Detecting 0-day attacks, events with no historical representation, remains particularly challenging, requiring inference models that generalize beyond previously observed patterns while ensuring interpretability for operational auditing. Regulatory frameworks such as ISA/IEC 62443, ISO/IEC 27001, and NIST SP 800-82 further mandate decision traceability and system transparency in industrial environments (Madsen et al., 2023; Djebbar and Nordstrom, 2023).
To address these challenges, this work proposes an autonomous defense system for IIoT networks based on distributed intelligent agents trained via Deep Deterministic Policy Gradient (DDPG) (Shruthi and Siddesh, 2023). Each agent processes temporally segmented traffic windows in real time, extracting discriminative indicators such as packet entropy, industrial command frequency, inter-packet timing, and physical process variations (Suhail et al., 2023). These metrics enable continuous-action inference with latencies below 200 m, allowing timely containment of unfolding attacks, including multi-node lateral propagation. Unlike prior intrusion detection approaches that consider reinforcement learning, federated learning, or explainable artificial intelligence as isolated or loosely coupled enhancements, the proposed system integrates these paradigms into a single operational pipeline. Autonomous reinforcement learning agents perform real-time detection and mitigation at the edge, federated learning enables synchronized model adaptation without sharing raw traffic data, and an embedded explainability module provides on-device, real-time decision transparency directly on resource-constrained edge hardware.
A key component of the architecture is an asynchronous federated synchronization mechanism. Agents periodically share model weights with a coordination server, which aggregates them using weighted Federated Averaging (Chen et al., 2022). This design preserves data locality, mitigates unstable contributions from compromised nodes, and allows the global model to evolve without requiring centralized data collection or uninterrupted connectivity. In addition, each agent incorporates an embedded interpretability layer using SHAP, enabling real-time, locally grounded explanations aligned with industrial auditing requirements. This framework integrates SHAP-based interpretability directly into the inference cycle of federated reinforcement learning agents while maintaining sub-200 m reaction times under realistic IIoT resource constraints.
System performance was validated in an experimental environment built on TON_IoT (Moustafa, 2019) and N-BaIoT (Meidan et al., 2018), executed on resource-constrained edge hardware (ARM Cortex-A72, 4 GB RAM). During training, entire attack classes were excluded to simulate realistic 0-day conditions. The system achieved an average accuracy of 92.0%, a false-positive rate of 4.1%, and an inference latency of 182 m. Agents also produced SHAP-based explanations correlating their decisions with temporal variations in key features (Mallampati and Hari, 2023), improving operational traceability and facilitating inspection by human operators. The evaluation neither defines nor uses an energy-efficiency metric. The assessment focuses exclusively on computational load, CPU utilization, memory footprint, and inference latency to determine real-time feasibility on ARM edge devices.
Operational tests further demonstrated the system’s resilience. In attacks combining lateral propagation and sensor spoofing, containment was achieved in under 250 m, isolated nodes were reintegrated within 8.2 s, and unaffected nodes preserved 100% service continuity. Compared with rule-based methods, supervised classifiers, and non-federated variants, the proposed system showed superior adaptability, robustness, and independence from centralized oversight.
The main contribution of this work is to demonstrate that a fully distributed intrusion detection and response architecture can operate effectively under realistic IIoT constraints by integrating reinforcement learning, federated learning, and explainable artificial intelligence into a single operational pipeline. Autonomous agents based on deterministic deep-policy learning perform real-time detection and mitigation at the edge, asynchronous federated aggregation enables coordinated model adaptation without sharing raw data, and an embedded SHAP-based explainability module provides transparent, on-device decision traces. The experimental validation on open datasets and resource-constrained hardware demonstrates the viability of this unified architecture for real-world cyber-physical infrastructures.
This paper is organized as follows: Section 2 reviews research on 0-day detection, autonomous agents, and dataset limitations. Section 3 presents the proposed architecture, agent modeling, datasets, and methodology. Section 4 reports the experimental results. Section 5 provides a detailed discussion, and Section 6 concludes the work and outlines future research directions.
2 Literature review
Protecting Industrial Internet of Things (IIoT) networks against 0-day attacks remains one of the most complex challenges in cyber-physical security. The absence of known signatures, the heterogeneity of industrial devices, and the dynamic nature of operational traffic reduce the effectiveness of traditional IDS. As a result, recent research has increasingly focused on autonomous learning-driven architectures capable of adapting to evolving threat conditions and performing real-time mitigation.
Zhang and Maple (2023) emphasize that IIoT infrastructures exhibit extreme variability in device behavior, communication protocols, and computational constraints. In this setting, deep reinforcement learning (DRL) approaches offer advantages over static anomaly detectors by continually tuning policies as traffic evolves. Ren et al. (2024)) further show that high-performance IDS models can be executed on resource-constrained industrial devices through model compression and TensorRT-optimized inference, validating their applicability in real-world environments with tight latency requirements.
DRL-based detection is complemented by memory-driven anomaly models such as the Adaptive SAMKNN algorithm proposed by Agbedanu et al. (2025), which achieves 98%–99% accuracy under gradual and recurrent concept drift. However, supervised and instance-based approaches cannot autonomously choose and execute mitigation strategies, making them insufficient for real-time defense against previously unseen attacks. This limitation has motivated increasing interest in deterministic DRL algorithms such as DDPG, which can operate on continuous action spaces needed for industrial control decisions.
Parallel efforts aim to develop autonomous agents capable of distributed intelligence across IIoT networks. Chaturvedi et al. (2024) combine SVM–LSTM anomaly detection with a DRL-based response module, improving adaptability but retaining a centralized architecture that limits scalability. Agbedanu et al. (2025) again demonstrate strong detection performance, but without collaborative model evolution, they lack generalization to multi-domain IIoT environments. Hybrid approaches, such as the genetic-algorithm–enhanced deep model proposed by Alkhafaji et al. (2024), show promising generalization to novel attack types. In contrast, Verma et al. (2024) employ federated learning (FL), a decentralized training paradigm in which model updates are shared across clients without exchanging raw data, combined with synchronized autoencoders to detect 0-day threats across multiple clients.
More recent studies attempt to unify learning, distribution, and adaptation. Hesham et al. (2025) introduce an attention-enhanced FL framework integrated with DRL that improves decision robustness through feature prioritization; however, inference remains centralized and cannot operate independently on edge nodes. Complementarily, Hathout et al. (2025) design a federated trust mechanism that isolates compromised clients using gradient deviation metrics and historical reputation scores. While this significantly improves robustness against poisoning attacks, it does not provide autonomous detection or coordinated real-time response.
These contributions highlight meaningful progress yet underline persistent gaps. DRL models excel in adaptability but usually lack federated coordination; FL-based IDSs preserve privacy but rarely incorporate edge-level inference or continuous-action decision-making; and both approaches often omit explainability, which is essential for industrial operators and regulatory compliance.
Limitations in available datasets further complicate rigorous evaluation. N-BaIoT (Hairab et al., 2023), although widely used, captures only botnet-related IoT traffic and lacks the operational diversity required for industrial anomaly detection. TON_IoT, as used in Booij et al. (2022), offers richer multimodal telemetry and a diverse set of attack types, but remains constrained to laboratory conditions and cannot fully reproduce production-scale variability. To overcome these limitations, Hazman et al. (2024) generate interpretable synthetic data using AutoML, meta-learning, and integrated gradients. In contrast, Al-Hawawreh and Hossain (2024) employ digital twins such as X-IIoTID to emulate complex industrial processes and broaden behavioral diversity. Moreover, the use of explainable artificial intelligence (XAI) is gaining traction, as demonstrated by the integrated-gradients approach in (Hazman et al., 2024), although most XAI techniques remain computationally unsuitable for decentralized, latency-critical environments typical of IIoT.
Despite advances across anomaly detection, distributed learning, and explainability, the literature still lacks a unified framework that combines.
• deterministic DRL policies suited for continuous industrial control;
• asynchronous federated aggregation enabling decentralized, privacy-preserving adaptation; and
• embedded real-time explainability compatible with resource-constrained edge devices.
This gap motivates the development of an autonomous defense architecture capable of detecting and mitigating 0-day attacks in real time, while ensuring interpretability, federated consistency, and operational viability in heterogeneous IIoT environments. To contextualize this contribution, Table 1 provides a comparative summary of the representative intrusion detection approaches analyzed, highlighting their learning paradigms, degree of decentralization, edge operation, and key limitations.
Table 1. Summary of representative intrusion detection approaches for 0-day threats in IIoT environments.
3 Materials and methods
3.1 General architecture of the autonomous defense system
The autonomous defense system adopts a distributed and hierarchical architecture designed to detect and mitigate 0-day attacks directly at the edge of IIoT environments (Yang et al., 2024). Intelligent agents are deployed on operational nodes to enable local inference and autonomous response, reducing dependency on continuous connectivity and ensuring resilience under stringent industrial latency requirements. At the device layer, heterogeneous IIoT components (sensors, actuators, PLCs) generate operational and network-level signals. A lightweight monitoring module performs real-time preprocessing, including normalization, integrity checks, and traffic encapsulation, to produce structured representations suitable for automated analysis.
Edge nodes host the autonomous agents, each implemented as a deep learning model refined through deep reinforcement learning (DRL) and tailored to the behavioral profile of the monitored devices (Tareq et al., 2024). Agents classify activity patterns as normal or anomalous and can autonomously initiate local mitigation (e.g., temporary isolation, policy adjustment, alert escalation). Limited temporal memory enables the detection of progressive or stealthy behaviors characteristic of advanced 0-day attacks.
Coordination across agents is achieved through a federated learning process that aggregates parameter updates rather than raw data. During each federation cycle, edge nodes submit local model deltas that are combined using weighted averaging at a central aggregator (Chen et al., 2024). The updated global model is then redistributed, enabling consistent adaptation across nodes while preserving data privacy and maintaining local sensitivity to environmental variations. The coordination center, deployed on a cloud or high-performance edge server, manages global orchestration, initializes the shared model, orchestrates aggregation cycles, and performs forensic analysis of detected incidents. It also integrates XAI modules to provide interpretability of agent decisions and interfaces for operator supervision.
Figure 1 summarizes the architecture, highlighting the decentralized inference workflow, federated communication backbone, and layered defense strategy. This design supports timely detection of unknown threats, privacy-preserving adaptation, and scalability across diverse IIoT deployments, making it a robust foundation for autonomous cyber-defense.
Overall, this architecture satisfies key industrial requirements for autonomous cyber-defense: timely detection of unknown threats, privacy-preserving learning, operational resilience under constrained computational budgets, and adaptability through incremental training cycles. Its modular structure supports both horizontal scaling by incorporating additional agents at new edge nodes and vertical scaling by integrating advanced analytics modules at the coordination layer, resulting in a robust and extensible defense solution for IIoT networks.
3.2 Intelligent agent modeling for detection and response
The intelligent agents deployed at edge nodes constitute the core analytical component of the autonomous defense system. Each agent performs real-time inference on local traffic patterns to detect and mitigate emerging 0-day behaviors. Agents are implemented using the DDPG algorithm (Kim et al., 2021), selected for its stable deterministic policy updates and low-latency inference properties, which are critical in continuous-action industrial environments. Unlike stochastic methods (e.g., SAC, PPO), DDPG avoids sampling-induced variance and reduces computational overhead, enabling efficient deployment on resource-constrained IIoT hardware.
3.2.1 State and action spaces
Each agent processes an observation vector
The action space
3.2.2 DDPG core algorithm
DDPG employs an actor–critic architecture with deterministic policy
The actor is optimized using the deterministic policy gradient, as formally defined in Equation 3:
Exploration is introduced through Ornstein–Uhlenbeck noise, which enables controlled deviation in continuous spaces without the overhead of sampling complete action distributions. Target networks ensure stable convergence under highly dynamic traffic patterns.
3.2.3 Reward function and federated synchronization
The reward combines defensive effectiveness, operational impact, and false-positive penalties, and is mathematically defined as shown in Equation 4:
Each agent trains locally using its own transition buffer and periodically participates in a federated synchronization round. Instead of transmitting raw operational data, agents send only their updated parameters to the coordination server, which applies weighted Federated Averaging, as formally expressed in Equation 5 (Adekunle et al., 2024):
Given the homogeneous hardware profile of the edge nodes and synchronized sampling intervals, standard asynchronous FedAvg achieves stable convergence without requiring proximal regularization or quantile-based aggregation. The global model is redistributed to all agents, ensuring consistency while preserving data locality and maintaining minimal communication overhead.
The complete interaction loop—including state acquisition, deterministic action selection, local learning, and federated synchronization—is summarized in Algorithm 1.
3.2.4 Statistical analysis of comparative performance
To rigorously assess the operational effectiveness of the proposed RL–FL defense model, we conducted a controlled comparative evaluation against four baseline detectors under identical node-level and temporal conditions. The analysis focused on four metrics that capture complementary aspects of defensive performance: reaction time
Reaction time
Pairwise comparisons between the proposed system and each baseline were performed using the Wilcoxon signed-rank test (two-sided), with Holm–Bonferroni correction applied across metrics and baselines. The standardized effect size was computed to quantify the magnitude of the observed differences, and is formally defined as shown in Equation 8:
where
To complement the continuous-metric analysis, McNemar’s test was applied to compare misclassification patterns. Let
The improvement in detection reliability was quantified using the odds ratio on discordant pairs, as formally defined in Equation 10:
All statistical tests were conducted with a significance level of
3.3 Data sets used
The training, validation, and testing of the autonomous agents relied on three heterogeneous datasets designed to emulate a wide range of IIoT traffic conditions and cyberattack behaviors, including scenarios resembling 0-day events. Selection criteria included structural diversity, temporal resolution, label richness, and compatibility with industrial environments. Due to differences in feature schemas and granularity, each dataset was used to train an independent agent. A unified subset of core variables—traffic entropy, command frequency, inter-packet time, and flow statistics—was identified for cross-dataset comparability (Table 2).
3.3.1 Characteristics of the data sources used for training and evaluation
The evaluation of the autonomous defense model relied on three complementary datasets, TON_IoT, N-BaIoT, and CICIDS2019, selected to cover industrial telemetry, botnet-driven device compromise, and hybrid enterprise-like traffic. Together, they provide heterogeneous structural patterns, diverse attack taxonomies, and multimodal feature distributions necessary to assess generalization and 0-day resilience.
TON_IoT (Moustafa, 2019) constitutes the primary industrial reference, integrating network flows, operational sensor telemetry (temperature, pressure, vibration, duty cycles), and control commands generated by PLC and SCADA systems (Saheed et al., 2023). It’s more than 20 million labeled records and 1-s temporal granularity enable reinforcement–learning–oriented window construction, where agents must react to abrupt or persistent deviations while preserving operational continuity. Both sensor logs and labeled packets were incorporated to reproduce realistic cyber-physical behavior under attack.
N-BaIoT (Meidan et al., 2018) complements this perspective by exposing the system to device-level compromises induced by Mirai and Bashlite variants. Its 115-dimensional high-frequency (100 Hz) feature vectors provide long-duration traces of malicious traffic, enabling fine-grained anomaly modeling. Although not industrial in origin, its representation of amplified DDoS behavior and coordinated botnet actions is essential for improving cross-domain robustness, given that industrial infrastructures are frequently targeted through compromised IoT devices.
CICIDS2019 (Adekunle et al., 2024) provides a third evaluation axis, incorporating realistic user–device–server interactions with detailed flow and packet annotations for attacks such as Brute Force, PortScan, DoS, Botnet, and Infiltration. Its structural patterns (e.g., port sweeps, bursty connections, protocol saturation) resemble hybrid industrial conditions and serve as a standardized benchmark for validating hyperparameters, ensuring model comparability, and quantifying false positives in non-industrial yet structurally relevant environments (Saikam and Koteswararao, 2024).
The joint use of these datasets ensures a broad operational envelope, spanning cyber-physical telemetry, botnet-driven device compromise, and hybrid network intrusions, thereby enabling rigorous evaluation of the proposed RL–FL agent under both in-domain and cross-domain conditions.
3.3.2 Preprocessing and temporal data segmentation
A standardized preprocessing pipeline was implemented to harmonize heterogeneous data sources and improve learning efficiency. Noise filtering and deduplication removed redundant PCAP/CSV entries and inconsistent sensor readings using hash-based comparison and sliding-window verification. Invalid physical values (e.g., negative temperatures) were eliminated.
To stabilize learning across highly variable distributions, two normalization strategies were applied. Min–Max scaling was used for low-variability features, whereas skewed variables (traffic entropy, inter-packet time) were normalized using Z-scores. Categorical fields (protocol type, attack label) were encoded using one-hot vectors, with semantic grouping applied when necessary to control dimensionality.
Temporal segmentation used sliding windows of
To evaluate 0-day resilience, an open-world protocol was used (Suryotrisongko et al., 2022). Specific attack classes were held out during training (e.g., “DDoS-HTTP Flood” in TON_IoT, “Bashlite” in N-BaIoT), appearing only during validation and testing (Gaspar et al., 2024). This design forces agents to generalize from familiar operational patterns to unseen adversarial behaviors.
This preprocessing pipeline ensures interoperability across datasets and reinforces the central objective of the system: learning autonomous defensive policies capable of responding to unknown, emerging threats.
3.3.3 Open-world setup and federated training protocol
Table 3 summarizes the configuration used in the open-world evaluation, specifying which attack classes were included for training, which were excluded to emulate 0-day scenarios, the dataset splits, and the number of federated learning rounds executed per cycle.
This configuration provides a consistent and controlled framework for evaluating 0-day generalization across datasets while maintaining alignment with the federated reinforcement learning protocol adopted in this study.
3.4 Feature engineering and traffic representation
Effective 0-day detection in IIoT requires feature representations that capture cyber–physical dynamics, preserve interpretability, and expose deviations from nominal operating conditions (Zhang et al., 2023). The system uses a compact set of discriminative variables, traffic entropy, industrial command rate, inter-packet time, and flow-level variance, selected for their sensitivity to structural anomalies such as scanning, spoofed control activity, bursty DoS patterns, and covert traffic modulation. These features summarize both network-level irregularities and deviations in process logic, allowing the RL agent to characterize evolving threat states with minimal dimensional overhead.
All features are aggregated into 10-s sliding windows (25% overlap), a configuration validated empirically to preserve temporal coherence while maintaining real-time feasibility. Prior experiments showed that shorter windows (5–10 s) increased variance in anomaly labeling, whereas longer windows primarily increased inference latency without improving accuracy. Feature normalization combines Min–Max scaling and Z-score standardization, depending on the dispersion, to ensure balanced gradient contributions during training. Dense autoencoders were used to detect latent redundancies, and a lightweight temporal attention mechanism prioritizes features exhibiting abrupt local deviations, improving sensitivity to transient threats without increasing model size.
A real-time visualization module (Figure 2) was implemented as an internal validation tool. It links feature trajectories with the agent’s inference output, enabling inspection of feature contributions under extreme or rapidly changing traffic conditions. This module is methodological rather than experimental, supporting model verification and operator interpretability.
Figure 2. System interface for real-time visualization of IIoT traffic characteristics and autonomous agent inference.
Table 4 summarizes the operational definition, expected ranges, and inference relevance of each variable. This set forms the basis for both the agent’s decision logic and the SHAP-based interpretability layer. During each inference cycle, these variables generate a contribution vector that quantifies their marginal influence on the policy output. The selection process combined domain-driven filtering, unsupervised dimensionality reduction, and information-gain metrics, retaining only variables consistently correlated with anomalous states across datasets. This alignment between inference and explanation ensures that the interpretability layer reflects the RL policy’s actual decision structure rather than an external approximation.
3.5 Training, evaluation and validation of the system
The training pipeline was structured in two stages. First, a centralized pre-training phase generated a generalized base model using consolidated historical traffic containing both normal and anomalous patterns. Second, each agent executed local refinement using live IIoT traffic, adapting its policy to device-specific dynamics. This hybrid strategy combines global robustness with localized specialization and supports subsequent federated synchronization.
3.5.1 Training environment and technology platform
The system was deployed on a distributed infrastructure comprising eight ARM-based edge nodes (Cortex-A72, four cores, 1.5 GHz, 4 GB RAM) running independent agents, and a coordination server equipped with an NVIDIA RTX 3080 GPU and 64 GB of RAM. Agents execute the DDPG model in real time, while the server performs asynchronous FedAvg aggregation and experiment logging. Actor–critic models were implemented using TensorFlow 2.11 and PyTorch 1.13, with inter-agent communication via low-latency gRPC channels. Docker containers ensured reproducible execution, versioned deployments, and persistent local state across synchronization rounds.
Figure 3 shows the physical layout of the experimental system. Agents operated directly on traffic generated by an industrial network simulator, executing inference and participating in federated rounds under realistic timing, latency, and resource constraints. This environment allowed us to validate the full RL–FL pipeline in a distributed industrial-like setting rather than purely simulated conditions.
Figure 3. Experimental environment showing the physical deployment of distributed agents, communication topology, and federated synchronization server.
From a deployment perspective, the proposed architecture can be installed directly on industrial gateways or embedded edge devices. Agents remain functional during temporary disconnections and heterogeneous communication protocols (e.g., Modbus, OPC UA). However, large-scale deployments require managing protocol diversity, guaranteeing low-latency inference during burst loads, and securing firmware-level updates.
3.5.2 Evaluation metrics
System evaluation combines algorithmic metrics and operational indicators. Algorithmic metrics include actual detection rate (TDR), false positive rate (FPR), macro F1-score, and robustness to 0-day attacks through open-world exclusion of specific classes. Operational metrics include the response latency from detection to action execution and energy-normalized inference effort.
The macro F1-score for
which is essential in unbalanced open-world settings, preventing dominant classes from masking poor performance on rare or unseen threats.
Within the federated training protocol, a synchronization stability metric
If
To evaluate energy efficiency under homogeneous hardware conditions, we defined the normalized inference energy as formally expressed in Equation 13:
where
3.6 Evaluation of the autonomous response in real time
The autonomous defense system was validated through operational scenarios designed to reproduce high-impact and distribution-shifting threats in IIoT environments. The objective was to characterize the behavior of distributed agents under unexpected conditions, quantify their ability to execute mitigation actions in real time, and measure stability, synchronization, and service continuity across the entire attack lifecycle.
3.6.1 Test scenarios
Three representative scenarios were constructed to evaluate real-time autonomous decision-making under open-world conditions. The first scenario simulated a 0-day intrusion by injecting traffic distributions deliberately absent from the training phase, including encrypted bursts, previously unseen command sequences, and transient state patterns that diverged from the nominal protocol behavior. This configuration enforces generalization to structurally novel inputs rather than unobserved class labels, reflecting realistic 0-day conditions in industrial systems.
The second scenario emulated multi-stage lateral propagation. A legitimate node was forced to generate unauthorized traffic toward neighboring devices, producing evolving flow patterns, abnormal inter-node command frequencies, and multi-hop scanning dynamics. This scenario evaluates the agent’s sensitivity to temporal drift and distributed deviations across the topology.
The third scenario replicated industrial sensor spoofing. Process variables (temperature, pressure, flow) were manipulated to remain within permissible operational limits while violating temporal and cross-sensor coherence. This required the agent to detect inconsistencies that do not manifest as simple threshold breaches, reflecting failures commonly observed in compromised cyber-physical systems. All scenarios were executed both independently and in compound configurations in a distributed IIoT environment with real edge agents, asynchronous federated synchronization, and autonomous actuation. All events were logged at high temporal resolution for subsequent analysis.
3.6.2 Definition of evaluated metrics
System behavior was evaluated along three complementary dimensions: reaction time, operational stability, and service continuity. Reaction time measures the interval between the onset of anomalous behavior and execution of a mitigating action, including acquisition, temporal aggregation, inference, and actuation. This metric is critical in industrial deployments, where delays directly increase the risk of physical damage or process deviation.
Operational stability reflects the agent’s capacity to sustain continuous inference under attack, preserve performance without degradation, and contribute consistent model updates during federated aggregation. Gradient deviation criteria were applied to exclude unstable local models and to maintain global convergence.
Service continuity evaluates the operational impact of autonomous actions, including the preservation of legitimate traffic, the effectiveness of containment, the reintegration of isolated nodes without state divergence, and the maintenance of complete traceability. All metrics were predefined and collected through synchronized logging channels to ensure reproducibility and precise temporal alignment.
In addition to the primary metrics, controlled configuration variants were executed to assess the contribution of individual architectural components. These included evaluations with and without federated synchronization (local-only learning), with and without the autoencoder-based dimensionality reduction used during feature extraction, and under different temporal window sizes and overlaps. These controlled variations provide a conceptual ablation perspective, confirming that federated updates improved long-term stability, dimensionality reduction reduced inference latency on constrained hardware, and larger windows enhanced contextual detection at the cost of slower reaction times. Although these analyses do not expand the experimental scope, they clarify the functional contribution of each subsystem within the overall architecture.
3.7 Implementation of explainable AI modules
In safety-critical industrial environments, autonomous decisions must be accurate and interpretable (Suryotrisongko et al., 2022). Explainability is therefore treated as an operational requirement rather than a post-hoc audit mechanism: technicians must understand why an agent executes a mitigation action, verify that the system aligns with process logic, and maintain trust in automated responses. This is particularly relevant in IIoT infrastructures where opaque decisions can compromise safety, traceability, and regulatory compliance.
To provide transparent and traceable reasoning, an interpretability layer was integrated into each agent’s inference pipeline using an optimized SHAP-based method. SHAP was selected for its additive consistency, robustness for tabular data, and compatibility with distributed black-box models (Gaspar et al., 2024). Other alternatives, such as LIME, Integrated Gradients, or Grad-CAM, have disadvantages: LIME is unstable with correlated, high-dimensional traffic features; IG and Grad-CAM require gradient access and internal model introspection, which are infeasible in federated or resource-constrained edge deployments. SHAP, particularly KernelSHAP, can operate using only model outputs, making it suitable for private, decentralized IIoT architectures.
A lightweight variant of KernelSHAP was implemented to generate local explanations for each inference of the actor network. To reduce KernelSHAP’s computational complexity, generally between
The explainability module operates in parallel with inference. For each action, SHAP produces an ordered contribution vector highlighting the dominant factors influencing the decision (e.g., elevated entropy, anomalous command frequency, inconsistency with historical sensor patterns). These attributions are stored in local logs for auditability and streamed to the system interface for immediate operator inspection. This dual mechanism supports both real-time human oversight and deferred forensic analysis.
Figure 4 illustrates the integration of the XAI component. After traffic acquisition and feature processing, the agent produces both an action and a corresponding explanation vector, which is recorded and displayed without affecting inference latency. The inclusion of SHAP strengthens transparency, supports compliance with industrial cybersecurity frameworks such as ISO/IEC 27001 and ISA/IEC 62443 (Wiemas and Suroso, 2024; Tanveer et al., 2021), and aligns autonomous reasoning with operator expectations in high-risk industrial environments.
Figure 4. Integration of the SHAP module for decision interpretability in the IIoT autonomous defense system.
4 Results
The results are divided into system performance, scalability, and energy efficiency. Each category highlights significant improvements and challenges observed during the experimental phase. The results underline the potential of cutting-edge AI to improve the reliability and efficiency of IoT networks, providing a detailed comparison with traditional cloud-based approaches.
4.1 Performance of the autonomous detection system
The proposed system was evaluated in an open environment, where specific attack classes were deliberately excluded during training to assess the model’s ability to generalize beyond previously observed patterns. The evaluation considers regular traffic, known attacks, and 0-day behaviors, providing a realistic representation of IIoT operating conditions. Beyond the class-exclusion environment, the reported metrics also reflect the model’s performance under structural distribution changes introduced during evaluation, including protocol variations, encrypted bursts, and synthetic command deviations not present in the training datasets.
To assess the influence of individual architectural components, additional controlled configurations were executed. These included evaluations of the model operating without federated synchronization (local-only learning), without autoencoder-based feature compression, and under reduced temporal window sizes. The aggregated results confirmed that federated updates stabilized long-term performance, dimensionality reduction reduced inference latency, and larger window sizes improved contextual detection, but at the cost of higher reaction time.
Table 5 reports the global metrics for the proposed system and four widely used families of IDS baselines: rule-based detection, unsupervised clustering (k-means and DBSCAN), and an unsupervised autoencoder. These approaches were selected because they constitute the predominant strategies in real industrial deployments, particularly in low-resource settings or environments where continuous retraining is infeasible. They also represent the methodological foundations over which more advanced neural or attention-based IDS models are typically benchmarked in recent literature. The inclusion of macro-averaged Precision and Recall enhances the interpretability of the comparison and exposes the trade-offs between false alarms and missed detections in imbalanced threat distributions.
The proposed system achieves a macro F1-score of 0.92, significantly outperforming all baseline families. Precision (0.91) and Recall (0.93) confirm the ability to detect subtle anomalies while maintaining a low false-positive rate, an essential requirement for high-availability industrial infrastructures. By comparison, rule-based systems exhibit high sensitivity to noise and limited capability to generalize beyond predefined signatures, resulting in a considerably higher FPR (0.115). Unsupervised methods, while valuable when labels are scarce, struggle to capture temporal dependencies and protocol-level semantics, thereby reducing their discriminative capacity. Even the autoencoder baseline fails to match the performance of the reinforcement-driven architecture, highlighting the impact of continuous-action optimization combined with federated synchronization.
Table 6 reports the relative improvement of the proposed system over each baseline across all metrics. The results show consistent improvements ranging from 20% to 55%, demonstrating that the gains are not isolated to a single metric but rather emerge as a coherent, cross-metric advantage.
The confusion matrix further confirms generalization across compound and cross-distribution scenarios, in which attacks combined protocol-level anomalies with multi-node propagation. The model preserved stable predictions despite shifts in device behavior and traffic structure, indicating robustness to 0-day conditions beyond categorical exclusion.
Figure 5a presents the confusion matrix, highlighting the model’s capacity to maintain class separation even in the presence of previously unseen behaviors. The main inaccuracies occur between Spoofing and Zero-Day attacks, two classes that share similar transient entropy patterns, yet the system preserves high diagonal dominance. Figure 5b reinforces these findings with per-class F1 comparison, showing that the proposed architecture achieves superior detection in all categories, particularly in 0-day scenarios, where conventional methods exhibit severe degradation.
Figure 5. Performance of the autonomous detection system; (a) Confusion matrix by type of threat in open-world mode; (b) Comparison of the F1-score per class between the proposed system and traditional approaches.
The evidence confirms not only a quantitative improvement but also a qualitative shift in detection behavior. The reinforcement-driven model does not rely exclusively on predefined patterns; instead, it adapts to evolving dynamics, maintains class-wise consistency, and delivers stable inference under constrained resources—capabilities that remain unattainable for traditional baselines.
4.2 Comparative statistical tests
Paired, non-parametric comparisons (Wilcoxon signed-rank) were performed between the proposed system and each baseline for reaction time
Table 7. Wilcoxon signed-rank comparisons (proposed vs. baselines) on paired windows. Medians (IQR) are reported;
Across all comparisons, the proposed system achieved statistically lower reaction times and false positive rates, and significantly higher precision and recall, confirming its superior performance under identical experimental conditions. Median effect sizes
McNemar’s tests corroborated these findings, indicating significantly fewer classification errors for the proposed system than for all baseline models
4.3 Autonomous response to threat scenarios
The ability of the system to respond autonomously and rapidly, and in a distributed manner to different types of threats represents one of the key elements of its operational functionality in critical industrial environments. To evaluate this behavior, attack scenarios were designed that measured multiple variables associated with real-time performance, coordination between agents, and the impact on the network infrastructure.
Figure 6 presents the composite results of this evaluation. Figure 6a shows the inference and total reaction times recorded for four types of threats: 0-day attacks, sensor spoofing, lateral propagation, and a combined scenario. The blue line represents the average time the agent takes to perform the inference. In contrast, the green line includes the complete processing time, from data acquisition to the actual execution of the action (e.g., node isolation or channel reconfiguration). The colored bands show the minimum and maximum range observed for each type of threat, allowing for analysis of the variability of the system under dynamic conditions.
Figure 6. Temporal evaluation and distributed coordination of the system under threat scenarios; (a) Average times and variation ranges for inference and total reaction against different types of attack; (b) Temporal sequence of isolation executed by multiple nodes during a lateral propagation event.
The results indicate that the system maintains consistent inference times across all scenarios, with average values between 122 m and 140 m, without exceeding 150 m even in mixed configurations. The total reaction time, which also includes the execution of the action policy, varies between 165 m and 210 m, which is higher in complex scenarios such as mixed ones, where multiple threats must be evaluated simultaneously. It is noteworthy that even under concurrent loads, the response time remains below 260 m, meeting the latency requirements established for critical IIoT networks according to the IEC 62541 and ISA/IEC 62443 guidelines.
Figure 6b represents the response time sequence of the system during a lateral propagation attack. This type of threat involves a compromised node transmitting malicious traffic to other nodes, generating a chain of infections. In this experiment, node one is the origin of the attack, and in successive intervals, it propagates to nodes 2, 3, and 4. The figure shows the exact moment each node detects the anomalous pattern, performs its local inference, and executes the isolation action. The time between detection and isolation is consistent with the results in the previous graph, and the entire isolation sequence takes place in less than 2 s, demonstrating low individual latency and effective synchronization between independent nodes.
This distributed response capability allows breaking the propagation chain without human intervention and without requiring centralized reconfigurations, which is essential in industrial systems where manual or sequential containment processes cannot compromise operational availability. Furthermore, the federated architecture ensures that local models continue to operate even during isolation events, preserving the functionality of uncompromised nodes.
Table 9 complements this assessment by presenting operational indicators by threat type, including the number of affected nodes, the total containment time, and the impact on service availability. In 0-day attacks, containment is achieved in an average of 4.8 s, affecting only two nodes and maintaining 99.6% availability for the rest of the system. In more complex scenarios, such as the combined one (0-day + propagation), the number of compromised nodes rises to six, and the containment time is 8.7 s. However, all network availability remains above 97.8%, demonstrating high operational tolerance in critical situations.
The values observed in the case of spoofing, which is a more difficult threat to detect because its values remain within acceptable thresholds, also show an adequate response, with a maintained availability of 99.2% and containment in less than 5.2 s. In all cases, the system can balance threat sensitivity and operational stability, without generating unnecessary isolation or massive interruptions. The results confirmed that the system not only manages to detect threats with high precision but can also respond autonomously, quickly, and in a distributed manner, without compromising service continuity. This combination of low reaction time, efficient federated coordination, and preservation of availability positions the proposed architecture as a robust solution for IIoT environments that demand real-time active security.
4.4 Operational evaluation and use of resources
In addition to its accuracy and autonomous response capability, the practical effectiveness of a defense system in distributed industrial environments critically depends on its operational behavior under real-world load. Specifically, it must be ensured that the detection, inference, and action processes do not compromise the processing resources of IIoT nodes or affect the continuity of critical services. To this end, the results were analyzed regarding CPU usage, memory, latency, relative energy consumption, and the overall impact on network operational availability.
Table 10 summarizes the indicators evaluated for five representative threat scenarios: 0-day attacks, sensor spoofing, lateral propagation, multi-vector combinations, and persistent anomalous activity. All data were collected under continuous execution conditions, without human intervention, on edge nodes with quad-core ARM architecture and 4 GB of RAM, simulating realistic industrial conditions.
Table 10. Evaluation of operational resources and containment effectiveness in different threat scenarios.
Regarding CPU usage, the results show an average consumption that varies between 44% and 53% depending on the complexity of the scenario. The lowest value occurs in isolated 0-day events, where a single pattern can detect the attack structure. In contrast, combined scenarios, involving parallel inference on multiple streams and federated coordination, increase the load to a maximum of 53%. These usage levels remain within the expected operating threshold for embedded nodes and do not affect the ability to execute concurrent tasks, such as sensor monitoring or data transmission.
RAM consumption maintains a proportional trend, with values between 1240 and 1450 MB. This usage corresponds not only to the inference model and its weights but also to the input buffers, the optimized SHAP explanatory module, and the temporary registers for federated synchronization. The system has been designed to keep memory usage under control through cyclical buffer flushing and log compression, avoiding overruns or swaps that affect latency.
Regarding inference latency, consistent times between 122 m and 140 m were observed, measured from the reception of the last input packet to the generation of the final action. These values include executing dense layers, temporal attention mechanisms, and the generation of SHAP explanations per inference. Even in complex scenarios, where multiple threats were evaluated in parallel, the system did not exceed 150 m per inference, meeting the latency requirements for industrial networks under the IEC TR 61850-90–7 and OPC UA standards for real-time edge computing.
One of the most critical aspects in energy-constrained nodes is the energy consumption per inference. Estimated indirect measurements place this value between 3.2 mJ and 4.1 mJ for simple scenarios and 4.1 mJ for blended attacks. These values were calculated based on the base frequency of the processor, the number of active cores, and the average duration of each inference cycle, assuming a power efficiency of 1.8/2.2 mW/MHz, as documented on compatible hardware (Jetson Nano, Rock Pi, Raspberry Pi 4). This energy efficiency validates the viability of large-scale system deployment in environments with multiple remote nodes or powered by renewable energy.
Furthermore, the analysis of the impact on service continuity shows that the system can contain threats without significantly affecting overall operations. Service availability remained between 97.8% and 99.6%, depending on the type of attack and the number of affected nodes. The most significant impact was observed in blended attacks, where the need to isolate multiple cascaded nodes temporarily reduced coverage. However, the federated architecture and local state persistence allowed nodes to recover after event validation, without data loss or requiring a forced reboot.
The successful containment rate also remains high in all scenarios, with a minimum of 90.5% in persistent anomalous activities-scenarios where patterns are less defined and the threat evolves slowly/and a maximum of 96.2% in lateral spread events, where the infection chain was interrupted entirely by the sequential autonomous decisions of the agents. The results confirm that the system achieves high levels of accuracy and autonomous response, doing so with a contained and efficient operational load compatible with heterogeneous edge architectures. The ability to infer, explain, and act without compromising node or network stability positions it as a robust, scalable operational solution suitable for critical industrial environments with low power, high availability, and reduced latency requirements.
In addition to the defined attack scenarios, the architecture’s robustness was examined under natural variability in network and processing conditions. The system maintained stable inference and containment performance without observable degradation in accuracy or response latency, even under fluctuating CPU load and communication delays. These results confirm that the proposed architecture sustains consistent operational behavior in dynamic industrial environments, reinforcing its reliability and robustness for real-world IIoT deployments.
4.5 Training metrics and temporal behavior of features
To understand the convergence properties and the behavior of the intelligent agents during training, a detailed evaluation was conducted based on the progression of key performance metrics and temporal feature dynamics under industrial traffic simulation. The evolution of these indicators is illustrated in Figure 7, which presents three representative graphs derived from the experimental phase: reward accumulation in federated learning rounds, reduction of classification loss during training, and dynamic response of key input variables over a representative test window.
Figure 7. Training metrics and temporal evolution of key features. (a) Federated rewardaccumulation across rounds, (b) model loss convergence over epochs, (c) feature dynamics in real-time evaluation.
Figure 7a shows the evolution of the average reward obtained by the agents during federated training. Each round corresponds to a full synchronization between participating edge nodes, during which updated local policies are aggregated. The average reward exhibits a consistent upward trend from round 0 to round 40, reaching a saturation point around round 38 with marginal improvements beyond that. This behavior suggests the agents progressively learn optimal response policies in their local environments. Notably, the curve includes realistic variability, as observed between rounds 22 and 26, where a temporal decrease occurs due to the injection of a batch of anomalous events into one of the participating nodes, simulating 0-day conditions. Despite the disturbance, the policy readjusts autonomously, reaffirming the system’s resilience. The final average reward converges to approximately 0.91, indicating successful coordination and learning among agents.
Figure 7b illustrates the supervised classification loss as computed during model training, using cross-entropy over labeled segments of the TON_IoT and CICIDS2019 datasets. The curve shows a rapid decline during the initial 20 epochs, stabilizing near epoch 30. Early fluctuations are observed between epochs 5 and 10, attributed to the risk of overfitting in imbalanced batch samples. This was mitigated through dynamic batch reweighting and dropout regularization. The final loss plateau is approximately 0.042, suggesting both generalization and stable parameter tuning. The curve illustrates the model’s ability to strike a balance between accuracy and convergence speed by integrating temporal attention and feature weighting mechanisms into the architecture.
Figure 7c presents a 100-s real-time window extracted from the test phase on an IIoT edge node. It plots the temporal evolution of three critical variables: traffic entropy (blue), command rate (orange), and inter-packet time (green). Between seconds 30 and 50, a spike in command rate is detected, increasing from a nominal five commands/sec to over 40 commands/sec. Simultaneously, traffic entropy rises to values above 4.2 bits, and inter-packet time decreases sharply to 0.2 s. This correlated pattern corresponds to a lateral propagation event combined with control spoofing. The agent detected this behavior and flagged it as high-risk within 1.2 s, confirmed by the SHAP explanation module. This visualization reinforces the system’s ability to identify real-time threats by analyzing the interaction between low-level traffic variables and contextual anomalies. Additionally, normal operational segments (e.g., 0–30s, 70–100s) exhibit stability in all metrics, with entropy oscillating around 2.1 bits, IPT near 1.1 s, and command rates below 10/sec, validating the contrast between attack and baseline states.
4.6 Decision interpretability with XAI
One of the distinctive features of the proposed autonomous defense system is its ability to offer local interpretations of each inference using an explainable AI mechanism integrated directly into the agent. This functionality enables auditing of system behavior and allows for real-time human oversight, particularly in industrial contexts where understandable rationales must support automated decision-making.
Explanation generation was implemented using an optimized SHAP version, adapted for inference in environments with limited computational resources. This module operates locally on each agent and evaluates the relative importance of the input variables in each decision, generating a weight vector representing the contribution of each feature to the value of the inferred action. Inference and explanation ran in parallel, and the average explanation generation time was consistently kept below 45 milliseconds, even under intensive inference conditions, thanks to the use of directed stochastic sampling techniques and prior dimension reduction using autoencoders. Figure 8 shows the SHAP-estimated average contribution of the main input variables to the decisions made by the system, categorized by threat type. The graph reveals how the model dynamically adapts the relative importance of its inputs based on the operational context and the nature of the observed anomalous pattern. Each threat type triggers different inference paths in the network of the agent, which is directly reflected in the weights assigned to each explanatory variable.
In 0-day attacks, the dominant variable is traffic entropy, with an average contribution of 42% to the model output. This is consistent with the evasive nature of this type of threat, where communication patterns are random or encrypted and do not conform to previously learned sequences. Entropy acts as a low-structure discriminator, helpful in identifying entirely new behaviors. Command frequency and other metrics follow more marginally, indicating that the model does not require structured data to issue an alert in this case.
In contrast, sensor spoofing events show a more balanced distribution. Here, industrial command frequency slightly dominates with a 32% weight, followed by interpacket time and process metrics (pressure, temperature, cyclic skew). This profile suggests that the model needs to correlate multiple dimensions to identify that, although the readings appear valid, their frequency or relationship to the system state is inconsistent. This is key in scenarios where false values are simulated without violating physical ranges, making detection difficult using traditional means.
In lateral propagation attacks, the variable with the most significant weight is command frequency 35% again, followed closely by interpacket time 30%. This pattern is characteristic of network infections where repetitive or bursty commands are issued to multiple nodes. The low contribution of process metrics 10% indicates that the system focuses on traffic dynamics rather than the internal state of the sensors.
The combined scenario, which simulates a realistic environment with multiple simultaneously active vectors, exhibits an even distribution among the three main variables, with each contributing around 25%–30%. This situation reflects the need of the agent to integrate several partial signals to make a reliable inference, reinforcing the value of the attention mechanisms implemented on temporal sequences.
The ability of the system to adapt the interpretation approach according to the threat type validates the contextual plasticity of the architecture. Furthermore, the design of the SHAP module in each agent allows explanations to be generated locally, without the need to transmit sensitive data to a central server, thereby preserving operational privacy and system efficiency.
Although structured human validation was not included in this study phase, cross-observations by plant technical personnel were conducted, confirming that the variables highlighted by SHAP in selected scenarios matched the indicators that preceded real-life failures or anomalous events. This preliminary alignment between automated inference and expert knowledge supports the semantic coherence of the explanatory system.
4.7 Comparative analysis
A set of controlled comparative experiments was designed to validate the differential contribution of the proposed system compared to previously used approaches for threat detection in IIoT networks. These included implementing representative techniques from different methodological families, such as rule-based systems (signature-based), unsupervised models (clustering and autoencoders), and classic supervised algorithms. The objective was to contrast the performance and functionality of the autonomous federated system with XAI against solutions widely used in industrial contexts or documented in recent technical literature.
All comparative approaches were executed under the same experimental conditions defined in the methodology. Specifically, the same datasets (TON_IoT and N-BaIoT), threat scenarios (including class exclusion for open-world evaluation), and distributed IIoT infrastructure in the controlled environment were used. Network configurations, traffic rates, time windows, and inference frequencies were constant across all tests. This methodological consistency ensures that the results obtained are directly comparable and are not affected by variations in the execution environment or workload.
Table 11 presents an extended comparative evaluation across key performance and architectural metrics, including reaction time, FPR, macro-averaged Precision and Recall, adaptive capacity to 0-day threats, and real-time interpretability. These metrics reflect raw detection performance and the practical effectiveness and resilience of each approach in industrial IIoT environments.
Table 11. Technical comparison of the performance of the proposed system versus traditional approaches.
The proposed system achieves a competitive reaction time of 182 m, with the lowest FPR (4.1%) among all methods, alongside high macro Precision (0.91) and Recall (0.93). This confirms its ability to minimize false positives while detecting most anomalous events, even under previously unseen conditions. The architecture’s use of distributed reinforcement learning, open-world training, and asynchronous federated synchronization supports its strong generalization capabilities and operational continuity.
In contrast, rule-based systems exhibit the lowest latency (90 m) but suffer from poor adaptability to new threats (null) and a high false positive rate (11.5%), making them unreliable in evolving environments. Clustering-based models (k-means, DBSCAN) display both low Precision and Recall, and lack real-time interpretability or resilience under adversarial conditions. While supervised Random Forests offer moderate detection metrics (Precision: 0.73, Recall: 0.76), they are inherently limited by their centralized inference pipeline and dependence on post hoc explanations.
The embedded SHAP module in the proposed system, executed concurrently with inference at the edge, provides real-time interpretability and supports industrial traceability requirements. Combined with federated learning and efficient local response, this positions the system as a robust and scalable solution for high-assurance IIoT security deployments.
Table 12 analyzes from an architectural and functional perspective, considering attributes that, although not directly reflected in accuracy metrics, are essential for applicability in real-world industrial environments: support for edge deployment, fault tolerance, decentralized operation capability, and integrated explainability.
The proposed system integrates distributed federated learning, which allows it to preserve operational privacy and continuously improve its performance without transmitting raw data. The previous sections demonstrated its compatibility with low-capacity edge nodes, and its distributed architecture provides high operational resilience, allowing it to continue operating even in the event of node or link loss. Other approaches lack this coordinated autonomous learning capability or embedded explainability mechanisms at inference time.
The results validate that the proposed system not only outperforms existing solutions in accuracy and adaptability but also provides a comprehensive set of architectural capabilities that make it viable, reliable, and auditable in critical industrial environments. Thus, it overcomes the operational, structural, and interpretability limitations of the classical and modern alternatives compared.
4.8 Technical comparison with existing systems: Direct empirical evaluation
To accurately establish the scope and technical limits of the proposed system, an empirical comparison was conducted with six contemporary IIoT threat-detection approaches. The comparison is limited exclusively to explicitly reported operational and functional elements, avoiding unjustified inferences. This evaluation considers the presence or absence of 0-day threat detection mechanisms, integration of explainability modules, a federated learning framework, edge deployment, and the nature of the reported metrics. Table 13 summarizes these comparisons into a verifiable framework aligned with the objectives of this research.
One key differentiator is the handling of emerging threats, particularly those not part of the system’s initial training. Verma et al. (2024) introduce a federated architecture that utilizes autoencoders in conjunction with OCSVM and dual voting between the client and server. Class dropout during training and subsequent detection indicate that the approach can capture structural anomalies without adaptive boosting or full distributed inference; meanwhile, Hairab et al. (2023) use convolutional networks with L1/L2 regularization and validation on TON_IoT, applying deliberate class dropout without implementing a model that can evolve after deployment. In contrast, the architecture developed here utilizes DDPG agents with federated updating, trained under a rigorous open-world framework and asynchronous synchronization, enabling it to adapt to new classes operationally without requiring manual intervention.
Regarding the explainability of decisions, the reviewed systems show significant limitations. Sezgin and Boyacı (2023) incorporate Shapley values as a post-hoc technique within a classification-oriented AutoML framework, without influencing inference or being available to the operator during decision-making. Alblehai (2025) mentions using saliency maps on BiLSTM, although purely visually and not coupled to the inference or retraining cycle. In contrast, the proposed system utilizes optimized SHAP values embedded in each agent, executed in parallel with the inference process, thereby enabling comprehensive traceability of each decision and supporting audit mechanisms or evolution guided by dominant variables.
The federated learning structure also represents a critical point of differentiation. Zero-Day Guardian (Verma et al., 2024) implements federated synchronization but requires subsequent central validation and a voting system, which introduces a structural dependency on the server. None of the other works implement functional federated learning. In the proposed development, agents are trained locally, synchronize their weights with a coordinating server via Federated Averaging, and continue operating even during failures or temporary disconnections, thereby conferring true distributed resilience. This property is essential in industrial environments where latency, operational stability, and decentralization are non-negotiable requirements.
Regarding edge deployment, most studies use this term generically or narrowly. Sobchuk et al. (2024) run their model on the Edge-IIoT test set but do not specify actual execution on computationally constrained devices. Sezgin and Boyacı (2023) and Alblehai (2025) mention edge or multi-node simulations but do not report actual use of physical devices. In contrast, this system has been deployed on heterogeneous edge nodes, including ARM devices with limited capabilities, under traffic and processing conditions that closely resemble those of the real application environment, allowing validation not only of the inference but also of the approach’s full computational feasibility. Finally, it is noted that all the reviewed studies present conventional classification metrics such as accuracy and F1 score. However, none include analysis of operational latency, energy consumption, resource usage during inference, or performance against complex attack scenarios. This work presents these metrics in specific sections, based on defined scenarios and direct quantification in an experimental environment. This enables the validation of the algorithm and ensures operational sustainability, scalability, and robustness under real-world usage conditions.
5 Discussion
The proposed system demonstrates a consistent integration of autonomous inference, federated coordination, and embedded explainability for IIoT defense. Unlike previous approaches that address 0-day detection, supervised models, or AutoML components in isolation (Alkhafaji et al., 2024; Verma et al., 2024; Hairab et al., 2023), this work delivers a unified architecture capable of continuous adaptation on edge hardware while preserving low latency and operational stability. The ability to execute inference, synchronize models asynchronously, and generate interpretable decisions in real time distinguishes the system from existing efforts.
Experimentally, agents were evaluated under open-world conditions, temporal segmentation, and real industrial hardware constraints (Lu and Wang, 2024). Combining DDPG with federated synchronization enabled learning of localized behaviors without sharing raw data, reducing false positives to 4.1.
The architecture also reduces dependency on exhaustive labels and allows agents to respond autonomously to unseen behaviors. Integrated SHAP explanations support operational auditing under ISO/IEC 27001 and ISA/IEC 62443 (Gawde et al., 2024; Wiemas and Suroso, 2024), reinforcing the system’s suitability for regulated industrial environments. This combination of adaptive behavior, distributed learning, and embedded traceability represents a substantial advancement over existing centralized or single-node reinforcement learning systems.
Despite these strengths, several limitations must be acknowledged. Zero-day evaluation via class exclusion does not fully replicate adversarial or polymorphic threats, which may diverge structurally from known traffic patterns. DDPG’s stability remains sensitive to abrupt environmental shifts, and federated synchronization can introduce temporary incoherence when nodes experience gradient deviations or connectivity loss. Moreover, although SHAP was chosen over LIME and Integrated Gradients due to its consistency and suitability for tabular IIoT data (Gaspar et al., 2024), the computational overhead, mitigated by feature reduction, may challenge ultra-constrained devices.
The results were validated in controlled simulations rather than continuously operating industrial systems, where event rates, tolerance to false alarms, and operational variability differ significantly. These factors require caution when extrapolating system performance to production environments and motivate future work on noise-resilient detection, long-term deployment, and dynamic criticality-aware policies.
Finally, recent advances in smart-infrastructure defense reinforce the relevance of distributed and explainable mechanisms. Hassine et al. (2025a) discuss the growing integration of AI, blockchain, and edge computing in smart-city cybersecurity. In contrast, Hassine et al. (2025b) demonstrate that combining GANs with GraphSAGE improves the detection of advanced persistent threats. These directions align with the principles of the present architecture, suggesting applicability beyond IIoT toward broader cyber-physical ecosystems.
6 Conclusions and future work
The autonomous defense system developed in this work shows that distributed intelligent agents, coordinated through federated learning and equipped with embedded explainability, can deliver accurate, low-latency, and interpretable threat detection directly at the IIoT edge. By avoiding centralized data aggregation and operating on constrained hardware, the architecture aligns with core Industry 4.0 requirements of scalability, decentralization, and operational autonomy, representing a substantive advance over signature-driven or centrally trained IDS models.
A key contribution is the validation of a DDPG-based distributed architecture capable of sustaining high accuracy in 0-day conditions. Through open-world evaluation with deliberate class exclusion, agents maintained a false-positive rate below 5% and reaction times between 158 and 260 m while running on ARM Cortex-A72 devices with only 4 GB RAM. This confirms that continuous-action reinforcement learning can meet the latency and determinism constraints of industrial communication standards such as OPC UA, even under resource limitations.
A second contribution is the integration of SHAP as an embedded interpretability mechanism operating within the inference cycle. Unlike post-hoc explainers, SHAP serves as an operational component: it provides decision traceability, supports detection of unstable agent behavior, and informs gradient-penalty policies that prevent compromised nodes from influencing federated updates. This produces a transparent and auditable decision pipeline consistent with ISA/IEC 62443 and ISO/IEC 27001 requirements.
Operational validation further confirms system robustness. Edge nodes maintained CPU consumption between 44% and 53%, memory usage below 1.5 GB, and inference latencies around 182 m. During multi-node attacks, uncompromised nodes preserved full service availability, and isolated nodes reintegrated within 9 s with complete state recovery–demonstrating resilience and controlled self-adaptation rarely achieved jointly in DRL-based or federated security architectures.
Limitations remain. Zero-day evaluation via class exclusion cannot fully emulate polymorphic or adversarial behaviors; DDPG may destabilize under abrupt environmental shifts; and federated synchronization may temporarily lose cohesion under gradient divergence or connectivity loss. Although optimized, SHAP introduces computational overhead that restricts deployment on ultra-low-power microcontrollers.
Future work will focus on incorporating temporal-memory models (e.g., GRUs or transformers) to detect persistent or evolutionary threats, and on applying dynamic participant selection to strengthen federated reinforcement learning under intermittent connectivity or partially compromised nodes. Validation in continuously operating industrial facilities will enable assessment of real process impacts and long-term convergence. Additional work will refine the explainability interface with confidence indicators, study the sensitivity of temporal segmentation parameters, and extend the architecture to address training-time vulnerabilities, including poisoning, model extraction, and backdoor insertion, via federated trust-management and gradient-validation mechanisms.
Data availability statement
The data supporting the conclusions of this article are available from the corresponding author upon reasonable request.
Author contributions
WV-C: Funding acquisition, Validation, Writing – review and editing, Formal Analysis, Conceptualization, Methodology, Supervision, Investigation, Visualization. RG: Conceptualization, Visualization, Writing – original draft, Formal Analysis, Data curation, Software. JG: Investigation, Software, Visualization, Writing – original draft, Methodology, Data curation, Formal Analysis. PP: Writing – review and editing, Investigation, Conceptualization, Supervision, Visualization, Validation, Methodology.
Funding
The author(s) declared that financial support was not received for this work and/or its publication.
Conflict of interest
The author(s) declared that this work was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Generative AI statement
The author(s) declared that generative AI was not used in the creation of this manuscript.
Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.
Publisher’s note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
References
Adekunle, T. S., Alabi, O. O., Lawrence, M. O., Adeleke, T. A., Afolabi, O. S., Ebong, G. N., et al. (2024). An intrusion system for internet of things security breaches using machine learning techniques. Artif. Intell. Applications 2 (3), 165–171. doi:10.47852/bonviewaia42021780
Agbedanu, P. R., Yang, S. J., Musabe, R., Gatare, I., and Rwigema, J. (2025). A scalable approach to internet of things and industrial internet of things security: evaluating adaptive self-adjusting memory K-Nearest neighbor for zero-day attack detection. Sensors 25 (1), 216. doi:10.3390/s25010216
Al-Hawawreh, M., and Hossain, M. S. (2024). Digital twin-driven secured edge-private cloud industrial internet of things (IIoT) framework. J. Netw. Comput. Appl. 226, 103888. doi:10.1016/j.jnca.2024.103888
Alblehai, F. (2025). Artificial intelligence-driven cybersecurity system for internet of things using self-attention deep learning and metaheuristic algorithms. Sci. Rep. 15 (1), 13215. doi:10.1038/s41598-025-98056-2
Alkhafaji, N., Viana, T., and Al-Sherbaz, A. (2024). Integrated genetic algorithm and deep learning approach for effective cyber-attack detection and classification in industrial internet of things (IIoT) environments. Arab. J. Sci. Eng. 50, 12071–12095. doi:10.1007/s13369-024-09663-6
Booij, T. M., Chiscop, I., Meeuwissen, E., Moustafa, N., and Den Hartog, F. T. H. (2022). ToN_IoT: the role of heterogeneity and the need for standardization of features and attack types in IoT network intrusion data sets. IEEE Internet Things J. 9 (1), 485–496. doi:10.1109/JIOT.2021.3085194
Chaturvedi, A., Kukreti, S., Bhavani, A. D., Nizampatnam, V. N. R. K., Kaur, D., and Natrayan, L. (2024). “Adaptive defense mechanisms against zero-day attacks in wireless sensor networks,” in International conference on distributed systems, computer networks and cybersecurity (ICDSCNC 2024). doi:10.1109/ICDSCNC62492.2024.10939775
Chen, J., Li, J., Huang, R., Yue, K., Chen, Z., and Li, W. (2022). Federated transfer learning for bearing fault diagnosis with discrepancy-based weighted federated averaging. IEEE Trans. Instrum. Meas. 71, 1–11. doi:10.1109/TIM.2022.3180417
Chen, Y., Yang, Q., He, S., Shi, Z., Chen, J., and Guizani, M. (2024). FTPipeHD: a fault-tolerant pipeline-parallel distributed training approach for heterogeneous edge devices. IEEE Trans. Mobile Comput. 23 (4), 3200–3212. doi:10.1109/TMC.2023.3272567
Djebbar, F., and Nordstrom, K. (2023). A comparative analysis of industrial cybersecurity standards. IEEE Access 11, 85315–85332. doi:10.1109/ACCESS.2023.3303205
Gadallah, W. G., Ibrahim, H. M., and Omar, N. M. (2024). A deep learning technique to detect distributed denial of service attacks in software-defined networks. Computers and Security 137, 103588. doi:10.1016/j.cose.2023.103588
Gaspar, D., Silva, P., and Silva, C. (2024). Explainable AI for intrusion detection systems: LIME and SHAP applicability on multi-layer perceptron. IEEE Access 12, 30164–30175. doi:10.1109/ACCESS.2024.3368377
Gawde, S., Patil, S., Kumar, S., Kamat, P., Kotecha, K., and Alfarhood, S. (2024). Explainable predictive maintenance of rotating machines using LIME. SHAP, PDP, ICE. IEEE Access 12, 29345–29361. doi:10.1109/ACCESS.2024.3367110
Hairab, B. I., Aslan, H. K., Elsayed, M. S., Jurcut, A. D., and Azer, M. A. (2023). Anomaly detection of zero-day attacks based on CNN and regularization techniques. Electronics (Switzerland) 12 (3), 573. doi:10.3390/electronics12030573
Hassine, L., Quadar, N., Ledmaoui, Y., Chaibi, H., Saadane, R., Chehri, A., et al. (2025a). Enhancing smart grid security in smart cities: a review of traditional approaches and emerging technologies. Applied Energy 398, 126430. doi:10.1016/j.apenergy.2025.126430
Hassine, L., Chaibi, H., Rahouti, M., Saadane, R., Chehri, A., and Mehdary, A. (2025b). GAN-driven feature selection and GraphSAGE for advanced persistent threat defense in smart grids. Arab. J. Sci. Eng. 1–18. doi:10.1007/s13369-025-10636-6
Hathout, B., Shepherd, P., Dagiuklas, T., Nagaty, K., Hamdy, A., and Rodriguez, J. (2025). Adaptive trust management for data poisoning attacks in MEC-based FL infrastructures. IEEE Open Journal of the Communications Society 6, 3140–3160. doi:10.1109/OJCOMS.2024.3523368
Hazman, C., Guezzaz, A., Benkirane, S., and Azrour, M. (2022). IDS-SIoEL: intrusion detection framework for IoT-Based smart environments security using ensemble learning. Cluster Comput. 26, 4069–4083. doi:10.1007/s10586-022-03810-0
Hazman, C., Guezzaz, A., Benkirane, S., and Azrour, M. (2024). Enhanced IDS with deep learning for IoT-Based smart cities security. Tsinghua Sci. Technol. 29 (4), 929–947. doi:10.26599/TST.2023.9010033
Hesham, E., Hamdy, A., and Nagaty, K. (2025). “A federated learning framework with self-attention and deep reinforcement learning for IoT intrusion detection,” in Proceedings of the 2024 13th international conference on software and information engineering (ICSIE ’24) (New York, NY, USA: Association for Computing Machinery), 88–94. doi:10.1145/3708635.3708649
Kim, S., Yoon, S., and Lim, H. (2021). Deep reinforcement learning-based traffic sampling for multiple traffic analyzers on software-defined networks. IEEE Access 9, 47815–47827. doi:10.1109/ACCESS.2021.3068459
Lu, T., and Wang, J. (2024). DOMR: toward deep open-world malware recognition. IEEE Trans. Inf. Forensics Secur. 19, 1455–1468. doi:10.1109/TIFS.2023.3338469
Madsen, M., Palmin, A., Stutz, A., Maurmaier, M., and Barth, M. (2023). Security Analyse des MTP Konzepts. atp Magazin 65 (8), 71–79. doi:10.17560/atp.v65i8.2673
Mallampati, S. B., and Hari, S. (2023). Fusion of feature ranking methods for an effective intrusion detection system. Computers, Materials and Continua 76 (2), 1721–1744. doi:10.32604/cmc.2023.040567
Meidan, Y., Bohadana, M., Mathov, Y., Mirsky, Y., Shabtai, A., Breitenbacher, D., et al. (2018). N-BaIoT—Network-based detection of IoT botnet attacks using deep autoencoders. IEEE Pervasive Comput. 17 (3), 12–22. doi:10.1109/MPRV.2018.03367731
Moustafa, N. (2019). “New generations of internet of things datasets for cybersecurity applications based on machine learning: ton_iot datasets,” in eResearch Australia Asia 2019, October.
Ren, K., Liu, L., Bai, H., and Wen, Y. (2024). “A dynamic reward-based deep reinforcement learning for IoT intrusion detection,” in 2024 2nd international conference on intelligent communication and networking (ICN 2024), 110–114. doi:10.1109/ICN64251.2024.10865958
Safa, M., Green, K. W., Zelbst, P. J., and Sower, V. E. (2023). Enhancing supply chain through implementation of key IIoT technologies. J. Comput. Syst. Sci. 63 (2), 410–420. doi:10.1080/08874417.2022.2067792
Saheed, Y. K., Abdulganiyu, O. H., and Tchakoucht, T. A. (2023). A novel hybrid ensemble learning for anomaly detection in industrial sensor networks and SCADA systems for smart city infrastructures. J. King Saud Univ. Comput. Inf. Sci. 35 (5), 101532. doi:10.1016/j.jksuci.2023.03.010
Saikam, J., and Koteswararao, C. (2024). EESNN: hybrid deep learning empowered spatial-temporal features for network intrusion detection system. IEEE Access 12, 15930–15945. doi:10.1109/ACCESS.2024.3350197
Serhane, A., Hamzaoui, E. M., and Ibrahimi, K. (2023). “IA applied to IIoT intrusion detection: an overview,” in Proceedings of the 10th international conference on wireless networks and Mobile communications (WINCOM 2023). doi:10.1109/WINCOM59760.2023.10323032
Sezgin, A., and Boyacı, A. (2023). AID4I: an intrusion detection framework for industrial internet of things using automated machine learning. Computers, Materials and Continua 76 (2), 2121–2143. doi:10.32604/cmc.2023.040287
Shruthi, N., and Siddesh, G. K. (2023). Trust metric-based anomaly detection via deep deterministic policy gradient reinforcement learning framework. Int. J. Comput. Netw. Commun. 15 (6), 01–25. doi:10.5121/ijcnc.2023.15601
Sobchuk, V., Pykhnivskyi, R., Barabash, O., Korotin, S., and Omarov, S. (2024). Sequential intrusion detection system for zero-trust cyber defense of IoT/IIoT networks. Advanced Information Systems 8 (3), 92–99. doi:10.20998/2522-9052.2024.3.11
Suhail, S., Iqbal, M., Hussain, R., and Jurdak, R. (2023). ENIGMA: an explainable digital twin security solution for cyber–physical systems. Comput. Ind. 151, 103961. doi:10.1016/j.compind.2023.103961
Suryotrisongko, H., Musashi, Y., Tsuneda, A., and Sugitani, K. (2022). Robust botnet DGA detection: blending XAI and OSINT for cyber threat intelligence sharing. IEEE Access 10, 34613–34624. doi:10.1109/ACCESS.2022.3162588
Tanveer, A., Sinha, R., and Kuo, M. M. Y. (2021). Secure links: secure-By-design communications in IEC 61499 industrial control applications. IEEE Trans. Industr. Inform. 17 (6), 3992–4002. doi:10.1109/TII.2020.3009133
Tareq, I., Elbagoury, B. M., El-Regaily, S. A., and El-Horbaty, E. S. M. (2024). Deep reinforcement learning approach for cyberattack detection. Int. J. Online Biomed. Eng. 20 (5), 15–30. doi:10.3991/ijoe.v20i05.48229
Verma, P., Bharot, N., Breslin, J. G., O’Shea, D., Vidyarthi, A., and Gupta, D. (2024). Zero-day guardian: a dual model enabled federated learning framework for handling zero-day attacks in 5G enabled IIoT. IEEE Trans. Consum. Electron. 70 (1), 3856–3866. doi:10.1109/TCE.2023.3335385
Wiemas, N. G. K., and Suroso, J. S. (2024). Analysis of risk management information system applications using Iso/Iec 27001:2022. Syntax Literate Jurnal Ilmiah Indonesia 7 (11), 18372–18391. doi:10.36418/syntax-literate.v7i11.15426
Yang, F., Zhang, S., Liu, C., Huang, J., Yu, T., Yu, K., et al. (2024). A hierarchical network management strategy for distributed CIIoT with imperfect CSI. IEEE Internet Things J. 11 (8), 13509–13523. doi:10.1109/JIOT.2023.3337897
Zhang, H., and Maple, C. (2023). Deep reinforcement learning-based intrusion detection in IoT system: a review. IET Conference Proceedings 2023 (14), 88–97. doi:10.1049/icp.2023.2577
Keywords: artificial intelligence, autonomous intrusion detection, explainable artificial intelligence (XAI), federated learning, zero-day threats in IIoT
Citation: Villegas-Ch W, Gutierrez R, Govea J and Palacios P (2026) Autonomous federated defense for zero-day threats in IIoT: explainable agents with real-time edge inference. Front. Commun. Netw. 6:1697204. doi: 10.3389/frcmn.2025.1697204
Received: 02 September 2025; Accepted: 22 December 2025;
Published: 28 January 2026.
Edited by:
Ali Ismail Awad, United Arab Emirates University, United Arab EmiratesReviewed by:
Babu R. Dawadi, Tribhuvan University (Pulchowk Campus), NepalJozef Papán, University of Žilina, Slovakia
Manuel J. C. S. Reis, University of Trás-os-Montes and Alto Douro, Portugal
Copyright © 2026 Villegas-Ch, Gutierrez, Govea and Palacios. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: William Villegas-Ch, d2lsbGlhbS52aWxsZWdhc0B1ZGxhLmVkdS5lYw==
†These authors have contributed equally to this work
Jaime Govea1†