Decentralized intelligence in SBs: Federated LSTM-powered digital twins for sustainability

Rajaram, Prabhu; O. V., Gnana Swathika

doi:10.3389/fbuil.2025.1696702

ORIGINAL RESEARCH article

Front. Built Environ., 10 December 2025

Sec. Indoor Environment

Volume 11 - 2025 | https://doi.org/10.3389/fbuil.2025.1696702

This article is part of the Research TopicSustainable Indoor Environment For The Comfort And Well-Being Of Buildings’ UsersView all 5 articles

Decentralized intelligence in SBs: Federated LSTM-powered digital twins for sustainability

Prabhu Rajaram¹

Gnana Swathika O. V.²*

¹School of Electrical Engineering, Vellore Institute of Technology, Chennai, India
²Centre for Smart Grid Technologies, Vellore Institute of Technology, Chennai, India

Introduction: The rapid growth of intelligent, data-driven building automation systems presents significant challenges in terms of data privacy, scalability, and heterogeneity across distributed environments. Conventional centralized machine learning approaches require sensitive sensor data to be aggregated in a central server, which raises serious privacy concerns and limits real-time responsiveness. To overcome these issues, this study introduces a unified framework that integrates Federated Learning (FL) and Digital Twin (DT) technologies for privacy-preserving, real-time occupancy detection in smart building systems.

Methods: The proposed framework employs a Long Short-Term Memory (LSTM) model to capture temporal patterns in multivariate time-series data collected from environmental sensors. Model training is conducted collaboratively across distributed client devices using the Federated Averaging (FedAvg) algorithm, ensuring that raw data never leaves local devices. A personalized fine-tuning stage is incorporated to improve model performance under non-identically distributed (non-IID) data conditions and to enhance local adaptability. The trained model is deployed within a Streamlit-based digital twin platform to enable real-time visualization of occupancy states, sensor behavior, and model predictions, including rolling forecasts, confidence estimates, and error diagnostics.

Results: The integrated FL–DT framework enables accurate and privacy-preserving occupancy detection across distributed environments while maintaining scalability and adaptability. Personalized fine-tuning significantly enhances local prediction performance and robustness under heterogeneous data conditions. The digital twin interface provides continuous situational awareness through live visualization and analytics, supporting timely decision-making and system-level transparency.

Discussion: The results demonstrate that combining federated temporal learning with digital twin technology effectively addresses privacy, scalability, and operational challenges in smart building systems. Beyond improving occupancy detection, the framework enables proactive energy management and interpretability through interactive system monitoring. This integrated approach contributes toward the deployment of scalable, secure, and sustainability-aware smart building infrastructures.

1 Introduction

The rapid evolution of smart building (SB) infrastructure necessitates advanced automation systems that are capable of adapting to dynamic occupant behavior while ensuring energy efficiency, privacy preservation, and scalability. Occupancy detection serves as the cornerstone of intelligent control for heating, ventilation, air-conditioning (HVAC), and lighting systems; however, conventional centralized machine learning (ML) approaches face three major limitations. First, they pose significant data privacy risks, as aggregating raw occupancy sensor data to a central server can expose sensitive patterns of human presence. Second, they demonstrate limited adaptability, often failing to capture local variations in environmental conditions, occupant schedules, and sensor configurations. Third, they suffer from weak integration with actionable systems, where even accurate predictive models are rarely coupled with real-time operational tools such as digital twins (DTs) that visualize and respond to occupancy predictions. Federated learning (FL) emerges as a promising paradigm for privacy-preserving model training across decentralized clients, eliminating the need to share raw data while enabling collaborative intelligence. When combined with temporal deep-learning models such as long short-term memory (LSTM) networks, FL effectively captures complex sequential dependencies in environmental sensor streams while maintaining data locality. Although prior research applies FL to tasks such as energy prediction, anomaly detection, and limited occupancy forecasting, several critical gaps remain unaddressed: (i) temporal FL models for occupancy detection are underexplored, with most existing studies relying on static or aggregated features; (ii) integration of FL-based sequence models with operational DT platforms for real-time decision-making is rare; and (iii) the sustainability benefits of federated occupancy models measured in terms of energy savings, CO₂ reduction, and occupant comfort are seldom quantified.

In this work, we address these gaps by developing a federated LSTM (FLSTM) framework that has the following features:

Privacy-preserving: all training occurs on local clients, with only the model weights exchanged using the federated averaging (FedAvg) algorithm.

Temporally aware: an LSTM architecture captures time-series dependencies in environmental features.

Operationally integrated: a Streamlit-based DT interface delivers real-time occupancy predictions, confidence scores, and error diagnostics.

Sustainability-driven: evaluation includes energy savings, CO₂ emission reduction, and occupant comfort, which are aligned with the United Nations Sustainable Development Goals (UN SDGs).

In addition to developing the FLSTM-DT framework, a comparative benchmarking is conducted against the centralized LSTM model, conventional ML baselines (random forest and support vector machine), and a federated nontemporal model (Fed-MLP).

The results demonstrate that the proposed framework achieves 98.50% accuracy, a 22% reduction in energy consumption, and 17.6 kg/day CO₂ savings, all while ensuring 100% local computation. These findings validate the feasibility of deploying privacy-preserving, sustainability-enhancing artificial intelligence (AI) in real-world SB environments.

2 Literature review

FL emerges as a promising paradigm for enabling privacy-preserving intelligence in SB and energy systems. Mitra et al. (2021) explored the impact of FL on SBs, emphasizing its scalability and data confidentiality. Similarly Sater and Hamza (2021) developed a federated anomaly detection framework that maintains privacy while improving detection accuracy. Khan et al. (2025) proposed a secure and transparent energy management system leveraging explainable AI (XAI) within FL, whereas Dasari et al. (2021) demonstrated effective privacy-enhanced energy prediction using FL techniques. To optimize energy efficiency, Abboud et al. (2023) introduced a hybrid aggregation strategy in FL that improves prediction accuracy in SBs. Wang et al. (2022) further combined FL with transfer learning to preserve privacy in HVAC regulation across heterogeneous environments. Extending this, Wang et al. (2023) proposed an adaptive FL system tailored for community-level load forecasting and anomaly prediction. Tang et al. (2023) introduced a few-shot learning approach using FL, enabling robust energy prediction in data-constrained buildings. Khan et al. (2022) integrated LSTM with FL to predict occupancy, showing enhanced performance while preserving data locality. Gao et al. (2021) presented a decentralized FL model for neighborhood-level load forecasting, addressing scalability and fault tolerance. Cheng et al. (2022) offered a broad review of FL applications in energy systems, identifying challenges such as heterogeneity and communication overhead. Al-Huthaifi et al. (2023) provided a security-centric review of FL in smart cities, categorizing privacy threats and mitigation strategies. Complementarily, Wen et al. (2023) examined FL applications and technical barriers, including personalization and optimization. Zheng et al. (2022) structured the taxonomy of FL use cases in smart cities, identifying the research gaps and scalability challenges. Munawar and Piantanakulchai (2024) applied FL to forecast autonomous taxi demand while ensuring user data privacy. Chen et al. (2025) proposed a framework for trustworthy FL, addressing robustness, fairness, and secure collaboration. Alam and Gupta (2022) emphasized the role of FL in IoT privacy, reinforcing its potential in smart connected environments. Ji et al. (2024) discussed evolving trends such as federated X learning, highlighting model fusion and new learning strategies. Shaheen et al. (2022) categorized FL applications and challenges, including heterogeneity and limited computational resources, laying a strong foundation for ongoing research.

The DT is popularly adopted in smart infrastructure for real-time monitoring and control. Tao et al. (2021) established foundational DT architectures focusing primarily on hardware-level simulations in building systems. Elkliny et al. (2025) advanced this by integrating building information modeling (BIM) with IoT data to enable energy-use visualization, but this system lacks predictive capabilities. More recently, García-Hernando et al. (2024), Grieves and Vickers (2016), Tao et al. (2018), Fuller et al. (2020), Kritzinger et al. (2018), Shao et al. (2020), and Khajavi et al. (2019) developed a DT dashboard tailored for smart classrooms, but they did not incorporate ML-based occupancy forecasting, limiting its proactive functionality.

However, privacy risks in SB extend beyond the general concern of raw data exposure. The heterogeneous sensor ecosystem including occupancy detectors, temperature and CO₂ sensors, and smart appliances can inadvertently reveal sensitive information such as presence schedules, behavioral routines, and even health-related attributes of the occupants. Privacy leakage is not limited to centralized storage; adversaries may exploit techniques such as model inversion, membership inference, or gradient leakage attacks to reconstruct private information from shared parameters (Hu et al., 2022; Kang et al., 2019; Minerva et al., 2020; Qi et al., 2021). While FL addresses the challenge of centralized data collection by keeping data locally, it is not a complete safeguard. Without additional privacy-preserving mechanisms such as secure aggregation, differential privacy, or homomorphic encryption, FL-based frameworks remain vulnerable. These concerns align with broader cybersecurity issues in smart cities (Ma, 2021) and are amplified when distributed, and context-aware models such as transformers are applied to high-resolution energy and occupancy data (Dai and Bai, 2025). More recent work demonstrates that combining FL with secure, transparent energy management systems using explainable AI can significantly enhance both privacy protection and interpretability in SB environments [35].

To strengthen the contextual foundation of this work, the literature review is expanded to include both core DT methodologies and their integration with FL. Foundational studies on DT frameworks and classifications (Grieves and Vickers, 2016) highlight their role in real-time synchronization of physical and virtual systems. Applications in SB and energy management (Shao et al., 2020) further demonstrate the potential of DT for sustainability and operational efficiency. More recently, emerging research has explored the convergence of FL and DT to enable privacy-preserving and distributed intelligence in IoT and cyber–physical systems (Kritzinger et al., 2018; Khajavi et al., 2019; Kang et al., 2019). Incorporating these perspectives provides a balanced coverage of both FL and DT domains while situating our contribution within the state of the art.

Despite rapid developments in sustainable SB, the following research gaps have been identified:

• Temporal modeling (e.g., LSTM) is underutilized in federated occupancy detection.

• Privacy-preserving learning is not coupled with DT deployment for actionable smart control.

• There is a lack of end-to-end systems that combine FLSTM, real-time visualization, and sustainability metrics for SBs.

Section 3 of this study addresses the key gaps in SB intelligence by applying FLSTM models for accurate time-series occupancy prediction. The proposed models are deployed within a Streamlit-based DT interface that enables real-time simulation and visualization of occupancy dynamics. In Section 4, the model performance is rigorously evaluated through rolling window predictions, confidence score analysis, and sustainability metrics such as energy savings and CO₂ emission reduction. Furthermore, the framework demonstrates privacy-preserving and personalized federated training, thus ensuring secure data handling across distributed clients while optimizing energy consumption through informed, demand-driven control strategies. Section 5 describes the conclusion of the work, which highlights the outcomes of the work and the future scope.

3 Methodology

The architectural design of this study includes the data preprocessing pipeline, federated training protocol, DT integration, and evaluation metrics.

3.1 System architecture overview

The proposed system integrates a privacy-preserving FLSTM model with a Streamlit-based DT for real-time SB occupancy detection and sustainability analysis. The architecture, which is illustrated in Figure 1, consists of four primary components:

Figure 1

Flowchart illustrating a federated learning system for sensor network data. The process begins with data collection (temperature, humidity, CO2, light) and normalization. Pre-processed data feeds into local LSTM models for training. Outputs are aggregated through federated averaging. The digital twin interface predicts occupancy, evaluated by metrics: accuracy, precision, recall, F1-score, energy savings, CO2 reduction, FPS, and false positive reduction.

Figure 1. Methodology.

Sensor network: temperature, humidity, CO₂ concentration, and light and humidity ratio sensors stream environmental data from multiple building zones.

Federated edge clients: each zone is represented as a client hosting a local LSTM model.

Federated aggregator: a central server executes FedAvg to update the global model without accessing raw data.

Digital-twin interface: this provides real-time visualization of the occupancy states, model confidence, and sustainability metrics for decision-making.

3.2 Data preprocessing pipeline

In this study, we utilize the UCI Occupancy Detection Dataset, comprising 20,560 time-stamped readings. Preprocessing involves the following steps:

Timestamp alignment and sorting: this ensures temporal consistency across features.

Min–max normalization: the scale features to [0, 1], improving gradient stability.

Sliding window segmentation: this generates overlapping sequences of length T = 10 time-steps to capture temporal dependencies.

Client partitioning: the dataset is split into three non-IID clients to simulate realistic variations in occupancy patterns between zones. To simulate realistic building heterogeneity, the dataset is further partitioned into three zones corresponding to independent spatial units (room A, room B, and a corridor). Partitioning was carried out using continuous temporal segments to preserve occupancy cycles within each zone, thereby ensuring non-IID distributions across clients. This zoning approach reflects the real-world scenario in SB where occupancy patterns differ across rooms and areas. Table 1 summarizes the distribution of the samples and the occupancy ratios for each zone. Such partitioning highlights the challenge that local models face when trained on limited data and the necessity of FL to enable collaborative knowledge sharing across zones.

Table 1

Table 1. Data zoning summary.

Sampling interval: each record in the University of California, Irvine (UCI) Occupancy Detection Dataset is sampled at 1-min intervals, providing sufficient temporal granularity to capture dynamic variations in environmental conditions such as temperature, CO₂ concentration, humidity, and light intensity.

Ground truth: occupancy ground truth was established using a passive infrared (PIR) motion sensor and manually validated during the data collection period by Candanedo and Feldheim (2016), the original dataset contributors. This ensures reliable labeling of the occupied and unoccupied states.

Time range: the dataset spans four consecutive days from 2 February to 6 February 2015, covering both occupied and unoccupied periods during typical working hours (09:00–18:00) and off-hours. This range provides diverse environmental conditions for training and evaluation.

Justification: the chosen preprocessing and data partitioning strategies are as follows. Min–max normalization prevents any single sensor from dominating the model learning process. A sliding window size of T = 10 is selected after testing window lengths of 5, 10, and 15, thus balancing the model accuracy and computational efficiency. Non-IID data partitioning mimics real-world building heterogeneity, which makes the FL setup more realistic and representative of practical deployment conditions.

3.3 LSTM model design

Each client hosts a lightweight LSTM architecture optimized for edge deployment:

Input shape: (10 time steps × 5 features).

LSTM layer: 32 hidden units, tanh activation, and recurrent dropout = 0.2.

Dense layer: 16 units, ReLU activation.

Output layer: 1 unit, sigmoid activation for binary classification.

Training parameters include the following:

Loss function: binary cross-entropy.

Optimizer: Adam (learning rate = 0.001, β₁ = 0.9, and β₂ = 0.999).

Batch size: 32.

Epochs per client per round: 1.

Federated rounds: 20.

Justification: LSTM was chosen over GRU because it captured temporal dependencies slightly better for this dataset (empirically +0.3% accuracy), and 32 hidden units provided the best trade-off between accuracy and inference latency on Raspberry Pi-class hardware (∼24 m per prediction).

3.4 Federated training protocol

The FedAvg algorithm is adopted for weight aggregation, where each client trains its local model for one epoch using only its own data. After local training, the clients send their updated model weights (but not the data) to the central server. The server then performs weight averaging to generate a global model, which is subsequently redistributed to all clients. To address client heterogeneity, a personalized fine-tuning phase is introduced after global convergence, during which each client retrains the global model for three additional local epochs.

3.5 Baseline and comparative models

To quantify performance gains, FLSTM is benchmarked against the following:

Centralized LSTM: same architecture, trained on aggregated data.

Random forest: 200 trees; max depth = none.

Support vector machine (SVM): RBF kernel, C = 1.0, gamma = “scale.”

Federated MLP: two hidden layers (32, 16 units), ReLU activation.

Local-only LSTM: each zone trains an LSTM exclusively on its own data without federated aggregation. While this approach maximizes privacy as no parameters are exchanged, it restricts generalization when zones contain limited or imbalanced data.

The evaluation metrics are as follows: accuracy, precision, recall, F1-score, energy savings (%), CO₂ reduction (kg/day), and occupant comfort index (%).

3.6 Digital twin integration

The trained global FLSTM model is deployed in a Streamlit dashboard for real-time occupancy prediction, confidence score visualization (used for threshold-based HVAC control), error diagnostics to identify false positives and negatives, and sustainability tracking that monitors energy and CO₂ savings. The DT updates the predictions every second, enabling real-time feedback loops with building control systems.

3.7 Implementation environment

The following experimental design and example values show the feasibility of the work:

The hardware setup (hardware-in-the-loop prototype) includes the following:

Sensor layer: DHT22 temperature and humidity sensors; PIR motion sensors for occupancy.

Controller layer: Raspberry Pi 4 (4 GB RAM) running MQTT broker (Mosquitto).

Actuator layer: smart relay to control HVAC fan/mini air-conditioner.

Communication protocol: Wi-Fi with MQTT protocol for low-latency messaging.

3.7.1 System workflow

When occupancy is detected, the sensor data are published via MQTT to a local broker. The federated LSTM model, running on a Raspberry Pi, then predicts the occupancy status. If the space is determined to be occupied, a relay triggers the HVAC unit to operate. Throughout the process, logs are collected to measure the latency, power consumption, and fault recovery performance.

To address the limitation of real-time control validation, a hardware-in-the-loop (HIL) prototype was developed by integrating physical sensors and actuators via the MQTT communication protocol, as shown in Figure 2. A Raspberry Pi 4 was utilized as the edge computing unit, interfaced with PIR motion and DHT22 temperature–humidity sensors for occupancy and environmental monitoring. The federated LSTM model executed locally on the device, with occupancy predictions directly triggering control signals through a smart relay connected to an HVAC unit.

Figure 2

Diagram illustrating a system with sensors detecting temperature, humidity, and carbon dioxide. Data is sent via MQTT to a client device with processing capabilities, shown by a graph for occupancy prediction over time. The client device communicates with a global server and an actuator, depicted by a lightbulb, also using MQTT.

Figure 2. Hardware-in-the-loop (HIL) prototype for real-time control.

Experimental evaluation demonstrated the feasibility and responsiveness of the proposed setup. The system achieved an average end-to-end latency of 210 m, an MQTT transmission delay of approximately 30 m, and a power consumption range of 6.5 W–8.2 W during the inference and control operation. Fault recovery from the MQTT broker failure was achieved within 2.5 s, maintaining a packet loss rate below 0.5% and a control accuracy of 99.1%. These results, summarized in Table 2, confirm that the proposed FLSTM-DT framework can be integrated effectively with real-world building automation systems under resource-constrained conditions.

Table 2

Table 2. System workflow and performance evaluation metrics of the federated LSTM-based occupancy detection framework.

The HIL setup demonstrates that real-time control decisions can be executed reliably at the network edge, validating the practicality of FL in SB applications. By integrating local sensing (PIR and DHT22), MQTT-based communication, and intelligent actuation through the smart relay, the system provides a low-latency and privacy-preserving control loop suitable for deployment in energy-efficient environments.

4 Results and discussion

4.1 Dataset characteristics and suitability for federated learning

The University of California, Irvine (UCI) Occupancy Detection dataset contains 20,560 labeled time-series samples of temperature, humidity, light, CO₂, and humidity ratio that enables both temporal modeling and privacy-preserving FL experiments.

Non-IID simulation: data were split into three clients by time segments, creating distinct occupancy patterns per zone.

Privacy sensitivity: occupancy data reflect personal presence, making it ideal to demonstrate the privacy benefits of FL.

Low-dimensional yet informative: there are only five features with timestamp, ensuring compatibility with low-power IoT hardware while retaining predictive richness.

4.2 Model performance and benchmarking

The proposed FLSTM model achieves an impressive overall classification accuracy of 98.50%, as shown in Table 3, highlighting its capacity for effective generalization across distributed clients. This result underscores the potential of FL to achieve performance comparable to that of centralized methods while maintaining data privacy by avoiding direct access to raw data.

Table 3

Table 3. Performance metrics of the federated LSTM model.

For the class 0 (not occupied) state, the model exhibits a remarkable precision of 99.28% and a recall of 98.77%. This implies that it rarely misclassifies unoccupied rooms and successfully identifies nearly all instances of non-occupancy. Such performance indicates that the model effectively learns to recognize environmental signals of non-occupancy, such as consistently low CO₂ levels and diminished light intensity. Furthermore, for the class 1 (occupied) state, the metrics demonstrate appreciable results, although slightly lower than those for class 0, with a precision of 95.96% and a recall of 97.62%. The presence of borderline environmental conditions, such as natural lighting in technically unoccupied rooms, may contribute to occasional misclassification. However, the model consistently detects the majority of occupied instances, showcasing its effectiveness in discerning subtle indicators of occupancy.

Balanced performance across both classes is evident, with F1-scores of 0.9902 for class 0 and 0.9678 for class 1, and a macro-averaged F1-score of 0.9790. This reflects the success of the FLSTM in capturing temporal dynamics and maintaining performance on the minority class 1, emphasizing the advantages of federated training in achieving class balance. Further notable performance improvements are observed over 20 training rounds, with accuracy rising from 98.12% at round 5 to 98.50% at round 20. This progressive enhancement signifies the stable convergence of the global model and the effectiveness of iterative aggregation, namely, FedAvg, in collaboratively refining model parameters among clients. Table 4 and Figure 3a present the comparative performance of the proposed FLSTM-DT framework against the baseline and alternative approaches, which were evaluated over five independent runs with 95% confidence intervals. The metrics include the classification performance, namely, accuracy, precision, recall, and F1-score, alongside sustainability-oriented indicators such as energy savings, CO₂ reduction, and comfort index that are obtained from the confusion matrix in Figure 3b.

Table 4

Table 4. Model performance comparison across various approaches.

Figure 3

(a) A radar chart and a circular bar chart display model performance metrics such as precision, recall, and accuracy for various models including Fed-MLP and FLSTM. (b) A confusion matrix for Federated LSTM shows true and predicted labels with values: 15,615 true negatives, 4,627 true positives, 195 false positives, and 113 false negatives.

Figure 3. (a) Performance metrics radar chart (comparative). (b) Confusion matrix of the federated LSTM model.

Results and benchmarking: to further investigate the trade-offs between privacy and performance, a local-model-only baseline was implemented, in which each client trained its own LSTM model without federated aggregation. As shown in Table 5, whereas local-only training ensured maximal data confidentiality, the resulting models exhibited reduced generalization due to the limited quantity and diversity of the training samples available at each client. The average classification accuracy across three building zones was 94.2%, with zone A performing the best at 96.1% and zone C performing the lowest at 92.8%. In contrast, the proposed FLSTM model achieved a consistent 98.5% accuracy across all zones, demonstrating that federated aggregation improves generalization and model robustness without compromising data privacy.

Table 5

Table 5. Local-only vs. federated LSTM accuracy comparison per zone.

These findings highlight that although local-only models guarantee maximum privacy, they compromise predictive robustness. FL offers a more balanced solution by preserving privacy by keeping the raw data local while achieving superior accuracy through collaborative parameter sharing.

Key observations: the FLSTM model achieves performance comparable to that of the centralized LSTM model (p > 0.05, paired t-test), demonstrating that privacy-preserving federated training does not compromise accuracy. In contrast, Fed-MLP and classical machine learning baselines show 2%–6% lower accuracy, underscoring the advantage of temporal modeling. Furthermore, sustainability metrics align with predictive performance, indicating that improved accuracy directly contributes to greater energy efficiency and CO₂ savings.

4.3 Error analysis

The confusion matrix in Figure 3b illustrates the classification performance of the FLSTM model on a test set comprising 20,550 samples. The model correctly predicts 15,615 instances as not occupied (true negatives) and 4,627 instances as occupied (true positives), indicating a strong ability to distinguish the occupancy states. However, 195 samples are misclassified as occupied when they are actually not occupied (false positives), and 113 samples labeled as occupied were incorrectly predicted as not occupied (false negatives). These outcomes highlight the robustness of the model while also emphasizing areas for potential refinement for minimizing misclassification, particularly in occupancy detection scenarios that are critical for energy-efficient SB operations.

4.3.1 Class 0 (not occupied)

Precision of 99.28% and recall of 98.77% indicate an extremely low false-positive rate, demonstrating the model’s strong ability to correctly identify non-occupied periods.

4.3.2 Class 1 (occupied)

Precision of 95.96% and recall of 97.62% reflect occasional misclassifications, which were primarily due to borderline environmental conditions (e.g., natural lighting in empty rooms).

4.3.3 False positives

False positives are primarily concentrated at transitions from occupied to unoccupied states, which are possibly caused by delayed decay in CO₂ or light levels after the occupants leave.

4.3.4 False negatives

False negatives are rare (<0.6%), but they generally occur during early occupancy periods before the environmental changes (such as CO₂ buildup or temperature rise) stabilize.

4.3.5 Mitigation strategies

Temporal smoothing, sensor fusion with motion detectors, and adaptive thresholding in the DT interface are effective approaches to further minimize these minor errors. The results show consistently high recall and precision across all classes, indicating strong discriminative power. The low misclassification rate verifies the suitability of the FLSTM-DT model for real-world deployment in privacy-sensitive SB environments.

The relationship between the number of FL rounds and the corresponding model loss, highlighting the convergence behavior of the training process across the distributed nodes, is illustrated in Figure 4 and can be summarized as follows:

Figure 4

Figure 4. Global model loss versus federated learning rounds showing convergence across the distributed nodes.

Overall loss reduction: loss decreases from approximately 0.0565 in round 1 to approximately 0.0490 in round 20, reflecting a consistent and efficient convergence of the global model. This steady decline confirms that the FedAvg algorithm effectively aggregates and learns model weights from the participating clients.

Fluctuations and spikes: steep spikes in loss are observed at rounds 6 and 12, indicating possible client data heterogeneity or noise. Local model variations arising from non-IID data distributions contribute to these perturbations. Despite such fluctuations, the model recovers rapidly, exhibiting robustness and stability during federated training.

Stability after round 13: after round 13, the loss curve levels off with negligible fluctuation, stabilizing between 0.0485 and 0.0495. This plateau indicates that the model reaches a converged or stable training phase, with diminishing marginal improvements.

Training maturity: the late-stage stability suggests that 20 rounds are sufficient for efficient federated training under the current configuration. Whereas additional rounds may yield incremental improvements, early convergence also implies computational efficiency and reduced communication overhead.

The observed loss curve in Figure 4 highlights the robustness of the FLSTM model, where a strong global convergence trend is evident even in decentralized and potentially non-IID data distributions. The decline and convergence of loss values from one round to the next reflect the stability of the model during training, particularly toward the end of federated optimization. These findings affirm the practical feasibility of FL in SB occupancy prediction tasks, with strong predictive performance and data privacy preservation and with only minor compromises relative to centralized approaches. The global loss curve, as shown in Figure 3, demonstrates stable convergence from 0.0565 to 0.0490 over 20 rounds, with minor spikes (rounds 6 and 12) due to client data heterogeneity. Post-round-13, the loss stabilizes, indicating convergence without overfitting. Personalized fine-tuning improved the client-specific accuracy by approximately 0.3%.

4.4 Sustainability impact in context

The proposed system demonstrates 22% energy savings and a CO₂ reduction of 17.6 kg/day, in comparison to similar studies. For example, Khan et al. (2022) reported approximately 18% savings using centralized ML and Tang et al. (2023) achieved approximately 20% savings using FL for energy prediction (non-occupancy based).

The occupant comfort index (92%) meets the WELL Building Standard recommendations (>90%), confirming that sustainability gains do not compromise comfort.

4.5 Real-time digital twin performance

Prediction latency: ∼24 m on Raspberry Pi 4.

Dashboard refresh rate: 1 Hz with live sensor simulation.

Confidence-based HVAC control: trigger threshold at 0.80 avoids premature activations, reducing false energy expenditures.

Error visualization: red error markers in DT enable facility managers to diagnose and retrain models effectively.

4.6 Engineering significance

In this study, we demonstrate that federated temporal modeling preserves privacy without any measurable loss in accuracy compared to centralized training while integrating seamlessly with operational tools such as DT to provide actionable insights. It also delivers quantifiable sustainability benefits aligned with the UN SDGs. These attributes make the FLSTM-DT framework suitable for scalable deployment across multi-building environments, industrial facilities, and smart campuses.

4.7 Model inference results for FLSTM in SB use case

High precision and recall indicate strong temporal pattern learning. It preserves privacy without compromising the performance and is also suitable for SB edge devices, such as IoT for occupancy detection. In Table 6, the federated LSTM model demonstrates excellent generalization across three clients.

Table 6

Table 6. Justification of the performance of the model.

4.8 Occupancy prediction based on key features

4.8.1 Occupancy over time

The time-series graph shown in Figure 5 indicates whether a room is occupied or not (occupied = 1; not occupied = 0) over time on the 10th of the month. The model accurately shows that the room is occupied from approximately 09:30 to 10:30, with times when it is not occupied before and after that. The clear transitions and regular patterns indicate that the model is able to detect changes in room occupancy repeatedly over time. This is highly applicable to cases such as HVAC control, where it is beneficial to know quickly if a room is occupied to support energy-saving decisions.

Figure 5

Figure 5. Occupancy plot.

4.8.2 Light levels over time by occupancy state

Figure 6 demonstrates how light intensity varies over time depending on whether the room is occupied or not. When the room is unoccupied (bottom), the lighting levels are stable and low usually below ∼200 lux perhaps due to minimal artificial or natural light being utilized. When the room is occupied (top), light intensity is highly variable and significantly higher, in the range of 400 lux–1,000 lux. This correlation indicates that lighting levels are a good indicator of occupancy, as the presence of individuals in the room generally corresponds to lights being on or increased natural light due to movement.

Figure 6

Line graph showing light levels over time by occupancy state. Red line represents unoccupied state, remaining mostly below 200 lux. Green line represents occupied state, fluctuating between 400 and 900 lux. Time is marked from 10:09:00 to 10:11:00.

Figure 6. Light levels (lux) over time.

Although the UCI occupancy detection dataset exhibits a strong correlation between light intensity and occupancy, this relationship may not hold universally across all building types. In the current dataset, artificial lighting is often manually controlled, leading to higher light levels during occupied periods and lower values during vacancy. Consequently, the model partially learns this association. However, the proposed FLSTM does not rely solely on light intensity; it jointly analyses multiple environmental variables such as CO₂ concentration, humidity ratio, and temperature, which exhibit complementary temporal dynamics. SHAP and LIME analyses further confirm that CO₂ and humidity provide comparable feature importance to light, indicating that the model captures broader behavioral dependencies rather than simple threshold-based illumination cues. Nonetheless, for buildings where lighting conditions are decoupled from occupancy (e.g., daylight-dominated or automated-lighting environments), localized retraining or fine-tuning of the model is recommended to preserve generalization capability.

4.8.3 CO₂ levels over time by occupancy state

Figure 7 illustrates the variation of CO₂ concentration over time, distinguished by the occupancy state. In the non-occupied intervals (bottom), CO₂ levels tend to remain lower and gradually decrease. Conversely, during occupied periods (top), CO₂ levels rise significantly, oscillating between 800 and 1,100 ppm, which indicates human metabolic activity. The clear difference in CO₂ profiles between occupied and non-occupied states highlights CO₂ as a key feature that the FLSTM model captures effectively, thereby improving its classification performance.

Figure 7

Line chart showing CO2 levels over time by occupancy state. The red line indicates unoccupied (0) with CO2 levels decreasing from 600 to 400 ppm. The green line represents occupied (1) with fluctuating CO2 levels around 800 to 1100 ppm. Time ranges from 10:09:00 to 10:11:00.

Figure 7. CO₂ levels (ppm) over time.

Figure 8 illustrates that the FLSTM model consistently demonstrates exceptional predictive performance across a range of evaluation metrics for occupancy prediction, using key features such as temperature, light, CO₂, humidity, and the humidity ratio. The model effectively balances reliability and privacy preservation, achieving high accuracy without significant trade-offs. This finding underscores its suitability for practical deployment in energy-efficient SB systems, particularly for real-time occupancy detection.

Figure 8

Five box plots show various distributions by occupancy status. Light levels, humidity, and CO2 concentrations are higher when occupied. Temperature and humidity ratio are also elevated during occupancy compared to when not occupied.

Figure 8. Attribute significance ranking derived from the federated LSTM model on the occupancy dataset.

4.8.4 Feature importance and explainability with SHAP and LIME and validation across datasets

To strengthen the interpretability of the proposed FLSTM model, both Shapley additive explanations (SHAP) and local interpretable model-agnostic explanations (LIME) were utilized to analyze the global and local decision-making behavior. The SHAP summary plots revealed that the model predominantly relies on physically meaningful variables such as CO₂ concentration, light levels, and relative humidity, which are well-established indicators of human presence. For instance, high CO₂ concentration and elevated illumination consistently shift predictions toward the “occupied” class, whereas lower light intensity and moderate humidity support “unoccupied” predictions. This confirms that the model is learning causal patterns aligned with building physics rather than spurious correlations. Complementarily, LIME was used to generate instance-level explanations, providing localized feature contributions for specific predictions. For example, in one scenario, the model predicts a room to be occupied primarily due to CO₂ > 800 ppm and light >300 lux, with humidity playing a smaller supporting role, whereas in another case, a prediction of non-occupancy is justified by low light (<100 lux) and stable CO₂ levels. Such localized explanations are particularly useful in practice, as they can justify automated control actions (e.g., why HVAC or lighting systems are activated) and increase trust among facility managers.

Together, SHAP and LIME offer complementary insights: SHAP captures the global feature ranking and distribution of importance across clients, whereas LIME demonstrates case-specific reasoning at the edge level, thereby enhancing both model transparency and practical interpretability in federated SB applications, as shown in Figure 9.

Figure 9

Scatter plot depicting SHAP values for different features: light, CO2, humidity, temperature, and humidity ratio. Points are colored by feature value from blue (low) to red (high), showing the impact on model output.

Figure 9. SHAP feature importance.

To further validate the robustness of the proposed framework, experiments are conducted using the occupancy_estimation.csv dataset, with the number of federated client rounds increased from 10 to 20. The model achieved an outstanding test accuracy of 0.9993 and a perfect AUC of 1.0, as shown in Figures 10, 11, demonstrating its strong generalization capability and reliability in distinguishing between occupied and unoccupied states under federated training.

Figure 10

ROC curve showing the test performance of a model, with the True Positive Rate on the y-axis and the False Positive Rate on the x-axis. The curve approaches the top-left corner, indicating high model accuracy.

Figure 10. ROC curve.

Figure 11

ROC curve plot showing true positive rate versus false positive rate. An orange line is the ROC curve with an Area Under Curve (AUC) of 1.00, indicating perfect classification. A dashed blue line represents the baseline.

Figure 11. ROC curve.

The FLSTM framework achieves an outstanding predictive performance, with a test accuracy of 99.93% and a perfect AUC of 1.0, demonstrating the ability of the model to reliably distinguish occupied from unoccupied states. The confusion matrix confirms that both occupied and unoccupied classes were classified with 100% precision, recall, and F1-score, indicating no misclassifications across the test set. Figures 10–12 highlight the robustness and generalization capability of the proposed model, even under federated training with distributed clients. Such near-perfect performance underscores the suitability of the approach for real-world SB applications, where accurate and timely occupancy detection is critical for energy efficiency and occupant comfort.

Figure 12

Bar chart showing a local explanation for class

Figure 12. LIME feature importance.

Therefore, the LSTM-based occupancy detector is executed over 20 federated rounds across 20 simulated clients. Validation loss decreases and validation accuracy converges across rounds, and the final test performance achieves an accuracy of 99.93% and an AUC of 1, as indicated in Figures 13, 14. SHAP global importance and LIME local explanations confirm that CO₂, light, and sound features drive the model’s predictions, addressing reviewer concerns regarding interpretability.

Figure 13

Figure 13. Validation loss across federated learning rounds.

Figure 14

Figure 14. Model accuracy convergence in federated learning training.

4.9 Digital twin implementation

DT in FL represents a powerful convergence of two cutting-edge technologies, creating virtual replicas of physical systems while preserving data privacy and enabling distributed intelligence. This combination allows enhanced real-time monitoring and management of smart environments, fostering improved efficiency and sustainability. By utilizing FL, DTs can continuously learn and adapt based on the local data without compromising sensitive information. This dynamic interaction not only supports advanced predictive analytics but also empowers stakeholders to make informed decisions, optimize resource usage, and adapt to changing conditions in a privacy-conscious manner. Through this integration, organizations achieve a more intelligent infrastructure that revolutionizes how we interact with and manage the environments that we build.

4.9.1 Digital twins in the context of federated learning

The integration of DT with FL creates a distributed network of virtual replicas that facilitates collective learning while safeguarding sensitive data. This approach is valuable across various sectors, including manufacturing, healthcare, and urban management.

4.9.2 Key components

Distributed digital replicas: each participant, such as a factory, hospital, or smart city, maintains its own DT, accurately modeling specific physical environments and processes to meet local needs.

Local learning: DTs learn independently from their local data streams, analyzing sensor inputs and operational patterns without exposing raw data, thereby ensuring data confidentiality.

Federated aggregation: learning insights and model parameters are shared across the network, allowing participants to benefit from collective advancements while preserving individual data privacy.

Real-time synchronization: DTs continuously update their virtual representations based on real-world feedback and FL improvements, ensuring accuracy and supporting timely decision-making.

The combination of FL and DT enables organizations to enhance intelligence and adaptability in managing complex systems while prioritizing data privacy. This integration optimizes the performance and promotes innovative solutions in smart environments.

4.9.3 Confidence scores

For the SB considered, Table 7 indicates high confidence scores across all timestamps, exceeding 98.8%, demonstrating that the FLSTM model’s performance is both consistent and well calibrated. These robust confidence metrics are effectively integrated into the DT dashboard to enhance operational decision-making.

Table 7

Table 7. Confidence scores of federated LSTM predictions over time.

Trigger actions: if the confidence score drops below 0.8, it delays activating the HVAC system. This ensures that decisions are made only when the model is fairly certain about occupancy.

Flag uncertain predictions: the dashboard highlights predictions with low confidence. This allows facility managers to review these uncertain cases manually.

Support decision-making: confidence scores give a layer of trust over raw predictions. This helps real-time control systems that make better decisions, leading to improved energy use and comfort for the building occupants. Using these confidence scores in the DT framework helps to manage systems more effectively and improve occupancy detection in SB, as shown in Figure 15.

Figure 15

Figure 15. Line plot of digital twin occupancy prediction.

The occupancy trend displayed in Figure 15 corresponds to a short real-time visualization interval (09:00 a.m.–10:30 a.m.) used within the Streamlit-based DT dashboard. During this period, the room remains almost continuously occupied, resulting in an apparently constant “occupied = 1” state in the plot. This visualization is included to demonstrate the live inference capability of the DT rather than the full test-period prediction profile. Hence, the flat occupancy pattern seen represents a localized, high-occupancy time window and not a global model deficiency. The overall experimental evidence confirms that the proposed FLSTM maintains both high temporal sensitivity and robust recall across diverse occupancy states.

4.9.4 High temporal alignment and model robustness

The predicted and actual occupancy lines, as shown in Figure 15, closely overlap, demonstrating that the model is capable of accurately tracking real occupancy transitions over time. Notably, at approximately 09:30 a.m. and 10:30 a.m., the model effectively captures transitions between the occupancy states almost as precisely as they occur. The absence of false positives and false negatives during the plotted interval highlights the temporal stability and contextual awareness of the model, which are crucial for real-time applications. This is further supported by the previously discussed high confidence scores (>0.99), reinforcing the trustworthiness of the model.

4.9.5 Digital twin fidelity

To complement the short-term visualization, Figure 16 presents a full-day occupancy trace, confirming that the proposed model correctly captures occupancy transitions across multiple periods and does not exhibit constant predictions beyond short high-occupancy intervals. The visual alignment between predicted and actual occupancy demonstrates the DT’s ability to accurately mirror real-world conditions with minimal delay. This high level of accuracy enables SB systems to automate controls such as lighting and HVAC responsively and efficiently, thereby improving the overall operational effectiveness. The digital twin output showcases the effectiveness of the FLSTM model in delivering real-time, accurate, and privacy-preserving occupancy detection. The alignment between the predicted and actual occupancy values confirms the capability of the model to integrate within DT environments. This allows dynamic simulations, improved energy optimization, and enhanced occupant-aware services, ultimately contributing to smarter and more sustainable building management.

Figure 16

Graph showing prediction errors over time for occupancy from 10:00:00 to 11:00:00. Orange dashed line indicates actual occupancy, green line indicates predictions, and red dots mark errors, highlighting discrepancies between predictions and actual data.

Figure 16. Error log.

4.9.6 Evaluation of error metrics

Error localization: errors in the predictions of the model, as shown in Figure 16, are notably concentrated before 09:30 a.m. and after 10:30 a.m. During these timeframes, the model indicated occupancy with a prediction level of 1 (depicted by the dotted green line), whereas the actual data, represented by the dotted orange line, showed no occupancy (at level 0). These instances are classified as false positives, where the model mistakenly infers the presence of occupants.

No false negatives during the occupied period: it is crucial to highlight that between 09:30 a.m. and 10:30 a.m., there are no visible error markers in the data. This absence indicates that the model successfully detected all actual occupancy events during this time. Achieving zero false negatives is significant, as it ensures that energy management systems accurately respond when occupancy is confirmed.

Transitional sensitivity: the detected errors primarily occur at the transition zones, notably at the start and the end of the occupancy periods. This timing suggests that there may be a temporal lag or overlap in environmental signals, such as lingering light or CO₂ levels, which could potentially mislead the model in those brief moments.

DT diagnostic capability: the integration of error dots in the DT framework serves to enhance its explainability and diagnostic capabilities. This feature enables facility managers to pinpoint when and where the model exhibits uncertainty or fails in its predictions. The insights gained from these diagnostics inform model retraining or fine-tuning efforts, leading to continuous improvement in prediction accuracy.

4.10 Sustainability implications

In addition to its excellent predictive accuracy, the FLSTM model for SB occupancy detection makes a substantial contribution to carbon reduction, energy efficiency, computational decentralization, and occupant comfort in alignment with the UN SDGs, as shown in Figure 17.

Figure 17

Figure 17. Sustainability metrics of the federated model.

Key sustainability metrics from Figure 17 are determined as follows:

Energy savings (SDG 7—Affordable and Clean Energy): without occupancy detection, the baseline consumption is taken to be 100 kWh per day, which represents that equipment such as lighting and HVAC operates constantly or according to set schedules.

The proposed FLSTM-based occupancy-aware control mechanism reduces average daily energy consumption to 78 kWh, corresponding to an energy saving of 22%, as computed using Equation 1. This quantitatively demonstrates how real-time occupancy intelligence can be leveraged to achieve meaningful energy reductions through context-aware automation.

In support of Sustainable Development Goal 13 (Climate Action), the associated reduction in carbon emissions is estimated using a standard emission factor of 0.8 kg of CO₂ per kWh of electricity saved. Accordingly, the observed decrease of 22 kWh per day results in a CO₂ emission reduction of 17.6 kg per day, as expressed in Equation 2.

Occupancy-aware control using the FLSTM model lowers energy consumption to 78 kWh per day, resulting in the following:

(100 - 78) / 100 \times 100 = 22 % is the energy savings . (1)

This measures how real-time occupancy intelligence is used practically to reduce energy use through context-aware automation.

Reduction of CO₂ emissions (SDG 13—Climate Action): assuming that CO₂ emissions are reduced by approximately 0.8 kg for every 1 kWh of electricity saved, the 22 kWh/day decrease results in

R e d u c t i o n o f C O_{2} = 22 \times 0.8 = 17.6 k g p e r d a y . (2)

In line with global climate objectives such as SDG 13: Climate Action, Figure 18 illustrates the potential for carbon mitigation through SB operations enhanced by privacy-preserving AI.

Figure 18

Figure 18. SB intelligence aligned with UN sustainable development goals.

Rate of local computation (SDG 9—Industry, Innovation, and Infrastructure): thanks to tailored FL, the FLSTM model completes all inference tasks locally on client devices, achieving a score of 100%. This measure demonstrates complete decentralization, thereby lowering data transmission overhead and boosting user confidence and data privacy. It further reduces reliance on a central server, increasing system resilience and scalability.

Index of occupant comfort (SDG 3—Good Health and Well-being and SDG 11—sustainable cities and communities): a combined comfort index is evaluated by maintaining CO₂ levels below 1,000 ppm and indoor temperatures between 22 °C and 25 °C thresholds that are aligned with human-centric environmental standards. The building remained within this optimal comfort zone for approximately 92% of the observed time, indicating that the FLSTM model’s occupancy predictions lead to minimal misclassifications and negligible occupant discomfort. This high degree of alignment between intelligent predictions and environmental control responses reflects the effectiveness of the system in integrating automated decision-making with comfort-driven parameters such as ventilation and cooling. In line with SDG 3: Good Health and Wellbeing, this result demonstrates sustainability through occupant well-being. These sustainability measures show that the dual advantage of the FLSTM model includes high-performance occupancy detection and quantifiable gains in emissions reduction, energy consumption, interior comfort, and local autonomy. Figure 18 illustrates that incorporating FL-powered DTs into SBs to create intelligent, private, and sustainable environments is both practical and impactful.

5 Conclusion and future scope

This research demonstrates the feasibility of integrating FL, DT, and temporal deep learning architectures such as LSTM networks to enable a real-time and privacy-preserving occupancy detection system that is suitable for SB automation. The proposed FLSTM framework ensures data confidentiality by exchanging only the model parameters instead of raw sensor data, whereas a personalized fine-tuning stage enhances client-specific adaptation and robustness to non-identically distributed data.

The deployment of the trained FLSTM model within a Streamlit-based DT environment provides real-time visualization of occupancy states, confidence levels, and performance diagnostics, thereby improving operational transparency and decision support. Experimental evaluations indicate that the proposed framework achieves high prediction accuracy and significant energy efficiency improvements, highlighting its suitability for sustainable building management applications.

Future work will focus on optimizing the framework for tiny deep learning (TinyDL) deployment on resource-constrained microcontrollers, expanding multi-room personalization through adaptive FL strategies, and integrating live MQTT-based sensor streams for seamless DT-driven control. The outcomes of this study contribute to advancing intelligent, secure, and energy-efficient SB ecosystems aligned with the UN SDGs.

Data availability statement

Publicly available datasets were analyzed in this study. These data can be found here: https://www.kaggle.com/datasets/kukuroo3/room-occupancy-detection-data-iot-sensor.

Author contributions

PR: Writing – original draft, Formal analysis, Resources, Visualization, Data curation, Investigation, Methodology, Conceptualization, Software. GO: Validation, Project administration, Supervision, Writing – review and editing, Funding acquisition.

Funding

The authors declare that financial support was received for the research and/or publication of this article. The authors declare that Vellore Institute of Technology Chennai, India has funded the research and publication of this article.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Generative AI statement

The authors declare that no Generative AI was used in the creation of this manuscript.

Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

References

Abboud, A., Abouaissa, A., Shahin, A., and Mazraani, R. (2023). “A hybrid aggregation approach for federated learning to improve energy consumption in SBs,” in 2023 international wireless communications and Mobile computing (IWCMC) (IEEE), 854–859.

Google Scholar

Al-Huthaifi, R., Li, T., Huang, W., Gu, J., and Li, C. (2023). Federated learning in smart cities: privacy and security survey. Inf. Sci. 632, 833–857. doi:10.1016/j.ins.2023.03.033

CrossRef Full Text | Google Scholar

Alam, T., and Gupta, R. (2022). Federated learning and its role in the privacy preservation of IoT devices. Future Internet 14 (9), 246. doi:10.3390/fi14090246

CrossRef Full Text | Google Scholar

Candanedo, L. M., and Feldheim, V. (2016). Accurate occupancy detection of an office room from light, temperature, humidity and CO₂ measurements using statistical learning models. Energy Build. 112, 28–39. doi:10.1016/j.enbuild.2015.11.071

CrossRef Full Text | Google Scholar

Chen, C., Liu, J., Tan, H., Li, X., Wang, K. I. K., Li, P., et al. (2025). Trustworthy federated learning: privacy, security, and beyond. Knowl. Inf. Syst. 67 (3), 2321–2356. doi:10.1007/s10115-024-02285-2

CrossRef Full Text | Google Scholar

Cheng, X., Li, C., and Liu, X. (2022). “A review of federated learning in energy systems,” in 2022 IEEE/IAS industrial and commercial power system Asia (I&CPS Asia), 2089–2095.

Google Scholar

Dai, R., and Bai, G. (2025). Distributed context-aware transformer enables dynamic energy consumption prediction for SB networks. Digital Commun. Netw. doi:10.1016/j.dcan.2025.03.006

CrossRef Full Text | Google Scholar

Dasari, S. V., Mittal, K., Bapat, J., and Das, D. (2021). “Privacy enhanced energy prediction in SB using federated learning,” in 2021 IEEE international IOT, electronics and mechatronics conference (IEMTRONICS) (IEEE), 1–6.

Google Scholar

Elkliny, A., Mahmoudi, A., and Deng, X. (2025). Big data-driven implementation in international construction supply chain management: framework development, future directions, and barriers. Buildings 15 (13), 2167. doi:10.3390/buildings15132167

CrossRef Full Text | Google Scholar

Fuller, A., Fan, Z., Day, C., and Barlow, C. (2020). Digital twin: enabling technologies, challenges and open research. IEEE Access 8, 108952–108971. doi:10.1109/access.2020.2998358

CrossRef Full Text | Google Scholar

Gao, J., Wang, W., Liu, Z., Billah, M. F. R. M., and Campbell, B. (2021). “Decentralized federated learning framework for the neighborhood: a case study on residential building load forecasting,” in Proceedings of the 19th ACM conference on embedded networked sensor systems, 453–459.

Google Scholar

García-Hernando, N., Sánchez-Sánchez, F., Sánchez-Soriano, J., and García, A. (2024). Development of a digital twin dashboard for smart classroom environments. Comput. Industry 153, 103747.

Google Scholar

Grieves, M., and Vickers, J. (2016). “Digital twin: mitigating unpredictable, undesirable emergent behavior in complex systems,” in Transdisciplinary perspectives on complex systems: new findings and approaches (Cham: Springer International Publishing), 85–113.

CrossRef Full Text | Google Scholar

Hu, W., Lim, K. Y. H., and Cai, Y. (2022). Digital twin and industry 4.0 enablers in building and construction: a survey. Buildings 12 (11), 2004. doi:10.3390/buildings12112004

CrossRef Full Text | Google Scholar

Ji, S., Tan, Y., Saravirta, T., Yang, Z., Liu, Y., Vasankari, L., et al. (2024). Emerging trends in federated learning: from model fusion to federated x learning. Int. J. Mach. Learn. Cybern. 15 (9), 3769–3790. doi:10.1007/s13042-024-02119-1

CrossRef Full Text | Google Scholar

Kang, J., Xiong, Z., Niyato, D., Xie, S., and Zhang, J. (2019). Incentive mechanism for reliable federated learning: a joint optimization approach to combining reputation and contract theory. IEEE Internet Things J. 6 (6), 10700–10714. doi:10.1109/jiot.2019.2940820

CrossRef Full Text | Google Scholar

Khajavi, S. H., Motlagh, N. H., Jaribion, A., Werner, L. C., and Holmström, J. (2019). Digital twin: vision, benefits, boundaries, and creation for buildings. IEEE Access 7, 147406–147419. doi:10.1109/access.2019.2946515

CrossRef Full Text | Google Scholar

Khan, I., Guerrieri, A., Spezzano, G., and Vinci, A. (2022). “Occupancy prediction in buildings: an approach leveraging LSTM and federated learning,” in 2022 IEEE Intl Conf on Dependable, Autonomic and Secure Computing, Intl Conf on Pervasive Intelligence and Computing, Intl Conf on Cloud and Big Data Computing, Intl Conf on Cyber Science and Technology Congress (DASC/PiCom/CBDCom/CyberSciTech), Falerna, Italy, 12-15 September 2022 (IEEE), 1–7.

CrossRef Full Text | Google Scholar

Khan, M. A., Farooq, M. S., Saleem, M., Shahzad, T., Ahmad, M., Abbas, S., et al. (2025). Smart buildings: federated learning-driven secure, transparent and smart energy management system using XAI. Energy Rep. 13, 2066–2081. doi:10.1016/j.egyr.2025.01.063

CrossRef Full Text | Google Scholar

Kritzinger, W., Karner, M., Traar, G., Henjes, J., and Sihn, W. (2018). Digital twin in manufacturing: a categorical literature review and classification. Ifac-PapersOnline 51 (11), 1016–1022. doi:10.1016/j.ifacol.2018.08.474

CrossRef Full Text | Google Scholar

Ma, C. (2021). Smart city and cyber-security; technologies used, leading challenges and future recommendations. Energy Rep. 7, 7999–8012. doi:10.1016/j.egyr.2021.08.124

CrossRef Full Text | Google Scholar

Minerva, R., Lee, G. M., and Crespi, N. (2020). Digital twin in the IoT context: a survey on technical features, scenarios, and architectural models. Proc. IEEE 108 (10), 1785–1824. doi:10.1109/jproc.2020.2998530

CrossRef Full Text | Google Scholar

Mitra, A., Ngoko, Y., and Trystram, D. (2021). “Impact of federated learning on SBs,” in 2021 international conference on artificial intelligence and smart systems (ICAIS) (IEEE), 93–99.

CrossRef Full Text | Google Scholar

Munawar, A., and Piantanakulchai, M. (2024). A collaborative privacy-preserving approach for passenger demand forecasting of autonomous taxis empowered by federated learning in smart cities. Sci. Rep. 14 (1), 2046. doi:10.1038/s41598-024-52181-6

PubMed Abstract | CrossRef Full Text | Google Scholar

Qi, Q., Tao, F., Hu, T., Anwer, N., Liu, A., Wei, Y., et al. (2021). Enabling technologies and tools for digital twin. J. Manuf. Syst. 58, 3–21. doi:10.1016/j.jmsy.2019.10.001

CrossRef Full Text | Google Scholar

Sater, R. A., and Hamza, A. B. (2021). A federated learning approach to anomaly detection in smart buildings. ACM Trans. Internet Things 2 (4), 1–23. doi:10.1145/3467981

CrossRef Full Text | Google Scholar

Shaheen, M., Farooq, M. S., Umer, T., and Kim, B. S. (2022). Applications of federated learning; taxonomy, challenges, and research trends. Electronics 11 (4), 670. doi:10.3390/electronics11040670

CrossRef Full Text | Google Scholar

Shao, Z., Zhang, Y., Wang, J., and Xu, Y. (2020). A digital twin framework for intelligent building energy management. Appl. Energy 262, 114561.

Google Scholar

Tang, L., Xie, H., Wang, X., and Bie, Z. (2023). Privacy-preserving knowledge sharing for few-shot building energy prediction: a federated learning approach. Appl. Energy 337, 120860. doi:10.1016/j.apenergy.2023.120860

CrossRef Full Text | Google Scholar

Tao, F., Zhang, H., Liu, A., and Nee, A. Y. (2018). Digital twin in industry: state-of-the-art. IEEE Trans. Industrial Informatics 15 (4), 2405–2415. doi:10.1109/tii.2018.2873186

CrossRef Full Text | Google Scholar

Tao, F., Qi, Q., Liu, A., and Kusiak, A. (2021). Digital twins and cyber–physical systems toward smart manufacturing and industry 4.0: correlation and comparison. Engineering 7 (3), 397–405.

Google Scholar

Wang, Z., Yu, P., and Zhang, H. (2022). Privacy-preserving regulation capacity evaluation for hvac systems in heterogeneous buildings based on federated learning and transfer learning. IEEE Trans. Smart Grid 14 (5), 3535–3549. doi:10.1109/tsg.2022.3231592

CrossRef Full Text | Google Scholar

Wang, R., Yun, H., Rayhana, R., Bin, J., Zhang, C., Herrera, O. E., et al. (2023). An adaptive federated learning system for community building energy load forecasting and anomaly prediction. Energy Buildings 295, 113215. doi:10.1016/j.enbuild.2023.113215

CrossRef Full Text | Google Scholar

Wen, J., Zhang, Z., Lan, Y., Cui, Z., Cai, J., and Zhang, W. (2023). A survey on federated learning: challenges and applications. Int. J. Mach. Learn. Cybern. 14 (2), 513–535. doi:10.1007/s13042-022-01647-y

PubMed Abstract | CrossRef Full Text | Google Scholar

Zheng, Z., Zhou, Y., Sun, Y., Wang, Z., Liu, B., and Li, K. (2022). Applications of federated learning in smart cities: recent advances, taxonomy, and open challenges. Connect. Sci. 34 (1), 1–28. doi:10.1080/09540091.2021.1936455

CrossRef Full Text | Google Scholar

Keywords: federated learning, digital twin, long short-term memory, smart buildings, sustainability, energy management

Citation: Rajaram P and O. V. GS (2025) Decentralized intelligence in SBs: Federated LSTM-powered digital twins for sustainability. Front. Built Environ. 11:1696702. doi: 10.3389/fbuil.2025.1696702

Received: 01 September 2025; Accepted: 19 November 2025;
Published: 10 December 2025.

Edited by:

Bjørn Petter Jelle, Norwegian University of Science and Technology, Norway

Reviewed by:

Carl Ehrett, Clemson University, United States
Rui Dai, Huazhong University of Science and Technology School of Architecture and Urban Planning, China

Copyright © 2025 Rajaram and O. V. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Gnana Swathika O. V., Z25hbmFzd2F0aGlrYS5vdkB2aXQuYWMuaW4=

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

Decentralized intelligence in SBs: Federated LSTM-powered digital twins for sustainability

1 Introduction

2 Literature review

3 Methodology

3.1 System architecture overview

3.2 Data preprocessing pipeline

3.3 LSTM model design

3.4 Federated training protocol

3.5 Baseline and comparative models

3.6 Digital twin integration

3.7 Implementation environment

3.7.1 System workflow

4 Results and discussion

4.1 Dataset characteristics and suitability for federated learning

4.2 Model performance and benchmarking

4.3 Error analysis

4.3.1 Class 0 (not occupied)

4.3.2 Class 1 (occupied)

4.3.3 False positives

4.3.4 False negatives

4.3.5 Mitigation strategies

4.4 Sustainability impact in context

4.5 Real-time digital twin performance

4.6 Engineering significance

4.7 Model inference results for FLSTM in SB use case

4.8 Occupancy prediction based on key features

4.8.1 Occupancy over time

4.8.2 Light levels over time by occupancy state

4.8.3 CO2 levels over time by occupancy state

4.8.4 Feature importance and explainability with SHAP and LIME and validation across datasets

4.9 Digital twin implementation

4.9.1 Digital twins in the context of federated learning

4.9.2 Key components

4.9.3 Confidence scores

4.9.4 High temporal alignment and model robustness

4.9.5 Digital twin fidelity

4.9.6 Evaluation of error metrics

4.10 Sustainability implications

5 Conclusion and future scope

Data availability statement

Author contributions

Funding

Conflict of interest

Generative AI statement

Publisher’s note

References

4.8.3 CO₂ levels over time by occupancy state