ORIGINAL RESEARCH article
Front. Future Transp.
Sec. Transportation Systems Modeling
Deep Heterogeneity Learning for Cross-City Transit Forecasting: A Differentially Private Federated Framework with Mixture-of-Experts and Seasonal Decomposition
Provisionally accepted- 1Astana IT University, Astana, Kazakhstan
- 2Zhetysu University named after Ilyas Zhansugurov, Taldykorgan, Kazakhstan
Select one of your emails
You have multiple emails registered with Frontiers:
Notify me on publication
Please enter your email address:
If you already have an account, please login
You don't have a Frontiers account ? You can register here
The accurate prediction of transit flows is fundamental to optimizing intelligent transportation systems; however, the deployment of centralized forecasting models is frequently obstructed by the heterogeneous, Non-Independent and Identically Distributed (Non-IID) nature of cross-city data and stringent data privacy regulations. To address these challenges, this study proposes X-FedFormer, a novel framework that integrates Federated Learning (FL) with Differential Privacy (DP) to enable collaborative, privacy-preserving forecasting across geographically disparate cities. The proposed architecture synergistically combines a Mixture-of-Experts (MoE) mechanism to dynamically adapt to diverse local data distributions and a Seasonal-Trend Decomposition module to explicitly disentangle complex, multi-scale temporal dependencies. To overcome the scarcity of open multi-city transit data, the framework is evaluated on a statistically validated synthetic dataset that rigorously simulates realistic inflow and outflow patterns across ten diverse urban environments. Experimental results demonstrate that X-FedFormer significantly outperforms state-of-the-art federated baselines, including FedProx, achieving an aggregate coefficient of determination of 0.922 and a consistent Mean Absolute Error (MAE) profile across all participating cities. Furthermore, ablation studies confirm that the integration of MoE and seasonal decomposition reduces forecasting error by approximately 11% and 16%, respectively, compared to standard architectures. The study also characterizes the privacy-utility frontier, establishing that the model maintains high predictive utility even under strict differential privacy guarantees. These findings present a scalable, robust solution for urban computing, effectively balancing the trade-off between algorithmic performance and data sovereignty in smart city applications.
Keywords: Differential privacy, Federated learning, Mixture-of-experts, Seasonal-trend decomposition, Smart Cities, Traffic flow forecasting
Received: 11 Jun 2025; Accepted: 02 Feb 2026.
Copyright: © 2026 Sakhipov, Uzdenbayev, Begisbayev, Mektepbayeva, Seiitbek and Yedilkhan. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
* Correspondence:
Aivar Sakhipov
Diar Begisbayev
Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.
