Your new experience awaits. Try the new design now and help us make it even better

ORIGINAL RESEARCH article

Front. Artif. Intell.

Sec. Machine Learning and Artificial Intelligence

Volume 8 - 2025 | doi: 10.3389/frai.2025.1620019

A Hybrid Long Short-Term Memory with Generalized Additive Model and Post-hoc Explainable Artificial Intelligence with Causal Inference for Air Pollutants Prediction in Kimberley, South Africa

Provisionally accepted
  • Sol Plaatje University, Kimberley, South Africa

The final, formatted version of the article will be published soon.

The study addresses the problem of nonlinear characteristics of common air pollutants by proposing a deep learning time-series model based on the long short-term memory (LSTM) integrated with a generalized additive model (GAM). LSTM model captures both nonlinear relationships and temporal long-term dependencies in time-series data, and GAM provides insight into the statistical relationship between selected features and the target pollutant. The post-hoc eXxplainable artificial intelligence (xAI) technique, local interpretable model-agnostic explanation (LIME), further explains the nonlinearity. Finally, causal inference was determined on the impact of the air pollutants relationship, thereby offering further interpretability in which deep learning models are deficient. Meteorological and air pollutant statistical records were leveraged from a Hantam (Karoo) air monitoring station in South Africa, and through a random sampling approach, synthetic data were generated for the city of Kimberley. The model was evaluated with the mean squared error (MSE), root mean squared error (RMSE) and mean absolute error (MAE) and correlation coefficient (R 2 ) for different time-steps. The proposed referred to as long short-term memory generalized additive model based posthoc eXplainable Artificial Intelligence (LSTM-GAM_xAI) model with a 10-day time-step and 5-day time-step for multiple pollutants NOx prediction guaranteed least MSE of 0.990, LSTM (0.340), BiLSTM (1.625), BiGRU (1.487), 1DCNN (1.402), Random Forest (1.394), and XGBoost (1.502). R 2 value generated LSTM-GAM-xAI (0.008), LSTM (-0.342), BiLSTM (-0.628), BiGRU (-0.490), 1DCNN (-0.405), Random Forest (-0.063) and XGBoost (-0.146). Similarly, air pollutants like PM2.5, PM10, O3, SO2, NO and NO2 were predicted with LIME explanations. Though tThe causal effect analysis show no p-values (>0.88)indicates that NOx has -0.00328 estimated effect with a p-value of 0.88 with the outcome variable (NO2). for variables, the Based on the experiment results show that, LSTM-GAM-xAI guaranteedgenerated the lowest MSE values across different time-steps.

Keywords: Generative Additive Model, post-hoc explanation, Local interpretable model-agnostic explanation, deep learning, causal inference

Received: 29 Apr 2025; Accepted: 21 Jul 2025.

Copyright: © 2025 Agbehadji and Obagbuwa. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence: Israel Agbehadji, Sol Plaatje University, Kimberley, South Africa

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.