AUTHOR=Liu Zitao , Lu Zhigang , Zhu Weidong , Yuan Jiansheng , Cao Zhaoxiang , Cao Tiantian , Liu Shuai , Xu Yuelin , Zhang Xiaoshan TITLE=Comparison of machine learning methods for predicting ground-level ozone pollution in Beijing JOURNAL=Frontiers in Environmental Science VOLUME=Volume 13 - 2025 YEAR=2025 URL=https://www.frontiersin.org/journals/environmental-science/articles/10.3389/fenvs.2025.1561794 DOI=10.3389/fenvs.2025.1561794 ISSN=2296-665X ABSTRACT=High ground - level ozone (O3) concentrations severely undermine urban air quality and threaten human health, creating an urgent need for precise and effective ozone - level predictions to aid environmental monitoring and policy - making.This study incorporated the historical concentrations of ozone and nitrogen dioxide (NO2) from the past 3 hours as lagged features into a Lagged Feature Prediction Model (LFPM), evaluated using nine machine - learning algorithms (including XGBoost). Initially, XGBoost combined with SHAP identified 11 key features, boosting computational efficiency by 30% without sacrificing prediction accuracy. Then, ozone concentrations were predicted using six meteorological variables.Results showed that LSTM - based methods, especially ED - LSTM, performed best among meteorological - only models (R2 = 0.479). Yet, predictions based solely on meteorological variables had limited accuracy. Adding five pollutant variables markedly improved the predictive performance across all machine - learning methods. XGBoost achieved the highest accuracy (R2 = 0.767, RMSE = 11.35 μg/m3), a 125% relative improvement in R2 compared to meteorological - variable - only predictions. Further application of the LFPM model enhanced prediction accuracy for all nine machine - learning methods, with XGBoost still leading (R2 = 0.873, RMSE = 8.17 μg/m3).These findings conclusively demonstrate that integrating lagged feature variables significantly enhances ozone prediction accuracy, offering stronger support for environmental monitoring and policy - formulation.