AUTHOR=Wei Feiran , Yang Shijun , Wang Huiying , Zhao Meng , Zhou Jinyi , Shen Xiaobing , Han Renqiang , Fei Gaoqiang TITLE=Epidemiological association and machine learning-based prediction of lung cancer risk linked to long-term lagged satellite-derived PM2.5 in China JOURNAL=Frontiers in Public Health VOLUME=Volume 13 - 2025 YEAR=2025 URL=https://www.frontiersin.org/journals/public-health/articles/10.3389/fpubh.2025.1536509 DOI=10.3389/fpubh.2025.1536509 ISSN=2296-2565 ABSTRACT=ObjectivesThis study investigated association between long-term PM2.5 exposure and lung cancer incidence, focusing on Jiangsu Province, China. We aimed to explore the effects of historical PM2.5 with time lags and build a prediction model using machine learning methods.Study designAn ecological epidemiology study.MethodsLung cancer incidence data from Jiangsu Province (2014–2018) were combined with annual PM2.5 concentration data from satellite sources for the previous 10 years (lag 0 to lag 9). Correlation and grey correlation analyses were performed to evaluate the lagged relationship between PM2.5 exposure and lung cancer incidence. To address the multicollinearity problem in the data, ridge regression, support vector regression, and back propagation artificial neural network were employed. The combined prediction model was constructed using the optimal weighting method.ResultsThe incidence of lung cancer was significantly correlated with PM2.5 concentration at different historical time points, with the strongest correlation at lag 9. The combined prediction model that integrates multiple prediction methods showed higher accuracy and reliability in predicting lung cancer incidence than a single model.ConclusionLong-term exposure to PM2.5, especially exposure with a long lag time, is closely related to lung cancer incidence. The integrated machine learning prediction model can be used as a reliable tool to assess the health risks of air pollution.