Application and comparison of ARIMA, LSTM and ARIMA-LSTM models for predicting foodborne diseases in Liaoning Province

Du, Xiaoxiao; Yu, Haomiao; Zhang, Hao; Liu, Xiangyun; Yu, Xinling; Xie, Tao; Diao, Wenli

doi:10.3389/fdata.2025.1666962

ORIGINAL RESEARCH article

Front. Big Data

Sec. Medicine and Public Health

Application and comparison of ARIMA, LSTM and ARIMA-LSTM models for predicting foodborne diseases in Liaoning Province

Provisionally accepted

Xiaoxiao Du

Haomiao Yu

Hao Zhang

Xiangyun Liu

Xinling Yu Tao Xie

Tao Xie^*

Wenli Diao^*

Liaoning Provincial Center for Disease Control and Prevention, Shenyang, China

The final, formatted version of the article will be published soon.

Objective: To compare the application of the ARIMA model, the Long Short-Term Memory (LSTM) model and the ARIMA-LSTM model in forecasting foodborne disease incidence. Methods: Monthly case data of foodborne diseases in Liaoning Province from January 2015 to December 2023 were used to construct ARIMA, LSTM, and ARIMA-LSTM models. These three models were then applied to forecast the monthly incidence of foodborne diseases in 2024, and their predictions were compared with those of a baseline model. Model performance was evaluated by comparing the predicted and observed values using root mean square error (RMSE), mean absolute error (MAE), and mean absolute percentage error (MAPE), allowing identification of the optimal model. The best-performing model was subsequently employed to predict the monthly incidence for 2025. Results: The ARIMA-LSTM model was identified as the optimal model. Specifically, the ARIMA (2,0,0) (0,1,1)₁₂ model produced RMSE = 300.03, MAE = 187.11, and MAPE = 16.38%, while the LSTM model yielded RMSE = 408.71, MAE = 226.03, and MAPE = 17.21%. In contrast, the ARIMA-LSTM model achieved RMSE = 0.44, MAE = 0.44, and MAPE = 0.08%, representing a dramatic improvement over the baseline model (RMSE = 204.17, MAE = 146.75, MAPE = 15.62%), with reductions of 99.5%, 99.7%, and 99.4% in RMSE, MAE, and MAPE, respectively. Based on the ARIMA–LSTM model, the predicted monthly cases of foodborne diseases for 2025 are: 214.62 (Jan), 260.84 (Feb), 462.92 (Mar), 590.92 (Apr), 800.88 (May), 965.11 (Jun), 2410.36 (Jul), 2651.36 (Aug), 1711.15 (Sep), 941.22 (Oct), 628.21 (Nov), and 465.05 (Dec). Conclusion: The ARIMA-LSTM model is considered the optimal model for predicting foodborne disease incidence in Liaoning Province in 2025.

Keywords: foodborne, ARIMA model, LSTM model, ARIMA-LSTM model, Predicting

Received: 18 Jul 2025; Accepted: 23 Oct 2025.

Copyright: © 2025 Du, Yu, Zhang, Liu, Yu, Xie and Diao. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence:
Tao Xie, xietao@lncdc.com
Wenli Diao, diaodwl@163.com

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.