Incidence of acute hemorrhagic conjunctivitis in Chongqing: a forecasting study based on mathematical models

Xian, Xiaobing; Wu, Sitian; Fu, Yandi; Fan, Xiaoli; Cheng, Yan; Zeng, Li; Hou, Zhangmei; Chen, Yinzhi

doi:10.3389/fpubh.2025.1644729

ORIGINAL RESEARCH article

Front. Public Health, 10 October 2025

Sec. Infectious Diseases: Epidemiology and Prevention

Volume 13 - 2025 | https://doi.org/10.3389/fpubh.2025.1644729

This article is part of the Research TopicMathematical Modelling and Data Analysis in Infectious DiseasesView all 7 articles

Incidence of acute hemorrhagic conjunctivitis in Chongqing: a forecasting study based on mathematical models

Xiaobing Xian^1,2

Sitian Wu³

Yandi Fu³

Xiaoli Fan⁴

Yan Cheng⁵

Li Zeng⁶

Zhangmei Hou⁵

Yinzhi Chen⁵^*

¹Department of Operations Management, The Thirteenth People's Hospital of Chongqing, Chongqing, China
²Department of Operations Management, Chongqing Geriatrics Hospital, Chongqing, China
³School of Pediatric, Chongqing Medical University, Chongqing, China
⁴College of public health, Chongqing Medical University, Chongqing, China
⁵Department of Healthcare-associated Infection Control, Chongqing General hospital, Chongqing University, Chongqing, China
⁶School of Mathematics and Statistics, Chongqing Technology and Business University, Chongqing, China

Background: Acute hemorrhagic conjunctivitis (AHC) is a highly infectious eye disease. It poses a significant threat to public health given its propensity for rapid transmission in densely populated areas. Recent epidemiological data have demonstrated a distinct seasonal outbreak pattern in Chongqing. However, conventional single prediction models exhibit limitations in accurately capturing the complex spatiotemporal transmission characteristics of AHC. This study endeavors to compare the performance of different mathematical models in forecasting AHC incidence in Chongqing. Through the investigation of optimal predictive methodologies, this study establishes a theoretical foundation for relevant department to formulate policies for preventing AHC.

Methods: The monthly incidence data of AHC in Chongqing from March 2019 to October 2024 were collected from the official website of the Chongqing Municipal Health Commission. Five predictive models (SARIMA, KNN, Prophet model as well as SARIMA-KNN and SARIMA-Prophet model) were employed to fit the incidence data. The data from March 2019 to December 2023 was designated as the training set, while the data from January 2024 to October 2024 served as the test set. Model performance was evaluated through multiple metrics, including MSE, RMSE, MAE, and MAPE. Subsequently, the Diebold-Mariano test was implemented to statistically assess the significance of predictive performance differences among the five models.

Results: During the period from March 2023 to October 2024, the incidence rate of AHC in Chongqing showed a pronounced seasonal fluctuation pattern, with the peak period consistently occurring between June and September annually. The comparative analysis of model performance revealed that the SARIMA-KNN hybrid model demonstrated optimal performance metrics in terms of MSE, MAE, RMSE, and MAPE. Furthermore, the predicted curve of the SARIMA-KNN model demonstrated superior fitting accuracy compared to the actual curve. The Diebold-Mariano statistical test confirmed that the SARIMA-KNN model's performance was significantly superior to other models.

Conclusion: In comparison with the other four models, the SARIMA-KNN hybrid model effectively integrates the temporal characteristics of AHC incidence. It offers the technical support for the development of early warning systems and the formulation of prevention and control strategies in Chongqing. This approach holds substantial practical significance in the field of public health.

1 Introduction

Acute hemorrhagic conjunctivitis (AHC) is a highly contagious viral conjunctivitis primarily caused by enterovirus 70 (EV70) or coxsackievirus A24 (CVA24) infection (1). Its typical clinical symptoms include conjunctival hyperemia, photophobia, epiphora, and foreign body sensation (2). Owing to its short incubation period and high infectivity, AHC is prone to rapid outbreaks and epidemics (3). Since its initial identification in Ghana in 1969 (4), AHC has shown periodic epidemic patterns globally, with significant prevalence observed in Asia, Africa, and Latin America (5). In China, the first outbreak of AHC was reported in Hong Kong in 1971 (6). As one of the most prevalent ocular infectious diseases in China, AHC has been reported in numerous cities. From 2005 to 2012, Chongqing consistently ranked among the top five regions nationwide in terms of AHC incidence rates (5). Recent monitoring shows that Chongqing persists as one of China's high-incidence regions for AHC. According to monitoring reports from the China CDC Information System (CIDCIS), the national incidence rate of AHC showed periodic fluctuations from 2013 to 2020, with 2014 and 2019 being the peak years of reporting. A notable trend is that after 2020, due to the high-intensity COVID-19 prevention and control measures, the reported incidence rate decreased significantly (7). However, with the full resumption of societal activities, epidemic intensity has manifested a rising trend. This phenomenon highlights the urgency of strengthening local prevention and control and prediction research on AHC after the epidemic.

The transmission of AHC depends on multiple factors, including climatic conditions (temperature, humidity) (8), population mobility patterns, and sanitation infrastructure (9, 10). Within Chongqing's subtropical monsoon climate, the virus transmission is potentially exacerbated by the hot and rainy summer conditions coupled with cold and humid winter environments (11). In addition, as a large-scale city in the central-southern region, the substantial population mobility significantly increases the risk of disease transmission. Recent epidemiological studies have demonstrated that the incidence of AHC in Chongqing has shown a cyclical fluctuation and sudden growth trend (12). The current infectious disease early warning and monitoring platforms mainly rely on traditional statistical models. These models demonstrate limited sensitivity in forecasting diseases with sudden and non-linear transmission characteristics. Meanwhile, existing platforms struggle to dynamically integrate heterogeneous data such as climate and population mobility (13). The accurate detection and early warning of AHC mixed infections remain challenges.

In recent years, advances in infectious disease prediction models have shifted from single statistical methodologies to multi-model integration and data-driven approaches. This transition aims to address the complex characteristics of medical data by leveraging the strengths of diverse algorithms (14). Notably, hybrid models that combine classical statistical methods with machine learning techniques have exhibited remarkable advantages in forecasting various infectious diseases, including tuberculosis (15), hepatitis B (16), and hand-foot-mouth disease (17). Yet research on AHC remains scarce, especially in the context of Chongqing, a city with a complex climate and a dense population, where systematic exploration has yet to be conducted.

The Seasonal Autoregressive Integrated Moving Average Model (SARIMA), an extension of the ARIMA framework, has gained widespread recognition in time series forecasting applications (18). The fundamental principle of SARIMA involves eliminating non-stationarity within the sequence through difference operations while incorporating seasonal parameters (P, D, Q, S) to effectively capture periodic patterns. This model demonstrates superior performance in forecasting periodic data. The K-Nearest Neighbors (KNN) algorithm is generally used for basic classification and regression analysis. In regression methods, this algorithm relies on the k nearest dependent variable values to predict a given data (19). The distance between two data points can be determined using a distance function (20). It is often used to capture local non-linear fluctuations and short-term trends. In most cases, as linear models cannot produce sufficient results, non-linear structures are adopted in time series analysis (21). We also introduce Prophet, a time series forecasting method based on an additive model developed by Facebook. Its core is to perform curve fitting within the Bayesian inference framework to achieve smoothing and prediction of time series data (22). This model shows robust performance in handling missing values and accommodating trend changes, while effectively fit complex multiple seasonal patterns.

SARIMA model excels in handling seasonality and trends within time series (23). KNN model demonstrates flexibility in generating accurate predictions based on local data characteristics (24). Concurrently, the Prophet model demonstrates superior performance in managing complex seasonality, trend variations, and outlier detection. To enhance the handling of intricate time series data and improve predictive accuracy, this study proposes the SARIMA-KNN hybrid model for the first time, and also introduces the SARIMA-Prophet hybrid model based on the SARIMA and Prophet models. In the predictive process, SARIMA is initially employed to extract linear components from the time series, followed by the application of KNN and Prophet models to model the residual sequences from SARIMA, thereby capturing non-linear features in the data. This multi-model integration strategy enables the simultaneous utilization of diverse algorithmic advantages, offering a more comprehensive representation of complex time series characteristics.

In conclusion, this study conducted a systematic comparison of the predictive performance among three single models and two hybrid models for forecasting the incidence of AHC in Chongqing. The research aims to identify the most effective predictive methodology and establish a theoretical foundation for early warning systems and resource allocation strategies pertaining to AHC in Chongqing.

2 Material and methods

2.1 Data

This study utilized the AHC data published by the Chongqing Municipal Health Commission (https://wsjkw.cq.gov.cn/) from March 2019 to October 2024. The Chongqing Municipal Health Commission is the official municipal health authority, and the data has been strictly reviewed, ensuring its authority and reliability. In terms of data quality control, all case diagnoses were made in accordance with the national unified “Diagnosis Criteria for Acute Hemorrhagic Conjunctivitis”, ensuring consistency in diagnostic standards. Additionally, the Chinese government attaches great importance to the monitoring of legally notifiable infectious diseases and implements a system of “local management and hierarchical responsibility”. AHC data is reported by grassroots medical institutions within 24 h of diagnosis and is successively reviewed and monitored by disease prevention and control institutions at the district (county), municipal, provincial, and national levels before being released by the Health Commission. This ensures the timeliness and accuracy of the data. The data does not involve personal information, so no professional ethical review is required.

This study employed the SARIMA model, KNN model, Prophet model, and two combined models to fit the monthly incidence rate of AHC. The incidence rate from March 2019 to December 2023 was used as the training set. The incidence rate from January 2024 to October 2024 was used as the test set to validate the predictive performance of the five models. The technical route diagram is shown in Figure 1.

Figure 1

Flowchart depicting the process of predicting and evaluating the monthly incidence of AHC in Chongqing Municipality. It starts with actual data from March 2019 to December 2023, leading to model building with SARIMA, KNN, and Prophet models. Combined models are created, followed by forecasting from January to October 2024. Predicted results are compared with actual data using MSE, RMSE, MAE, and MAPE for effect evaluation. The best model is chosen based on these evaluations.

Figure 1. Technical route diagram of the development, prediction, and evaluation process for the AHC incidence prediction model in Chongqing.

2.2 Data analysis software

Data preprocessing and descriptive statistics were conducted using SPSS 25.0. Model fitting procedures for SARIMA, KNN, Prophet models, along with SARIMA-KNN and SARIMA-Prophet models were implemented in R 4.3.0. Throughout the study, statistical significance was determined at the conventional threshold of P < 0.05.

2.3 SARIMA model

The SARIMA model, as an extension of the ARIMA model, is specifically designed to handle time series data with seasonal components (25). The structure of a complete SARIMA model is expressed as:

\begin{array}{l} SARIMA (p, d, q) \times (P, D, Q) m & (1) \end{array}

where p, d, and q represent the autoregressive order, the differencing order, and the moving average order of the non-seasonal part, respectively; P, D, and Q are the corresponding orders of the seasonal part; and m indicates the length of the seasonal cycle (for example, m = 12 for monthly data) (26). The general mathematical representation of the SARIMA model is as follows:

\begin{array}{l} Φ_{p} (B^{m}) φ_{p} (B) {(1 - B^{m})}^{D} {(1 - B)}^{d} y_{t} = θ_{Q} (B^{m}) θ_{q} (B) ω_{t} & (2) \end{array}

B is the backward shift operator, y_t is a non-stationary time series, ω_t is a Gaussian white noise process. D is the seasonal difference term, and D = 1 is sufficient to enforce data stationarity. φ_p(B) is the non-seasonal autoregressive polynomial, θ_q(B) is the non-seasonal moving average polynomial, $Φ_{p} (B^{m})$ is the seasonal autoregressive polynomial, and $Θ_{Q} (B^{m})$ is the seasonal moving average polynomial. The expression of the four-term polynomial is as follows:

\begin{array}{l} φ_{p} (B) = 1 - φ_{1} B - φ_{2} B^{2} - \dots - φ_{p} B^{p} & (3) \end{array}

\begin{array}{l} θ_{q} (B) = 1 + θ_{1} B + θ_{2} B^{2} + \dots + θ_{q} B^{q} & (4) \end{array}

\begin{array}{l} Φ_{P} (B^{m}) = 1 - Φ_{1} B^{m} - Φ_{2} B^{2 m} - \dots \dots - Φ_{P} B^{Pm} & (5) \end{array}

\begin{array}{l} θ_{Q} (B^{m}) = 1 + Θ_{1} B^{m} + Θ_{2} B^{2 m} + \dots \dots + θ_{Q} B^{Q_{m}} & (6) \end{array}

The construction of a SARIMA model mainly involves three steps: stationarity test, model selection, and parameter verification.

Firstly, conduct a stationarity test on the original sequence. Use the Augmented Dickey-Fuller (ADF) unit root test; if p < 0.05, the sequence is considered stationary. If not, perform difference: typically, start with seasonal differencing, and if it remains non-stationary, proceed with non-seasonal difference. Additionally, identify the presence of seasonality and the value of the cycle m by plotting the sequence graph, seasonal decomposition graph, and calculating the periodic autocorrelation.

Secondly, for the stationary sequence, plot its autocorrelation function (ACF) and partial autocorrelation function (PACF) graphs. The ACF quantifies the correlation between a time series and its lagged values, while the PACF measures the correlation between the time series and its lagged values at a specific time interval, excluding the influence of intermediate lags (27). The ACF of a time series can be expressed as:

\begin{array}{l} ACF (y_{t}, y_{t - k}) = \frac{Covariance (y_{t}, y_{t - k})}{Variance (y_{t})} & (7) \end{array}

K is the lag period, defined as the difference between y_t and y_t−k.

The PACF between two observations can be expressed as follows:

\begin{array}{l} PACF (y_{t}, y_{t - 2}) = \frac{Covariance (y_{t}, y_{t - 2} | y_{t - 1})}{\sqrt{Variance (y_{t} | y_{t - 1})} \sqrt{Variance (y_{t - 2} | y_{t - 1})}} & (8) \end{array}

Based on the truncation and drag tail characteristics observed in the ACF/PACF plots, the preliminary estimation of the parameters p, q, P, and Q can be established as follows:

The order of the non-seasonal AR term p: PACF truncates after lag p.

The order of the non-seasonal MA term q: ACF truncates after lag q.

The order of the seasonal AR term P: PACF truncates at seasonal lags (such as m, 2m, ...).

The order of the seasonal MA term Q: ACF truncates at seasonal lags.

After the initial determination of the order, the model parameters are fitted using the maximum likelihood estimation (MLE) or conditional least squares estimation method. Multiple candidate models are compared using information criteria such as AIC, AICc, and BIC, and the model with the smallest value is preferred.

\begin{array}{l} AIC = - 2 log L (\hat{θ}) + 2 K & (9) \end{array}

\begin{array}{l} AICc = - 2 log L (\hat{θ}) + 2K + \frac{2K (K + 1)}{N - K - 1} & (10) \end{array}

\begin{array}{l} BIC = - 2 log L (\hat{θ}) + K log N & (11) \end{array}

$Log L (\hat{θ})$ represents the likelihood function, K indicates the total number of model parameters, and N is the quantity of observed data.

Third, parameter testing: The Ljung–Box Q test is used to test the white noise residuals. If P > 0.05, it indicates a white noise sequence, confirming that the model effectively captures the data information and its validity is statistically significant. Conduct t-statistic tests on the model parameters. P < 0.05 indicates statistical significance, suggesting that we consider the established model to be appropriate.

2.4 KNN model

The principle of KNN for time series prediction is based on “K nearest similarity.” The core methodology involves identifying the most similar historical segments to the current time window and use the subsequent values of these similar segments for prediction (28). The steps are as follows:

First, the original data is transformed into a structured representation suitable for supervised learning through preprocessing. In this study, the sliding window method is used to reconstruct the continuous time series, dividing the original sequence X = (x₁, x₂,..., x_T) of length T into several fixed-length observation windows with n = 12. For each time point t, its feature vector is defined as S_t = (x_t−n+1, x_t−n+2,...,x_t), with the corresponding target output being the future h = 1 step sequence segment Y_t = (x_(t+1), x_(t+2), ..., x_(t + h)). To further enhance the feature representation ability, sliding statistics can be introduced as auxiliary features to construct a multi-dimensional feature space.

Next, calculate the distance between the target sample and each individual instance within the training data. Common distance measurement methods include Euclidean distance, Manhattan distance, Minkowski distance (29, 30).

The Minkowski distance used in this study is a generalization of the Euclidean distance and the Manhattan distance, with the formula:

d = ${(\sum_{i = 1}^{n} | x_{1 i} - x_{2 i} |^{p})}^{\frac{1}{p}}$ (When p = 2, it is the Euclidean distance; when p = 1, it is the Manhattan distance.)

Then, based on the calculated distances, select the K sample points in the training set that are closest to the test sample, and vote or average the values of these points to obtain the prediction result.

Finally, make adjustments and optimizations, using 5-fold cross-validation and grid search for model training and parameter tuning.

2.5 Prophet model

The Prophet model, developed by Facebook, is an advanced time series data, this model employs a Bayesian-based curve fitting methodology to both smooth and predict time series data, thereby facilitating the rapid acquisition of desired forecasting outcomes. The Prophet model comprises three principal components: trend, seasonality, and holidays (22). Its fundamental equation is formulated as follows:

\begin{array}{l} y (t) = g (t) + s (t) + h (t) + ε_{t} & (12) \end{array}

Here, g(t) denotes the trend function characterizing the non-periodic variations in the time series, s(t) represents the seasonal component, h(t) signifies the influence of holidays or specific events on the time series, and ϵ(t) constitutes the error term (31).

Regarding the trend modeling approach, it encompasses the fitting of piecewise linear curves or non-linear saturation growth models. The growth pattern is conventionally modeled through the logistic growth model, whose fundamental formulation is as follows:

\begin{array}{l} g (t) = \frac{C}{1 + exp (- k (t - m))} & (13) \end{array}

In this context, C denotes the carrying capacity, k signifies the growth rate, and m indicates the offset parameter. Notably, both the carrying capacity and the growth rate are non-constant variables. Through parameter rate adjustment, the model's flexibility can be effectively modulated.

2.6 SARIMA-KNN model

The SARIMA-KNN model integrates the advantages of the SARIMA algorithm and the KNN algorithm through a two-stage modeling strategy. Firstly, a SARIMA (p, d, q)(P, D, Q)s model is constructed. The non-stationarity of the sequence is eliminated through difference operations (d, D), and the deterministic structural features of the time series are extracted using autoregressive (p, P), moving average (q, Q), and seasonal (s) components. Then, the standardized residual sequence derived from the SARIMA model is utilized as input for the KNN model. This residual term contains the non-linear features and random components in the original sequence that were not explained by the linear model. By optimizing the length L of the local modeling window through an adaptive sliding window mechanism, combining a density-sensitive K value selection strategy, and using k-fold cross-validation, the optimal hyperparameter combination is determined through network optimization methods to effectively capture the non-linear dynamic features in the residual sequence.

2.7 SARIMA-Prophet model

The SARIMA-Prophet model shares a similar modeling framework with the SARIMA-KNN hybrid model described in Section 2.6, as both employ a phased modeling approach. In the initial phase, both models utilize the SARIMA model for time series fitting and forecasting, obtaining the initial prediction results and the residual sequence. The difference is that this model inputs the standardized residual sequence into the Prophet model. The Prophet model automatically decomposes and fits the complex seasonal patterns and non-linear trend changes in the residuals through an additive model structure. The final prediction is obtained by integrating the forecast outputs from both the SARIMA model and the Prophet model's residual predictions.

2.8 Model evaluation

We used root mean square error (RMSE), mean absolute error (MAE), mean absolute percentage error (MAPE), and mean square error (MSE) to evaluate the prediction efficiency of the SARIMA, KNN, and SARIMA-KNN models (32). These indicators measure prediction accuracy from different perspectives:

1) RMSE: sensitive to outliers, with units consistent with the original data, suitable for scenarios where high bias needs to be controlled.

2) MAE: reflects the mean of absolute errors, with strong robustness and not affected by extreme values.

3) MAPE: measures relative error in percentage form, but requires non-zero actual values.

4) MSE: A commonly used objective function for model optimization, but should be combined with RMSE for auxiliary interpretation.

According to previous studies, when the RMSE, MAE, MAPE, and MSE of a model are smaller, the model's goodness of fit is better. The following are the calculation methods:

\begin{array}{l} RMSE = \sqrt{\frac{\sum_{i = 1}^{n} {(y_{i} - {\hat{y}}_{i})}^{2}}{n}} & (14) \end{array}

\begin{array}{l} MAE = \frac{1}{n} \sum_{i = 1}^{n} | y_{i} - {\hat{y}}_{i} | & (15) \end{array}

\begin{array}{l} MAPE = \frac{1}{n} \sum_{i = 1}^{n} | \frac{(y_{i} - {\hat{y}}_{i})}{y_{i}} | \times 100 % & (16) \end{array}

\begin{array}{l} MSE = \frac{1}{n} {\sum_{i = 1}^{n} (y_{i} - {\hat{y}}_{i})}^{2} & (17) \end{array}

Furthermore, to further evaluate whether the differences in the predictive performance of the five models are statistically significant, we conducted the Diebold-Mariano test, comparing SARIMA-KNN with each of the other four models one by one, to clarify the statistical differences in the predictive accuracy of different models.

3 Results

3.1 Descriptive statistical results

Table 1 shows the monthly incidence rate of acute hemorrhagic conjunctivitis in Chongqing from March 2019 to October 2024. The highest incidence month was July 2019, with a rate of 9.69, while the lowest incidence month was January 2023, with a rate of 0.53. Figure 2 indicates that the incidence rate of AHC in Chongqing is relatively high from June to September each year.

Table 1

Table 1. The incidence distribution of acute hemorrhagic conjunctivitis in Chongqing from March 2019 to October 2024.

Figure 2

Line graph showing monthly incidence rates per 100,000 from 2019 to 2024. Different colored lines represent each year. Notable peaks are in July 2019 and June 2022. Data shows varying trends over years, with most fluctuations occurring mid-year.

Figure 2. Seasonal distribution of AHC in Chongqing Municipality. The vertical axis represents incidence rate (per 100,000 population), and the horizontal axis indicates months. Lines of different colors correspond to each year, showing distinct seasonal fluctuations with an annual peak between June and September. The magnitude of variation differs across years.

3.2 Performance of SARIMA model

From the time series plot in Figure 3A, it can be observed that the sequence has a clear downward trend, suggesting that it is a non-stationary time series. The ADF test yields P = 0.1485 > 0.05. As shown in the seasonal component in Figure 4, the sequence has a distinct seasonal feature with each year as a cycle. Since the data is monthly, the cycle length is 12. After performing first-order differencing and first-order seasonal differencing on the original sequence, the random fluctuations of the sequence become relatively stable, as depicted in Figure 3B. The ADF test results in P = 0.01 < 0.05, thus SARIMA (p, 1, q) × (P, 1, Q)12 is initially selected. Through stepwise parameter estimation of the SARIMA model, SARIMA(0, 1, 3) × (P, 1, Q,)12 is ultimately determined as the best prediction model. The Ljung-Box test yields P = 0.1292 > 0.05, accepting the null hypothesis, indicating that the fitted model is significantly effective.

Figure 3

Time series decomposition graph showing four panels: observed, trend, seasonal, and random components from 2019 to 2024. The observed data displays fluctuations. The trend component shows a gradual decline, the seasonal component indicates repeating patterns, and the random component reveals irregular variations.

Figure 3. Time series analysis of acute hemorrhagic conjunctivitis incidence. (A) Time series plot of the incidence rate of acute hemorrhagic conjunctivitis. (B) Time series plot of the sequence after first-order difference and seasonal difference.

Figure 4

Two line graphs labeled A and B show incidence rates per 100,000 over time. Graph A spans from 2019 to 2024, with fluctuations peaking at around 10. Graph B covers 2019 to 2024, showing smaller fluctuations between -4.5 and 4.5. Both graphs depict variable trends.

Figure 4. Decomposition of the incidence sequence of acute hemorrhagic conjunctivitis, revealing the underlying trend, seasonality, and random variations in incidence dynamics.

3.3 Performance of KNN Model

The performance of the KNN model was evaluated using 5-fold cross-validation and grid optimization for model training and parameter tuning. RMSE was used as the criterion for selecting the best model. Through systematic analysis, it was determined that the minimal RMSE value of ~1.610 was achieved when the parameter k was set to 8. Consequently, this specific parameter configuration (k = 8) was identified as the optimal setting for the KNN model, demonstrating its capability to deliver superior predictive accuracy and robust fitting performance for the AHC dataset.

3.4 Performance of Prophet model

The Prophet model was employed to automatically fit the incidence rate of AHC in Chongqing from March 2019 to December 2023, subsequently projecting the incidence rate from January to October 2024. The predictive outcomes are presented in Table 3, while the fitting and prediction curves are illustrated in Figure 5. The findings demonstrate that the monthly incidence rate of AHC in Chongqing exhibits a distinct seasonal pattern.

Figure 5

Line graph showing incidence rates per 100,000 from 2020 to 2024 with actual and predicted data. Actual data in red fluctuates around 0 to 7.5. Predictions use KNN (green), Prophet (purple), SARIMA-KNN (brown), SARIMA-Prophet (pink), and SARIMA (dark blue) models. Predictions converge around 2024 with a shaded area indicating uncertainty.

Figure 5. The fitting situation of the actual incidence rate of acute hemorrhagic conjunctivitis from March 2019 to December 2023 and the predicted incidence rate from January 2024 to October 2024. The SARIMA-KNN predicted curve closely aligns with the actual incidence, demonstrating the model's accuracy.

3.5 Performance of the SARIMA-KNN Hybrid Model

We extracted the deterministic structural features of the sequence using the SARIMA model, and then input the residual sequence of the SARIMA model into the KNN model. We determined the optimal parameters of the hybrid model using the same steps as when modeling separately. Table 2 presents the performance evaluation metrics of the SARIMA-KNN model.

Table 2

Table 2. Evaluation indicators of the three models.

3.6 Performance of the SARIMA-Prophet Hybrid Model

The hybrid model first captures the linear trend and seasonal components in the time series through SARIMA, and then inputs the obtained residual sequence into the Prophet model for further fitting of the implicit non-linear features. The performance indicators of the model are shown in Table 2.

3.7 Performance Comparison

The performance metrics of all models are presented in Table 2. It is not difficult to see that the SARIMA-KNN model demonstrates lower values in MSE, MAE, RMSE, and MAPE compared to the other four models. The residual error accumulation observed in the SARIMA-Prophet combination further substantiates the rationale for employing KNN to correct non-linear residuals. Furthermore, statistical analysis of predictive performance differences through the Diebold-Mariano test reveals that the SARIMA-KNN combined model exhibits significantly superior prediction accuracy at a statistical level (P < 0.05) compared to the remaining models. Consequently, we conclude that the SARIMA-KNN model demonstrates superior accuracy and applicability in predicting the incidence of AHC in Chongqing Municipality.

The prediction results of the SARIMA (0,1,3) × (1, 1, 0)12 model, the KNN model, and the SARIMA-KNN model for January 2024 to October 2024 are shown in Table 3. (The negative value range indicates that individual point predictions should be interpreted with caution, but the overall performance of the model still meets the requirements for early warning) Empirical analysis reveals that the SARIMA-KNN model demonstrates superior predictive accuracy, exhibiting lower prediction errors compared to the other two models across the majority of the predicted months. Furthermore, the comparative fitting performance of these three models is comprehensively illustrated in Figure 5.

Table 3

Table 3. Comparison of prediction results of five models.

4 Discussion

Our findings demonstrate that the incidence of AHC in Chongqing manifests pronounced seasonal variations, with the peak prevalence consistently observed between June and September annually. This epidemiological phenomenon is strongly associated with the region's distinctive summer climatic conditions. Specifically, the average temperature in Chongqing during this period ranges from 25°C to 30°C, accompanied by relative humidity levels exceeding 70%. Such environmental parameters create optimal conditions for the survival and proliferation of conjunctivitis viruses, consistent with the established biological characteristic that these pathogens thrive in warm and humid environments (8). In addition, under high temperature and humidity conditions during summer, the sebum secreted by the meibomian glands of the eyes increases, creating a more suitable environment for viral propagation. The period from June to September aligns with school summer vacations, during which students engage in various group activities including training programs and summer camps. As a major tourist destination, Chongqing experiences substantial population concentration and high mobility rates. The emergence of even a single case significantly elevates the risk of rapid disease transmission within the community.

The SARIMA-KNN hybrid model developed in this study successfully identified the incidence pattern of AHC in Chongqing. The RMSE of the hybrid model was reduced by 5.9% compared to the single SARIMA model. To further validate the efficacy of hybrid models, we incorporated the Prophet model and its hybrid version with SARIMA as benchmark comparisons. While the Prophet model demonstrates automatic handling of seasonality and holiday effects, its RMSE of 1.39 in this study remains higher than that of the SARIMA-KNN combined model. This finding suggests that although Prophet effectively captures prominent periodic patterns, the KNN's instance-based learning with local adaptability proves more advantageous in processing the complex non-linear residuals in AHC incidence rates. This outcome aligns closely with the core concept of recent monkeypox virus prediction research (33). By constructing structurally appropriate combined models that effectively integrate the strengths of different algorithms, we can overcome the limitations of individual models in characterization capabilities, thereby enabling more comprehensive and precise capture of the complex epidemiological characteristics of infectious diseases. In traditional time series forecasting, the SARIMA model, compared to the ARIMA model, incorporates seasonal effects and has been widely applied in predicting infectious diseases such as influenza and hand-foot-and-mouth disease due to its capability to effectively capture seasonal and periodic patterns (17, 34). However, the SARIMA model demonstrates limited adaptability to sudden events and requires differencing to stabilize non-stationary series, which may result in information loss and reduced prediction accuracy. This limitation has been previously noted in influenza virus prediction studies (35). Conversely, the KNN algorithm has demonstrated remarkable flexibility in non-linear pattern recognition, as evidenced by its applications in air quality prediction and emergency department volume forecasting (36), but it is difficult to handle cyclical patterns when used alone (37). In this investigation, the synergistic integration of both methodologies enabled the SARIMA module to analyze the long-term downward trend and linear patterns of AHC's summer cyclical peak, while the KNN module identified non-linear anomalous fluctuations potentially induced by environmental variations and social behaviors through localized similarity searches, thereby further minimizing prediction errors.

Furthermore, while numerous studies have integrated machine learning with traditional time series models (38, 39), the innovation of our approach resides in the recognition that SARIMA prediction residuals contain spatio-temporal heterogeneity information beyond its linear assumptions. The KNN algorithm performs spatial interpolation on these residuals by incorporating distance weights, thereby achieving geographical refinement of prediction outcomes. In comparison to the widely adopted hybrid models such as SARIMA-LSTM and SARIMA-XGBoost in recent studies (40, 41), the SARIMA-KNN hybrid model proposed in this research demonstrates superior performance in multiple aspects. Not only does it maintain comparable prediction accuracy, but it also exhibits enhanced computational efficiency and improved interpretability. While LSTM models are capable of capturing complex sequential dependencies, they necessitate substantial training data and intricate hyperparameter optimization. Similarly, despite its robust predictive capabilities, XGBoost presents interpretability challenges regarding epidemiological mechanisms due to its inherent “black box” characteristics. Conversely, KNN offers distinct advantages, including an intuitive algorithmic principle, minimal parameter requirements, and superior adaptability to medium and small-scale datasets. These attributes render it particularly valuable for infectious diseases with limited data availability, such as AHC, or in resource-constrained regions. Furthermore, given the abrupt onset and rapid transmission patterns characteristic of AHC outbreaks, which demand swift public health responses, the SARIMA-KNN hybrid model's prediction outputs can be seamlessly integrated with existing infectious disease surveillance network reporting systems. This integration facilitates the development of a dynamic early warning system (42). Through comparative analysis of short-term predicted values against historical baseline levels and the establishment of multi-tiered risk-level warning thresholds, the system can automatically generate alerts to disease control authorities when predicted values surpass predefined thresholds. This automated mechanism enables prompt enhancement of pathogen monitoring and preparation of epidemic prevention materials, thereby optimizing prevention and control efficiency.

Although the SARIMA-KNN hybrid model in this study was developed for predicting the incidence of acute hemorrhagic conjunctivitis, its methodology can be extended to disease prediction and emergency response scenarios. Similar to the ANN-CVD study (43) that utilized artificial neural networks to predict cardiovascular disease mortality in Pakistan, this model, by combining SARIMA and KNN, is also suitable for health data that requires consideration of both long-term trends and short-term fluctuations. Moreover, the response to public health emergencies such as COVID-19 needs to take into account policy dynamics and time sensitivity (44). The hybrid framework of SARIMA-KNN model can be enhanced by integrating external covariates and optimizing the sliding window, providing a time-sensitive predictive tool for emergency decision-making. This direction is highly consistent with the goal of building a multi-disciplinary intelligent early warning system. Future research should introduce feature importance assessment methods (45) to clarify the contribution of external variables such as meteorology and population mobility to the prediction results, and focus on constructing a multi-disciplinary integrated intelligent prediction system. It is recommended to develop adaptive hybrid models (such as SARIMA-Transformer-XGBoost) to enhance the response capability to public health emergencies. At the same time, a spatio-temporal dynamic early warning platform can be established to achieve county-level risk classification and optimal resource allocation (46). Relevant institutions should enhance the medical data sharing mechanism and privacy protection standards, promote the inclusion of prediction models in local infectious disease prevention and control guidelines. Ultimately, a closed-loop management system encompassing “data-driven—model prediction—decision support” should be established to provide intelligent solutions for the scientific prevention and control of climate-sensitive infectious diseases, such as acute hemorrhagic conjunctivitis.

5 Limitation

Although the SARIMA-KNN hybrid model proposed in this study demonstrated high accuracy in predicting AHC, several limitations still exist. Firstly, the relatively short time span of the data used in this study restricted the model's ability to capture longer-term epidemic trends or multi-year cyclical patterns. Secondly, the model's generalization ability may be limited by regional characteristics, and its predictive performance may vary in other provinces, cities, or regions. Additionally, the reliability of long-term predictions is constrained by the assumption of stability in environmental and social factors. If extreme climates or policy adjustments occur in the future, the prediction results may deviate. These limitations suggest that it is necessary to continuously accumulate longer time series of monitoring data and integrate multi-source data to optimize the model structure, thereby enhancing the applicability of the prediction. Future research should also focus on collecting incidence data of AHC in other geographical areas and conducting external validation of this model to comprehensively evaluate its robustness and applicability across regions.

6 Conclusion

This study pioneers the introduction and validation of a SARIMA-KNN hybrid model, employing a residual correction strategy for the prediction of AHC in Chongqing. Comparative analyses with SARIMA, KNN, Prophet models, and the SARIMA-Prophet hybrid model demonstrate the superior performance of the SARIMA-KNN hybrid model across multiple error metrics (MSE, MAE, RMSE, and MAPE), achieving significant reductions in prediction error. The research not only identifies the seasonal characteristic that the period from June to September constitutes the peak incidence of AHC but also offers a practical tool for public health decision-making, thereby enhancing the accuracy and real-time performance of existing infectious disease early warning and monitoring platforms. Future research directions may focus on enhancing the model performance through integrating multi-source data and optimizing the deep learning architecture. Furthermore, it is recommended to connect the model output with the regional prevention and control resource scheduling system to achieve precise early intervention for infectious diseases.

Data availability statement

The original contributions presented in the study are included in the article/supplementary material, further inquiries can be directed to the corresponding author.

Ethics statement

Ethical approval was not required for the study involving humans in accordance with the local legislation and institutional requirements. Written informed consent to participate in this study was not required from the participants or the participants' legal guardians/next of kin in accordance with the national legislation and the institutional requirements.

Author contributions

XX: Resources, Writing – review & editing, Supervision, Project administration. SW: Writing – original draft, Data curation, Writing – review & editing. YF: Formal analysis, Writing – review & editing, Conceptualization. XF: Validation, Writing – review & editing, Visualization. YC: Methodology, Writing – review & editing. LZ: Formal analysis, Writing – review & editing. ZH: Writing – review & editing, Formal analysis. YC: Visualization, Writing – review & editing.

Funding

The author(s) declare that financial support was received for the research and/or publication of this article. This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.

Acknowledgments

We acknowledge the Chongqing Municipal Health Commission for providing the epidemiological data of acute hemorrhagic conjunctivitis, we would like to express our sincere thanks to the peer experts who participated in the discussions and provided valuable suggestions for this research project.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Generative AI statement

The author(s) declare that no Gen AI was used in the creation of this manuscript.

Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.

Publisher's note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

References

1. Thakur A, Sharma D, Singh MP, Chauhan P, Shah A, Angra V, et al. Clinical and molecular investigation of acute haemorrhagic conjunctivitis outbreak in North India (2023). Int Ophthalmol. (2024) 44:444. doi: 10.1007/s10792-024-03368-3

PubMed Abstract | Crossref Full Text | Google Scholar

2. Qiu HF, Zeng DW, Yi J, Zhu H, Hu L, Jing D, et al. Forecasting the incidence of acute haemorrhagic conjunctivitis in Chongqing: a time series analysis. Epidemiol Infect. (2020) 148:e193. doi: 10.1017/S095026882000182X

PubMed Abstract | Crossref Full Text | Google Scholar

3. Zhang SY, Hu QQ, Deng ZH, Hu SX, Liu FQ, Yu SS, et al. Transmissibility of acute haemorrhagic conjunctivitis in small-scale outbreaks in Hunan province, China. Sci Rep. (2020) 10:119. doi: 10.1038/s41598-019-56850-9

PubMed Abstract | Crossref Full Text | Google Scholar

4. Chavan NA, Rani VS, Shinde P, Shinde M, Pavani S, Srinath M, et al. Identification of coxsackievirus a-24 GIV c5 strain as the cause of acute hemorrhagic conjunctivitis outbreak in Hyderabad, India in 2022. Heliyon. (2024) 10:e32254. doi: 10.1016/j.heliyon.2024.e32254

PubMed Abstract | Crossref Full Text | Google Scholar

5. Jing D, Zhao H, Ou R, Zhu H, Hu L, Giri M, et al. Epidemiological characteristics and spatiotemporal analysis of acute hemorrhagic conjunctivitis from 2004 to 2018 in Chongqing, china. Sci Rep. (2020) 10:9286. doi: 10.1038/s41598-020-66467-y

PubMed Abstract | Crossref Full Text | Google Scholar

6. Zhang L, Zhao N, Huang XD, Jin XM, Geng XY, Chan TC, et al. Molecular epidemiology of acute hemorrhagic conjunctivitis caused by coxsackie a type 24 variant in China, 2004–2014. Sci Rep. (2017) 7:45202. doi: 10.1038/srep45202

PubMed Abstract | Crossref Full Text | Google Scholar

7. Sun W, Chen Y, Qin S, Miao Z. Epidemiology and spatiotemporal analysis of acute hemorrhagic conjunctivitis in Zhejiang province, China (2004–2023). Front Public Health. (2025) 13:1509495. doi: 10.3389/fpubh.2025.1509495

PubMed Abstract | Crossref Full Text | Google Scholar

8. Zhang L, Jiang H, Wang K, Yuan Y, Fu Q, Jin X, et al. Long-term effects of weather condition and air pollution on acute hemorrhagic conjunctivitis in china: a national wide surveillance study in China. Environ Res. (2021) 201:111616. doi: 10.1016/j.envres.2021.111616

PubMed Abstract | Crossref Full Text | Google Scholar

9. Kono R, Miyamura K, Yamazaki S, Sasagawa A, Kurahashi H, Tajiri E, et al. Seroepidemiologic studies of acute hemorrhagic conjunctivitis virus (enterovirus type 70) in west Africa II Studies with human sera collected in west african countries other than Ghana. Am J Epidemiol. (1981) 114:274–83. doi: 10.1093/oxfordjournals.aje.a113192

PubMed Abstract | Crossref Full Text | Google Scholar

10. Reeves WC, Brenes MM, Quiroz E, Palacios J, Campos G, Centeno R. Acute hemorrhagic conjunctivitis epidemic in colon, republic of panama. Am J Epidemiol. (1986) 123:325–35. doi: 10.1093/oxfordjournals.aje.a114241

PubMed Abstract | Crossref Full Text | Google Scholar

11. Qin M, Cheng L, Li Y, Tang X, Gan Y, Zhao J, et al. Disease burden contributed by dietary exposure to aflatoxins in a mountainous city in southwest china. Front Microbiol. (2023) 14:1215428. doi: 10.3389/fmicb.2023.1215428

PubMed Abstract | Crossref Full Text | Google Scholar

12. Xu GC, Fan T, Zhao YZ, Wu WD, Wang YB. Predicting the epidemiological trend of acute hemorrhagic conjunctivitis in china using bayesian structural time-series model. Sci Rep. (2024) 14:17364. doi: 10.1038/s41598-024-68624-z

PubMed Abstract | Crossref Full Text | Google Scholar

13. Davis SE, Walsh CG, Matheny ME. Open questions and research gaps for monitoring and updating AI-enabled tools in clinical settings. Front Digit Health. (2022) 4:958284. doi: 10.3389/fdgth.2022.958284

PubMed Abstract | Crossref Full Text | Google Scholar

14. Zhang T, Li J. Understanding and predicting the spatio-temporal spread of covid-19 via integrating diffusive graph embedding and compartmental models. Trans Gis. (2021) 25:3025–47. doi: 10.1111/tgis.12803

PubMed Abstract | Crossref Full Text | Google Scholar

15. Kuan MM. Applying SARIMA, ets, and hybrid models for prediction of tuberculosis incidence rate in Taiwan. Peerj. (2022) 10:e13117. doi: 10.7717/peerj.13117

PubMed Abstract | Crossref Full Text | Google Scholar

16. Fang K, Cao L, Fu ZW, Li WX. Prediction of reported monthly incidence of hepatitis b in Hainan province of China based on SARIMA-BPNN model. Medicine. (2023) 102:e35054. doi: 10.1097/MD.0000000000035054

PubMed Abstract | Crossref Full Text | Google Scholar

17. Yu GC, Feng HF, Feng S, Zhao J, Xu J. Forecasting hand-foot-and-mouth disease cases using wavelet-based SARIMA-NNAR hybrid model. PLoS ONE. (2021) 16:e0246673. doi: 10.1371/journal.pone.0246673

PubMed Abstract | Crossref Full Text | Google Scholar

18. Chaturvedi S, Rajasekar E, Natarajan S, McCullen N. A comparative assessment of SARIMA, LSTM RNN and FB prophet models to forecast total and peak monthly energy demand for India. Energy Policy. (2022) 168:113097. doi: 10.1016/j.enpol.2022.113097

Crossref Full Text | Google Scholar

19. Zhao P. Nearest Neighbor Methods with Applications in Functional Estimation and Machine Learning. Ann Arbor, MI: ProQuest Dissertations & Theses (2021).

Google Scholar

20. Cakir M, Yilmaz M, Oral MA, Kazanci HO, Oral O. Accuracy assessment of rferns, NB, SVM, and KNN machine learning classifiers in aquaculture. J King Saud Univ Sci. (2023) 35:102754. doi: 10.1016/j.jksus.2023.102754

Crossref Full Text | Google Scholar

21. Zhang L, Chen Q, Xiong S, Zhu S, Tian J, Li J, et al. Mushroom poisoning outbreaks in Guizhou province, China: a prediction study using SARIMA and prophet models. Sci Rep. (2023) 13:22517. doi: 10.1038/s41598-023-49095-0

PubMed Abstract | Crossref Full Text | Google Scholar

22. Luo Z, Jia X, Bao J, Song Z, Zhu H, Liu M, et al. A combined model of sarima and prophet models in forecasting aids incidence in Henan province, China. Int J Environ Res Public Health. (2022) 19:5910. doi: 10.3390/ijerph19105910

PubMed Abstract | Crossref Full Text | Google Scholar

23. Murthy KVN, Saravana R, Kumar KV. Modeling and forecasting rainfall patterns of southwest monsoons in North-East India as a SARIMA process. Meteorol Atmos Phys. (2018) 130:99–106. doi: 10.1007/s00703-017-0504-2

Crossref Full Text | Google Scholar

24. Huang M, Lin R, Huang S, Xing T. A novel approach for precipitation forecast via improved k-nearest neighbor algorithm. Adv Eng Inform. (2017) 33:89–95. doi: 10.1016/j.aei.2017.05.003

Crossref Full Text | Google Scholar

25. Mahmud TS, Ng K, Hasan MM, An CJ, Wan SY. A cross-jurisdictional comparison on residential waste collection rates during earlier waves of covid-19. Sustain Cities Soc. (2023) 96:104685. doi: 10.1016/j.scs.2023.104685

PubMed Abstract | Crossref Full Text | Google Scholar

26. Mao Q, Zhang K, Yan W, Cheng C. Forecasting the incidence of tuberculosis in china using the seasonal auto-regressive integrated moving average (SARIMA) model. J Infect Public Health. (2018) 11:707–12. doi: 10.1016/j.jiph.2018.04.009

PubMed Abstract | Crossref Full Text | Google Scholar

27. ArunKumar KE, Kalaga DV, Kumar C, Chilkoor G, Kawaji M, Brenza TM. Forecasting the dynamics of cumulative covid-19 cases (confirmed, recovered and deaths) for top-16 countries using statistical machine learning models: auto-regressive integrated moving average (ARIMA) and seasonal auto-regressive integrated moving average (SARIMA). Appl Soft Comput. (2021) 103:107161. doi: 10.1016/j.asoc.2021.107161

PubMed Abstract | Crossref Full Text | Google Scholar

28. Lanjewar MG, Parab JS, Shaikh AY. Development of framework by combining cnn with knn to detect alzheimer's disease using MRI images. Multimed Tools Appl. (2023) 82:12699–717. doi: 10.1007/s11042-022-13935-4

Crossref Full Text | Google Scholar

29. Mishra S, Das H, Mohapatra SK, Khan SB, Alojail M, Saraee M. A hybrid fused-knn based intelligent model to access melanoma disease risk using indoor positioning system. Sci Rep. (2025) 15:7438. doi: 10.1038/s41598-024-74847-x

PubMed Abstract | Crossref Full Text | Google Scholar

30. Vieira J, Duarte RP, Neto HC. KNN-stuff: KNN streaming unit for fpgas. Ieee Access. (2019) 7:170864–77. doi: 10.1109/ACCESS.2019.2955864

Crossref Full Text | Google Scholar

31. Wang Z, Zhang J, Zhang W, Lu N, Chen Q, Wang J, et al. Development and comparison of time series models in predicting severe fever with thrombocytopenia syndrome cases - Hubei province, China, 2013–2020. China Cdc Wkly. (2024) 6:962–67. doi: 10.46234/ccdcw2024.200

PubMed Abstract | Crossref Full Text | Google Scholar

32. Xian XB, Wang L, Wu XH, Tang XQ, Zhai XP, Yu R, et al. Comparison of SARIMA model, holt-winters model and ets model in predicting the incidence of foodborne disease. BMC Infect Dis. (2023) 23:803. doi: 10.1186/s12879-023-08799-4

PubMed Abstract | Crossref Full Text | Google Scholar

33. Iftikhar H, Daniyal M, Qureshi M, Tawiah K, Ansah RK, Afriyie JK. A hybrid forecasting technique for infection and death from the mpox virus. Digit Health. (2024) 10:20552076231204748. doi: 10.1177/20552076231204748

PubMed Abstract | Crossref Full Text | Google Scholar

34. Olukanmi SO, Nelwamondo FV, Nwulu NI. Utilizing google search data with deep learning, machine learning and time series modeling to forecast influenza-like illnesses in South Africa. IEEE Access. (2021) 9:126822–36. doi: 10.1109/ACCESS.2021.3110972

Crossref Full Text | Google Scholar

35. Zhao ZY, Zhai MM, Li GH, Gao XF, Song WZ, Wang XC, et al. Study on the prediction effect of a combined model of SARIMA and LSTM based on SSA for influenza in Shanxi province, China. BMC Infect Dis. (2023) 23:71. doi: 10.1186/s12879-023-08025-1

PubMed Abstract | Crossref Full Text | Google Scholar

36. Shi BB, Liu JD. Nonlinear metric learning for knn and svms through geometric transformations. Neurocomputing. (2018) 318:18–29. doi: 10.1016/j.neucom.2018.07.074

Crossref Full Text | Google Scholar

37. Halder RK, Uddin MN, Uddin MA, Aryal S, Khraisat A. Enhancing k-nearest neighbor algorithm: a comprehensive review and performance analysis of modifications. J Big Data. (2024) 11:113. doi: 10.1186/s40537-024-00973-y

Crossref Full Text | Google Scholar

38. Lv SX, Peng L, Hu HL, Wang L. Effective machine learning model combination based on selective ensemble strategy for time series forecasting. Inf Sci. (2022) 612:994–1023. doi: 10.1016/j.ins.2022.09.002

Crossref Full Text | Google Scholar

39. Wang YB, Xu CJ Li YC, Wu WD, Gui LH, Ren JC, et al. An advanced data-driven hybrid model of SARIMA-NNNAR for tuberculosis incidence time series forecasting in Qinghai province, China. Infect Drug Resist. (2020) 13:867–80. doi: 10.2147/IDR.S232854

PubMed Abstract | Crossref Full Text | Google Scholar

40. Wang ZD, Yang CX, Zhang SK, Wang YB, Xu Z, Feng ZJ. Analysis and forecasting of syphilis trends in mainland china based on hybrid time series models. Epidemiol Infect. (2024) 152:e93. doi: 10.1017/S0950268824000694

PubMed Abstract | Crossref Full Text | Google Scholar

41. Man H, Huang H, Qin Z, Li Z. Analysis of a SARIMA-XGBoost model for hand, foot, and mouth disease in Xinjiang, China. Epidemiol Infect. (2023) 151:e200. doi: 10.1017/S0950268823001905

PubMed Abstract | Crossref Full Text | Google Scholar

42. Li Z, Meng F, Wu B, Kong D, Geng M, Qiu X, et al. Reviewing the progress of infectious disease early warning systems and planning for the future. BMC Public Health. (2024) 24:3080. doi: 10.1186/s12889-024-20537-2

PubMed Abstract | Crossref Full Text | Google Scholar

43. Qureshi M, Ishaq K, Daniyal M, Iftikhar H, Rehman MZ, Salar SAA. Forecasting cardiovascular disease mortality using artificial neural networks in Sindh, Pakistan. BMC Public Health. (2025) 25:34. doi: 10.1186/s12889-024-21187-0

PubMed Abstract | Crossref Full Text | Google Scholar

44. Liao Y, Qi W, Li S, Shi X, Wu X, Chi F, et al. Analysis of onset-to-door time and its influencing factors in Chinese patients with acute ischemic stroke during the 2020 covid-19 epidemic: a preliminary, prospective, multicenter study. BMC Health Serv Res. (2024) 24:615. doi: 10.1186/s12913-024-11088-8

PubMed Abstract | Crossref Full Text | Google Scholar

45. Zhang N, Wang Y, Zhang H, Fang H, Li X, Li Z, et al. Application of interpretable machine learning algorithms to predict macroangiopathy risk in Chinese patients with type 2 diabetes mellitus. Sci Rep. (2025) 15:16393. doi: 10.1038/s41598-025-01161-5

PubMed Abstract | Crossref Full Text | Google Scholar

46. Xue Q, Xu DR, Cheng TC, Pan J, Yip W. The relationship between hospital ownership, in-hospital mortality, and medical expenses: an analysis of three common conditions in china. Arch Public Health. (2023) 81:19. doi: 10.1186/s13690-023-01029-y

PubMed Abstract | Crossref Full Text | Google Scholar

Keywords: acute hemorrhagic conjunctivitis, SARIMA, prophet, SARIMA-KNN, SARIMA-Prophet

Citation: Xian X, Wu S, Fu Y, Fan X, Cheng Y, Zeng L, Hou Z and Chen Y (2025) Incidence of acute hemorrhagic conjunctivitis in Chongqing: a forecasting study based on mathematical models. Front. Public Health 13:1644729. doi: 10.3389/fpubh.2025.1644729

Received: 10 June 2025; Accepted: 22 September 2025;
Published: 10 October 2025.

Edited by:

Sarafa Iyaniwura, Fred Hutchinson Cancer Center, United States

Reviewed by:

Hasnain Iftikhar, Quaid-i-Azam University, Pakistan
Hugo Vega-Huerta, National University of San Marcos, Peru

Copyright © 2025 Xian, Wu, Fu, Fan, Cheng, Zeng, Hou and Chen. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Yinzhi Chen, eWlvemhpQGZveG1haWwuY29t

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.