- 1Department of Statistics, University of Botswana, Gaborone, Botswana
- 2Department of Mathematical and Computational Sciences, University of Venda, Thohoyandou, South Africa
- 3Department of Statistics and Population Studies, University of the Western Cape, Capetown, South Africa
Introduction: The COVID-19 pandemic posed significant challenges for public health systems, especially in Africa, where data scarcity, inadequate healthcare infrastructure, and regional disparities hindered effective forecasting and response efforts. Conventional forecasting methods have faced challenges in adequately addressing the complexity and detail necessary for effective policy interventions at various administrative levels. This study examines the challenge of producing accurate and coherent forecasts of COVID-19 cases within the hierarchical structure of Africa, which includes the continental, regional, and national levels.
Methods: To establish a comprehensive forecasting model that uses hierarchical time series forecasting through a bottom-up reconciliation approach augmented by machine learning algorithms. We employ extreme gradient boosting (XGBoost) and random forest models, subsequently improving predictive accuracy via a weighted average ensemble method. We produce forecasts at the national level and then aggregate them to ensure consistency across all hierarchical levels. The models are evaluated in comparison to conventional methods such as ARIMA and exponential smoothing.
Results: Empirical findings indicate that XGBoost is the best among all the single forecast models used in this study, combining forecasts from the XGBoost with the random forest and assigning more weights to the XGBoost surpasses all other models in the area of mean absolute error, root mean square error, and mean absolute scale error. Results further revealed that Southern Africa, despite its low population density, reported the highest number of cases, indicating underlying health vulnerabilities and socioeconomic factors. In summary, the bottom-up HTSF method, when combined with machine learning, serves as an effective tool for forecasting in environments with limited data availability.
Discussion: It is advisable to apply similar models to other infectious diseases and to expand their use to guide health interventions, resource allocation, and early warning systems in future pandemics.
1 Introduction
The uneven impact of COVID-19 observed throughout the African continent requires careful examination. A combination of underlying socio-economic shortcomings and public health issues has significantly impacted the outcomes of the pandemic. In particular, widespread poverty, limited availability of healthcare services and high rates of infectious diseases such as HIV/AIDS and tuberculosis have likely exacerbated the effects of COVID-19. Evidence from studies, such as that by [1] depicts tuberculosis co-infection associated with increased COVID-19 risk and severity. Decision-making at the operational, tactical, and strategic levels is fundamental for any organisation. Forecasts that inform such decisions possess distinct characteristics. Strategic decisions necessitate long-term forecasts at an aggregate level, whereas decisions at the dynamic operational level demand short-term and detailed forecasts [2]. The variations in time granularity influence the generation of these forecasts; hence, analysts typically produce long-term strategic forecasts by analysing high-level unstructured information from the business environment. Therefore, [3] indicated that short-term operational forecasts are produced using structured but constrained information sources, such as historical COVID-19 data, where this process relies primarily on statistical methods. Moreover, within the same decision-making level, the forecasting literature indicates that varying forecast horizons necessitate distinct methods. This is because these forecasts are generated through different approaches and rely on diverse information sets, leading to potential discrepancies [4], and this disagreement results in misaligned decisions. Daily coronavirus forecasts that inform inventory and health decisions offer a consistent perspective on market stability. Conversely, strategic-level forecasts may indicate a thriving market, which carries profound implications for investment strategies and budgetary decisions [5].
When it comes to time series forecasting, machine learning (ML) outperforms conventional techniques like ARIMA because it can handle large, complex, non-linear datasets, incorporate multiple external factors (features), and continuously adapt with new data. This results in higher accuracy in volatile environments, but it is frequently more complex and less interpretable than simpler statistical models. While ML models like LSTMs and XGBoost capture complex patterns, making them superior for dynamic forecasting, traditional models are excellent for simpler, univariate data but struggle with real-world complexity and huge data. Chenrui et al. [6] explore how machine learning (ML) technologies addressed critical challenges across healthcare and public policy. They argued that ML enables COVID-19 forecasting, diagnosis, and decision support for response and control. In their comprehensive review, [7] analyse how ML has been utilised to combat the COVID-19 pandemic. Bachan et al. [8] summarises how Artificial Intelligence (AI), Machine Learning (ML), and deep learning have expanded medical science’s traditional dynamics during the pandemic in the following key areas.
Hierarchical time series forecasting (HTSF) is essential for better planning, resource allocation, and decision-making because it guarantees coherent (consistent) forecasts across various levels (such as total sales vs. regional sales) in a business structure. It also improves accuracy by utilizing data from both aggregate and granular levels, lowering variance, and effectively managing complex, interconnected data. It resolves the issue of individual estimates not adding up, which leads to inconsistent operations and budgeting, particularly in supply chains, energy, or retail.
The HTSF involves several challenges because the series at various hierarchical levels, reflecting different degrees of aggregation, can display significantly distinct characteristics, as [9] noted. According to [10], supply chain data exhibit intermittency, high volatility, and skewness. In contrast, aggregated series tend to evolve more smoothly, exhibit reduced skewness, and display more pronounced seasonality dependency. The identified differences influence the selection of forecasting methods applicable at various hierarchical levels, and [2] made this observation. We can forecast aggregates by summing the forecasts of the corresponding series at lower levels. This bottom-up approach typically yields suboptimal outcomes, and this situation presents a second major challenge: the necessity for the forecast of each aggregated series to equal the sum of the forecasts of the corresponding disaggregated series, thereby facilitating coherent decision-making across various hierarchical levels. The likelihood of satisfying this aggregation constraint diminishes when forecasts for each series in the hierarchy are generated independently. In many cases, like predicting COVID-19 by the day, the hierarchy can have thousands or even millions of time series at the lowest level, which affects the choice of forecasting methods based on how much computing power is needed.
However, one remarkable aspect of hierarchical forecasting research is the absence of probabilistic prediction, and this is a major drawback, as, in many different contexts, optimum decision-making calls for an evaluation of forecast uncertainty. Gabalawy et al. [11] suggested that to effectively manage forecasts, we need probabilistic forecasts for the total system load in areas like power supply planning, setting operational reserves, price forecasting, and electricity market trading. Although high-level administrative entities, like states or nations, or heavily inhabited regions, frequently have enough data in terms of quantity and quality, this is less prevalent at lower scales and local levels, such as counties. Clustering smaller geographic areas according to characteristics relevant to the COVID-19 spread could help solve this problem by producing clusters with stronger data for subsequent modelling. Using important demographic, mobility, meteorological, medical capacity, and health-related county-level characteristics, [12] suggest a modelling approach that groups countries with comparable epidemic patterns to examine the spread of COVID-19. Building on their work, we provide a hierarchical time series forecasting (HTSF) system meant to forecast COVID-19 cases using these county-level hierarchies while tackling the important issues described before. We think that our proposed HTSF structure, which uses grouped county-level data, can predict COVID-19 cases as accurately as or better than conventional methods for national, cluster, and county-level illness predictions. Our method is especially useful for areas with little or no data.
While machine learning (ML) models (such as Random Forest, and XGBoost) excel in predictive power with large, complex, noisy data, handle high dimensions well, but frequently sacrifice interpretability and require more tuning/data, statistical models (such as ARIMA, ETS) offer interpretability, simplicity for smaller/linear data, and focus on inference but struggle with complexity and scale for HTSF) forecasting [13]. Although specialist hierarchical ML incorporates structure for better results, ML models, particularly deep learning, may capture complex non-linearities and relationships across hierarchical levels more effectively, frequently beating classical statistics in overall accuracy. Table 1 gives a detailed comparison of traditional statistical methods with machine learning applications in HTSF.
We propose a reconciliation method for HTSF utilising county-level data by applying a bottom-up approach, as in the work of [14]. The HTSF method has been effectively utilised in diverse applications, including the prediction of future tourism demand at multiple levels as seen in [15], forecasting demand for accident and emergency departments as in [2], and various business forecasting scenarios as seen in [16]. The HTSF integrates predictions from all aggregation levels to yield temporally reconciled, coherent, and robust forecasts. Hence, in hierarchical forecasting, producing “coherent” forecasts is essential, as the aggregate forecasts must precisely equal the sum of the forecasts at the more detailed and disaggregated levels. This guarantees that the forecasts accurately represent the characteristics of the actual data [17]. Particularly in data-scarce areas at the national and regional levels, this work makes a significant contribution by being the first to utilise bottom-up hierarchical time series forecasting to predict COVID-19 cases in Africa. It shows a new combination of machine learning models, XGBoost and Random Forest—in the HTSF structure, proving their superiority over conventional statistical techniques such as ARIMA (herein referenced as autoregressive integrated moving average) and exponential smoothing in managing complicated and nonlinear epidemic data. The study also suggests a weighted average ensemble (WAE) approach that uses the complementary advantages of XGBoost and Random Forest. Particularly when weighted towards XGBoost, the ensemble method has improved predictive performance, thereby offering a strong option for real-world public health forecasting. Furthermore, the research uses the bottom-up forecast reconciliation technique to guarantee forecast consistency across all levels of the hierarchy, from national to regional to country-level projections. Informed decision-making in public health depends on this consistent forecasting system, which also allows correct resource allocation; hence, the model exposes important geographical differences, such as the high COVID-19 load in Southern Africa despite its comparatively low population density; hence, it stresses the importance of focused interventions. The suggested framework is a useful tool for future epidemic planning and response throughout the African continent, as it is not only scalable and flexible for different infectious illnesses but also provides a realistic forecasting solution in low-data settings.
1.1 Research highlights
The highlights of this research are:
• A comparison between traditional hierarchical time series forecasting approaches (ARIMA and Exponential Smoothing) with machine learning approaches (XGBoost and Random Forest)
• Development of a Weighted Average Ensemble model that combines forecasts from the XGBoost and the Random Forest.
• Among the traditional methods, the ARIMA model performed better than the Exponential Smoothing model.
• Machine learning models outperformed traditional statistical methods. Combining machine learning approaches increases the performance of the models in forecasting daily COVID-19 cases.
• Despite having the lowest population density, the Southern African region had the highest COVID-19 prevalence among all the other regions in Africa.
The rest of the paper is organised as follows: Section 2 presents the models, and empirical results are discussed in Section 3. A detailed discussion of the results is presented in Sections 4, 5 concludes.
2 Methodology
2.1 Data
The data used in this study are daily confirmed COVID-19 cases for the period 2020-02-16 to 2023-02-18 in the five regions of Africa, which are Northern Africa (NA), Eastern Africa (EA), Southern Africa (SA), Central Africa (CA) and Western Africa (WA). The population sizes as of 2024 of each of these five regions are as follows: NA (272 131 339), EA (500 703 846), SA (73 138 701), CA (212 915 636) and WA (456 251 329), both males and females from all age groups. Figure 1 shows the five regions of Africa.
Table 2 provides a classification of African nations divided into five geographical areas: NA, EA, SA, CA and WA. All the regions are further subdivided into their constituent countries, which are numbered for easy identification.
2.2 Hierarchical structure
A hierarchical time series comprises several time series organised in a hierarchical framework with aggregation constraints that must be met. Let be the vector of all observations in the hierarchy at the time , and let include the most disaggregated (bottom-level) series’ observations. The connection between the whole hierarchy and the base elements can be expressed as Equation 1 below:
where is the number of time periods, and is a summing matrix of hierarchical aggregation rules. Nonetheless, the hierarchical time series aggregation is now given by:
where is the vector of all series (Africa, regions, and countries) at time , is the summing matrix of dimension and is the vector of bottom-level (country) observations at time .
From Equation 2 the summing matrix row 1 represents the sums of all 54 countries (Africa total), rows 2–6: Sum countries for regions to , rows 7–60 denote the identity matrix to preserve bottom-level observations with = row vector of ones and = row vector of zeros.
The hierarchical structure of the countries in the five regions of Africa is shown in Figure 2.
Figure 2. Hierarchical structure of the different countries in the five regions of Africa. The top level is defined as Level 0, Level 1 is represented by the five regions R1(NA),) and R5(WA). The bottom level is Level 2 with denoting country in region , e.g., denotes country 1 in region 2, which is Kenya.
2.3 Stage 1: machine learning models
This section presents the machine learning reconciliation approach that leverages the potential of decision tree models. The approach is designed to address the limitations of conventional hierarchical forecast methods, such as the ARIMA and Exponential Time Series models, as discussed in Section 1. Given the non-linear nature of COVID-19 daily case data, we use the XGBoost and Random Forest (RF) models. A weighted average ensemble of XGBoost and RF will then be compared with the individual models.
2.3.1 Extreme gradient boosting (XGBoost) method
Extreme Gradient Boosting (XGBoost) is a specific implementation of gradient boosting [18] that incorporates randomisation and regularisation techniques to reduce overfitting while increasing training speed. Moreover, it computes second-order gradients of the loss function, which provide more information about the gradient’s direction, making it easier to minimise the loss.
In an XGBoost framework, weak trees are continuously appended to the set with different weights. The trees in the set must approach the residuals from the previous prediction as closely as possible, as expressed in Equation 3.
where is the predicted value, is the set including all regression trees, where is one of the regression trees, and is the number of regression trees. The expectation is that the predicted value, , is as close as possible to the actual value without losing its generalisation ability. The formula to compute Obj as given in Equation 4.
where is the loss function, representing the influence between the predicted and the true value. The loss function can be a second-order derivative. is the regularisation term, which defines the complexity of the model. The regularisation term is defined as given in Equation 5.
where is the number of nodes, and is the score represented by the nodes. The smaller the value, the lower the complexity and the stronger the generalisation ability.
2.4 Random forests
Random Forests (RF) is an ensemble learning approach grounded on decision trees [19], which are statistical models widely utilised for regression and classification problems. The RF methodology, developed by Breiman [20], generates numerous decision trees during training and averages their predictions to improve collective performance [21].
In an RF model, each decision tree is trained on a randomly selected subset of the training data (bagging) and a random subset of features (random subspace method) [22]. There are two important roles that this approach plays: it stops overfitting [23] and enables the model to learn various patterns in the data [24]. Our research focuses particularly on the regression capability of the RF model [25]. Let be the input features, a random vector, and be the real-valued response variable. The final goal is to estimate the regression function , which may be defined as the conditional expectation as given in Equation 6 below:
Given a training dataset comprising independent and identically distributed observations, we construct an estimator . This estimator is considered consistent if it satisfies the condition given in Equation 7 as the sample size grows [26]:
The RF algorithm generates an ensemble of regression trees. For a given input , the prediction of the -th tree can be expressed as Equation 8 according to [27]:
where are independent random variables that govern the tree construction process [28]. These variables determine the data subsampling and feature selection at each split. The prediction of the -th tree takes the form presented in Equation 9 [29]:
where denotes the bootstrap sample used for constructing the -th tree, represents the terminal leaf node containing and counts the number of training points in .
The final RF prediction combines the outputs of all individual trees through averaging [30] as presented in Equation 10:
As the number of trees grows large, we can consider the theoretical infinite forest estimator [31], represented by Equation 11:
where the expectation is taken with respect to the randomness in tree construction, conditional on the training data. The law of large numbers guarantees that as , the finite forest prediction converges to this infinite ensemble [32], given by Equation 12:
2.4.1 A weighted average ensemble of XGBoost and RF
Ensemble learning combines several machine learning models to achieve better predictive performance than any single model. This study considers the weighted-average ensemble of RF and XGBoost and describes its benefits. The RF and XGBoost ensemble leverages the complementary strengths of bagging and boosting and should yield more accurate and stable predictions than either model alone. A summary of the comparison of the XGBoost model versus the RF model is presented in Table 3.
The XGBoost model excels at optimising complex patterns but can overfit noisy data, while RF provides stability by averaging multiple decision trees. Combining these two models balances the bias-variance trade-off, improving generalisation.
2.4.2 Forecast reconciliation
Hierarchical time series forecasting involves reconciling base forecasts to meet aggregation constraints. Several forecast reconciliation methods are proposed in the literature. Popular techniques include bottom-up (BU), top-down (TD), middle-out (MO), and optimal combination (OC). Bottom-up sums lower-level forecasts, and Top-down aggregates the total forecast. Middle-out is a combination of these. Optimal combination methods, such as ordinary least squares, minimise reconciliation error across the hierarchy, improving accuracy and theoretical consistency. Several other methods are discussed in the literature. For a detailed discussion of these methods, see, for example.
In this study, we will use the BU forecast reconciliation method. Bottom-up methods of probabilistic hierarchical forecasting were introduced by [Taylor et al. [33]]. The Bottom-up (BU) approach is one of the simple methods for generating coherent forecasts. The approach involves generating forecasts for each series at the bottom level and then summing these to produce forecasts for all the series in the structure. The bottom-up (BU) forecasts model is given by:
The BU forecasts are obtained using , where is a null matrix and I is the identity matrix. The matrix extracts only bottom-level forecasts from , and adds up to give the BU forecasts.
2.5 Performance metrics
The following evaluation metrics will be used in this study.
2.5.1 Mean absolute error
The mean absolute error (MAE) measures the average absolute difference between predicted and actual values, providing a robust metric for model accuracy.
where is the actual value, is the predicted value and is the number of observations.
2.5.2 Root mean squared error
The root mean squared error (RMSE) penalises larger errors more severely than MAE, making it sensitive to outliers.
2.5.3 Mean absolute scaled error
The mean absolute scaled error (MASE) compares the model’s MAE to the MAE of a naive (e.g., random walk) forecast. A value <1 indicates better performance than the baseline.
2.5.4 Bias
Bias measures the average difference between actual and predicted values, indicating systematic over- or under-prediction.
3 Results and analysis
In this section, results from data analysis are presented and interpreted. The process starts with exploratory data analysis, which helps understand the data distributions and visualise the data patterns.
3.1 Exploratory data analysis
Table 4 shows that the Southern African region has the smallest population among African regions. In addition, this region has the lowest population density compared to all the other regions. However, the Southern African region recorded the highest daily COVID-19 case count on the African continent. More than 75% of the daily COVID-19 cases in the African continent are from the Southern African region. This could be attributed to challenges related to poverty, access to healthcare, and environmental factors in the region. The Southern African region has a high prevalence of communicable diseases, including HIV/AIDS and TB. HIV/AIDS and TB result in immune deterioration. Hence, people with such infections are at high risk of contracting COVID-19. A study by Tamuzi et al. [1] shows that TB was a risk factor for COVID-19, both in terms of severity and mortality. Although Western has the highest population density, about 3 times that of Southern Africa, the number of daily cases is far less than that of Southern Africa. The Western African region has a lower proportion of the elderly, and some studies have shown that COVID-19 is more prevalent among the elderly than among younger individuals.
In Figure 3, the time series plots for the daily COVID-19 cases from all the African regions, including the whole of Africa, are presented.
Figure 3 shows that Central Africa had the lowest number of daily COVID-19 cases, followed by Western Africa. Southern Africa accounts for a high share of daily COVID-19 cases reported in Africa. For all regions, the COVID-19 pandemic is characterised by four waves. To examine the distribution of the COVID-19 pandemic, the box and whiskers plots for each region are presented in Figure 4. The Box and Whiskers plots show a non-normal distribution of daily COVID-19 cases with some outliers across the African region. Models that can handle this kind of complex pattern need to be employed. Previous studies have shown that traditional statistical methods, such as ARIMA models, cannot handle such complex data. Thus, machine learning models shall be employed, particularly XGBoost and RF.
Further distribution of COVID-19 cases by month and day of the week is presented in Figure 5. The results in Figure 5 show that more cases were recorded in winter, from June to August, for all the regions. This is because the coronavirus that causes COVID-19 survives longer in environments with reduced sunlight and lower temperatures. High COVID-19 cases were also recorded around December to January, the months associated with festive seasons, when contact with infected individuals is high, leading to increased spread. Although the distribution of spreads does not vary much by day of the week, Fridays have slightly more reported cases than other days.
3.2 Hierarchical forecasting
At this stage, time series data for all African regions had 1,342 observations, which were split into training and test sets at 75:25, yielding 1,000 observations for the training set and 342 for the test set. The machine learning models, XGBoost and the Random Forest (RF), are fitted on the training sets and validated on the test set. To improve the performance of the fitted models, we included lag7 and lag30 variables to capture the effects of week and month on each time series. In addition to lag7 and lag30, the cubic spline for each time series is included. These covariates have a non-zero influence on the daily time series for each region and the whole of Africa.
The hyperparameters were selected from the search ranges provided in Table 5.
The final parameters for the XGBoost were tuned using booster = “gbtree,” eta = 0.1, maximum depth = 10, subsample = 0.8, colsample bytree = 0.85, lambda = 1, alpha = 0, objective = “reg:squarederror.”
3.2.1 Extreme gradient boosting (XGBoost) and random forest (RF)
To fit the XGBoost model, the R package “boost” is used. To execute the analysis, three types of parameters were used: (1) general parameters, which in this case are the gbtree that helps to boost, (2) booster parameters, and (3) learning task parameters. To assess performance on the test set, we used the mean absolute squared error (MASE), mean absolute error (MAE), root mean squared error (RMSE), and bias. The forecast performance of the XGBoost for each region is presented in Table 6a. The RF model was fitted using the H2O package. The ones from the fitted RF models are presented in Table 6b.
Table 6. Performance comparison of XGBoost and RF models in forecasting COVID-19 spread for African regions.
Results from Tables 6a, b show that the MAEs and the RMSEs for the XGBoost model are lower than the ones for the RF model for all the regions of Africa and for Africa as a whole. This shows that the XGBoost model performs much better than the RF model. This finding supports other studies that XGBoost is the best machine learning approach for handling complex relationships due to its built-in regularisation [18]. Figure 6 presents plots of the XGBoost model’s forecasts and observed values.
The plots in Figure 6 show that the forecasts from the XGBoost model are close to the observed test set values. The XGBoost model also predicted most of the spikes in the test set. We also present, in Figure 7, the Box-and-Whisker plots of the residuals from the fitted XGBoost model. For the regions, the plots show the presence of outliers. As such, we propose a proper hierarchical forecasting using the reconciliation procedure, a bottom-up approach, to improve the forecast performance of the models.
3.2.2 Hierarchical forecasting using the bottom-up reconciliation approach
At this stage, hierarchical forecasting is performed using machine learning approaches, including XGBoost and RF models. Forecasts from the XGBoost and the RF are further combined using the weighted average ensemble (WAE), assigning different weights to the fitted models and selecting the model with the best performance. The bottom-up method is used for forecast reconciliation. This results in fitting a two-level hierarchical time-series forecasting model, where Level 1 represents total COVID-19 cases for each of the five African regions, and Level 2 represents daily COVID-19 cases for each country in Africa. The traditional ARIMA and Exponential smoothing are used as benchmarks for hierarchical forecasting.
Figure 8 presents the reconciled forecasts for level zero (daily COVID-19 cases in Africa), level 1 (daily COVID-19 cases in the five regions of Africa) and level 2 (daily COVID-19 cases for all countries in Africa). Even though some countries reported more daily COVID-19 cases than others, the time series plots at all levels show a similar trend.
In Table 7, performance metrics (MASE, MAE, and RMSE) for the machine learning approaches (XGBoost, RF, and WAE) and the traditional forecasting methods (ARIMA and Exponential smoothing) are presented.
Results from Table 7 show that, among the traditional approaches, the ARIMA model performs better than the Exponential smoothing model. However, the performance of machine learning approaches is far better than that of traditional methods. The XGBoost method has demonstrated greater robustness for hierarchical time series forecasting than the RF and WAE methods, with equal weights assigned to the XGBoost and RF forecasts. The WAE with more weight assigned to the XGBoost forecasts performs better than the one with equal weights, and its metrics, though smaller, are approximately equal to those from the XGBoost method. In addition, the RMSE, MAE, and MASE for the XGBoost hierarchical forecasting using the reconciliation approach are lower than those presented in Table 6 for XGBoost without reconciliation. The same is observed for the RF method, where the performance metrics presented in Table 7 are lower than the ones presented in Table 6. Thus, hierarchical forecasting using the bottom-up reconciliation approach improves forecast accuracy by accounting for the hierarchical structure of the data and forecasting at multiple levels of aggregation. The WAE method with more weights assigned to XGBoost forecasts is the best forecast approach, followed by the XGBoost method. We present plots of forecasts from WAE, with more weight assigned to XGBoost, and of XGBoost forecasts for levels 1 and 0. The plots are presented in Figures 9, 10 for level 1 and the top panel, respectively.
Figure 9. Top panel (a): Hierarchical forecasting from the WAE model for the five regions (level 1). Bottom panel (b): Hierarchical forecasting from the XGBoost model for the five regions (level 1).
Figure 10. Top panel (a): Forecasts from the XGBoost vs. the observed cases. Bottom panel (b): Forecasts from the WAE vs. the observed cases.
Plots in Figure 9a further confirm that forecasts from the WAE model with more weights assigned to the XGBoost model than the RF give a better prediction of the test set than the forecasts from the XGBoost model presented in Figure 9b. The WAE also give a better prediction of the reconciled COVID-19 cases for the top panel (Africa) than the XGBoost method, as shown in Figures 10a, b, respectively. There has also been a notable improvement in capturing spikes in the datasets when the bottom-up reconciliation approach is used for hierarchical forecasting, as shown in the plots in Figure 6. The coverage probabilities from the XGBoost were estimated using the linear regression approach at a 95% confidence interval and are fairly high. This further confirms the robustness of XGBoost for hierarchical forecasting.
4 Discussion
This study compares the performance of machine learning methods (XGBoost and random forests) and traditional methods (exponential smoothing and ARIMA) for hierarchical time-series forecasting of daily COVID-19 cases in Africa. The choice of the XGBoost and the Random Forest machine learning methods is anchored in the study by Shakhovska and others [34], which, although applied to a hierarchical classification problem, supports the selection of the XGBoost and the Random Forest as the best predictive methods. The study used a two-stage approach. In the first stage, we compare the performance of the XGBoost and Random Forest models in predicting COVID-19 cases across the five regions of Africa and in the aggregated time series for the whole African continent. At this stage, reconciliation methods were not used. The XGBoost had lower MASE and RMSE on the validation set than the Random Forest method. Thus, the XGBoost outperformed the Random Forest method in forecasting the daily COVID-19 cases at all levels of the hierarchy. XGBoost’s fine-grained control and ability to handle imbalanced datasets make it a strong contender for advanced predictive modelling [35], unlike other single-forecast models. While XGBoost also has the advantage of incorporating inbuilt regularisation techniques that help prevent overfitting and thereby improve generalisability, the Random Forest does not have built-in regularisation parameters, which can be beneficial in some scenarios [35]. However, based on the comparison of the XGBoost approach across the African regions, West Africa has the lowest MAE, followed by Central Africa, and the North African region is in third place. From the RF approach, Central Africa had the lowest MAE, followed by West Africa, and East Africa came in third. West Africa, Central Africa, and East Africa are the regions with the fewest recorded daily COVID-19 cases. This could explain the small MAEs in these regions.
In the second stage, hierarchical time-series forecasting is performed using the bottom-up reconciliation method. The machine learning approaches are benchmarked against traditional methods, namely ARIMA and Exponential time series methods. A comparison of the traditional time series approaches showed that, overall, the Exponential Smoothing Time series model performed better than the ARIMA model. For the machine learning methods, we used XGBoost and the Random Forest method, and further ensembled them using the Weighted Average Ensemble (WAE). Various weights were assigned to the forecasts from XGBoost and Random Forest, and the results showed that assigning more weight to XGBoost forecasts improved the model’s performance on the validation set. The uniqueness of this study is in the formulation of the WAE approach. Overall, the order of performance of the fitted models can be summarised as follows: WAE: XGBoost > WAE: RF > ARIMA > EST. Although the XGBoost outperforms the WAE approach, their performance is almost identical. This study shows that, for this problem, the weighted average ensemble (WAE) method with more weights assigned to XGBoost is the best approach for hierarchical forecasting of COVID-19 spread across the African continent. Although a study by Fang and others [36] supports the flexibility of the XGBoost machine learning model for handling time series data with complex patterns like those in the COVID-19 data, combining it with the random forest method improves its efficiency. Thus, combining forecasts improves stability and reduces sensitivity to outliers, leading to more reliable predictions than single-forecast models [37]. A comparison of forecasts from the first and second stages for the XGBoost and RF methods shows that hierarchical time-series forecasting using the bottom-up reconciliation approach improves the predictive power of the models. The bottom-up approach prevents information loss during aggregation and maintains data consistency across all hierarchies [38]. In addition, the approach can capture granular information, leading to more accurate forecasting, especially when there are significant variations across levels. Although in most regions of Africa and across Africa as a whole, the WAE: performed best, the Central African region showed different results. The XGBosst showed the best performance in the Central African region.
5 Conclusion and recommendations
This study shows that a bottom-up approach to hierarchical forecasting can improve the accuracy and consistency of COVID-19 predictions across different levels in Africa. This method provides substantial benefits for public health planning when data availability allows, ensuring that forecasts at national and regional levels align with those at the country level. Machine learning techniques, particularly the weighted average ensemble (WAE) of the XGBoost and RF forecasts, with more weight assigned to the XGBoost, demonstrated greater robustness and accuracy than traditional statistical methods such as ARIMA and exponential smoothing, especially when modelling complex, non-linear epidemiological data.
The study’s key empirical finding indicates that the Southern African region became the epicentre of COVID-19 cases in Africa, despite possessing the lowest population and population density. The unexpected outcome can be attributed to underlying comorbidities, particularly the high prevalence of HIV/AIDS and tuberculosis in the region, which are known to compromise immune response and may have intensified the impact of COVID-19. Future research should investigate the biological and epidemiological interactions among COVID-19, HIV, and TB through in vivo or in vitro studies to enhance understanding of their combined health burden. National health authorities should enhance local data collection and consider implementing combined machine-learning approaches, combined with bottom-up hierarchical forecast reconciliation, to improve pandemic preparedness and response in Africa and other regions with limited data resources.
Data availability statement
The analytic data can be downloaded from https://github.com/.
Ethics statement
This study uses secondary data that is readily available online from Our World in Data. Data access did not require ethical approval.
Author contributions
ClS: Conceptualization, Formal analysis, Methodology, Writing – original draft. CaS: Methodology, Supervision, Writing – review & editing. KM: Methodology, Writing – review & editing.
Funding
The author(s) declared that financial support was not received for this work and/or its publication.
Acknowledgments
The authors sincerely thank the anonymous reviewers for their helpful comments and suggestions on this paper.
Conflict of interest
The author(s) declared that this work was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Generative AI statement
The author(s) declared that generative AI was not used in the creation of this manuscript.
Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.
Publisher's note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
Footnote
Abbreviations ARIMA, autoregressive integrated moving average; BU, bottom up; CA, Central Africa; EA, Eastern Africa; ETS, exponential time series; HTS, hierarchical time series; MAE, mean absolute error; MASE, mean absolute scaled error; MinT, minimization trace; MO, middle out; NA, Northern Africa; RF, random forest; RMSE, root mean square error; SA, Southern Africa; SVR, support vector regression; TD, top down; WA, Western Africa; WAE, weighted average ensemble; XGBoost, eXtreme gradient boosting.
References
1. Tamuzi JL, Ayele BT, Shumba CS, Adetokunboh OO, Uwimana-Nicol J, Haile ZT, et al. Implications of COVID-19 in high-burden countries for HIV/TB: a systematic review of evidence. BMC Infect Dis. (2020) 20:1–18. doi: 10.1186/s12879-020-05450-4
2. Athanasopoulos G, Hyndman RJ, Kourentzes N, Petropoulos F. Forecasting with temporal hierarchies. Eur J Oper Res. (2017) 262(1):60–74. doi: 10.1016/j.ejor.2017.02.046
3. Kolsarici C, Vakratsas D. Correcting for misspecification in parameter dynamics to improve forecast accuracy with adaptively estimated models. Manage Sci. (2015) 61(10):2495–513. doi: 10.1287/mnsc.2014.2027
4. Mir AA, Alghassab M, Ullah K, Khan ZA, Lu Y, Imran M. A review of electricity demand forecasting in low and middle-income countries: the demand determinants and horizons. Sustainability. (2020) 12(15):5931. doi: 10.3390/su12155931
5. Château B. Energy demand drivers. In: Hafner M, Luciani G, editors. The Palgrave Handbook of International Energy Economics. Cham: Springer International Publishing (2022). p. 511–43. doi: 10.1007/978-3-030-86884-0_26
6. Lv C, Guo W, Yin X, Liu L, Huang X, Li S, et al. Innovative applications of artificial intelligence during the COVID-19 pandemic. Infect Med. (2024) 3(1):100095. doi: 10.1016/j.imj.2024.100095
7. Khan M, Mehran MT, Haq ZU, Ullah Z, Naqvi SR, Ihsan M, et al. Applications of artificial intelligence in COVID-19 pandemic: a comprehensive review. Expert Syst With Appl. (2021) 185:115695. doi: 10.1016/j.eswa.2021.115695
8. Bachan PR, Bera UN, Kapoor P. Recent advancement of artificial intelligence in COVID-19: prediction, diagnosis, monitoring, and drug development. In: Mehta G, Wickramasinghe N, Kakkar D, editors. Innovations in VLSI, Signal Processing and Computational Technologies. WREC 2023. Singapore: Springer (2024). Lecture Notes in Electrical Engineering; Vol. 1095. doi: 10.1007/978-981-99-7077-3_28
9. Silveira Gontijo T, Azevedo Costa M. Forecasting hierarchical time series in power generation. Energies. (2020) 13(14):3722. doi: 10.3390/en13143722
10. Mircetic D, Rostami-Tabar B, Nikolicic S, Maslaric M. Forecasting hierarchical time series in supply chains: an empirical investigation. Int J Prod Res. (2022) 60(8):2514–33. doi: 10.1080/00207543.2021.1896817
11. Gabalawy M, Hosny NS, Adly AR. Probabilistic forecasting for energy time series considering uncertainties based on deep learning algorithms. Electr Power Syst Res. (2021) 196:107216. doi: 10.1016/j.epsr.2021.107216
12. Mohanty S, Shimamura A, Nicholson CD, González AD, Razzaghi T. Hierarchical time series forecasting of COVID-19 cases using county-level clustering data. In: Operations Research Forum. Cham: Springer International Publishing (2025). Vol. 6. p. 28. doi: 10.1007/s43069-025-00424-1
13. Kontopoulou VI, Panagopoulos AD, Kakkos I, Matsopoulos GK. A review of ARIMA vs. machine learning approaches for time series forecasting in data driven networks. Future Internet. (2023) 15:255. doi: 10.3390/fi15080255
14. Hyndman RJ, Athanasopoulos G. Forecasting: Principles and Practice. 3rd ed. Melbourne, Australia: OTexts (2021).
15. Wickramasuriya SL, Athanasopoulos G, Hyndman RJ. Optimal forecast reconciliation for hierarchical and grouped time series through trace minimisation. J Am Stat Assoc. (2019) 114(526):804–19. doi: 10.1080/01621459.2018.1448825
16. Hyndman R, Lee AJ, Wang E. Fast computation of reconciled forecasts for hierarchical and grouped time series. Comput Stat Data Anal. (2016) 97:16–32. doi: 10.1016/j.csda.2015.11.007
17. Makatjane K, Shoko C, Moroke ND. Probabilistic forecasting of hourly wind power load in South Africa. In: Machine Learning and Computer Vision for Renewable Energy. New York: IGI Global (2024). p. 268–85.
18. Chen T, Guestrin C. Xgboost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: Association for Computing Machinery (2016). p. 785–94.
19. Breiman L, Friedman JH, Olshen RA, Stone CJ. Classification and Regression Trees. Florida: Chapman and Hall/CRC (1984).
22. Ho TK. The random subspace method for constructing decision forests. IEEE Trans Pattern Anal Mach Intell. (1998) 20(8):832–44.
23. Dietterich TG. Ensemble methods in machine learning. In: International Workshop on Multiple Classifier Systems. Berlin, Heidelberg: Springer (2000). p. 1–15.
26. Györfi L, Kohler M, Krzyzak A, Walk H. A Distribution-Free Theory of Nonparametric Regression. Berlin: Springer Science & Business Media (2002).
27. Scornet E, Biau G, Vert JP. Consistency of random forests. Ann Stat. (2015) 43(4):1716–41. doi: 10.1214/15-AOS1321
28. Lin Y, Jeon Y. Random forests and adaptive nearest neighbours. J Am Stat Assoc. (2011) 101(474):578–90. doi: 10.1198/016214505000001230
29. Wager S, Hastie T, Efron B. Confidence intervals for random forests: the jackknife and the infinitesimal jackknife. J Mach Learn Res. (2014) 15(1):1625–51.25580094
30. Ishwaran H, Lu M. Standard errors and confidence intervals for variable importance in random forest regression, classification, and survival. Stat Med. (2019) 38(4):558–82. doi: 10.1002/sim.7803
31. Mentch L, Hooker G. Quantifying uncertainty in random forests via confidence intervals and hypothesis tests. J Mach Learn Res. (2016) 17(1):841–81.
33. Ben Taieb S, Taylor JW, Hyndman RJ. Coherent probabilistic forecasts for hierarchical time series. In: Precup D, Teh, YW, editors. Proceedings of Machine Learning Research: Vol. 70, Proceedings of the 34th International Conference on Machine Learning. PMLR (2017). p. 3348–57.
34. Shakhovska N, Izonin I, Melnykova N. The hierarchical classifier for COVID-19 resistance evaluation. Data. (2021) 6:6. doi: 10.3390/data6010006
35. Fatima S, Hussain A, Amir SB, Ahmed SH, Aslam SMH. XGBoost and random forest algorithms: an in depth analysis. Pak J Sci Res. (2023) 3(1):26–31. doi: 10.57041/vol3iss1pp26-31
36. Fang ZG, Yang SQ, Lv CX, An SY, Wu W. Application of a data-driven XGBoost model for the prediction of COVID-19 in the USA: a time-series study. BMJ Open. (2022) 12(7):e056685. doi: 10.1136/bmjopen-2021-056685
37. Shoko C, Sigauke C, Njuho P. Short-term forecasting of confirmed daily COVID-19 cases in the Southern African development community region. Afri Health Sci. (2022) 22(4):534–50. doi: 10.4314/ahs.v22i4.60
Keywords: bottom-up reconciliation, ensemble, hierarchical time series, random forest, weighted average, XGBoost
Citation: Shoko C, Sigauke C and Makatjane K (2026) Hierarchical forecasting of COVID-19 cases in Africa using machine learning models. Front. Epidemiol. 6:1696282. doi: 10.3389/fepid.2026.1696282
Received: 31 August 2025; Revised: 21 January 2026;
Accepted: 22 January 2026;
Published: 11 February 2026.
Edited by:
Mohamed Ali Daw, University of Tripoli-Libya, LibyaReviewed by:
Jayakumar Kaliappan, Vellore Institute of Technology (VIT), IndiaYuanzhao Ding, University of Oxford, United Kingdom
Copyright: © 2026 Shoko, Sigauke and Makatjane. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Claris Shoko, c2hva29jQHViLmFjLmJ3