- 1Digitalization Department, State Grid Fujian Electric Power Company, Fuzhou, China
- 2Economic and Technical Research Institute, State Grid Fujian Electric Power Company, Fuzhou, China
- 3Department of Economics and Management, North China Electric Power University (Baoding), Baoding, China
With the rapid advancement of digital transformation, enterprises face escalating challenges in cloud resource allocation due to dynamic workloads and substantial capital investments. Existing forecasting models often overlook the impact of corporate digital maturity, leading to suboptimal investment decisions and resource inefficiencies. This study proposes an integrated framework combining an ARIMAX forecasting model with a multi-constraint optimization approach. We incorporate a quantified Digital Transformation Index (DTI) as an exogenous variable and develop a cost-minimization investment model under constraints including resource gaps, leasing ratios, alert thresholds, and budget limits. Simulation experiments using Alibaba Cloud cluster data demonstrate that the proposed model achieves a CPU load prediction error (MAPE) of less than 5%, with a statistically significant DTI coefficient (p < 0.01). The optimal investment strategy utilized 93.67% of a $2.22 million budget, achieving a leasing ratio below 45% while maintaining a 67% resource utilization safety threshold. We employed Mean Absolute Percentage Error (MAPE) for forecasting accuracy and Net Present Value (NPV) for cost evaluation, selected for their relevance to operational and financial performance in cloud resource management.
1 Introduction
With the rapid advancement of digital transformation, enterprises are actively adopting cloud technologies to transform their production, operations, and management processes. The exponential growth of business data has led to escalating pressure on cloud platforms, pushing resource allocation to the point of saturation. Simultaneously, cloud infrastructure construction requires substantial capital investment. Without scientific planning, unplanned expansion may result in compromised quality and financial waste. Therefore, systematic investigation of cloud resource load forecasting techniques and optimization of investment strategies are critical for enhancing operational efficiency and achieving sustainable development.
Existing studies have explored the impact of digital transformation. For instance, Pan et al. (2025) demonstrated that digital transformation significantly enhances corporate productivity, particularly in state-owned enterprises and the central and eastern regions of China (Pan and Hu, 2025). Xing et al. (2025) revealed varying sensitivities to pricing and data service benefits among manufacturing participants in industrial internet platforms (Xing et al., 2025). Li (2025) emphasized that developing digital management systems is key to gaining competitive advantages in the era of IoT and telematics (Li, 2025). These findings underscore the profound influence of digital transformation on modern enterprises’ operational frameworks.
In resource forecasting, scholars have explored a variety of prediction techniques and models, which are mainly divided into two categories: time series analysis and machine learning. Time series methods, such as the ARIMA model and exponential smoothing, excel at capturing the linear trends and periodic characteristics of cloud resource utilization, making them particularly suitable for scenarios with abundant historical data and stable operational patterns. For example, Calheiros et al. achieved 91% accuracy in predicting web server loads using ARIMA, validating its effectiveness in cloud environments (Calheiros et al., 2015). Mi et al. applied quadratic exponential smoothing to predict the number of user requests, and then estimated virtual machine resource requirements (Mi et al., 2011). Machine learning approaches, such as SVM and LSTM, demonstrate significant advantages in handling large-scale, high-dimensional cloud resource data by automatically extracting complex nonlinear relationships, often yielding higher prediction accuracy. For instance, Gao et al. designed a dynamic resource scheduling scheme based on the ant colony algorithm, optimizing the load balancing and energy consumption management in cloud computing platform (Gao, 2015). Additionally, Zhang et al. leveraged a deep belief network (DBN) for cloud resource demand prediction, improving accuracy through input-output relationship analysis (Zhang et al., 2017).
In the field of investment optimization, Zhang systematically categorized macro-asset allocation theories into five types: (1) return-risk balance, (2) return-only, (3) risk-only, (4) investor utility maximization, and (5) integration of economic cycles with subjective judgment (Zhang and Zhang, 2017). The study further analyzed the characteristics and limitations of each category. From the perspective of financial management and cost-effectiveness, Li compared four equipment allocation models—self-owned procurement, financial leasing, operating leasing, and hybrid leasing—providing actionable insights for enterprises in cost control, risk mitigation, and decision optimization (Li, 2022). Additionally, Liu et al. explored optimal choices among financing leases, operating leases, and outright purchases in fixed-asset allocation by constructing cash flow models (Liu et al., 2010).
Existing studies indicate that corporate digital maturity significantly influences cloud platform development and exhibits a strong correlation with cloud resource demand. However, current literature on cloud resource allocation has predominantly overlooked this linkage, resulting in a critical research gap in cloud infrastructure investment strategies. To bridge this gap, our study incorporates corporate digital maturity as a critical exogenous variable into a cloud resource demand forecasting model. Furthermore, by analyzing business growth trends and existing cloud resource configurations, we comprehensively investigate investment decision optimization during cloud platform upgrades, with a focus on comparing the economic viability of third-party cloud service leasing versus self-built data centers.
1.1 Contributions of this study are threefold:
• Theoretical: We introduce a multidimensional Digital Transformation Index (DTI) as an exogenous variable in cloud resource forecasting, bridging the gap between digital maturity and IT resource planning.
• Methodological: We develop an integrated ARIMAX-predictive optimization framework that combines forecasting with multi-constraint investment decision-making.
• Practical: The model provides enterprises with a scalable, budget-aware investment strategy for cloud platformexpansion, validated through real-world simulation.
2 Related works
The existing body of research on cloud resource management can be broadly categorized into two interconnected streams: (1) predictive modeling of resource demand, and (2) optimization of investment and allocation strategies. A review of recent literature (2020–2025) reveals distinct evolutionary trends and prevailing research gaps in both domains.
2.1 Advancements in cloud resource forecasting
Recent studies in cloud resource forecasting demonstrate a clear paradigm shift from traditional statistical methods towards sophisticated deep learning, hybrid optimization, and privacy-preserving computational frameworks. For instance, Wang et al. proposed a BO-LSTM model that integrates Bayesian optimization with marketing variables to enhance the accuracy of point forecasts (Wang and Chen, 2025). Similarly, Sania Malik et al. developed a hybrid FLNN model (FLGAPSONN) that combines Genetic Algorithm and Particle Swarm Optimization, enabling concurrently prediction of multiple resource metrics (e.g., CPU, memory) and demonstrating superior performance on Google cluster traces (Malik et al., 2022). In the realm of data privacy, Stefanidis et al. designed MulticloudFL, a federated learning framework that supports accurate distributed predictions without centralizing sensitive data (Stefanidis et al., 2023). Other innovations include the use of spiking neural networks (MASNN) by Karpagam et al. to capture temporal symmetries in resource usage, and the DimAug-TimesFM approach by Yang et al., which employs data augmentation to improve the robustness of long-horizon forecasts under conditions of data scarcity (Karpagam and Kanniappan, 2025; Yang et al., 2025).
While these studies represent significant progress in model architecture, optimization algorithms, and learning paradigms, they predominantly focus on technical and data-driven factors, largely overlooking the intrinsic impact of corporate digital maturity—a critical business driver that systematically influences IT resource consumption patterns. A summary of representative forecasting studies is provided in the upper part of Table 1. This oversight establishes a salient research gap, which our study aims to address by introducing a quantified Digital Transformation Index (DTI) as an exogenous predictive variable.
2.2 Evolution of investment and resource allocation models
Parallel developments in investment optimization and resource allocation reflect a trend toward multi-objective, predictive, and synergistic decision-making frameworks. Serban and Dedu introduced a Mean-Deviation-Entropy (MDE) model for portfolio optimization, simultaneously balancing return, risk, and diversification—exemplifying the shift from single-to multi-criteria optimization (Serban and Dedu, 2025). Echoing this trend, Nalewaik (2025) emphasized the need in capital project planning to move beyond traditional cost-benefit analysis by integrating social value and resilience through Multi-Criteria Decision-Making (MCDM) methods (Nalewaik, 2025). The integration of forecasting into dynamic allocation is another key trend. Su, for example, utilized GARCH models for financial forecasting and developed dynamic weight allocation algorithms to track the efficient frontier in real-time, establishing a “predict-then-optimize” methodological paradigm (Su, 2020). Furthermore, the concept of synergistic resource configuration has gained traction. Liu, in the context of state-owned capital allocation, highlighted that optimal resource deployment requires the integration of different capital forms and the coordination between incremental and capital, a principle that profoundly informs the hybrid “lease-or-build” model in cloud resource strategy (Liu, 2023). Key contributions in this domain are summarized in the lower part of Table 1.
Despite these theoretical and methodological advances, a significant synthesis is lacking. Specifically, there remains no unified model that seamlessly integrates demand forecasting (e.g., as in Su’s approach), multi-objective trade-offs (e.g., following Serban and Nalewaik’s frameworks), and synergistic resource configuration (e.g., informed by Liu’s concept) to systematically address the core cloud investment dilemma of “leasing versus self-building”.
3 Methods
This study proposes an integrated framework for cloud resource demand forecasting and investment optimization, comprising four main steps:
• Quantify the enterprise’s digital maturity to compute a Digital Transformation Index (DTI).
• Forecast dynamic cloud resource demand using an ARIMAX model with the DTI as an exogenous variable.
• Diagnose real-time resource utilization and trigger optimization alerts against predictive thresholds.
• Formulate and solve a constrained optimization model to determine the cost-minimal investment decision.
The overall workflow is illustrated in Figure 1.
3.1 Digital maturity quantification and exogenous variable design for ARIMAX modeling
Enterprise cloud resources exhibit a strong correlation with digital transformation progress within organizations. Critical resource fluctuations—such as in computing power, storage, and bandwidth—are closely tied to advancements in enterprise digital maturity (Zhong, 2018). Furthermore, digital transformation necessitates operational realignment of business processes and significantly reshapes cloud resource allocation strategies and usage patterns.
3.1.1 Digital maturity assessment framework
To assess enterprise digital maturity, this study introduces a four-dimensional evaluation framework, detailed in Tables 2, 3. The framework is structured around four core dimensions—technological application, data capability, business integration, and organizational adaptation—from which 12 quantifiable secondary indicators are derived. The Analytic Hierarchy Process (AHP) and expert scoring methods are applied to assign weights to these sub-indicators, enabling the calculation of a composite Digital Transformation Index (DTI). The DTI is calibrated against an industry benchmark of 100, with values above this threshold indicating superior digital maturity in enterprises.
3.1.1.1 Indicator selection justification
Each secondary indicator in the DTI framework was selected based on its established linkage to cloud resource demand, as supported by prior IT and digital transformation literature. The justifications are organized by primary dimension:
• Technical Infrastructure (B1): Indicators C1-C3 directly reflect the scale, modernization level, and architectural paradigm of the IT environment. Higher investment in cloud computing (C1), server virtualization (C2), and cloud-native applications (C3) is intrinsically linked to increased and more dynamic consumption of computing, storage, and network resources.
• Data-Driven Capability (B2): Indicators C4-C6 measure the intensity of data utilization. Broader data middle platform coverage (C4), higher real-time data processing volumes (C5), and accelerated data storage growth (C6) are key drivers demanding robust, scalable storage, memory, and computing power.
• Business Digital Integration (B3): Indicators C7-C9 quantify the digitization of core business operations. An increasing online transaction ratio (C7), remote work penetration (C8), and deployment of intelligent systems (C9) generate sustained and variable loads on cloud platforms by translating business activity directly into IT workload.
• Organizational Adaptability (B4): Indicators C10-C12, while having a more indirect influence, capture the enterprise’s capacity for continuous digital innovation. A higher ratio of digitally skilled employees (C10) and agile teams (C11), coupled with mature digital decision-making systems (C12), fosters an environment where new digital initiatives are rapidly developed and deployed, thereby driving evolving and less predictable resource demands over time.
3.1.1.2 AHP-expert scoring method
The weights for both primary and secondary indicators were determined through an integrated AHP-Expert Scoring Method to ensure a rational and consensus-driven weighting scheme. The procedure was conducted as follows:
• Expert Panel Formation: A panel of 15 experts was assembled, consisting of 5 senior IT architects, 5 digital transformation strategists, and 5 enterprise cloud solution managers, each with over 10 years of relevant industry experience.
• Pairwise Comparisons: Each expert performed pairwise comparisons for indicators at the same hierarchical level (e.g., B1 vs. B2; C1 vs. C2 under B1) using the standard Saaty’s 1–9 scale.
• Consistency Verification: The consistency ratio (CR) was computed for each expert’s judgment matrix. Matrices with a CR > 0.1 were considered inconsistent and were returned to the respective expert for reassessment, thereby ensuring the logical reliability of individual inputs.
• Weight Aggregation and Calculation: The validated individual judgment matrices were aggregated into a final group matrix using the geometric mean method. The final weights for each indicator, as shown in Tables 1, 2, were obtained by calculating the principal eigenvector of the aggregated matrix. All resulting weights exhibited a high level of consistency (CR < 0.1), confirming the reliability of the expert judgments and the overall weighting scheme.
3.1.2 Exogenous variable generation for ARIMAX modeling
The composite score for digital transformation maturity is derived using Equation 1:
where.
For the raw score
3.1.2.1 Calculation example
For instance, consider a hypothetical enterprise at a specific time t. Its normalized scores
B1(Weight ω1 = 0.35): C1 = 70, C2 = 60, C3 = 50.
B2(Weight ω2 = 0.30): C4 = 40, C5 = 55, C6 = 65.
B3(Weight ω3 = 0.25): C7 = 80, C8 = 30, C9 = 20.
B4(Weight ω4 = 0.10): C10 = 50, C11 = 40, C12 = 60.
The DTI score is calculated using Equation 2:
3.1.2.2 Time-Series Dataset Generation
The above process generates a synchronized time-series dataset, given by Equation 3:
This dataset aligns temporally with cloud resource demand data to support subsequent analytical modeling.
3.2 Dynamic cloud resource demand forecasting with integrated digital maturity metrics
In the process of enterprise digital transformation, changes in cloud resource demand are jointly influenced by historical usage patterns and digital operational capabilities. This study employs an ARIMAX model, incorporating quantified enterprise digital transformation indicators as exogenous variables, to establish an integrated forecasting framework that combines technological development and business needs. Unlike the ARIMA model, which relies solely on historical data, the ARIMAX model integrates the DTI, enabling it to capture cloud resource demand fluctuations in complex scenarios (e.g., traffic peaks and resource auto-scaling) more accurately. This approach significantly enhances the capability of the model to analyze and predict dynamic cloud resource demands.
3.2.1 Basic principles of the model
The ARIMAX model extends the traditional ARIMA framework by incorporating exogenous variables. The general form of an ARIMAX (p, d, q) model is given by Equation 4:
where.
B denotes the backward shift operator.
3.2.2 Model construction process
3.2.2.1 Step 1 stationarity test
The ARIMA model requires the time series data to be stationary. Typically, the Augmented Dickey-Fuller (ADF) test is employed to assess the stationarity of the cloud resource demand series {yt}. The test begins with the null hypothesis that the series is non-stationary (i.e., it contains a unit root). After computing the test statistic, it is compared against standard critical values. If the null hypothesis is rejected, the series is deemed stationary; otherwise, differencing is applied iteratively until stationarity is achieved. This process determines the optimal differencing order d.
3.2.2.2 Step 2 model order determination
For the stationary series, the Autocorrelation Function (ACF) and Partial Autocorrelation Function (PACF) plots were analyzed to tentatively identify the autoregressive order p and moving average order q. The ACF plot helps identify the influence of past values, while the PACF plot helps identify the direct effect of a specific lag. The following standard guidelines were followed:
• A tailoring off ACF and a sharp cutoff in PACF after lag p suggest an AR(p) model.
• A sharp cutoff in ACF after lag q and a tailoring off PACF suggest an MA(q) model.
• If both ACF and PACF tail off, a mixed ARIMA (p, q) model is indicated.
• The Akaike Information Criterion (AIC) was subsequently used to compare models with different (p, q) combinations, and the model with the lowest AIC was selected for its optimal balance of goodness-of-fit and parsimony.
The formulas for calculating ACF and PACF are given in Equations 5, 6:
3.2.2.3 Step 3 model validation
After the initial model construction, the residual sequence is tested for white noise properties. This is assessed by examining the ACF and PACF plots of the residuals. If the majority of autocorrelation and partial autocorrelation coefficients lie within the confidence intervals, the model is considered to have effectively captured the information of data. Otherwise, the parameters p and q are adjusted for re-fitting. Upon successful validation of the ARIMA (p,d,q) model, the quantified enterprise digital transformation indicators are incorporated as exogenous variables, forming the ARIMAX (p,d,q) model.
3.2.2.4 Step 4 series forecasting
Once a valid ARIMA (p,d,q) model was established, the DTI time series was incorporated as an exogenous variable
3.3 Real-time cloud resource utilization diagnosis and optimization triggering
3.3.1 Business-critical threshold specification
The safety threshold for resource utilization is determined using Equation 7, based on business criticality.
where.
For critical services (e.g., real-time transaction processing, customer support),
This threshold
3.3.2 Predictive threshold activation mechanism
Utilizing the demand forecasts
where.
If the condition is met, meaning the predicted demand ratio exceeds the safety alert threshold, then resource i is determined to face a risk of supply shortage in the future period, triggering an expansion alarm or an architecture optimization requirement for resource i. If the condition is not met, the water level of resource i is safe, and immediate expansion is not required.
3.4 Predictive-driven investment optimization under operational constraints
3.4.1 Objective function
The objective function focuses on minimizing the total cost
where.
The objective is to minimize the total cost by optimizing the proportion of leased and purchased resources.
3.4.2 Unified constraint framework
To ensure rational and feasible resource allocation, the following constraints are defined:
Constraint 1 Resource Demand Forecast Constraint. As shown in Equation 10:
where.
This constraint ensures that resource allocation meets the minimum demand requirement, preventing service disruptions due to underestimation.
Constraint 2 Resource Supply Constraint (Water Level Alert Line). As shown in Equation 11:
This constraint enforces the safety margin
Constraint 3 Natural Number Constraint. As shown in Equation 12:
Ensures non-negative resource allocation, aligning with practical requirements.
Constraint 4 Cloud Leasing Cap Constraint. As shown in Equation 13:
where.
This prevents over-reliance on leasing, which could inflate long-term costs.
Constraint 5 Budget Constraint. As shown in Equation 14:
where.
4 Simulation experiments
4.1 Experimental setup
To evaluate the effectiveness of the proposed model (Section 3), we conducted simulation experiments using the publicly available cluster dataset cluster_trace_v2018 from Alibaba Cloud. The detailed cluster parameters in this dataset are summarized in Table 4.
The dataset encompasses operational data from 4,000 servers, including both online application containers and offline computational tasks. To simplify the experimental process, we used CPU utilization data from two randomly selected containers on one machine to simulate the daily average operational states of containers a and b in Company A’s current infrastructure, resulting in 1,922 data points (Xie and Dong, 2025).
Container a had 2,954 CPU cores, whereas Container b featured 14,378 CPU cores. The CPU usage trends for containers are shown in Figure 2 (Container a) and Figure 3 (Container b).
As shown in Figures 2, 3, both Containers a and b currently operate in a high-utilization mode. Since they are critical for the enterprise’s real-time business operations, the enterprise must allocate sufficient resource reserves. The safety buffer coefficient is set to k = 0.5, resulting in a water level alert threshold of 0.67 for both containers. This indicates a clear need for capacity expansion. To ensure the continuity of daily operations and mitigate risks from extreme events, simulation-driven investment analysis is conducted to expand the capacity of Containers a and b.
4.2 Implementation framework
4.2.1 Data partitioning and environment
The dataset was partitioned such that the first 80% of the data was used as the training set, and the remaining 20% was allocated to the test set (Li et al., 2025; Liang et al., 2023). We conducted simulation experiments using Python 3.13. The hardware configuration details for this experiment are summarized in Table 5.
4.2.2 Digital maturity quantification
The Digital Transformation Index developed in this study is designed to capture the dynamic evolutionary nature of corporate digitalization processes. Since the cluster dataset used in this experiment covers only the period up to early 2018, we extended the evaluation of digital transformation levels from 2018 to 2020 by incorporating enterprises’ historical development paths and external environmental analyses. This method ensures both logical consistency in the research design and realism in the decision-making context. The resulting assessments not only supply exogenous variables for the subsequent ARIMAX model but also offer a contextual foundation that more accurately reflects actual transformation phases for investment decision simulations.
Building on the methods in Section 3.1, Table 6 summarizes the quantitative results of digital transformation maturity and contextual insights for the enterprise.
4.2.3 Cloud demand forecasting in digital transformation
4.2.3.1 Model order identification and validation
To determine the optimal parameter combination for the ARIMAX model, we first conducted stationarity tests and correlation analysis on the CPU utilization time series for Containers a and b. Figures 4, 5 present the Autocorrelation Function (ACF) and Partial Autocorrelation Function (PACF) for Container a, respectively, while Figures 6, 7 display the corresponding functions for Container b.
From the ACF and PACF plots, both containers’ sequences exhibit significant autocorrelation structures. Based on these characteristics and the Akaike Information Criterion (AIC), we identified the optimal ARIMAX model orders as:
Container a: p = 1, d = 0, q = 1.
Container b: p = 1, d = 0, q = 1.
4.2.3.2 Model diagnostic tests
To verify the adequacy of the ARIMAX (1,0,1) model specification, we performed Ljung-Box white noise tests, presented in Tables 7, 8.
Additionally, the residual statistics for both containers are provided in Table 9.
All Ljung-Box test p-values exceed the 0.05 significance level, indicating that the residual sequences of both models are white noise, thus confirming model adequacy. Although Container a’s residuals show slightly elevated kurtosis (5.23), it remains within acceptable limits.
4.2.3.3 Exogenous variable significance
In the forecasting model for Container a, the exogenous variable DTI exhibited a statistically significant coefficient of 943.1211 (p = 0.001), indicating a strong positive impact on cloud resource demand. A comparable trend was observed in the model for Container b.
4.2.3.4 Forecasting implementation
We employed the ARIMAX model to train and forecast the load time series for Container a and Container b individually. The results of the training are illustrated in Figure 8 (Container a) and Figure 9 (Container b).
To guarantee resource adequacy, the maximum values from the operational load predictions were selected following the model’s application. The top five largest values for Containers a and b, sorted in descending-rank order, are presented in Table 10.
4.2.4 Forecasting performance comparison: ARIMAX vs. ARIMA
To validate the predictive accuracy enhancement achieved by the proposed ARIMAX model, a comparative analysis was conducted against the traditional ARIMA model. Utilizing the same training and testing datasets for both containers, we evaluated the forecast performance through error distribution analysis and statistical metrics.
4.2.4.1 Visual error distribution analysis
The forecast error distributions for both models are visually compared in Figures 10–13, providing insights into the prediction accuracy across different modeling approaches.
Container a Performance:
Figure 10 illustrates the error distribution of the ARIMAX model for Container a, showing errors concentrated around zero with minimal dispersion.
Figure 11 displays the corresponding error distribution for the ARIMA model, revealing more scattered errors with greater variance.
Container b Performance:
Figure 12 presents the ARIMAX model’s error distribution for Container b, demonstrating similar concentration around zero.
Figure 13 shows the ARIMA model’s error distribution for Container b, exhibiting comparable dispersion patterns to Container a.
4.2.4.2 Statistical significance
The consistent reduction in mean error across both containers (approximately 54%–55% improvement) provides strong evidence for the superior predictive capability of the ARIMAX framework. This enhancement can be attributed to the incorporation of the Digital Transformation Index as an exogenous variable, which captures the systematic influence of enterprise digital maturity on cloud resource demand patterns.
The concentrated error distribution around zero in the ARIMAX models (Figures 10, 12) indicates reduced bias and variance, confirming the model’s ability to more accurately track actual resource utilization trends compared to the traditional ARIMA approach (Figures 11, 13).
4.2.5 Resource expansion constraints
Given that both Containers a and b support real-time business-critical services (e.g., transaction processing), we set the safety buffer coefficient k = 0.5, consistent with the upper range recommended in Section 3.3.1 for high-criticality workloads. This sets the water level alert threshold at 67% of total capacity. The minimum expansion requirement is derived from the safety threshold condition using Equation 15:
For Container a, the minimum scaling requirement is calculated by substituting the relevant parameters into Equation 16:
For Container b, the minimum scaling requirement is calculated by substituting the relevant parameters into Equation 17:
The expansion is subject to a total budget of $2.22 million and a constraint that leased resources constitute no more than 50% of the total added capacity. This leasing cap balances operational flexibility with long-term cost control. For the economic evaluation, purchased equipment is amortized over a 10-year service life, consistent with the typical useful life of enterprise server hardware, and future costs are discounted at an 8% rate, reflecting the standard cost of capital in the Chinese IT sector (Zhang et al., 2024). The market-quoted CPU leasing prices for Containers a and b are provided in Table 11.
When expanding Containers a and b via self-built servers, refer to the detailed specifications listed in Table 12 (Container a) and Table 13 (Container b).
4.2.6 Genetic algorithm optimization implementation
This study employs a Genetic Algorithm (GA) to solve the mixed-integer nonlinear programming problem, primarily based on the following considerations: all (Huang, 2024; Xu et al., 2023) problem variables are integers, aligning with GA’s discrete variable handling characteristics; the constraint conditions include nonlinear relationships, making traditional linear programming methods difficult to apply directly; global search capability is required to avoid converging to local optima; and the problem scale is moderate, suitable for population-based intelligent optimization methods.
4.2.6.1 Parameter configuration
To ensure algorithm convergence and solution quality, the genetic algorithm parameters were configured as shown in Table 14 after multiple experimental trials and debugging:
4.2.6.2 Constraint handling mechanism
For the nonlinear constraints in the problem, the algorithm employs a penalty function method. The degree of constraint violation is transformed into penalty terms added to the objective function, ensuring the search process consistently moves toward the feasible region. Specific constraints include seven inequality constraints related to production capacity, efficiency, proportional allocation, and budget.
Through the aforementioned parameter configuration and algorithm design, the genetic algorithm can effectively search for the global optimal solution within the complex feasible solution space, providing a reliable theoretical basis and numerical results for subsequent decision analysis.
5 Results
5.1 Optimal resource configuration
The computational results are tabulated in Table 15.
5.2 Container-specific validation
The validation results for each container’s optimal configuration are detailed in Table 16, confirming compliance with all operational constraints.
The validation confirms that both container configurations comply with all operational constraints. The hybrid model for Container a optimally balances cost-efficiency with operational flexibility, whereas the exclusive self-build strategy for Container b is justified by its stable, high-demand profile, favoring long-term cost savings.
5.3 Feasibility assessment
The final feasibility of the proposed investment strategy is summarized as follows:
• Total NPV cost: $2.07 million
• Budget constraint: $2.22 million
• Utilization rate: 93.67%
The genetic algorithm solution achieves a 93.67% budget utilization ($2.07M of the $2.22M ceiling), which demonstrates high cost-efficiency in practice. This near-optimal expenditure indicates that the model successfully identified a configuration that maximizes resource acquisition within the financial limit, while the slight underspend (6.33%) provides a valuable financial buffer for unforeseen contingencies or future scaling needs.
6 Discussion
This study developed an integrated framework for cloud resource forecasting and investment decision-making by incorporating corporate digital maturity, yielding the following core findings:
• Enhanced Forecasting Accuracy: The ARIMAX model, enriched with the DTI as an exogenous variable, demonstrated a significant improvement in predicting cloud resource demand. The comprehensive DTI evaluation system, built upon four primary dimensions and twelve secondary indicators, successfully quantifies digital maturity. Simulation results confirmed that the DTI-embedded model achieves a CPU load peak prediction error of less than 5%, with the DTI coefficient being statistically significant (p < 0.01). This robustly validates a strong correlation between an enterprise’s digital transformation level and its cloud resource consumption.
• Optimized Investment Strategy: The proposed investment model effectively addresses the core “lease-or-build” dilemma under budgetary constraints (Lu, 2024; Wang, 2025). By evaluating the trade-offs between self-built data centers and third-party cloud leasing, the model achieves dual objectives: optimal resource allocation and stringent cost control. The simulation experiment, conducted within a $2.22 million budget ceiling, yielded an optimal configuration of 10 self-built servers and 231 leased units for Container a, and 9 self-built servers for Container b, with a total cost of $2.07 million.
• Comprehensive Framework Integration: The study presents a holistic framework that seamlessly integrates the DTI, predictive analytics, and multi-constraint optimization. The DTI systematically links strategic digital initiatives with IT resource planning. The optimization model, incorporating constraints such as budget, leasing ratios, and safety thresholds, enables a synergistic resource configuration that aligns long-term strategic goals with dynamic operational needs. This integration provides a theoretically sound and operationally viable solution for cloud resource planning in the context of digital transformation.
6.1 Limitations and future work
While this study provides a novel framework, it is subject to several limitations. Firstly, the model was validated using a dataset from a single cloud provider (Alibaba Cloud). The generalizability of the findings to other cloud ecosystems (e.g., AWS, Azure) or enterprise-specific IT environments with unique workload patterns may be limited and warrants further investigation. Secondly, the DTI, though comprehensive, may not capture all facets of digital transformation equally across different industries. Future research should aim to test and calibrate this model with multi-source datasets and explore industry-specific DTI adaptations. Furthermore, the model assumes static pricing and technology, whereas incorporating real-time spot market prices and evolving hardware specifications could enhance its practical utility.
7 Conclusion
This study successfully developed an integrated framework that bridges corporate digital maturity with cloud resource management. By introducing a quantified Digital Transformation Index (DTI) as a key exogenous variable into an ARIMAX forecasting model, we achieved high-precision prediction of cloud resource demand (MAPE <5%). The subsequent optimization model, solving for a cost-minimal investment strategy under multiple operational and financial constraints, efficiently allocated resources, utilizing 93.67% of a $2.22 million budget.
The primary theoretical contribution lies in establishing and validating the critical link between digital maturity and IT resource demand. Methodologically, the seamless “predict-then-optimize” framework demonstrates significant practical utility. It provides enterprises with a scalable, economically viable, and decision-support tool for navigating cloud investment choices during digital transformation. Future work will focus on enhancing the model’s dynamism by incorporating real-time market fluctuations and supply chain factors, thereby increasing its adaptability and robustness for real-world applications.
Data availability statement
The original contributions presented in the study are included in the article/Supplementary Material, further inquiries can be directed to the corresponding author.
Author contributions
ZC: Conceptualization, Project administration, Writing – review and editing. YT: Supervision, Writing – review and editing. FW: Methodology, Writing – review and editing. XW: Formal Analysis, Validation, Visualization, Writing – original draft. MX: Data curation, Formal Analysis, Writing – original draft. ZW: Investigation, Writing – review and editing.
Funding
The author(s) declared that financial support was received for this work and/or its publication. This research was funded by Economic and Technology Research Institute of State Grid Fujian Electric Power Company, grant number 52130N24000U.
Conflict of interest
Authors ZC and FW were employed by Digitalization Department, State Grid Fujian Electric Power Company. Author YT was employed by Economic and Technical Research Institute, State Grid Fujian Electric Power Company.
The remaining author(s) declared that this work was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
The authors declared that this work received funding from Economic and Technology Research Institute of State Grid Fujian Electric Power Company. The funder had the following involvement in the study: Supervision, Writing-review and editing.
Generative AI statement
The author(s) declared that generative AI was not used in the creation of this manuscript.
Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.
Publisher’s note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
Supplementary material
The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/frcmn.2025.1732098/full#supplementary-material
References
Calheiros, R. N., Masoumi, E., Ranjan, R., and Buyya, R. (2015). Workload prediction using ARIMA model and its impact on cloud applications' QoS. IEEE Trans. Cloud Comput. 3 (04), 449–458. doi:10.1109/TCC.2014.2350475
Gao, C. (2015). Research and implementation of load balancing method in cloud stack platform. [Harbin (China)]: Harbin Institute of Technology. [M.S. thesis].
Huang, W. (2024). Water resource allocation optimization method based on GM-VMD water demand prediction. Water Resour. Technol. Superv. 10, 165–170.
Karpagam, T., and Kanniappan, J. (2025). Symmetry-aware multi-dimensional attention spiking neural network with optimization techniques for accurate workload and resource time series prediction in cloud computing systems. Symmetry 17 (3), 383. doi:10.3390/sym17030383
Li, Q. (2022). Research on equipment leasing mode selection in engineering management. Port. Eng. Technol. 59 (04), 97–100. doi:10.16403/j.cnki.ggjs20220422
Li, J. (2025). Industrial digitalization guiding comprehensive enterprise digital construction framework. China Mark. 12, 191–194. doi:10.13939/j.cnki.zgsc.2025.12.047
Li, S., Yu, K., and Chen, Y. (2025). Research on resource usage prediction in high-performance computing platforms based on ARIMA and LSTM. Comput. Sci. 52 (09), 1–11.
Liang, R., Xie, X., Zhai, Q., and Zhang, Q. (2023). Research on container cloud load prediction based on improved stacking ensemble model. Comput. Appl. Softw. 40 (12), 48–55+100.
Liu, W. (2023). Research on the optimization path of capital allocation for state-owned capital investment and operation companies. Bus. News (11), 151–154.
Liu, H., Jing, S., and Liu, T. (2010). Financial decision-making methods for leasing or purchasing fixed assets. Acc. Mon. 16, 15–16. doi:10.19641/j.cnki.42-1290/f.2010.16.007
Lu, Y. (2024). Investment strategies and asset allocation analysis in financial markets. Bus. Inf. 10, 83–86.
Malik, S., Tahir, M., Sardaraz, M., and Alourani, A. (2022). A resource utilization prediction model for cloud data centers using evolutionary algorithms and machine learning techniques. Appl. Sci. 12 (4), 2160. doi:10.3390/app12042160
Mi, H., Wang, H., Yin, G., Shi, D., Zhou, Y., and Yuan, L. (2011). A resource on-demand reconfiguration method for virtualized data centers. Softw. 22 (9), 2193–2205. doi:10.3724/sp.j.1001.2011.04056
Nalewaik, A. (2025). “A hybrid approach to benefits planning for capital projects,” in Proceedings of the 2025 IEEE European Technology and Engineering Management Summit (E-TEMS) (IEEE), 86–91. doi:10.1109/E-TEMS64751.2025.11239339
Pan, H., and Hu, G. (2025). Can enterprises generate new productive forces through digital transformation? An empirical study from the perspective of technological innovation. Technol. Econ. 44 (02), 31–42.
Serban, F., and Dedu, S. (2025). A scalarized entropy-based model for portfolio optimization: balancing return, risk and diversification. Mathematics 13 (20), 3311. doi:10.3390/math13203311
Stefanidis, V.-A., Verginadis, Y., and Mentzas, G. (2023). MulticloudFL: adaptive federated learning for improving forecasting accuracy in multi-cloud environments. Information 14 (12), 662. doi:10.3390/info14120662
Su, J. (2020). The implementation of asset allocation approaches: theory and evidence. Sustainability 12 (17), 7162. doi:10.3390/su12177162
Wang, Z. (2025). Research on factors influencing investment asset allocation efficiency of Chinese insurance companies. Mark. Wkly. 38 (02), 42–45.
Wang, Y., and Chen, T. (2025). BO-LSTM-Based cloud resource consumption prediction model. Comput. Eng. De. 46 (5), 1418–1423. doi:10.16208/j.issn1000-7024.2025.05.024
Xie, X., and Dong, Y. (2025). A container cloud resource prediction model based on secondary decomposition and broad learning system. Lab. Res. Explor. 44 (03), 94–100.
Xing, Q., Wu, P., and Deng, F. (2025). Dependency analysis between industrial internet platforms and manufacturing participants under digital transformation. Manag. Eng. 39 (05), 1–18.
Xu, Y., Hong, Y., He, L., Hong, F., Zhang, Y., Hou, F., et al. (2023). Equity and resource prediction of child health human resources in maternal and child health institutions in Guizhou province. Chin. Health Resour. 26 (05), 582–588. doi:10.13688/j.cnki.chr.2023.230147
Yang, X., Zheng, Q., Zhu, X., Luo, M., Hou, Z., Zhang, J., et al. (2025). DimAug-TimesFM: dimension augmentation for long-term cloud demand forecasting in few-shot scenarios. Appl. Sci. 15 (7), 3450. doi:10.3390/app15073450
Zhang, X., and Zhang, L. (2017). A review of theoretical research on asset allocation. Econ. Dyn. 2, 137–147.
Zhang, W., Duan, P., Yang, L. T., Xia, F., Li, Z., Lu, Q., et al. (2017). Resource requests prediction in the cloud computing environment with a deep belief network. Softw. Pract. Exper. 47 (03), 472–488. doi:10.1002/spe.2426
Zhang, J., Ouyang, S., Wu, H., Xin, X., and Huang, W. (2024). Optimal configuration of grid-side energy storage considering distribution network reliability and operational economy. Power Syst. Autom. Equip. 44 (07), 62–68+85. doi:10.16081/j.epae.202312044
Keywords: ARIMAX model, cloud resource forecasting, digital transformation, decision support systems, optimization techniques
Citation: Chen Z, Tang Y, Wu F, Wang X, Xia M and Wang Z (2026) Optimization of cloud resource demand forecasting and investment decisions in the context of digital transformation. Front. Commun. Netw. 6:1732098. doi: 10.3389/frcmn.2025.1732098
Received: 25 October 2025; Accepted: 11 December 2025;
Published: 07 January 2026.
Edited by:
Lukman Adewale Ajao, Federal University of Technology Minna, NigeriaReviewed by:
Farhan Nisar, Qurtuba University of Science and Information Technology, PakistanSai Bharath Sannareddy, Abbott, United States
Copyright © 2026 Chen, Tang, Wu, Wang, Xia and Wang. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Minghui Xia, MjU0OTYyMTkxNUBxcS5jb20=
Zhuolin Chen1