Research on Optimal Scheduling of VPP Based on Latin Hypercube Sampling and K-Means Clustering

Based on the classical scenario set, the VPP economic dispatch model is proposed taking into account the uncertainty factors of distributed power sources. The basic model of the VPP is first analyzed, followed by the proposed operation strategy of the VPP based on the basic model, while considering the impact of the time-of-use electricity price on the economics of the VPP. Latin hypercube sampling combined with K-means clustering is used to generate the classical scene set; at the same time, the model is solved using an algorithm that incorporates a genetic mechanism in an improved particle swarm algorithm (PSO). Finally, according to the established model, a calculation example is used to verify. The design is based on two scenarios of the classic scene set and general scene. The optimization configuration results are compared and analyzed. It is confirmed that the VPP optimization configuration under the classic scene set can improve the net income of the VPP.


INTRODUCTION
The rapid development of technologies that rely on natural energy sources has led to massive consumption of fossil energy. In this regard, many solutions have been proposed to mitigate phenomena such as air pollution and global warming. The concept of the VPP (virtual power plant) was developed. The VPP can provide an effective means of managing distributed generation (DG) at a time when distributed energy is growing rapidly. At present, there is no authoritative definition of the VPP, and most people accept the concept that it uses advanced control, communication, and computing technologies in a distributed network where different types of distributed energy sources can be aggregated and further distribute energy so that these DGs can operate as a whole, while also effectively mitigating the instability of distributed energy sources.
The VPP has been studied in great depth in many literatures. The concept of the responsive VPP was first introduced in the literature (Department Of Energy (, 2006)) during the theoretical exploration and proof stage; relying on the different ways of achieving the response, two working models of the VPP were proposed: incentive-based power plant models and lump sum-based power plant models, and these were verified through calculation examples. A comprehensive account of the VPP is presented in the literature (Xia and Liu, 2016;Lin, 2017;Gong, 2018;Li et al., 2021). Hong et al., 2017;Wang, 2017;Yuan, 2017;Wen and Guoen, 2019;Liu et al., 2020;Yang et al., 2020 investigate the problem regarding the management of renewable energy sources in the grid, which considers the VPP to be able to integrate distributed energy sources optimally. In the literature (Pudjianto et al., 2007;Zhou and Lin, 2019), distributed energy sources, such as CHP units, are aggregated to form a VPP to participate in market trading and realize a VPP model for distributed energy sources, such as energy storage and demand response. You et al., (2009) consider the uncertainty of power market prices and new energy generation, such as wind and PV consider both controllable and uncontrollable distributed power sources; and propose a dayahead bidding strategy for the VPP with the goal of maximizing economic efficiency. Yuan et al., (2016) investigate the economic efficiency of the VPP based on particle swarm optimization (PSO) algorithms, taking into account time-of-use electricity prices. The strategies proposed in the literature (Soltani et al., 2012) consider the effect of reliability and determine the optimal hourly operating strategy for DERs by Monte Carlo simulation methods. However, these scheduling strategies are proposed based on deterministic market prices.
As the role played by the VPP in the power system increases in proportion, the issue of operating and scheduling the VPP is a problem we must face. The abovementioned literatures summarize the VPP and even propose a day-ahead bidding strategy for the VPP, but the VPP contains a large number of distributed power sources, which have uncertainty in their power output, and currently in the power system, to offset the uncertainty in the system, the method of the reserved rotating reserve is usually used as carried out by Yi and Li, (2018); the socalled reserved rotating reserve method is applied to reduce the error by compensating the prediction error at a certain confidence level by a certain rotating reserve capacity. Robust optimization methods are generally used to deal with uncertainty in the model; however, although robust optimization methods are good, but the results are often conservative and extreme. Stochastic optimization methods based on scenario sets are generally used to deal with uncertainty in models by discretizing a continuous problem into a finite number of scenarios with certain probabilities and then finding the optimal expectation, often including the generation of multiple scenarios and decimation of scenarios to derive classical scenario sets such as those obtained by Zhinong et al., (2018). In the literature (Liu, 2018), a Markov chain-based PV ultrashort-term prediction model was constructed for historical PV data; the relationship between PV curve description quantity characteristics and collection granularity was investigated; the main evaluation indexes reflecting the continuous fluctuation characteristics of PV power were extracted according to the PV power change state, and a multiobjective optimization-based PV power collection granularity calibration model was established. In the literature , a two-level collaborative optimal scheduling model for the VPP considering carbon neutral benefits was proposed to further enhance the adaptability and accuracy of the model under a carbon neutral layout by considering carbon emissions and carbon neutrality in a multitime-scale optimal scheduling model for the VPP to fill the gap of VPP optimal scheduling in the field of carbon emissions. In the literature (Wei et al., 2015;Zhao and Fan, 2019a;Gao, 2019), a double-layer optimal dispatching model of the VPP based on time-of-use electricity prices was proposed, and the study showed that the VPP based on time-of-use electricity prices can maximize the revenue to enhance the level of new energy consumption and ensure the balance of supply and demand in the region. Lin et al., (2021) used the stochastic optimization and adaptive robust optimization methods to model the uncertainty of electricity price, wind power output, and demand response and then linearized the model formulation based on the engineering game idea to establish a two-stage, three-level day-ahead dispatch model. The optimal solution is obtained by the PSO algorithm, and it is verified that this model can effectively improve the economy and safety of VPP operation.
The VPP is proposed to integrate various distributed power sources, controllable loads, and energy storage devices, etc., and gather various distributed power sources together to form a whole through advanced communication technology to participate in the operation and dispatch of the grid in a unified manner. The control of the VPP is divided into two types: decentralized control and centralized control. However, a new distributed coordination controller proposed in the literature (Sun et al., 2015), combined with a multi-intelligence-based consensus algorithm, is applied to the distributed generators of the energy Internet, which can maintain the consistency of electrical angles and amplitudes between the energy Internet and MG. In the second part of the article, a control architecture based on a multi-intelligence system is proposed to describe the information exchange between different parts.
Translated with www.DeepL.com/Translator (free version). In the previous studies of optimal scheduling of the VPP, they all gave priority to the use of traditional power generation methods with lower generation costs in order to maximize benefits, while ignoring the consumption of new energy sources, such as wind power and photovoltaics, resulting in the waste of new energy generation costs, which actually undermines the ultimate benefits of the VPP. The study establishes an optimal configuration model of the VPP. First, the basic mathematical model of the VPP is analyzed, followed by the uncertainty factors in the VPP using Latin hypercube sampling and K-means clustering to form a classical scenario set. Then, the objective function is to maximize the net revenue of the VPP, and the constraints are considered to establish the optimal configuration model of the VPP consisting of wind power, photovoltaic power generation, energy storage system, and gas turbine, and finally, the algorithm of adding genetic mechanism to the improved PSO algorithm is used to solve the model. In addition, a VPP operation strategy is proposed, which takes into account the economy of the VPP according to the time-of-use electricity price. The rationality and effectiveness of this strategy are verified by examples.

GENERATION OF CLASSIC SCENE SETS Latin Hypercube Sampling to Form Multiple Scenes
Latin hypercube sampling is a method proposed by M. D. McKay, R. J. Beckman, and W. J. Conover in 1979, which can effectively use the distribution of sample response random variables (McKay et al., 2012). Latin hypercube sampling is a typical stratified sampling, and for all sampling areas, this sampling method can be used to cover a smaller and unduplicated sample. The sampling is performed by the following steps: 1) Dividing the sample into equal intervals on the cumulative probability scale 0 to 1.
Let the random variable A be the object of our study and its probability distribution function be Y F(A).
( 1 ) Let B be the number of samples, and then the vertical axis of Y F(A) is divided into B equal intervals; each interval is independent of each other without repetition, and the width of the interval is 1/B.
2) Generate random numbers in each interval.
( 2 ) A random number u is generated in the interval shown in Equation 2 and u is a random variable adhering to uniform distribution on the interval (0, 1); then, a random number d i can be generated for the ith interval and can be expressed as 3) Inverted conversion generates the sampled values.
The sampled values are calculated by the inverse function as follows: ( 4 ) The specified B sample values can be obtained by the abovementioned steps. Yang et al., (2013) studied the distributed wind power output using a two-parameter Will distribution, and Zhou et al., (2016) studied the distributed PV using a β distribution.

K-Means Clustering Reduction Scenario
K-means clustering minimizes the sum of squares of the Euclidean distance between each sampled point and its nearest clustering center. K-means first selects the initial cluster centers randomly or manually and then divides the data set into several clusters (a data point belongs to the cluster whose cluster center is closest to the data point) and calculates the mean of the clusters as the cluster centers. K-means repeatedly updates the clustering centers and clusters until convergence. The main distance metrics are Manhattan distance, Euclidean distance, Marxian distance, Chebyshev distance, and other methods.
The set of samples X (x 1 , x 2 , x 3 , ..., x n ) is known, the number of categories is k, and the sum of point-to-center distances is chosen as the objective function: In Eq. 5, d ij 1, x j ∈ U i 0, x j ∉ U i ; w i represents the cluster center of the ith class and U i represents the set of samples of the ith class after clustering.

VIRTUAL POWER PLANT ECONOMIC MODEL Objective Function
The objective function is to maximize the net income of the VPP, in which the income first mainly includes the income of wind power generation and photovoltaic power generation, which are distributed power sources to supply the load, followed by the income of the difference between charging and discharging of energy storage batteries, while considering the income generated by gas turbine auxiliary power generation, and the income of the difference between the income earned by the VPP from the sale of electricity to the grid and the cost of electricity purchased by the VPP from the grid; the three main sources of costs are operation and management costs, energy consumption costs, and penalty costs.
In Eq. 6, F represents the net income of the VPP, N represents the number of classic scene sets, α(s) represents the probability of scene occurrence, P s represents the income of the VPP under scenario s, and C s represents the cost of the VPP under scenario s. P s can, in turn, be expressed as follows: In Eq. 7,t represents time series, A 1,t represents the price of electricity sold during t, P s PW,t , P s PV,t , P s GT,t , P s Discharge,t , P s Charge,t , P s Sell,t , P s Buy,t represent the power of wind power generation, photovoltaic power generation, gas turbine, energy storage, etc. in the time period t under the scenario s. C s C s om,t + C s es,t + C s pu,t .
In the formula, C s om,t , C s ec,t , and C s pu,t , respectively, represent operation management cost, energy consumption cost, and penalty cost; E om,PW , E om,PV , E om,GT , and E om,Batttry means cost factor; A 2,t represents the electricity purchase price in t period; and P GT represents the fuel cost of gas turbine unit power generation and the unit power generation cost of the gas turbine (Cui et al., 2010): In Eq. 12, P GT represents the price of natural gas, η represents power generation efficiency, and L NG represents the low calorific value of natural gas.
The VPP declares its planned contribution to the grid: In Eq. 13, D PW and D PV represent the planned output of wind power generation and photovoltaic power generation in time t, respectively, and P GT,t max represents the maximum output of the gas turbine.

1) Power balance constraint
In Eq. 14, D t represents the interactive power between the VPP and power grid during t period.
2) Gas turbine constraints P GT, min ≤ P s GT,t ≤ P GT, max .
In Eq. 15, P GT, max and P GT, min , respectively, represent the upper and lower limits of the gas turbine during normal operation.
3) Gas turbine climb rate constraint In Eq. 16, R A g and R B g represent the upward and downward climbing rate of the gas turbine, respectively.

4) Battery capacity constraints of energy storage systems
The energy storage system must comply with the law of conservation of energy in the process of dispatching, that is, the electrical energy stored now is equal to the sum of the electrical energy stored in the previous moment and the charging and discharging energy in the process of two moments, which is as follows: At the same time, Eq. 18 must be satisfied: In Eq. 18, S s ca,t and S s ca,t−1 represent the capacity of the energy storage battery in t period and t-1 period, respectively; Δt represents the time interval; η 1 and η 2 represent charging efficiency and discharging efficiency, respectively; E bat represents the installed capacity of the energy storage system; and S ca, max and S ca, min , respectively, represent the upper and lower limits of the energy storage capacity.

5) Charging and discharging constraints of energy storage batteries
In the abovementioned formula, P Charge, max and P Charge, min , respectively, represent the extreme value of the upper and lower limit of the charging power of the energy storage system; P Discharge, max and P Discharge, min , respectively, represent the extreme value of the upper and lower limits of the discharge power of the energy storage system; P s c,t denotes the power (charging or discharging) of the energy storage system in time period t under the scenario s; M s Charge,t and M s Discharge,t , respectively, represent the state variable of the charging and discharging of the energy storage system in the time period t under the scenario s; and the value is 0 or 1.

1) The form of interaction between the VPP and grid
Based on the existence of time-of-use electricity price, there are three main aspects of the interaction between the VPP and grid: 1) If the price of electricity is at peak hours at this time, the energy storage and gas turbines in the VPP will sell all the excess power to the grid to earn benefits under the condition that the load demand is met; 2) If the price of electricity is in the usual period, the distributed power source will be powered first, followed by the energy storage system, and the gas turbine will decide whether to power up by comparing the cost of power generation with the grid price; 3) If the price of electricity is in the valley, the cost of gas turbine power generation is lower than the price of grid electricity, and the gas turbine does not produce power, the VPP to purchase electricity from the grid, part of the load, another part of the power stored in the storage system, to be sold at the right time to earn the difference in price.

2) VPP operation strategy
The operation strategy of the VPP mainly considers the power output of distributed power sources, such as wind power and photovoltaic power, the power output of energy storage systems, and the power output of gas turbines.
Distributed power output: wind power and photovoltaic power generation as new energy; the VPP within its capacity to achieve priority utilization. When the output of the distributed power is greater than the load demand, all the load demand will come from the distributed power, and the remaining power will be stored in the energy storage system or sold to the grid according to the time-sharing tariff; when the output of distributed power is less than the load demand, all the output of distributed power will be used for the load demand. The output of the energy storage system: The VPP gives priority to the power output from distributed power sources. If the distributed power sources cannot meet the load demand, the energy storage system will be discharged, and if there is a supply of distributed power sources that exceeds the demand, the energy storage system will be charged and discharged at the right time.
Gas turbine output: The gas turbine plays an auxiliary role in the VPP as a controllable load, ensuring that it functions when distributed power sources, energy storage, etc. are not available to generate power to supply the load, storage, and the grid. Whether or not the gas turbine is powered depends on the demand of the load and its cost of power generation compared to the grid tariff, and then the decision is made whether or not to power it.

MODEL SOLVING ALGORITHM
In view of the shortcomings, such as the basic particle swarm being prone to fall into local optimum, a new improved PSO algorithm is proposed based on the basic PSO algorithm. The following improvements are made relative to the basic PSO algorithm: The adaptive weight calculation formula is set as follows:   In Eq. 22: In the formula, n is the current iteration number; f is the realtime objective function of the particle,; ω max and ω min represent the maximum and minimum values of inertia weight, respectively; and f min and f avg , respectively, represent the minimum target value and average value of all current particles.
To enhance the global search capability of the algorithm, the concept of hybridization in genetic algorithms is added to the PSO algorithm (Lu et al., 2020). Suppose there is an existing hybridization pool, a certain number of particles is added in this pool, and then the particles in the pool are allowed to hybridize, which will generate a certain number of particles; let that number be the same as the original particles, and at the same time, the original particles are replaced with the newly generated particles so that the global search ability of the algorithm can be enhanced.
The positions and velocities of the new particles are obtained by crossing the positions of the original particles: nv mv(1) + mv(2) |mv(1) + mv(2)| |mv|.
In the abovementioned formula, nx indicates the position of the new particle, mx indicates the position of the original particle;    i is a uniform random number between 0 and 1, nv represents the velocity of the new particle, and mv represents the velocity of the original particle.
The flow chart of the improved PSO is shown in Figure 1.

SIMULATION EXAMPLE Calculation Example Settings
In order to verify the feasibility of the abovementioned VPP energy storage system optimization configuration and algorithm, a VPP containing 1000 kW of wind power, 1000 kW of photovoltaic power, and 400 kW of gas turbine and an energy storage system with a rated capacity of 1600 kWh were selected. All the data in this study and the final results are derived on the Matlab simulation platform. Specific parameter information is shown in Tables 1, 2, 3, 4. The time-sharing electricity price is selected as the reference (Zhao and Fan, 2019b) for each distributed energy operation and management cost factor for non-summer industrial, commercial, and other electricity consumption in Shanghai. In this study, a typical daily load curve of a location is selected as the load forecast, and the results are shown in Figure 2.
Based on the historical data, the abovementioned Latin hypercube sampling is used to generate scenes and K-means clustering is used to reduce the scenes. In this study, the number of classical scenes of wind power output and PV power output is predetermined to be four each, so there are 4 × 4 scenes in the classical scene set. The results are shown in Figure 3 and Figure 4. The probability of each scene is shown in Table 5 and Table 6.

Simulation Results and Analysis
Running the developed model and solving it by an improved PSO algorithm, the net benefits of the VPP under different scenarios are shown in Table 7.
Running the developed model, the net revenue and the output of each component of the virtual power plant under the influence of the time-sharing tariff for Scenario 1 can be obtained, as shown in Figure 5.
It can be seen from Figure 6 that during the time period 22: 00-6:00 of the following day, the VPP has maintained the purchase of electricity from the grid mainly due to the fact that the price of electricity at this time is in the valley hours; during the time period 6:00-7:00, the electricity price is in the usual period, and the VPP does not interact with the grid for power; during the time period 8:00-11:00, when electricity prices are at their peak and trough, the VPP makes a profit by selling electricity to the grid while ensuring the demand of the load; during the period 11:00-18:00, the electricity is in the usual period, and the VPP does not interact with the grid for power; in the time period 18:00-22:00, when the electricity price is in the peak and valley hours, the VPP sells electricity to the grid to earn the difference but purchases electricity from the grid around 20:   00, which is due to the fact that PV power generation almost stops at this time, and the distributed power and gas turbine output still cannot guarantee the demand of the load, so the power needs to be purchased from the grid; then 22:00 enters the valley hours and the VPP purchases electricity from the grid. From Figure 7, we can see that when the grid electricity price is at 22:00-6:00 the next day, the gas turbine does not generate electricity because the electricity price of the VPP from the grid is lower than the generation cost of the gas turbine; while the electricity price is at peak hours, the grid electricity price is higher than the generation cost of the gas turbine and the gas turbine generates electricity at almost full capacity and sells the electricity to the grid after satisfying the load demand, thus gaining benefits, when at the time 11:00-18:00. Although the electricity price is at normal hours, the distributed power supply and energy storage system can meet the load demand, and the gas turbine does not generate electricity at this time.
From Figure 8 we can see that when the electricity price is at 22:00-6:00 the next day, the energy storage system is in the charging state, and when the time period reaches 8:00-11:00, the energy storage system is in the discharging state, which is due to the fact that the energy storage system sells the electricity stored in the valley hours to the grid under the premise of ensuring normal operation of the load, and thus earning the difference in price in the time period of 12:00-22:00. In the time period of 12:00-22:00, the electricity price is in the normal or peak period, and under the premise of satisfying the load demand, the energy storage will sell the excess power to the grid to earn the price difference, and then when entering the valley hours, the VPP purchases power from the grid and stores it in the energy storage system.

CONCLUSION
A VPP optimization model based on the classical scenario set is simulated and analyzed for specific arithmetic cases. Two different scenarios are set up for the optimal configuration, and the net benefits and output of each component are obtained for different cases. The effectiveness of the improved PSO algorithm is verified by solving the developed model with the improved PSO algorithm incorporating genetic mechanism. Finally, by comparing the net returns under different scenarios, it is verified that the optimal configuration of the VPP under the classical scenario set can significantly improve the net returns of the VPP.
Outlook: In recent years, with the popularity of electric vehicles, more uncertainties have been added to the VPP. Electric vehicles can be seen as a mobile power source, and their charging and discharging are highly random and have a greater relationship with human factors. At present, countries around the world are reducing carbon emissions, and in the future, considering electric vehicle access to the VPP, taking into account the economy and environmental protection (carbon emissions) of the VPP will be the focus of our research.

DATA AVAILABILITY STATEMENT
The raw data supporting the conclusion of this article will be made available by the authors, without undue reservation.

AUTHOR CONTRIBUTIONS
CW and WS completed the model building of the VPP and the simulation and reduction of wind power photovoltaic output scenarios. WS completed the debugging of the algorithm. MC completed the drawing of the graph. XM read the manuscript and corrected the grammatical errors.