A Machine Learning-Based Model Predictive Control Method for Pumped Storage Systems

Integrated systems required for renewable energy use are under development. These systems impose more stringent control requirements. It is quite challenging to control a pumped storage system (PSS), which is a key component of such power systems. Because of the S-characteristic area of the PSS pump turbine, traditional proportional-integral-derivative (PID) control induces considerable speed oscillation under medium and low water heads. PSSs are difficult to model because of their nonlinear characteristics. Therefore, we propose a machine learning (ML)-based model predictive control (MPC) method. The ML algorithm is based on Koopman theory and experimental data that includes PSS state variables, and is used to establish linear relationships between the variables in high-dimensional space. Subsequently, a simple, accurate mathematical PSS model is obtained. This mathematical model is used via the MPC method to obtain the predicted control quantity value quickly and accurately. The feasibility and effectiveness of this method are simulated and tested under various operating conditions. The results demonstrate that the proposed MPC method is feasible. The MPC method can reduce the speed oscillation amplitude and improve the system response speed more effectively than PID control.

As the scale of PSSs and the extent to which they are utilized in the development of sustainable power systems increase, their precise control has gradually emerged as an important area of research. Existing PSS control methods can be divided into two main categories. The methods within the first category use proportional-integral-derivative (PID) control. These methods are based mainly on parametric design of each link within the PID controller. Some representative studies follow. Shi et al. used the d-q axis vector of a generator to design PI parameters with the aim of achieving highly efficient peak regulation and frequency suppression within a power grid (Yifeng Shi et al., 2020). Zhao et al. used the power priority control and speed priority control strategies to design PID parameters, in addition to simulating and verifying the superior control effect of the power control strategy (Yves Pannatier et al., 2010;Guopeng Zhao and Ren, 2021). Several other advanced algorithms and theories have been applied as well. Xu et al. designed a fuzzy PID controller based on the F fractional-order integrator and differentiator (FOPID) (Podlubny, 1999) by using a search algorithm. This controller optimizes the FOPID method parameters, increases the convergence speed, and yields superior results (Xu et al., 2016;Xu et al., 2018). Gao et al. used Hopf bifurcation theory to determine the algebraic stability criterion of a PSS in order to obtain ideal PID controller parameters that correspond to a stable system state (Chunyang Gao et al., 2021a).
The control methods under the second category use optimization algorithms to determine the best approach to coping with the nonlinear behaviors of hydraulic systems. Schmidt et al. used a nonlinear optimization algorithm to determine the best stationary operating point of a PSS in order to minimize loss across the entire system and consider operating constraints systematically (Schmidt et al., 2017). Hou et al. used a multiobjective optimization algorithm to design a control strategy for starting a PSS in addition to establishing a simulation model based on a real system to verify the feasibility of the optimized control strategy (Hou et al., 2018;Hou et al., 2019). Pan et al. modeled a new stability criterion by using the particle swarm algorithm to optimize the parameters of a PSS controller; this criterion enhanced the independent response of the PSS significantly (Pan et al., 2021).
Among the various optimization control methods, the model predictive control (MPC) method has a robust theoretical foundation and substantial promise for practical applications. The most famous explanation is called "Open Loop Optimal Feedback" (Alamir and Allgöwer, 2008;Diego and Carrasco, 2011;Forbes MG et al., 2015). It richly illustrates the four layers of modeling, control, optimization, and logistics in the MPC control method and their relationship (Mayne, 2016;Shiliang Zhang et al., 2017;Sopasakis and Sarimveis, 2017;Ye et al., 2017). In recent years, the application of MPC to PSSs has gradually attracted the attention of scholars. Li et al. used the generalized nonlinear predictive control method to design a PSS controller and simulated start-up and speed disturbance processes under no-load conditions to verify controller robustness and efficiency (Li et al., 2017b). Liang et al. proposed an MPC strategy based on enumeration to determine the optimal switching time of a PSS with the objectives of increasing operational flexibility and promoting frequency regulation (Liang et al., 2019). Feng et al. proposed a new adaptive MPC strategy and conducted a simulation experiment to verify the superiority of this strategy in terms of adjusting the voltage and load and suppressing frequency oscillation (Chen Feng et al., 2021). Although MPC has the above advantages, it still has certain disadvantages when compared to traditional control methods. For instance, MPC modeling requires more expert knowledge and is timeconsuming to perform using traditional methods (De Souza et al., 2010;Max Schwenzer et al., 2021. Fortunately, the continuous development of hardware equipment and control algorithms has decreased the MPC control time. If the control FIGURE 1 | The overall power storage system model predictive control structure.
Frontiers in Energy Research | www.frontiersin.org November 2021 | Volume 9 | Article 757507 effect is particularly important, MPC is much better than traditional control methods.
Modeling is a key component of MPC but almost all systems used in relevant contexts are nonlinear; for example, the water pump turbine of a PSS has an S-characteristic area (Yin et al., 2014;Li et al., 2017a;Anilkumar et al., 2017;Zeng et al., 2017;Zhang et al., 2017). This increases the difficulty of modeling. A variety of modeling methods have been developed in response.
Huang et al. proposed a method of predicting the complete characteristics of a Francis pump turbine. Based on Euler's equation and the velocity triangle at the runner, their method derives a mathematical model that completely describes the characteristics of the Francis pump turbine. The results of a simulation experiment indicated that their method was suitable for performing a priori simulations before measuring device characteristics (Huang et al., 2018). Gao  In recent years, with the continual development of artificial intelligence technology, a few machine learning (ML) algorithms have yielded outstanding nonlinear system modeling results (R. Martin et al., 2015;J. Sánchez Oro et al., 2016;Li et al., 2016;Idowu et al., 2016;Robinson et al., 2017;Yong Ping Zhao et al., 2018;Kiely et al., 2019;Ayush Thada et al., 2021). The application of ML algorithms to PSS has emerged as an innovative idea. In this regard, Feng et al. established a PSS model by optimizing the initial learning parameters via prior knowledge learning and subsequently adopting a stepped control strategy and an artificial-sheep-algorithmbased rolling optimization mechanism. This model can replace the existing differential geometry model (Chen Feng et al., 2021). Zhao et al. introduced an improved Suter transformation-backpropagation (BP) neural network interpolation model and used it to establish a PSS model. This model can be divided into three parts: characteristic curve processing, data prediction, and interpolation. It can correct the characteristic curve of a pump turbine effectively (Zhigao Zhao et al., 2019). These modeling methods are innovative and represent meaningful research directions. However, the model machines and BP neural networks used in the aforementioned modeling methods have extremely complex structures and use large numbers of parameters. This increases the complexity of PSS control.
For these reasons, we propose using a MPC method based on an ML algorithm to model PSS control. This method can convert a nonlinear model into a linear model in high-dimensional space. The form of the obtained model is simple and few parameters are required. Thus, it is suitable for fast, accurate control processes. Our paper makes the following contributions to PSS development: 1) The pump-turbine partition model that we establish effectively solves the common problem of difficult modeling in the S-region and provides a feasible direction for research on pump turbines in the s-region. 2) The machinelearning algorithm that we propose makes full use of the data generated in the PSS; it is an effective way of combining experimental data with the actual system and improves the accuracy of the actual system model. 3) We use the proposed model within the MPC method; this solves the problem of large control errors in the PSS at low water heads. 4) Compared to traditional control methods, our proposed method can reduce the shock of the pumped storage system and reduce the control process adjustment time.

PSS MODELING
The overall MPC structure is illustrated in Figure 1. It comprises three parts: a controlled system (a PSS), a data collection component, and a control component. The PSS is composed of a pump turbine, servo mechanism, penstock, and generator. The function of the data acquisition component is to collect the data in the PSS and provide it to the control system. It includes a flow sensor, a torque sensor, a speed sensor, an opening sensor, and a pressure gauge, and is responsible for collecting five variables: the flow, torque, speed, guide vane opening, and water head. The control system includes model generators for pumps, turbines, servo systems, pipelines, and generators. Their task is to generate PSS component models that the controller can use. The control system also includes a predictor that generates a predictive model. In addition, the controller generates control instructions according to the optimized algorithm in order to manipulate the PSS control variables.
Five important state variables are used in the PSS: a is the vane opening, n 11 is the unit rotational speed, Q 11 is the unit flow, M 11 is the unit torque, and h is the water head magnitude. The modeling process aims to establish the relationships between these state variables. The real values of these state variables at each moment are collected via digital equipment and used in the control process. The controller determines the optimal amount of control via the model prediction process and uses the optimizer to control the PSS.

ML Algorithm
To establish a simple, accurate PSS model, we propose an ML algorithm that can determine the linear relationships among state variables in high-dimensional space using the given data.

Related Principles
1) Assume that the set of all state variables s (i) is given in Eq. 1. To study the relationship between the state variables s (u) and s (v) , we assume that they satisfy the relationship given in Eq. 2, the discrete form of which is given in Eq. 3. All of these relationships are nonlinear. where s (i) is the state variable in system; N s is the number of state variables.
zs (u) where zs (u) , zs (v) is the amount of change in s (u) and s (v) respectively; g(s (u) ) is a certain relational expression of s (u) .
where s (u) j , s (u) j+1 is the value of s (u) at time jand j + 1 respectively; G(s (u) j ) is a certain discrete function of s (u) at time j. For nonlinear relationships, Koopman proposed the Koopman operator (KO), expressed in Eq. 4, in 1931. The KO can be used to represent the complete nonlinear relationship using a linear function. Notably, no universal method is available for selecting the observation function; it is usually selected based on the characteristics of the system and the analyst's experience. where is the measure function acting on the state variables; + represent a compound operation of functions.
2) For each value of s (v) , there exists a corresponding dataset s (u) , which can be expressed as a matrix, as in Eq. 5.
where DM is data matrix composed of N sets of values of state variables. We apply the KO to matrix DM to obtain the highdimensional data matrix K, as shown in Eq. 6. Then, we take columns 1 through N-1 of K as K a and columns 2 through N as K b , as indicated in Eqs 7, 8, respectively.
where K is data matrix in measurement space that apply the measure function O(s) to DM.
where K a , K b is data matrix consisting of N − 1 columns of K.
According to the Koopman principle, under a suitable observation function set, K b and K a share a linear relationship that can be expressed as Eq. 9. Therefore, the estimated value of K N can be calculated using Eq. 10.
where A is a steady matrix.
wherek N is the estimated value of k N ; k 0 is the initial value of matrix K. The error is the deviation between the real value k N and the estimated valuek N , as shown in Eq. 11. If the selected measurement function can ensure that this error is smaller than a threshold that we set, K a and K b can be considered to have a linear relationship.
where error is the error between the true value k N and the estimated valuek N .
3) When K a and K b share a linear relationship under a certain observation function set, the relationship between s (u) and s (v) can expressed using Eq. 12.
where f i is an undetermined coefficient.

ML Algorithm Flow
The ML algorithm flow is depicted in Figure 2. It consists of the following parts: data preprocessing, KO, establishment of the objective equation, estimation of the state variable value, error calculation, parameter determination, and determination of the fit model. The details follow:

Data Preprocessing
Experimental data are generated using the experimental platform. All s (u) versus s (v) experimental data are collected and sorted into a data matrix D, as expressed in Eq. 5.

KO
The observation function set is selected according to theoretical principles and practical experience and applied to matrix D to obtain the high-dimensional spatial data matrix K. Then, we select the matrixes K a and K b , each of which has N-1 columns, where N is a parameter that must be determined.

Establishing the Objective Equation
We establish the objective equation given in Eq. 9 and transform it into the form presented in Eq. 13.
where K + a is the pseudo-inverse of K a .

Calculation of the Estimated Value
We perform singular value decomposition on K a to obtain the decomposed form, as shown in Eq. 14.
where U ∈ C n×n and V ∈ C m×m are unitary matrices, and Σ ∈ C n×m is a diagonal matrix. Next, matrix A can be expressed using Eq. 15. Its eigenvalue decomposition form is given in Eq. 16.
where (U) p is the conjugate matrix of U.
where W is the eigenvector matrix of A and Λ is the eigenvalue matrix. By substituting Eq. 15 into Eq. 16, we can obtain the linear model given in Eq. 17. The value of K N can estimated in this manner, as expressed in Eq. 18.
where k 0 Φ + · k 1 is the initial value of k N .

Error Calculation and Parameter Determination
The error between the real and estimated values is expressed in the form given in Eq. 19. This expression is related to the parameter N. We gradually increase N until the error exceeds the threshold. The minimum value of N for which the error exceeds the threshold is set as the parameter value.

Fit Model Determination
As previously mentioned, the form of the fitted model is given in Eq. 12. We use the least squares method to determine the coefficient f i in this equation.

PSS Models
Pump Turbine Modeling Using an ML Algorithm The pump-turbine model contains s-shaped regions and is a complex, nonlinear model. We use machine learning algorithms to model the pump-turbine by partition. The process is shown in the Figure 3. A detailed description of the modeling process is provided below. The pump turbine model is based on two relational expressions. That is, aand n 11 are used to express Q 11 and November 2021 | Volume 9 | Article 757507 10 M 11 in the forms given in Eqs 20, 21, respectively. Notably, these are two nonlinear relationships and "S" regions are present. That is, multiple values of Q 11 and M 11 correspond to the samen 11 . For this reason, it is quite difficult to establish an accurate analytical model.
where f Q 11 , f M 11 represent some kind of function that the independent variables are a and n 11 , and the dependent variables are Q 11 and M 11 respectively.
We use the preceding ML algorithm to establish the relationship between Q 11 − a, Q 11 − n 11 , M 11 − a, and M 11 − n 11 and use the established relationship as a basis for model construction, as shown in Eqs 20, 21. The details are as follows.
The Relationship Between Q 11 , a, and n 11 Q 11 -a.
To analyze the relationship between Q 11 and a, we use the observation function set given in Eq. 22.
The error obtained using the proposed ML algorithm is depicted in Figure 4 (1-A). In all ranges, the vane opening satisfies the requirements outlined in this observation function set.
We use the observation function set given in Eq. 23 to analyze the relationship between Q 11 and n 11 based on the relationship between the flow rate and the pump turbine speed.
The ML parameter N is determined using the algorithmic flow shown in Figure 2. In actual operation, we continue to increase the value of N and calculate the error as shown in Eq. 19. When this error increases significantly, the prediction model does not match reality. The error at this point is the threshold. We use this N as the boundary of the segmented model. The technical details are as follows: As n 11 changes, the linear relationship obtained using the proposed ML algorithm in high-dimensional space is as follows. • The error corresponding to n 11 ≥ 0 is depicted in Figure 4 (1-B1). The ML parameter N (which is n 11 ) is set to 73. We define the first region of Q 11 − n 11 (P1 of Q 11 ) as 0 ≤ n 11 ≤ 73. • When n 11 ≥ 73, we introduce an intermediate variable IVQ and use it to find the linear relationship between Q 11 and n 11 . The expression for IVQ is given in Eq. 24. The error obtained using ML is depicted in Figure 4 (1-B2). The ML parameter N (i.e., IVQ) is set to 79. We define the second region of Q 11 -n 11 (P2 of Q 11 ) as the region where n 11 > 73 and IVQ ≤ 79.
• When IVQ > 79, we use M 11 0 as the dividing line because M 11 < 0 causes the control to run away. We define the third region of Q 11 -n 11 (P3 of Q 11 ) as IVQ > 79 and M 11 > 0.
Based on the aforementioned three scenarios, the ML algorithm divides the relationship between Q 11 , a, and n 11 into three areas within the available range. The partition map is illustrated in Figure 4A. The Relationship Between M 11 , a, and n 11 M 11 -a.
Much like Q 11 -a, the observation function set given in Eq. 25 can reduce the error of the proposed ML algorithm. The resulting error is depicted in Figure 4 (2-A). The vane openings corresponding to all ranges fulfill the requirements under this observation function set.
Based on the relationship between the torque and speed of a pump turbine, the observation function set given in Eq. 26 is used. Much like the process used in step 1, we use the ML process in Figure 2 to determine the segment boundary. The technical details are as follows: • The error corresponding to n 11 ≥ 0 is depicted in Figure 4 (2-B1). The parameter N (which is n 11 ) is set to 73 in the ML algorithm. We define the first region of M 11 -n 11 (P1 of M 11 ) as 0 ≤ n 11 ≤ 73. • When n 11 ≥ 73, the intermediate variable IVM is introduced; its expression is given in Eq. 27. The error obtained using the ML algorithm is illustrated in Figure 4 (2-B2). The parameter N (which is IVM) in the ML algorithm is set to 67. We define the second region of M 11 -n 11 (P2 of M 11 ) as n 11 > 73and IVM ≤ 67.
where M 11r is the rated value of M 11 and has a magnitude of 220.7347.
The ML algorithm divides the relationship between M 11 , a, and n 11 into three areas within the available range. The partition map is depicted in Figure 5B.

Fit Model
After obtaining the partitions, the principles described in Related Principles are used to establish polynomial models of Q 11 and M 11 for each partition as fit models. The polynomial term is the function included in the observation function set. The three-part fit model of Q 11 is given in Eqs 28-30, and the fitting coefficient values are listed in Table 1. Similarly, the three-part fit model of M 11 is given in Eqs 31-33, and the fitting coefficients are listed in Table 2.

Comprehensive Area Model
Notably, the partition model is chosen because the device model includes multiple expressions, each of which is divided into multiple segments. Since these segments are related directly to differences and coverage, we divide them into multiple regions so that they can accurately represent the device model in a detailed manner. Herein, the partitions M 11 and Q 11 are staggered overlapping partitions. Furthermore, we combine the partitions M 11 and Q 11 to obtain the comprehensive area, as illustrated in Figure 6. The boundaries in the figure are the parameters in the relationship between the variables determined by the ML algorithm.

Other Models
In the description of the pump turbine model, it is necessary to point out the relationship between unit and non-unit quantities, which is expressed using Eqs 34-36.
where D is the nominal diameter of the pump turbine.

Servo Mechanism Model
The servo mechanism model is nonlinear; it considers the speed limit, position limit, position saturation limit, and suppression governor, as depicted in Figure 7. In this model, δ is the valve stroke, T aB is the auxiliary servomotor response time constant, T a is the servomotor response time constant, δ max is the valve stroke upper limit, δ min is the valve stroke lower limit, a max is the servomotor stroke upper limit, and a min is the servomotor stroke lower limit.

Penstock System Model
The approximate elastic water hammer model is used to model the penstock system. Its expression is given in Eq. 37.
where h(s) L h} { and Q(s) L Q} { represent the Laplace transforms of h and Q, respectively. T w is the water inertia time constant, T r is the water hammer pressure reflection time, and h f is the head loss coefficient.

Motion Equation Model
When the generator is connected to the pump turbine, the torque of the turbine and the load torque generated by the generator must satisfy the motion equation (Eq. 38).

Model Prediction
After obtaining the model by using the method introduced in the preceding section, we can predict the value of the state variable. The process is depicted in Figure 8. The details of each module are introduced as follows.
1. Calculate the model calculation value s(k) at time k using the equipment model. 2. Subtract the model-calculated value s(k) from the real value s real (k) at time k to obtain the deviation value Δs(k). The true value comes from the data acquisition component. 3. Calculate the model calculation value s(k + 1) at time k+1 using the equipment model. 4. Add the deviation value Δs(k) at time k and the calculated value s(k + 1) of the model to obtain the predicted valuê s(k + 1) at time k+1. 5. Calculate the error error(k + 1) between the predicted valuê s(k + 1) and the real value s real (k + 1) at k+1. 6. Judge whether the error is lower than the limit. If the error is lower than the limit, the prediction level meets the requirements and the prediction model is obtained directly. If the error exceeds the limit, return to step 2 and continue the model prediction process.

Optimizer
In this study, we use optimization algorithms as the MPC strategy. The corresponding set of state variables is given in Eq. 46.
The goal of PSS control is to regulate the vane opening such that the turbine operates at the required speed and the system achieves the highest efficiency. Therefore, the objective function is the reciprocal of the turbine efficiency, as shown in Eq. 47.
The optimization constraints include equality and inequality constraints. The equality constraints are represented by the model predictions established in Eqs 39-45. The inequality constraints include the upper and lower limits of each state variable, as described in Eqs 48-52. a min (k + 1) ≤â(k + 1) ≤â max (k + 1), (48) n 11 min (k + 1) ≤n 11 (k + 1) ≤n 11 max (k + 1), Q 11 min (k + 1) ≤Q 11 (k + 1) ≤Q 11 max (k + 1), M 11 min (k + 1) ≤M 11 (k + 1) ≤M 11 max (k + 1), The optimization process uses the interior point algorithm, which works well in practice. It has been proven that in some cases, this method can solve a problem to a specified accuracy by performing operations on various polynomials that do not exceed the dimension of the problem.

NUMERICAL EXPERIMENTS AND ANALYSIS
To verify the effectiveness of MPC, we measured and collected data from a PSS in China and conducted simulation experiments using the MATLAB software environment (Mathworks, 2017). The experiment proceeded in three parts, each of which corresponded to no-load start up, frequency disturbance, or speed disturbance. The experiment was performed under medium and low water heads because the PSS easily enters the S-characteristic area under these head levels. We set the water head h to 413 and 418 m and the no-load opening to 0.5. In addition, we used a PSS PID control scheme that was proposed in a previous study for comparison.

No-Load Start-Up Condition
In this part of the experiment, the PSS was in the no-load condition and the pump turbine was started from a standstill. MPC and PID control were implemented when the startup speed reached 90% of the rated speed. The experiment was performed for 120 cycles, each of which ran for approximately 0.2 s. The speed results are depicted in Figure 9. The pump turbine state variable results are depicted in Figure 10. The results indicate that MPC performs better than PID control of PSSs because MPC provides a smaller overshoot and faster response speed.

Frequency Disturbance Condition
In this part of the experiment, the PSS was operated under no load. For the system frequency to be disturbed at the 10th cycle, the PSS speed needed to be increased to 1.02 times the rated speed. The experimental results obtained under MPC and PID control are depicted in Figure 11 and Figure 12. Both control methods can fulfill the frequency disturbance requirements. Under PID control, the speed overshoots the threshold and the response time is longer. In contrast, the speed overshoot is smaller and the new speed requirements are reached quickly under MPC control.

Speed Disturbance Condition
In this part of the experiment, the load generator was activated at the 10th cycle and the load torque was increased in steps. At this point, the speed of the pump turbine was disturbed such that it decreased abruptly. The experimental results obtained under MPC and PID control are depicted in Figure 13 and Figure 14. Under PID control, the speed is disturbed in the S-characteristic region. This produces strong oscillation and the speed takes substantial time to stabilize. In contrast, the speed oscillation amplitude is smaller and the speed stabilizes within a short time under MPC control.

Advantages of MPC
The results described in the preceding section indicate that the MPC and PID control methods can fulfill PSS S-characteristic area control requirements that are otherwise difficult to meet. However, the MPC method offers the following advantages: • The MPC method can achieve effective control under the conditions of no-load start-up, frequency disturbance, and speed disturbance. • The model built using ML can accurately represent the PSS.
The control program calculation time can be reduced because the established model is simple. • Since the MPC method uses a global optimization algorithm, the optimal solution within the partition can be calculated directly. Therefore, PSS oscillation in the S feature area can be reduced effectively. • Under the MPC method, the change of state variable is gentle. Thus, it is substantially conducive to reducing PSS equipment loss.

CONCLUSION
We proposed an MPC method for PSSs. A ML algorithm based on the Koopman theory was proposed for PSS modeling. A partition model of the pump turbine used in PSSs was established. This allowed us to express the complex nonlinearity of the S-characteristic region using a simple polynomial model. The experimental results indicated that the MPC method quickly reached the control target under conditions of noload start-up, frequency disturbance, and speed disturbance. This method can more effectively suppress oscillation of the turbine rotation speed in the S-characteristic region than existing control methods and has a faster response speed. The proposed method has practical promise because it learns device characteristics from experimental data and is not limited by device type. In the future, we will improve the accuracy of ML to improve the effect of MPC and study the performance of the proposed method under other working conditions.

DATA AVAILABILITY STATEMENT
The datasets presented in this article are not readily available because Data set for pumped storage system. Requests to access the datasets should be directed to oddnewlife@126.com.

AUTHOR CONTRIBUTIONS
QC and XL; methodology, QC; software, QC and SY; validation, XL; data curation, PZ; writing-original draft preparation, SY; writing-review and editing, QC; visualization, CG; supervision, XL; project administration, XL; All authors have read and agreed to the published version of the manuscript.