Research on the flow characteristics identification of steam turbine valve based on FCM-LSSVM

Due to aging and deformation of the through-flow path and system modifications, the flow characteristics of the turbine inlet valve often deviate from the design value, which affects the unit load control accuracy and operational stability. In order to obtain the actual valve flow characteristics of the turbine and thus improve the FM performance, an FCMLSSVM model is proposed in this paper to identify the valve flow characteristics. First, FCM clustering is proposed to classify the historical operating data of the plant and obtain a wide range of variable operating conditions. Then, using least squares support vector machine (LSSVM), the relationship between turbine input and output variables was modeled in each condition cluster, with integrated valve position command, speed, and real power generated as input variables and actual steam inlet flow as output variables. Using a 330 MW turbine unit as an application example, the established FCM-LSSVM model was validated for the valve flow characteristics of the turbine. The results show that the model can obtain accurate valve flow characteristics without conducting tests on the turbine. The method can save a lot of labor and material resources in doing the characteristic test, and after comparison, the proposed method can identify the flow characteristics more accurately among the existing neural network identification methods, which can provide technical support to improve the unit frequency regulation characteristics and improve the accuracy of valve operation.


. Introduction
As China government puts forward the strategy of carbon peaking and carbon neutrality, an increasing number of new energy units join the power grid, thereby bringing the huge pressure of frequency modulation to the power grid. At present, conventional thermal power units are still the main frequency modulation resources, and improving the frequency modulation performance of thermal power units is vital for the safe and stable operation of the power system (Kotowicz et al., 2019;Jinshan et al., 2021;Chen and Cheng, 2023). The frequency modulation performance of the steam turbine unit is closely related to the flow characteristics of the steam turbine valve. During the long-term use of the steam turbine, the switching of the regulating valve group shows violent jitter because of equipment installation deviation, steam impact, equipment aging, and other reasons. This phenomenon leads to dramatic fluctuation in the main steam flow and real power before and after switching under the same load command. The overlap degree between the actual flow characteristics and the preset values deviates, thereby affecting the frequency modulation capability of the unit (Wang et al., 2018;Tan et al., 2022). Therefore, valve function correction is conducive to .
improving the frequency modulation capability of thermal power units after the unit has been modified and installed or operated for a long time (Zhu et al., 2019;Xu et al., 2020). At present, the research on improving the flow characteristic of steam turbine valves mainly focuses on the method of flow characteristic test. Wallat et al. (2018) study the lateral and axial movement of the valve plug to design a flow characteristic test model. Xiao et al. (2016) used CFD numerical simulation to find the flow coefficient of the buffer structure for the valve flow test. Zanazzi et al. (2012) test the operation of turbine valves based on aerodynamic principles to improve valve regulation. Wang and Hai (2010) regulate the load by studying the stress characteristics of the blades under different control methods to ensure stable valve operation. Characteristic tests can improve control stability. However, they require considerable manpower and material resources to obtain the test data, and the process may affect the normal operation of a unit.
Some studies use operating data to identify the valve flow characteristics and overcome the limitation of the valve characteristic test. Salahshoor et al. (2010) obtained the valve flow characteristic curve by using the SVM data mining algorithm and comprehensively considering the characteristics of the characteristic flow area. In reference Li et al. (2019), based on DEH historical data, corrected the valve flow characteristic curve by calculating the actual equivalent flow calculation in single valve mode. However, the solution process of the above algorithms is cumbersome, and the objective data relationship is easily ignored. Least-squares support vector machine (LSSVM) is one of the commonly used methods for modeling power station operation data. The LSSVM method transforms the traditional quadratic programming problem into the solution of linear equations, thereby reducing the difficulty of the solution process and making the modeling process convenient (Lv et al., 2013(Lv et al., , 2020Zhao and Zhang, 2022). In reference Chen (2022), the actual running data of steam turbines were collected. Moreover, LSSVM theory was used to establish models for steam turbine valve position instruction, main steam pressure, and actual inlet steam flow and simulate characteristic tests. Compared with working condition tests, the workload was greatly reduced, and the modeling efficiency improved considerably.
Although the LSSVM model can accurately describe the turbine system, a single model can adapt only to a certain local working condition, and the fitting effect of the valve flow characteristics of the turbine unit with a wide range of variable working conditions is not ideal (Liao et al., 2018). Thus, the identification accuracy of valve flow characteristics is affected. In reference Han et al. (2020), the clustering algorithm was used in the operation process of a steam turbine under multiple working conditions to establish models for the valve flow characteristics under different working conditions for analysis. Compared with a single model, the models established exhibited considerably improved fitting effects. The clustering algorithm can effectively improve the fitting accuracy by dividing the working conditions (Zhang et al., 2021).
The main contribution of this paper is to establish an FCM-LSSVM model to identify the flow characteristics of turbine valves, which can effectively solve the time-consuming and tedious problem of valve flow characteristics testing. Nowadays, turbines often operate under a wide range of variable conditions, and it is difficult to fit all the conditions using one mathematical model. To address the problem of insufficient accuracy of valve flow characteristics identification by the currently used model, this paper proposes a clustering plus modeling method to identify the valve flow characteristics of turbines, which first clusters the operating conditions of different units and then models them separately to identify them, and can fit all turbine operating conditions to the maximum extent. The unit data were first screened, and the FCM algorithm was used to classify the working conditions. Then, the LSSVM sub model was trained to obtain the mathematical relationship between the comprehensive valve position command, real power, speed, and the actual inlet steam flow under different . /frsgr. . working conditions. The main parameter variables were changed to simulate the test condition. Moreover, the trained model was used to predict the actual inlet steam flow, obtain the relationship between the integrated valve position command and the actual inlet steam flow, identify the valve flow characteristics, and provide support for improving the frequency modulation ability of the turbine. The rest of this work is structured as follows: Section 2 describes the theoretical basis of the FCM-LSSVM model. The system modeling process is explained in Section 3, the application instance and results are presented and discussed in Section 4, while Section 5 summarizes the main conclusions of the present work.
. Turbine valve flow characteristics identification process

. . Selection of input variables
The selection of input variables considerably impacts the prediction results. Thus, reasonable input parameters must be determined to ensure the prediction accuracy of valve flow characteristics. The parameters with high influence on valve flow characteristics are selected, as shown in Table 1. A high correlation coefficient between variables indicates a strong correlation between variables, and the Pearson correlation coefficient method was used to analyze the data in the table. Pearson's correlation coefficient is defined as: Where, X and Y are the two groups of data whose correlation degree is to be determined, respectively having m elements; And X are Y the average values of the two groups of data respectively; Kis Pearson correlation coefficient, whose value is [−1, 1]. The correlation degree between data is reflected by the K absolute value, the larger the absolute value is, the higher the correlation degree between data. Generally, the condition for determining a strong correlation is that the correlation coefficient is >0.65. After calculation, the actual power, comprehensive valve position instruction and speed were finally selected as the set of input variables. The correlation coefficients between them and the actual main steam flow were 0.96, 0.91, and 0.84, respectively, and the actual main steam flow was set as the output variable of the model.
The actual main steam flow of the steam turbine cannot be directly measured; thus, based on the historical data of the unit, the improved Furuger formula is adopted for calculation (Li et al., 2022), as follows: Where G 1 is the equivalent value of the actual main steam flow. Here, the pressure ratio is used to represent the equivalent steam flow, that is, the ratio of the equivalent value to the actual value. P a is the regulating stage pressure, P is the main steam pressure, P e is the rated main steam pressure, and P a,e is the rated regulating stage pressure.

. . Theoretical basis of model
The main distribution range of the power generation load is 150-300 MW. A single model cannot accurately express the valve flow characteristics under different working conditions. Thus, sub models under different working conditions are established.
where c is the number of cluster categories, m (m > 1) is fuzzy index, u ij represents the membership degree of sample j belonging to class i, v i is the clustering center of class i, and x j − v i is the Euclidean distance from sample x j to sample v i . Equation (1) can be solved through the following iterative process: Among them, i = 1, 2, · · · , c, and j = 1, 2, · · · , n. Figure 1 shows the steps of the FCM algorithm to determine the clustering center and membership matrix. The stop iteration threshold and the maximum number of iterations are set, and the minimization of objective function (1) is achieved through continuous iteration of the membership matrix.
. . . LSSVM identification principle LSSVM solves linear equations to improve the computational complexity of traditional support vector machines. Given a set of training samples D, Where x j is the input for sample j, y j is the output of sample j, p is the dimension of the input vector, and n is the number of training samples. The optimization problem of LSSVM can be transformed into the following by mapping the non-linear estimation function: Where w is the weight vector, γ is the penalty factor, and ξ j is the relaxation variable.
The above formula and its Lagrange function can be expressed as  (8) Where ϕ represents the nonlinear mapping function, b is the bias quantity, α j is the introduced Lagrangian multiplier, and j = 1, 2, . . . , n.
The final model output function can be obtained from the Karush-Kuhn-Tucker condition: K x, x j is a commonly used radial basis function, which can obtain smooth model estimation (Hong and Wen, 2021).
The membership degree aggregation strategy is introduced to integrate the sub models. The final FCM-LSSVM model is established, as follows: Where h(x) is the output value of all model predicted values, f i (x) is the output obtained by Formula (7), and u ij is the membership value.

. . Establishment of the valve flow characteristic model
After the unit's historical operating data are prepossessed within the load distribution range, the FCM algorithm is adopted, and the membership principle is introduced. The three parameters of integrated valve position instruction R f , speed n, and real power W are selected as the characteristic variables. According to Equations (1-3), c cluster data are obtained, and the output is set as the actual inlet steam flow of the turbine. According to Formulas (5-7), the . /frsgr. .

. Application instance
A typical 330 MW domestic subcritical unit is taken as an application case. The steam turbine has two high-pressure main valves and six high-pressure regulating valves, which are controlled by the CIS valve. The operating conditions are distributed between 150 and 300 MW. The distributed control system collects the historical operation data of the unit, including the parameters shown in Table 1. The sampling interval is 10 s, and a total of 17,567 groups of operation data are collected. After eliminating the operation fault data and downtime data, 16,343 pieces of historical operation data of the unit are obtained. It includes the data of the stable operation period of the unit and the operation data of the lifting load period, and then calculates the actual intake flow according to Equation (9). The first 11,284 pieces of historical unit operation data were taken as training set data, and the last 5,059 pieces of unit operation data were taken as test set data. The data distribution is shown in Figure 3.
• score standardization was used for processing the unit operation data, making these data conform to the standard normal distribution; that is, the mean value is 0, the standard deviation is 1, and the data processing function is Where x ′ is the normalized data, µ is the sample mean, and σ is the sample standard deviation.
FCM clustering was performed on the normalized historical operation data of the unit. The data of the final cluster classification is shown in Figure 4. The results show that all the data samples were divided into five subsets after clustering. Moreover, the working conditions in each subset were similar and concentrated in the power of 165,209,240,275, and 300 MW. The data similarity reached the maximum, ensuring the accuracy of the subsequent training LSSVM model.
The input vector X = [R f , n, W] and the output variable Y = [G 1 ] were used to train the LSSVM model for each data subset. Finally, the five models were integrated into one LSSVM model, which could simulate the test conditions of the turbine valve flow characteristics.
This FCM-LSSVM model was used to predict the actual inlet steam flow of the test set data. The results are shown in Figure 5. The data curves of the actual calculated value and the model identification value were nearly identical, indicating a high prediction accuracy. Figure 6 shows the model identification error. The difference between the actual and predicted inlet flow of the turbine was <0.08%. This finding proves that the model can accurately predict the actual inlet flow.
The BP neural network model is trained to predict the actual intake flow of the turbine, and the results are shown in Figure 7. The identification error of the BP model is shown in Figure 8. The discrimination error of BP neural network is around 20%. By comparison, the prediction accuracy of the FCM-LSSVM model is significantly higher than that of the BP neural network, which is more suitable for the identification of turbine valve flow characteristics.
Average relative error (ARE), root mean square error (RMSE), and normalized RMSE (NRMSE) were considered. NRMSE is an indicator for measuring the identification effect of the model. Each indicator is defined as follows: In the formula, the actual inlet steam flow is represented by G i .Ĝ i is the predicted value, G is the average value of the actual flow, G max and G min represent G i the maximum and minimum values of the actual inlet steam flow, and n is the number of samples in the test set. Table 2 shows that the prediction error of the actual inlet steam flow using the model is large. Each sub model has a good fitting effect on the data in only one working condition. However, it cannot accurately fit the data in the overall operating condition.  With the integrated valve position instruction as the horizontal coordinate and the actual inlet steam flow percentage predicted by the model as the vertical coordinate, the relationship diagram between the two variables was constructed, and the curve was fitted, as shown in Figure 9. The simulator was used to test the valve flow characteristics of the sub critical turbine unit. The AGC of the unit was removed, the primary frequency modulation was withdrawn, and the CIS valve control method was adopted for the test. The test curve of the valve flow characteristics of the turbine was finally obtained, as shown in Figure 10. The comparison of Figures 9, 10 show that the curve of the valve flow characteristic predicted by the FCM-LSSVM model is similar to the curve of the flow characteristic test. This finding proves that the method proposed in this study can replace the flow characteristic test and realize the identification of valve flow characteristics accurately.
The simulation results are the real valve flow characteristics of the turbine identified by the model, reflecting the current actual control condition of the unit. With the use of the turbine and the aging of the equipment, the actual valve flow characteristics will change, and this change will also lead to inaccurate valve operation during the primary and secondary regulation of the unit, which will affect the frequency regulation capability. By using the model to identify the valve flow characteristics and replacing the valve flow characteristics curve with a model, the valve operation can be made more accurate, thus improving the frequency regulation performance.

. Conclusions
In this paper, FCM-LSSVM theory was put forward to achieve the identification model of steam turbine valve flow characteristics. Given the characteristics of the unit operating in a wide range of variable operating conditions, the samples were grouped into multiple classes based on parameters such as comprehensive valve position command, real power, and speed. Moreover, the LSSVM model was established under different load conditions. Then, it was integrated into the final model. The flow characteristic identification model was established using the proposed method and taking a 330 MW steam turbine as an example. The results show that the difference between the identification value and the actual value of the steam inlet flow is within 0.08%. The mean relative error, RMSE, and NRMSE of the FCM-LSSVM model are 0.66, 0.63, and 0.93%, respectively. The identification results of the steam turbine valve flow characteristics are accurate and have guiding significance for improving the frequency modulation performance of thermal power units.

Data availability statement
The original contributions presented in the study are included in the article/supplementary material, further inquiries can be directed to the corresponding author.

Author contributions
XH, FJ, BW, QZ, HS, and CW contributed to conception and design of the study. XH and FJ processed data. BW and QZ performed the statistical analysis. CW and HS wrote the first draft of the manuscript. CW was involved in the writing of the first manuscript of this paper and provided many suggestions throughout the process. All authors contributed to manuscript revision, read, and approved the submitted version.

Conflict of interest
XH, FJ, and BW were employed by State Grid Hebei Energy Technology Service Co., Ltd.
The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.