UHVDC transmission line diagnosis method for integrated community energy system based on wavelet analysis

With the large-scale development of renewable energy power, China has faced with the challenges of the reverse regional distribution of wind and solar resources and power load, as well as the intermittency and randomness of renewable energy power. Therefore, China is vigorously developing ultra-high voltage direct current (UHVDC) transmission technology to solve the problem of insuf ﬁ cient ﬂ exibility caused by the uncertainty of renewable energy and the ﬂ uctuation of multi-energy loads in integrated community energy systems. UHVDC plays an increasingly pivotal role in the west-east transmission system in China ’ s power system due to its high transmission capacity and long transmission distance. Once the fault occurs in the ultra-high voltage direct (UHVD) transmission line, quick and accurate fault location identi ﬁ cation is of great signi ﬁ cance. Hence, this paper proposes a UHVDC transmission line diagnosis method based on wavelet analysis for integrated community energy systems. Wavelet transform (WT) is used to decompose the transient signal on a multi-scale, and then power systems computer-aided design (PSCAD) software is utilized for simulation calculation to obtain the singular spectrum entropy of each layer and facilitate wavelet transformations for signal denoising with advanced tools such as MATLAB. The prediction results can distinguish outside the recti ﬁ cation side fault, within the recti ﬁ cation side fault, and outside the inverter fault with an accuracy of 100%. A large number of simulations demonstrate that combining singular spectrum entropy with support vector machines (SVM) has emerged as a robust technique for integrated community energy systems, suggesting its potential as a standard method in UHVDC transmission line diagnosis. This study is of signi ﬁ cant reference for realizing the


Introduction
As global energy demand rises and the public's awareness of environmental protection increases, there is a growing demand for clean energy sources.China's energy resource centers and load centers are distributed in the reverse direction, and the energy flow is in the pattern of "sending coal from the west to the east and from the north to the south" as well as "sending electricity from the west to the east and electricity from the north to the south."China's wind and solar energy resources have great potential for development, with the most abundant distribution in the northwest, Xinjiang, and Inner Mongolia regions.Although these regions have abundant wind and solar resources, the load power is low due to factors such as productivity and population, which is exactly the opposite of the central and eastern regions, leading to serious wind and solar abandonment in the northwest region (Zhou, 2019).In the future, China's energy production centers are expected to move west and north, while the gravity center of demand may remain in the central and eastern regions for a long time, and the scale and distance of energy flow will further increase.For future largescale access to new energy under the conditions of a largecapacity "west to east" scenario, there is a need to plan ahead with the appropriate transmission mode (Pan et al., 2016;Ding et al., 2021), posing higher requirements for electric energy transmission technology.Among various energy transmission methods, UHVDC technology has played an increasingly important role in power transmission and has well solved the problem of unbalanced distribution between energy centers and load centers due to its advantages of large transmission power, low line cost, and good control performance (Wang et al., 2007;Muzzammel, 2020).From an economic point of view, UHVDC is more suitable for long-distance power supply; the longer the distance, the higher the economy.From the existing UHV lines in our country, UHVDC transmission is generally chosen for ultra long-distance power transmission.The main way to fully implement the new national energy development plan is to transmit wind and photovoltaic power from large energy bases in the "Three Northern Regions" to the eastern and central load centers through the UHVDC transmission system.
Numerous studies have been dedicated to efficiently accommodating excess renewable generation, reducing renewable curtailment, and improving overall energy efficiency.According to the development research of the State Grid, clean energy, such as wind power located in the energy base of North China (including western Inner Mongolia), can be transmitted through UHVDC systems.Similarly, clean energy from energy bases in the northeast and northwest (including Xinjiang) regions can only be transported through UHVDC systems (Zhen et al., 2008;Shu et al., 2012).Ma et al. (2018) and Wang L. et al. (2018) established an planning model to minimize investment, construction, and operating costs.Zeng et al. (2018) studied methods for handling multiple uncertainties in the programming models.Zhou et al. (2020) studied the planning method based on maximum utilization hours and established to minimize construction and operating costs throughout the entire cycle.However, the above studies have overlooked the complex geographical environment along the UHVDC transmission line.The fault rate of the transmission line is high in the event of extreme events, making it difficult to ensure continuous energy supply to important users and rapid recovery from faults, which compromises the safety and reliability of the direct current transmission systems (Meghwani et al., 2017).Therefore, the protection of UHV transmission lines becomes paramount, highlighting the importance of fast and accurate fault diagnosis for the safe and stable operation of the UHVDC transmission systems.Based on this phenomenon, Niazy andSadeh (2013), Singh Brains et al. (2017), Wang Y. et al. (2018), andShu et al. (2020) conducted certain research.At present, transient protection utilizing the boundary to the attenuation characteristics of high-frequency quantities is the development direction of UHVDC transmission line protection.Studies by Yong et al. (2009), Yan et al. (2017), Sheng et al. (2019), andMuzzammel, (2020) have made significant progress in this realm.However, these studies overlook the complex regional environment along UHVDC transmission lines.In the case of extreme events, it is difficult to ensure the continuous energy supply to important users and the rapid recovery of faults due to the high transmission line failure rate seriously affecting the safety and reliability of the HVDC transmission system.
Support Vector Machines (SVM) is a machine learning method based on statistical learning developed in the 1990s (Bauer et al., 2011), aimed at classifying samples by finding an optimal classification hyperplane between them (Xue et al., 2015).Given that SVM is a superior statistical learning method known for recognizing patterns in small sample data, the objective of distinguishing fault types is achieved through training and testing the sample sets (Shu et al., 2010;Liu and Chen, 2017;Wang C. et al., 2018;Zhen et al., 2019).Zhu et al. (2011) introduced a new identification method combining SVM and multi-resolution singular spectral entropy to address the problem of classifying islanding and grid interference.Considering the practical difficulty in obtaining a large number of typical samples of UHVDC line boundary and fault transient signals, the combination of multi-resolution singular spectrum entropy and support vector machine is applied to the problem of fault identification inside and outside the UHVDC line transient protection.This approach demonstrates the effectiveness in classifying fault categories with minimal sample data, yielding the desired outcomes (Adly et al., 2020;Chen et al., 2020).Hence, this study mainly proposes a transient protection method for UHVDC lines based on SVM and multi-resolution singular spectrum entropy.Section 2 provides a brief overview of the methodology designed in the study.Section 3 discusses the modeling of verification.Section 4 elaborates on the analysis of frequency domain waveform and singular spectral entropy to classify signals in three different positions.Finally, Section 5 concludes key insights derived from this study.

Singular spectral entropy
Singular spectrum analysis represents an effective time-domain analysis method that transforms the embedding space into an equivalent orthogonal coordinate system.This transformation facilitates obtaining signal trajectories in the subspace with the minimum embedding dimension, thereby eliminating linear dependencies and artificial symmetry between delay coordinates.As a result, this process enhances the signal-to-noise ratio and sharpens signal singularity.However, as a time-domain analysis method, traditional singular spectrum analysis is not conducive to multi-scale monitoring of signal singularity and fault localization.Thus, the multiresolution singular spectrum entropy combines the idea of multiresolution analysis and information entropy to characterize the singular state of signal energy distribution and the probability of geometric feature distribution at different scales (Wu et al., 2012).

Wavelet singular spectral entropy extraction
represents the approximate value at scale j 0, denoted as c 0 f(k), then the formula for discrete dyadic wavelet transform can be expressed in Eq. 1.
where H and G are low-pass filters and high-pass filters respectively.In addition, c j (k) and d j (k) represent the approximate and detailed parts of the signal at scale, respectively.The discrete signal f(k) is decomposed into d 1 , d 2 , . . ., d j , c j by scale 1, 2, . . ., J which contains information from different frequency bands from the highfrequency to the low-frequency.
2) Perform wavelet reconstruction on each decomposed layer of the signal.Suppose only the wavelet coefficients of a certain frequency band are retained and the wavelet coefficients of other frequency bands are set to 0. In that case, a singular spectrum analysis of the reconstructed time-domain signal is performed.The reconstruction formula is in Eq. 2.
where H* and G* are respectively the dual operators of H and G.
3) Set on layer j, the discrete wavelet reconstructed signal from the multi-resolution analysis is D j d j (k), k 1, 2, ..., N , and the reconstructed signal D j is mapped to the embedding space.For the sequence D j (1), D j (2), Λ, D j (N) of point N, the reconstruction state at discrete time I is: where J represents the reconstruction delay, m denotes the embedding dimension, and the corresponding reconstruction trajectory is in Eq. 3.
(3) Among them, the element of A j ∈ R N×N is the correlation function, in Eq. 4.
4) Perform singular value decomposition on each layer of matrix A j , assuming the singular value is δ ji .Then, δ ji forms the singular spectral value of the signal on that layer.Suppose k is the number of non-zero singular values.In that case, the value of k reflects the number of different patterns in each column of the feature matrix A. The magnitude of singular value δ ji reflects the proportion of corresponding patterns in the total pattern.Therefore, the singular value δ ji is a partition of vibration signals in the time-frequency domain of wavelet signals.According to information entropy theory, the singular spectral entropy of the reconstructed signal at this level can be calculated in Eq. 5.
where p ji

Wavelet singular spectrum entropy eigenvector extraction
Specific steps of feature vector extraction: 1) Perform WT on the selected voltage signal for analysis, use phase space reconstruction, equivalently exchange the embedding space into an orthogonal coordinate system, and construct a (N − n + 1) × n matrix A j in Eq. 6.
2) Perform singular value decomposition of matrix A j for each layer.3) Perform singular spectral entropy operation on the singular values of each layer, combine the singular spectral entropy values of each layer, and obtain the eigenvector Tin Eq. 7.
T e 1 , e 2 , / , e j where e 1 , e 2 , / , e j represent the entropy value of each layer, and the corresponding frequency band is from high frequency to low frequency.

Support vector machines (SVM)
The singular spectral entropy value of the transient signal at the initial moment of a fault in a DC transmission line can reflect the information regarding the fault in the UHVDC transmission line.However, the values are all less than 1.Therefore, due to the presence of abnormal data and a limited number of faults, along with a small number of training samples, it is proposed to use the singular spectral entropy as the input for the SVM classification model to determine the type of fault.
The learning strategy of SVM is to find the optimal hyperplane, maximizing the interval between hyperplanes and transforming it into a convex quadratic problem.Nonlinear mapping is used to map the sample space to a high-dimensional feature space, and linear learning machines are applied in the feature space to solve nonlinear classification and regression problems in the sample space (Zhang et al., 2020).Figure 1 shows the network structure of the SVM, with the detailed definitions of variables and functions provided in the following text.

Principles of SVM
For a non-linear separable sample set, introducing the relaxation variable (ξ i ≥0) and penalty factor C, the objective function is in Eq. 8:  The external grounding fault and line short circuit on the rectifier side in the frequency domain.

Frontiers in Energy Research
frontiersin.org04 The internal grounding fault and line short circuit in the frequency domain.φ W, ξ ( ) where W denotes the normal direction vector for dividing the hyperplane.The decision function of the optimal hyperplane is in Eq. 9.
where indicates that α i * and b* are the parameters for determining the optimal partition hyperplane.x and x i * represent points in the sample space, that is, support vectors.Moreover, y i symbolizes the value of the prediction factor, and sgn() indicates a signed function.

Selection of kernel function and parameter optimization methods
This paper uses the Gauss kernel function, whose formula is in Eq. 10: where X − Y represents the distance between two vectors and δ denotes the width of the kernel function.
The bilinear search method and grid search method are methods used to determine the SVM classifier.Unlike the bilinear search method, the grid search method offers the advantage of parallel SVM training for C and δ as they are independent of each other.Therefore, this paper adopts the grid search method, taking M C and N δ separately to form M × N combination (C, δ), which are trained separately.Subsequently, the learning accuracy is estimated, obtaining the combination with the highest learning accuracy among all combinations as the optimal parameter.(Verrax et al., 2020;Mohammadi et al., 2021).Subsequently, phase space reconstruction is performed on the high-frequency coefficients of each layer, followed by singular value decomposition on the phase space of each layer to obtain singular values of different sizes (Song et al., 2011;Metidji et al., 2013;Xiang et al., 2018).

Failure diagnosis method
Since the magnitude of singular values reflects the differences between various fault states, singular spectral entropy can quantitatively describe the degree of change (Li C. et al., 2018;Li Y. et al., 2018;Li and Chen, 2019).Singular spectral entropy serves as a reflection of the uncertainty of signal energy.The simpler the signal component, the smaller the singular spectral entropy value, indicating more concentrated signal energy.Conversely, the more complex the signal components, the larger the singular spectral entropy value, indicating that the energy is more dispersed, and the signal is more evenly distributed throughout the entire frequency component (Moreno et al., 2014;Wang and Zheng, 2014).
Therefore, the singular spectral entropy of the fault signal at different decomposition levels is calculated, and then the fault signal is diagnosed through the singular spectral entropy (Xing et al., 2016).Singular spectral entropy can be used to reflect the different changes in fault signals at three different locations, including outside the rectification side, within the rectification side, and outside the inverter side after boundary and line attenuation.

Simulation verification
The simulation model is established based on the actual parameters of the Yunguang ±800 kV UHVDC transmission system.For training, 3 external grounding faults are considered on the rectifier side, 3 external grounding faults on the inverter side, and 9 internal grounding faults.The training set is shown in Table 1.For testing, 3 external grounding faults on the rectifier side, 3 external grounding faults on the inverter side, and 9 internal grounding faults are used.The testing set is shown in Table 2, and the identification results of the grounding fault are shown in Table 3.
As can be seen in Table 6, electing appropriate C and δ can achieve the best testing accuracy.
When C = 2 and δ = 2, the testing accuracy of the external grounding fault on the rectifier side, the internal grounding fault, and the external grounding fault on the inverter side is 100%.This paper uses PSCAD to establish a model, with a sampling frequency of 40 kHz, a sampling time of 0.05 s, and a total of 2000 sampling points.The WT uses the db4 wavelet with a decomposition level of 6.
The reconstructed waveform of the high-frequency coefficients obtained from the WT reflecting ground faults at three different locations, outside the rectification side, within the rectification side, and outside the inverter side, is shown in Figures 2-4.
It can be seen from the waveform that the waveform changes of fault information at different positions are noticeable in the frequency domain, and the amplitude in the high-frequency range also varies.4.
The y in Table 4 denotes the category, and the meaning of y in the following tables is the same as that in Table 4.
According to the presented data in Table 4, in the low frequency band e 3 , there is not much difference in the singular entropy values of the external grounding fault on the rectifier side, the internal grounding fault, and the external grounding fault on the inverter side.This phenomenon arises due to the attenuation of the lowfrequency signal along the line and line boundary can be approximated to be zero.Consequently, the distribution of energy in this frequency band is relatively concentrated, leading to relatively low uncertainty in energy distribution.In the high-frequency range, the singular spectral entropy values of e 1 and e 2 are relatively large, exhibiting a significant difference, with values of the internal < the external on the rectifier side < the external on the inverter side.This is because when there is an external grounding fault on the inverter side, the high-frequency signal needs to pass through the double attenuation effect of the line and the line boundary to reach the installation site of the rectifier side protection.This severe high-frequency signal attenuation and the relatively uniform energy distribution in this frequency band increase the uncertainty of energy distribution.
In addition, Table 4 demonstrates the relative stability of wavelet singular spectral entropy for fault information at the same location.Therefore, this study uses the singular spectral entropy of faults outside the rectification side, within the rectification side, and outside the inverter side as input variables for SVM to classify faults.The SVM algorithm process is shown in Figure 5.The nine samples in Table 4 are employed as a training set, and the parameters C, δ of the SVM classifier are obtained using the grid search method.The data in Table 6 indicate that an ideal accuracy can be achieved for small sample training and learning by selecting appropriate kernel parameters.Then, three groups of feature vectors are selected to form a testing set and input to the SVM for testing.The test set samples are shown in Table 5.

SVM fault recognition algorithm
The testing set is tested in Table 5, and the following results are obtained: 1) When C = 1024, δ = 1024, the correct rate is very low, which is 13.33%.Only the external grounding fault on the inverter side is correctly classified, and the faults in the external grounding on the rectifier side and the internal grounding are not correctly classified.2) When C = 512, δ = 512, the correct rate is 33.33%, and there are two samples with the external grounding fault on the rectifier side that is not correctly classified.3) When C = 128, δ = 128, the correct rate is 83.33%, and there is one sample with the external grounding fault on the rectifier side that is not correctly classified.4) When C = 2, δ = 2, the correct rate is 100%, and all samples are correctly classified.The classification results are shown in Table 6.
When C = 2, δ = 2, the above external grounding fault on the rectifier side, the internal grounding fault, and the external grounding fault on the inverter side are classified by SVM, resulting in the classification diagram is obtained, as shown in Figure 6.
As can be seen from the SVM diagnosis results in 6 and the SVM classification diagram in Figure 6, the trained SVM classifier parameters can correctly classify the grounding fault signals in three different positions.

Conclusion
This study presented a fault diagnosis method for UHVDC transmission lines in a regional integrated energy system, establishing an accurate mathematical model and making a multiscale analysis of fault signals using wavelet analysis.Then, the method for fault identification inside and outside the UHVDC line area using a multi-resolution singular spectrum entropy and SVM was proposed by analyzing the UHVDC transmission line boundary and the attenuation effect of lines on the high-frequency of fault transient signals.For this purpose, MATLAB software was utilized to facilitate wavelet transformations for signal denoising and various machine learning techniques for predictive modeling.Compared with the traditional artificial neural network, the revealed mathematical model offers the advantages of fewer training samples, shorter training time, less overfitting, and higher recognition accuracy.
This study revealed the fault diagnosis method for UHVDC transmission lines within a regional integrated energy system.The fault identification problem of transient protection for UHVDC transmission lines inside and outside the region was transformed into a pattern classification problem.As a result, the sample identification accuracy improved from 13.3% to 100% by establishing a mathematical model and selecting a suitable SVM classifier parameter.Thus, the external fault on the rectifier side, the internal fault, and the external fault on the inverter side could be distinguished at the same time, and automation in the whole classification process could be realized.Given the challenges of acquiring a large number of typical samples of UHVDC line boundary and fault transient signals of the line, the combination of multi-resolution singular spectrum entropy and SVM was applied to the problem of fault identification inside and outside the UHVDC line transient protection.With only small sample data, the fault categories could be effectively classified, and the expected results could be achieved.The accuracy reached 100%, indicating the correctness and effectiveness of the method.This study not only makes a great contribution to the fault diagnosis of UHVDC transmission lines in regional integrated energy systems but also lays a new theoretical foundation for the fault diagnosis of other regional integrated energy systems in the process of energy signal transmission.

FIGURE 1
FIGURE 1Network structure of SVM.

FIGURE 4
FIGURE 4The external grounding fault and line short circuit on the inverter side in the frequency domain.
Tang et al. (2013),Wang et al. (2015), andAbdullah (2017) analyzed the attenuation characteristics of UHVDC transmission lines and their boundaries for high-frequency quantities of fault transient signals.According to the attenuation characteristics of the fault transient signal caused by the lines and their boundaries, wavelet decomposition applied to the fault signals in three different positions, including outside the rectification side, within the rectification side, and outside the inverter side to obtain different reconstructed high-frequency coefficient amplitudes

FIGURE 5
FIGURE 5Flow chart of SVM algorithm.
the singular spectral entropy of the wavelet reconstruction coefficients of the fault signals at three different positions to obtain the feature vectors; 3) Utilize a portion of the feature vectors as the training set and apply the grid search method to determine the SVM classifier parameter C, δ; 4) Input the remaining part of the feature vector as the test set into the SVM classifier for testing and obtaining the classification results.

TABLE 1
Training set of line short circuit.

TABLE 2
Testing set of line short circuit.

TABLE 3
Identification results.
4.1 Frequency domain waveform analysis

TABLE 6
SVM diagnosis of three different locations.