Sequential Detection of Microgrid Bad Data via a Data-Driven Approach Combining Online Machine Learning With Statistical Analysis

Bad data is required to be detected and removed from the microgrid data stream because it misleads the decision-making of the Energy Management Systems (EMS) and puts the microgrid at risk of instability. In this paper, the authors propose a sequential detection method that combines three data mining algorithms, that is the Online Sequential Extreme Learning Machine (OSELM), statistical analysis within a sliding time window, and the Density-Based Spatial Clustering of Applications with Noise (DBSCAN). After sequential data training, OSELM is used to construct an online updated error-filtering map to extract the electrical feature of the microgrid data sequence. Meanwhile, the statistical features, i.e. the surge of the variance and the corresponding correlation coefficients under a sliding time window are first proposed as another two complementary feature dimensions. The three-dimensional features are finally analyzed by DBSCAN to discriminate the bad data. The detection performance of this approach is verified by the data sequence collected from a four-terminal ring-shaped DC microgrid prototype. Compared with bad data detection using a single electrical feature or only statistical features, this approach shows the best performance. Moreover, it can be further applied to the online detection of microgrid bad data in the future.


INTRODUCTION
microgrid is in a steady state. It also impacts the emergency decision on system security when the microgrid is under large disturbances. These factors further affect the economy of microgrid operation, even causes disastrous consequences such as system collapse (Shahnia et al., 2010). Therefore, it is particularly necessary to detect and eliminate these bad data. With the future integration of the information system and the physical infrastructure as well as the high penetration of power electronic devices, the 4V (volume, velocity, variety, and veracity) features of the microgrid data are becoming more and more obvious. As a result, the types and contents of bad data in the microgrid are more complicated than that of the utility grid (Qiu et al., 2017), which calls for a more rapid and effective bad data detection method.
Power system bad data detection has been researched for over 40 years. Most of the research aims at the utility grid. Through the survey of much-related literature, bad data detection is divided into two steps, features extraction, and features analysis. The first task is to obtain the quantitative features containing the differences between the normal data and the bad data from single or multiple dimensions. The next step is to analyze the features and approximate the dividing boundary between normal data and bad data. Traditional bad data detection methods use electrical features to identify bad data, based on the idea that bad data stems from various uncertainties and does not comply with the electrical mechanism of the power system. The electrical features can be obtained from either the power system model or the vast historical data. According to these two feature-acquiring means, the research methods of bad data detection are divided into the traditional method based on the power system analytical model and the modern method based on the data-driven model. The traditional power system analytical model based on the bad data detection method relies on either the estimation or prediction of the operating state of the power system. According to the features differences, it is mainly divided into the residual method and surge method. The residual method uses the state estimator to estimate the real-time power flow of the power system and extracts the residual (the difference between the measured value and the estimation of the true value) as the feature. Next, based on the probability distribution of the residuals, the outliers located outside of certain confidence intervals are detected as bad data (Bretas and Bretas, 2015;Zhao et al., 2017). This method is limited by the huge computational cost because the state estimation process should be repeated many times to avoid the residual pollution and residual flooding effect (Liu et al., 2011). The surge method (Huang and Lin, 2004;Do Coutto Filho and de Souza, 2009) treats the power system as a dynamic model and takes the surge (difference between the present measured value and the predicted value at the previous sampling time) as the feature. Next, the bad data is detected based on the statistical hypothesis test of the surge. This method overcomes the formerly mentioned disadvantage of huge computational cost. But it assumes that the topology and the parameters of the utility grid are not changed during the adjacent sampling time, which restricts its application.
Traditional bad data detection methods have a long way to go before being applied in the microgrid. Due to the high penetration of Distributed Energy Resources (DERs) in the microgrid, the operating modes and operating states are more complicated than that of the utility grid (Hu et al., 2011). At the same time, as a hybrid AC-DC multi-converter system, the static operating point of the microgrid often migrates. It is difficult to establish a dynamic analytical model for the microgrid (Xia et al., 2016), while the analytical model is the basis of the traditional bad data detection methods. Reference (Gu et al., 2017) proposed a state estimation method based on a dynamic large-signal model of the microgrid to realize the distributed control of microgrid voltage. However, the influence of the inter-converter coordinated control scheme on the model parameters is not considered. Authors in (Beg et al., 2017) proposed a bad data detection method based on the hybrid numeral and physical simulation model of the microgrid. The main idea is the use of a microgrid dynamic simulation model to verify whether the data conforms with the electrical laws. This method is very instructive, but the traditional power system analytical model is not used.
On the contrary, the modern bad data detection methods based on data-driven models do not need to analyze the power system model (Wu et al., 2013;Huang et al., 2016). They use the machine learning method to extract the electrical features out of the vast historical data, which are used for the prediction of the measurement error. Next, clustering analysis is used to automatically assort normal data and bad data in different clusters (Shyh-Jier and Jeu-Min, 2002;Cramer et al., 2015;Yang et al., 2017). In our previous work (Huang et al., 2018), the machine learning algorithm ELM is used to extract the electrical feature, and the feature is analyzed by the clustering algorithm DBSCAN to realize the fast and effective detection of the bad data in the microgrid. To the best of our knowledge, this method is the first application of bad data detection in microgrids based on the data-driven model. The combining of ELM and DBSCAN can achieve faster and more accurate detection than the previous methods (Shyh-Jier and Jeu-Min, 2002;Cramer et al., 2015;Yang et al., 2017). However, there are still some drawbacks. The research adopts the idea of offline training, the prediction model is only trained once, and its accuracy depends on the completeness of the information contained in the historical data. Inspired by the sequential detection idea in reference , we introduce the OSELM algorithm to improve our previous work. Using the method of online training to update the prediction model sequentially is more conducive to the realization of the online detection of bad data in the future. However, there is still a problem in the sequential learning of OSELM. The accuracy and generalization ability of such supervised machine learning models still heavily depend on prior knowledge. They are not sensitive enough to some unfamiliar operating modes or states. Therefore, it is necessary to introduce some other dimensions of features together with a new unsupervised detection method to complement the shortcomings of the single electrical feature extracted by the supervised OSELM algorithm.
Recently, bad data detection methods based on statistical analysis have been widely used in the field of network security Frontiers in Energy Research | www.frontiersin.org May 2022 | Volume 10 | Article 861563 (Bosman et al., 2017;Ren et al., 2017). Its main idea is to use the statistical property of the continuous data stream to determine whether an observation value is beyond the statistical range of normal data (Almalawi et al., 2014;Mohammadpourfard et al., 2017). The external appearance of bad data is an outlier that is too large or too small. So, it has a statistically significant surge feature, and lower correlation with other normal data. Therefore, the surge of variance and the correlation coefficient of the measurement data sequence within a sliding time window [inspired by (Araya et al., 2017)] can be used as two feature dimensions to distinguish the bad data. As mentioned earlier, the operating conditions of the microgrid are more complex than that of the utility grid. Due to the lack of prior knowledge of the intrinsic electrical relationship between the data, the statistical features of microgrid measurement data can be flooded by the noise of the data itself. Therefore, a single statistical method is not sufficient for the microgrid bad data detection. On the contrary, the electrical features of microgrid measurement data use prior knowledge of the microgrid electrical laws. The combining of the above two supervised and unsupervised methods, i.e. the use of both electrical features and statistical features, can achieve a better detection performance of bad data. Guided by the above idea, this paper presents a sequential detection method of microgrid bad data based on machine learning and statistical analysis. Based on our previous research work, this paper takes the microgrid simulation data as the prior knowledge and builds the error-filtering map in the training process of the OSELM algorithm which has the sequential learning ability. The online updated errorfiltering map is used to obtain the electrical feature of the microgrid measurement. Meanwhile, the statistical analysis method is used to obtain the surge of the variance and the correlation coefficient of the microgrid measurement data sequence in a sliding time window. Finally, we use the clustering algorithm DBSCAN to analyze the features in the above three dimensions and identify the bad data. The contribution of this paper is as follows.
1) On the basis of our previous bad data detection method ELM + DBSCAN, an online training and sequential detection method for microgrid bad data via the combination of OSELM and DBSCAN is proposed for the first time. 2) A statistical method that uses the surge of the variance and the correlation coefficient of the data sequence in a sliding time window is first proposed and applied in microgrids for bad data detection. 3) The above two types of methods are combined by using electrical features and statistical features at the same time. This hybrid method can not only avoid being flooded by system noise but also recognize the sudden change of the microgrid operating states. The detection performance is better than that of the OSELM + DBSCAN method using the single electrical feature or the statistical method using only statistical features. More importantly, it can realize the sequential detection of bad data (both point anomaly and contextual anomaly), while the existing methods can only achieve the detection of point anomaly.
The rest of this paper is organized as follows. The basic theory and our new idea of microgrid bad data detection are introduced in section 2. In section 3, the sequential detection method combining the OSELM, statistical analysis, and DBSCAN algorithm is proposed. And the detection performance of the method is verified by the data sequence from a real microgrid prototype in section 4. Section 5 concludes the full text. Figure 1 shows the entire path of data from measuring to transmission to processing in the microgrid which adopts the commonly used hierarchical control structure. Data in the microgrid are mainly divided into two categories, the upward system status information, and downward control commands. The status information includes voltage, current, active and reactive power, switch status, port status, protection action instructions, etc. Among them, the electrical measurements, i.e. the voltage, current, and power are the objects for bad data detection in this paper.

BASIC THEORY AND NEW IDEA
The electrical data on each Distributed Energy Resource (DER) port, grid port, and load port of the microgrid are collected by the sensors and finally enter the local controller and the host computer via the communication network. These electrical data are used to guide the host computer to issue control commands including the operating mode of each port, input control command value of the converter, and switch on/ off command, to realize energy dispatching and system control of the microgrid. However, due to the uncertainties of the data acquisition and communication systems, this electrical data is inevitably mixed with noise and even gross error. In order to improve the reliability of the data, state estimation is needed to reduce noise. At the same time, the bad data detection method is required to clear out the gross error.

Bad Data in Microgrid
" An outlier is an observation that deviates so much from other observations as to arouse suspicion that it was generated by a different mechanism" By Douglas M. Hawkins in 1980. The appearance of abnormal data can be seen as a random, sporadic phenomenon relative to the large amount of normal data present, which is largely deviated from normal data and comes from different mechanisms. Therefore, abnormal data often does not have a strong correlation with normal data. This correlation is reflected in two aspects, one is the relevance of the data attribute, the other is the correlation of the data structure. The normal data generally comes from the same mechanism, and the structure is relatively compact, often showing a spherical or band-like structure. The abnormal data does not conform to the intrinsic structure of normal data, and the structural correlation is weak.
According to these characteristics of bad data and normal data, there are two premises in detecting bad data.
Premise 1: Normal data instances occur in dense neighborhoods, while anomalies occur far from their closest neighbors.
Premise 2: Normal data is the majority and is relevant because it arises from the expected mechanism. While abnormal data is generated by sporadic mechanisms and is therefore partially or completely uncorrelated.
Based on the above two premises, the data attributes in the microgrid are shown in Figure 2.
We can see from Figure 2 that there are four kinds of attributes of microgrid data, i.e. spatial attribute, graph attribute, sequential attribute, and profile attribute. 1) Spatial attribute refers to different electrical features of the discrete electrical data points formed by the readings of the sensors. As a single data point in the network topology, the data itself contains noise. 2) Graph attribute refers to the neighbor relationship of the sensor in space. At each time interval, the electrical measurements in each column conform to the microgrid measurement equation z = h (x)+v. The electrical data is spatially correlated. 3) Sequential attribute refers to the neighbor relationship of the data sequence in time series. When the topology and parameters of the microgrid keep unchanged, the microgrid is a dynamic time-invariant system. The electrical data sequence z conforms to the state transition equation z k+1 = f (x k )+q k , and the data is also time-dependent. 4) Profile attribute is the scene feature of anomaly defined at the system level in the dimensions of time, space, etc. In the spacetime dimension, because different operating scenarios often change periodically, there is a similarity between data sequences.
Based on these four kinds of attributes, three kinds of the anomaly are classified.
1) Point anomaly refers to bad data points in the spatial dimension. 2) Contextual anomaly refers to bad data points or sequences in the time dimension. 3) Collective anomaly refers to abnormal states or patterns in the space-time dimension.
The microgrid bad data discussed in this paper belong to both point anomaly and contextual anomaly, as a result, both spatial correlation and timing correlation can be used to distinguish good data and bad data.
1) bad data in spatial correlation.
The electrical measurements in the microgrid can be expressed as a linear combination of the true values and the measurement errors, as shown in Eq. 1. Note that all the variables are in matrix form.
where z is the measurement, h(x) is the true value, x is the state variable which ensures the observability of the microgrid, h (·) is the equation expression of the microgrid model, v is the measurement error. Rewrite Eq. 1 into the time series form.
where, t is the time stamp, z(t), x(t), and v are all in vector form. When z(t) only contains normal data, the corresponding v stands for the Gaussian white noise.
where σ is the standard deviation of the measurement error v. When z(t) contains bad data z b (t), where v b is the gross error, which deviates beyond a certain statistical confidence interval of the normal measurement errors. The confidence interval is usually determined as ±(6-7) σ in industrial applications (Clewer Bernard, 1986).
2) bad data in timing correlation.
The microgrid electrical data sequence in time series can also be expressed as where k is the time stamp, f (·) is the equation expression of the microgrid model mapping in time series, and q k is the surge of electrical measurement. When q k , i.e. the difference between the present measured value and the predicted value at the previous sampling time, is very big, there comes an outlier. This outlier may represent bad data. It can also be caused by the sudden change of the microgrid operating state. A further timing correlation method is needed to distinguish the two situations, which will be described later in section 3.
The key to bad data detection lies in the processing of data features. The process is divided into two main steps. 1) Features extraction: the procedure of estimating or predicting the features of bad data. The features include both electrical features and statistical features. The electrical features represent the distance between the measurement z and the true value h(x). The statistical features represent the distance between the measurement z k+1 in sampling time k+1 and the true value f(x k ) in sampling time k. 2) Features analysis: using mathematical statistics, data clustering, or other unsupervised methods to approach the interface between normal data and bad data in certain feature dimensions.
The traditional bad data detection method is based on State Estimation.
The diagram of bad data detection based on State Estimation is shown in Figure 3.
When the model of the power system can be analytically resolved as expression h (·), the estimated valuesX of x can be obtained through a State Estimator according to the measurement vector z and the prior knowledge σ from the data acquisition system. The electrical features, i.e., residuals r z − h(X) are obtained. Then, the probability distribution f (rr T ) of the residuals r is used as a hypothesis test to detect bad data. The existence of bad data z b can affect the results of the State Estimation, i.e., residuals r, leading f (rr T ) to change. Therefore, in the vicinity of the boundary where the threshold γ is located, a false or miss detection may occur.

Bad Data Detection Based on Online Sequential Machine Learning
The electrical features of microgrid measurements have the most abundant prior knowledge for bad data detection. The analytical model h(·) of the microgrid can be approximately fitted asĥ(·) by supervised online sequential machine learning, as long as plenty of microgrid historical measurements and simulation data (e.g. data from the SIMULINK model of the microgrid in (Beg et al., 2017)) are given. The schematic of the online sequential machine learning based bad data detection is shown in Figure 4.
Learning from historical data, the online sequential machine learning based bad data detection method constructs an online updating error-filtering map between the historical electrical data sequence and the historical power flow first and then the updated map predicts the true valueĥ(x) out of the present measurement z. Subsequently, the electrical feature z −ĥ(x) can be obtained.
Since it is hard to discover the statistical properties of the prediction of the machine learning method, the analysis of the feature z −ĥ(x) cannot be carried out by statistical hypothesis testing. Thus, the unsupervised machine learning method, clustering is used to do the job. Based on the similarity of the data itself, clustering analysis can automatically sort normal data and bad data in different clusters without any prior knowledge. Due to the uncertainty of the boundary between normal data and bad data, a false or missed detection will occur in the vicinity of the interface.

Bad Data Detection Based on Statistical Analysis in Time Series
Micro-grid is a strong nonlinear time-varying system. Every measurement of the microgrid is not independent but restricted to the electrical mechanism. Therefore, the statistical features of the data in time series indirectly reflect the electrical features and can be used for bad data detection. The schematic of the bad data detection based on the statistical analysis in time series is illustrated in Figure 5.
When an outlier (data point p+1) occurs in the microgrid electrical data series, the variance and correlation coefficient of the data sequence in a sliding window with enough width N shows different degrees of the surge. Note that the surge of the variance is ΔD(p + 1) and the surge of the correlation coefficient is Δρ(p + 1). The ΔD(p + 1) reflects the continuity of a singledimensional data sequence, which can be used to recognize outliers. But the outliers can not only be caused by the bad data, but also by the sudden change of the microgrid operating state. When there exists a sudden change of the microgrid operating state, the ΔD(p + 1) is also significant. Under such a situation, the Δρ(p + 1), reflecting the correlation between the multi-dimensional data sequences, can be used to further distinguish the bad data based on the correlation between different electrical measurements.
Since the operating state of the microgrid changes very often, the threshold of ΔD(p + 1) and Δρ(p + 1) cannot be determined by a fixed statistical hypothesis testing. So, clustering is used to analyze the two statistical features and sort normal data, outliers, and bad data in different clusters. A false or missed detection can also happen near the boundary of different clusters.

OSELM Algorithm
A combination of two machine learning algorithms, the supervised ELM and unsupervised DBSCAN, is used for bad data detection in our previous work. Compared to other machine learning algorithms, ELM is well known for its unmatched training speed and great potential for algorithm evolution. Detailed information on ELM and DBSCAN can be referred to (Huang et al., 2018). The OSELM (Liu et al., 2015) which is used in this paper is briefly introduced on the basis of ELM. The network structure and parameters of ELM are shown in Figure A1 in the appendix. Through sequential learning, OSELM can update the machine learning model online, which makes the model more adaptive in the application of time-series data.

DATA-DRIVEN APPROACH
According to section 2, OSELM is a brand-new online sequential machine learning algorithm that can quickly approximate and update the error-filtering map between the measurements and the true values by recursive linear regression. DBSCAN is very suitable for distinguishing outliers with non-Gaussian distributions, e.g. bad data. Therefore, the combination of OSELM and DBSCAN can quickly realize the sequential detection of the micro-grid bad data. But such a supervised machine learning method, relying on data training, is not sensitive enough to some unfamiliar operating modes or states. The unsupervised statistical analysis in the time series method, which uses the statistical features (the surge of the variance and the correlation coefficient) in a sliding time window, is proposed to recognize the sudden change of the microgrid operating states.
On this basis, a sequential bad data detection method is proposed by using both the electrical features and the statistical features. The proposed sequential bad data detection method is described as follows. The application details of the proposed statistical analysis in the time series method are explained later.

Sequential Bad Data Detection
Guided by the previously mentioned two detection ideas, a sequential detection method using a data-driven approach is proposed. It combines the OSELM, the statistical analysis in time series, and the DBSCAN. The flow chart of this method is illustrated in Figure 6. The process of the method is mainly divided into the following steps.

1) Data acquisition and preprocessing.
Collect, screen, and normalize the measurement data of the microgrid prototype and the simulation data of the corresponding microgrid simulation model to form an electrical data series. Next, this processed data series is split into the sequential training data chunks X i and T i , (i = 1, 2, . . . ) and the testing data series z. The sequential training data chunks, input matrix X i come from the old measurement data, and the other sequential training data chunks, target matrix T i is from the simulation data corresponding to X i . Meanwhile, the testing data series z is acquired from the current measurement data.

2) Features extraction.
Based on the recursive training method of the OSELM algorithm, the OSELM model is trained by X i and T i , (i = 1, 2, . . . ) to build an online updating error-filtering map. Using the updated error-filtering map to predict the testing data series z, the output matrixĥ(x) is obtained, and then |z −ĥ(x)| is extracted as the electrical feature, i.e. the error (including the gross error) in z.
At the same time, the statistical analysis method is developed to calculate the variance surge matrix ΔQ (p+1) of the testing data series z in a sliding time window, where p+1 is the time stamp. The surge of the variance Δq(p + 1) and the surge of the pseudocorrelation coefficient Δρ(p + 1) are extracted as the statistical features sequentially.
The aforementioned three features are clustered by DBSCAN to obtain normal clusters and outliers. Outliers with large |z −ĥ(x)|, large Δq(p + 1) , and small Δρ(p + 1) are identified

1) Statistical property of data sequence.
Take the data sequence z M×N (p) out of the microgrid measurement matrix z within the fixed time window width N at the pth sampling time. According to Eq. 2, we can see where t is the time stamp, M is the dimension of the electrical measurements in z.
Calculate the covariance matrix Q(p) of the data sequence z M×N (p).
where E is the expectation function. The entries of matrix Q(p) are where z i (t) is the ith electrical measurement at sampling time t, z i (p) is the average of the ith electrical measurement sequence over the time window of width N, the diagonal entry q ii (p) is the variance of the ith electrical measurement in the data sequence at the sampling time p, q ij (p) is the covariance of the ith electrical measurement and the jth electrical measurement, and q ij (p) q ji (p).
2) Statistical features in sliding time window.
Slide the time window with fixed width N forward by one data point. During this process, the surge of the variance matrix Q is The entries of ΔQ (p+1), i.e. the surge of the variance Δq ii (p + 1) and covariance Δq ij (p + 1) at the (p+1)th sampling time is derived as Eq. 10 under the approximation that z i (p + 1) ≈ z i (p) (marked as z i ) when N is large enough. At the same time, we assume that the surges before the sampling time (p+1) have been already detected and eliminated. That is to say, z i (p − N + 1) ≈ z i . Under such conditions, the following equation can be deprived.
According to Eq. 10, if the new arrived electrical measurement z i (p + 1) is a normal data, it is in the vicinity of z i , and then Δq ii (p + 1) will be quite small. On the contrary, if z i (p + 1) is an outlier, Δq ii (p + 1) is large enough to be defined as a surge. So, Δq ii (p + 1) can be used for outlier detection. The outlier may be caused by the bad data. But, it can also be caused by the sudden change of the microgrid operating state. Therefore, the single statistical feature Δq ii (p + 1) is not enough for bad data detection.
When the outlier z i (p + 1) is bad data, the other statistical feature Δq ij (p + 1) is also obvious to be defined as a surge. And it is quite smaller than Δq ii (p + 1), because z j (p + 1) is very close to z j . But, when the outlier z i (p + 1) is caused by the sudden change of the microgrid operating state, the change of z i (p + 1) will result in the change of other electrical measurements z j (p + 1) because they have strong electrical relationships with each other. Thus, some of z j (p + 1) are FIGURE 9 | Partial components display of the DC microgrid prototype. much larger than z j , which makes these Δq ij (p + 1) very close to Δq ii (p + 1).
Concerning the concept of the correlation coefficient ρ ij (p + 1) , we define Δρ ij (p + 1) the surge of the pseudo-correlation coefficient as Eq. 11.
For the application of |Δρ ij (p + 1)|, we average it in the dimension j as follows.
So, |Δρ ij (p + 1)| can be used as another statistical feature to further distinguish bad data from outliers caused by the sudden change of microgrid operating states.

Acquisition and Preprocessing of Data
The data sequence of the microgrid is obtained from a fourterminal ring-shaped DC microgrid prototype and its simulation model. The topology, control structure, and partial components of the prototype are illustrated respectively in Figures 7-9.
According to its control strategy, the microgrid has four operation modes, which are shown in Table 1.
There are 24 kinds of electrical measurements collected from the microgrid prototype, namely: terminal voltage and terminal current of the four terminals [U p1 , U p2 , U p3 , U p4 , I p1 , I p2 , I p3 , I p4 ], four DC buses voltage [U dc1 , U dc2 , U dc3 , U dc4 ], the current flowing through the four positive DC bus [I dc1 , I dc2 , I dc3 , I dc4 ], the power output of the four terminals [P 1 , P 2 , P 3 , P 4 ], the active power and reactive power of the grid side [P grid , Q grid ], the active power and reactive power of the load side [P load , The microgrid prototype can be switched between the four operating modes in Table 1 by issuing control commands from the host computer. The data sequence is obtained from the microgrid prototype and its SIMULINK simulation program in the above four control models in a month's operation. Six sets of testing data were randomly selected. The sampling frequency was 10 Hz, and the sampling time was 13 min 20 s. The Transient processes between different operating modes are removed. The reasons are as follows. First, the physical mechanism of the transient process is clear, rather than caused by uncertainty or unfamiliar mechanisms. Second, the transient process can be detected by the microgrid operation mode switching control signal to know the time of its occurrence, and according to the end of the wide fluctuation of the data to know the time of its end. Therefore, it is not the target of point anomaly detection and contextual anomaly detection in this paper. Each row of the testing data matrix is sorted by electrical quantities order [P 1 , P 2 , P 3 , P 4 , P grid , P load , Q grid , All testing data input and output are scaled, taking the reference value p = 6 kW, Q = 0.5 kVar, U = 550 V, I = 10 A.

Simulation Cases Design
Parameters Design.
1) The number k of hidden layer nodes in OSELM is set to 80, and the excitation function g (·) is the sigmoid function.

2) The neighborhood radius Eps and the density threshold
MinPts of the neighborhood in DBSCAN are set to 0.005 and 4, respectively.  Simulation Environment.
1) The simulation software is MATLAB R2018b.
2) The computer configuration for simulation is core i5 processor with 2.4 GHz frequency plus DDRⅢ memory bank with 8 Gbps memory

Simulation Cases
According to the normal distribution characteristics of the measurement error, the bad data with a gross error of 7-10 times the standard deviation of the measurement error were randomly preset in the six sets of testing data with a content of 5%. The measurement accuracies of the voltage Halls (Type: VSM500D) and the current Halls (Type: LA150-P) used in the micro-grid prototype in the simulation section of this paper are 0.008 and 0.01 respectively. The formula for calculating the standard deviation of errors can be found in reference (Huang et al., 2018). The bad data preset in this paper includes cases of amplitude jumps (point anomalies), amplitude deviations, and amplitude shifts (contextual anomalies) (Xu et al., 2021). The simulation cases verify the effectiveness of the proposed method by comparing the detection performances of the three algorithms, including the OSELM + DBSCAN method, the ST (statistical analysis) + DBSCAN method, and the OSELM + ST + DBSCAN method. For point anomaly, the detection performance indicators include the right detection rate and calculation time. The right detection rate R r is calculated by the correct detection times N r , false detection times N f , and missed detection times N m . R r = N r / (N r + N f + N m ). N r , N f , and N m are confirmed by contrasting the detection results of bad data with the preset location of bad data. For contextual anomaly, the detection performance is quantified by the confusion matrix in Table 2 (Hu et al., 2020;Li et al., 2021a;Li et al., 2021b;Hu et al., 2021;Jung, 2022).
In Table 2, TP (True Positive) represents true positive events, FN (False Negative) represents false negative events, FP (False Positive) represents false positive events, TN (True Negative) represents true negative events, and NN represents all events. Based on these events, indicators such as Recall (R), Precision (P), Accuracy (Acc), and Error (Err) are chosen to evaluate the detection performance. Their definitions are shown below where card (•) is the counting function. Large values of R, P, and Acc with a small value of Err represent good detection performance.
The OSELM + DBSCAN method, ST + DBSCAN method, and OSELM + ST + DBSCAN method are denoted respectively as methods A, B, and C. The bad data detection results are carried out by using methods A, B, and C for simulation in each case. Each simulation case repeats 10 times, and the average detection performances are calculated. The amplitude jumps are those discrete data points that deviate far from normal data. The detection performances are shown in Table 3.
As can be seen from Table 3, for point anomalies, the three methods have good detection results (R r is between 66% and 89%). Except for Dataset 5, the detection accuracy of Method C is better than that of Method A and Method B, but the calculation time is sacrificed. Relatively speaking, Method B has the worst detection performance.
Randomly select an electrical measurement U dc2 from Dataset four in Case 1 to visually display the detection effects of the three methods as shown in Figure 10.
Through Figure 10, it is seen that all three methods can detect point anomaly quite well with a few false detections and missed detections. Method C shows the best performance.
The amplitude deviations are those data sequences that deviate far from normal data series in a stepwise way. The detection performances are shown in Table 4.
In Table 4, the indicators corresponding to the best detection performance in each dataset are bolded. It can be seen that Method C shows the best performance when detecting the amplitude deviations, except for R in Dataset 1 and Dataset 4.
Randomly select the electrical measurement P 4 from Dataset 1 in Case 2 to visually display the detection effects of the three methods as shown in Figure 11.
Through Figure 11, it is seen that all three methods can detect the amplitude deviation quite well with a few false detections and missed detections. Method C shows the best performance.
The amplitude shifts are those data sequences that slowly shift and continuously deviate from normal data series. The detection performances are shown in Table 5.
Through Table 5, it can be seen that Method C shows the best performance when detecting the amplitude shifts, except for R and P in Dataset 4.
Randomly select the electrical measurement I line1 from Dataset 5 in Case 3 to visually display the detection effects of the three methods as shown in Figure 12.
Through Figure 12, it is seen that all three methods can detect the amplitude shift quite well with a few false detections and missed detections. Method C shows the best performance.  In this paper, the statistical surge feature (ST) is first used for bad data detection, including point anomaly detection and contextual anomaly detection. On this basis, a sequential detection method that combines OSELM, ST, and DBSCAN is proposed for micro-grid bad data detection. The performance of this method is verified by a four-terminal ring-shaped DC micro-grid prototype. By comparing with the existing OSELM + DBSCAN method and the ST + DBSCAN method, it is demonstrated that the proposed OSELM + ST + DBSCAN method has the best detection performance. To be more specific, 1) The OSELM + ST + DBSCAN can detect both point anomaly and contextual anomaly, such as amplitude jumps, amplitude deviations, and amplitude shifts. 2) The OSELM + ST + DBSCAN method can realize the best bad data detection accuracy at the cost of a small increase of computation.

DATA AVAILABILITY STATEMENT
The original contributions presented in the study are included in the article/Supplementary Material, further inquiries can be directed to the corresponding author.