An Advanced Accurate Intrusion Detection System for Smart Grid Cybersecurity Based on Evolving Machine Learning

Yu, Tong; Da, Kai; Wang, Zhiwen; Ling, Ying; Li, Xin; Bin, Dongmei; Yang, Chunyan

doi:10.3389/fenrg.2022.903370

ORIGINAL RESEARCH article

Front. Energy Res., 30 May 2022

Sec. Smart Grids

Volume 10 - 2022 | https://doi.org/10.3389/fenrg.2022.903370

This article is part of the Research TopicTransition Toward Sustainable BuildingsView all 4 articles

An Advanced Accurate Intrusion Detection System for Smart Grid Cybersecurity Based on Evolving Machine Learning

Tong Yu¹*

Kai Da¹

Zhiwen Wang²

Ying Ling¹

Xin Li¹

Dongmei Bin¹

Chunyan Yang¹

¹Guangxi Power Grid Co.,Ltd., Electric Power Research Institute, NanNing, China
²Guangxi Power Grid Co.,Ltd., Hechi Power Supply Bureau, Hechi, China

Smart grids, the next generation of electricity systems, would be intelligent and self-aware of physical and cyber activity in the control area. As a cyber-embedded infrastructure, it must be capable of detecting cyberattacks and responding appropriately in a timely and effective manner. This article tries to introduce an advanced and unique intrusion detection model capable of classifying binary-class, trinary-class, and multiple-class CDs and electrical network incidents for smart grids. It makes use of the gray wolf algorithm (GWA) for evolving training of artificial neural networks (ANNs) as a successful machine learning model for intrusion detection. In this way, the intrusion detection model’s weight vectors are initialized and adjusted using the GWA in order to reach the smallest mean square error possible. With the suggested evolving machine learning model, the issues of cyberattacks, failure forecast, and failure diagnosing would be addressed in the smart grid energy sector properly. Using a real dataset from the Mississippi State Laboratory in the United States, the proposed model is illustrated and the experimental results are explained. The proposed model is compared to some of the most widely used classifiers in the area. The results show that the suggested intrusion detection model outperforms other well-known models in this field.

1 Introduction

A smart grid (SG) is a complicated system that combines processing technologies, modern communication, and recognition in the current electrical grid. In the SG, intelligent control applications are utilized, which necessitates the usage of error-free data of high quality, as well as rapid and dependable performance (Mohamed et al., 2021a). While SGs are yet under development, they present a risk of misoperation as vital infrastructure and a cyber–physical system, which is the result of intruders injecting malicious or false data (Alnowibet et al., 2021). In recent years, cybersecurity is an important concern faced by power system operators with the advent of SG implementation at large scales. Increasingly, high-speed networks and critical cyber–physical devices are being used now in power grids, making them vulnerable to attacks (Ma et al., 2021). The large-scale energy systems generate high-volume, high-speed data that are difficult for conventional attack detection systems to process. Cybersecurity systems still need to be efficient and resilient to deal with such new threats and detect malicious data on the network effectively (Chen et al., 2021). Cyberattacks (CAs) that target the electric grid belong typically to data intrusion attacks. Denial-of-service (DoS), load redistribution (LR), and false data injection (FDI) attacks are the three main forms of data intrusion attacks. Such attacks allow CAs for manipulating data with which the power grid manages and controls operations, disrupting the power system’s safe operation, gaining financial benefit, and even destroying it physically (Meng et al., 2021). Modern intrusion detection systems (IDSs) rely heavily on the ability to detect and separate abnormal data from normal data. IDSs maintain data availability while maintaining the integrity and confidentiality of networks from unauthorized access (Nazir and Khan., 2021). Technically, IDS is based on the intrusion detection model. Users and utilities may suffer important losses as a result of unreliable or inadequate intrusion detection (Xue., 2021). Intrusion detection models address engineering issues, which are nonlinear, undefined, and accompanied by noise. It is essential to implement an intrusion detection model that is robust, reliable, and cost-efficient in order to resolve such problems. In this way, an overview of recent investigations on the improvement of intrusion-diagnosing models for electrical grids has been provided in the rest.

Several procedures and countermeasures for malicious data attacks on control centers are defined in Zhang et al. (2021). In order to obtain small yet extremely damaging attacks, they proposed the minimum residue energy heuristic (Zhang et al., 2021). Non-identifiable but detectable attacks are examined in Varmaziari and Dehghani (2017) and Cheng et al. (2019). The power flow layout using the supervisory control and data acquisition (SCADA) communication framework layout is integrated in Pan et al. (2018), and some algorithms in order to improve attack detection and system security factors are examined. Countermeasures vs. unobservable attacks are suggested in Dehghani et al. (2020). PMUs are assumed to be sufficiently secure and known to prevent attacks. Since CAs appear as a natural occurrence in the process, they can be complicated and hard to distinguish malicious from non-malicious data in communication systems. Chattopadhyay et al. (2017) distinguished CAs and disturbances since the disruption appear as the CA, and vice versa. This leads to incorrect classification, inappropriate actions, and other problems for the power systems (Liu et al., 2021). Power system disturbances can be categorized and grouped by a number of data-mining techniques. When DoS attacks are present, resilient cooperative event-triggered control and scheduling have been taken into account (Cong et al., 2021).

The precision and speed of the detection method can be considerably affected by the size of the feature set for intrusion detection applications. There is no guarantee that more features will result in improving efficiency since more features would need more memory, take longer time to process, and possibly have higher noise-to-signal ratios. Feature selection has been shown to be critical to faster intrusion detection in networks with a lot of information traffic (Panthi, 2021). An intrusion detection model in Liang et al. (2020), which uses feature selection as well, provided good detection accuracy based on varied features. As compared to other feature selection techniques, that model had great true-positive (TP) criteria and small false-positive (FP) criteria. An artificial neural network (ANN) has been widely applied in deep learning for its efficiency and simplicity. In addition, this is used for intrusion detection in electrical grids (Reddy et al., 2021). Currently, it is difficult to train an ANN since conventional training algorithms face problem to deal with slow convergence and local optima. In one recent trend, the ANNs are trained by applying heuristic optimization methods based on physical or biological principles for determining the most effective weights and biases (Cui et al., 2020).

The present study proposes the use of ANN training with the gray wolf algorithm (GWA) for creating an intrusion diagnosing layout. GWA–ANN is the name for this model. In general, an ANN structure and its arrangement of neurons are classified as follows: self-organizing maps, feed-backward, and forward. The multilayer perceptron (MLP), which can be the feed-forward neural network (FFNN), uses the hidden layer to transform the inputs into outputs. In this case, the network has been trained using the back-propagation algorithm as the supervised learning model. A GWA method, which is a powerful swarm-based intelligent search approach (Qiao et al., 2021), has been applied to identify attacks while overcoming the slow convergence issue and “local minima” traps related to ANNs. In general, GWA is well-known for its ability to determine the identified surrounding area for the universal optimal and has been considered to be very accurate and efficient for solving optimization problems. A GWA algorithm is applied as a trainer for FFNN for overcoming the challenges related to the learning method. This can be a flexible and gradient-free method that could prevent local optimum and has the ability to address many optimization issues and outperform the other current optimization methods to train MLPs. Based on the proven outcomes from the research, this study uses the GWA for training an ANN to detect cyber intrusions in SGs. In our model, the GWA would minimize the mean square error (MSE) and find the best weights for usage in the ANN. The effectiveness of the GWA–ANN model is evaluated using diverse statistical measures, like recall, $F_{1}$ score precision, and accuracy. A comparison is also made between the suggested GWA–ANN model and other intrusion detection models that utilize the databases of CAs in electrical grids held at Mississippi State University (MSU). In the large-scale power system intrusion datasets, it is shown that the GWA–ANN is able to produce better perceptions for most cases and diagnose among diverse categories of unknown entities.

Following are the sections of this study: The GWA–ANN-based intrusion detection model is presented in section 2. The power system structure and the datasets applied in this study are described in section 3. Experimental outcomes are presented in section 4 to show the algorithm’s effectiveness. Section 5 discusses conclusions and future work.

2 Intrusion Detection System Based on the Gray Wolf Algorithm–Artificial Neural Network

2.1 Artificial Neural Network

ANNs are well-known methods for classification, which simulates the activity of biological neurons inside the brain (Lan et al., 2021). ANNs are different from conventional classification techniques since they generate relationships dynamically through training inputs, instead of relying on predefined relationships (Zou et al., 2021). Training and testing phases are included in the ANN classification method (Kumar et al., 2018). The input weight summation has been determined as follows:

S_{j} = \sum_{i = 1}^{m} I_{i} w_{j k} + β_{j} . (1)

Here, the input variable is shown by $I_{i}$ ; the linkage weight among the input node $i$ and the latent node $j$ has been represented by $w_{i j}$ ; and the latent node’s bias $j$ is shown by $β_{j}$ . Every latent layer node’s output has been determined by using the sigmoid activation function described as follows:

f_{j} = \frac{1}{1 + e^{- S_{j}}} . (2)

The last output for every node $k$ in the network’s output layer has been determined as follows:

{\hat{O}}_{k} = \sum_{j = 1}^{h} f_{j} w_{j k} + β_{k} . (3)

Here, the link weight among latent node $j$ and the output node $k$ has been shown via $w_{j k}$ and the output node’s bias $k$ is represented by $β_{k}$ .

2.2 Gray Wolf Optimizer

The hunting behavior and leadership style of gray wolves have been mimicked by GWO, a swarm-based optimization algorithm. The mathematical formulation of GWO is described in Mirjalili et al. (2020).

2.2.1 Gray Wolf Optimizer Inspiration

A gray wolf’s behavior when hunting makes it one of the top predators on the food chain. Figure 1 shows the four subgroups of gray wolves based on their dominance, namely, alpha ( $α$ ), beta ( $β$ ), delta ( $δ$ ), and omega ( $ω$ ). In the gray wolf pack hierarchy, the alpha wolf occupies the top position because of its experience in deciding on habitats and hunting prey for the pack. Beta wolves are found at the second level of wolf packs. Beta wolves help the alpha wolf manage the pack and perform other functions. In third in the pack hierarchy, delta wolves serve mainly as a protector of the pack vs. dangers and as a helper for weaker members. Omega wolves are the remaining wolves in the pack, at the bottom of the pack’s hierarchy. Because of its role to manage and maintain the gray wolf pack, the social hierarchy of the pack is its basis. The social hierarchy also aids in the pack’s ability to hunt prey systematically, in which, once the prey is found, the alpha will lead the pack to track and encircle it. Delta and beta are commanded for attacking the target by the alpha wolf. As soon as the prey escapes, omega wolves will assist delta and beta for catching target.

FIGURE 1

FIGURE 1. Levels of gray wolves packs.

2.2.2 Gray Wolf Optimization Method

According to the hunting strategy in gray wolves, the GWO algorithm search encircles and attacks prey. The GWO algorithm, similar to other meta-heuristic layouts, begins via selecting a random set of solutions (wolves). Every solution contains one wolf position vector $X$ in a search space. Vector $X'$ s length shows an issue dimension. For PSPSH, the length of vector $X$ showing the numbers of SAs $m$ and their amounts show the beginning time for every SA, so $X^{2} = (X_{1}^{z}, X_{2}^{z}, \dots, X_{m}^{z})$ , in which $X_{i}^{z}$ is a set of SA $i$ at $z^{t h}$ iteration. In every iteration, the alpha wolf is the optimal solution, while the beta and delta are the second and the third, respectively. The rest of the solutions are assigned as omega wolves. By encircling alpha, beta, and delta, the omega wolves will assist them in hunting target with the following formulas:

d = | c \cdot X_{p, z} - X_{z} |, (4)

X_{z + 1} = X_{p, z} - μ \cdot d, (5)

μ = 2 \cdot b \cdot r_{1} - b, (6)

c = 2 \cdot r_{2} . (7)

Here, prey position at $z^{t h}$ iteration is shown by $X_{p, z}$ , wolf position at $z^{t h}$ iteration is represented by $X_{z}$ , wolf position at ${(z + 1)}^{t h}$ iteration is shown by $X_{z + 1}$ , $μ$ and $c$ represent two coefficient vectors, $b$ linearly reduces from 2 to 0 across the course of iterations, and $r_{1}$ and $r_{2}$ represent two random vectors between (0, 1). The GWO equation formulations have been revised to be greatly reasonable, realistic, and not a conflict with the PSPSH formula represented in Gosain and Sachdeva (2020).

The whole of the omega wolves solutions must be updated in every iteration based on the three best solutions (viz., delta, alpha, and beta delta solutions) applying these formulations as follows:

d_{α} = | c_{α} \cdot X_{α} - X |, (8)

d_{β} = | c_{β} \cdot X_{β} - X |, (9)

d_{δ} = | c_{δ} \cdot X_{δ} - X |, (10)

X_{1}^{'} = X_{α} - μ_{α} \cdot d_{α}, (11)

X_{2}^{'} = X_{β} - μ_{β} \cdot d_{β} . (12)

X_{3}^{'} = X_{δ} - μ_{δ} \cdot d_{δ} (13)

X_{z + 1} = \frac{X_{1}^{'} + X_{2}^{'} + X_{3}^{'}}{3} (14)

In GWO, exploration and exploitation can be effectively balanced, while local optima stagnation can be avoided utilizing $μ$ . GWO explores and exploits a quest space during $| μ | > 1$ and $| μ | < 1$ , respectively. Local search avoidance relies primarily on the value of $c$ , altering randomly throughout iterations.

2.2.3 Gray Wolf Optimizer on the Basis of the Local Search Algorithm Process

Recent optimization investigations have presented hybrid optimization layouts for improving the efficiency of main layouts and enhancing their outcomes (Al-Ghussain et al., 2021a). This part proposes GWO–MCA, a hybrid algorithm that combines GWO and local search algorithms (MCA). GWO–MCA proposes for meeting the shortcomings for GWO causing its optimal solution to be poor, like low accuracy and slow convergence speed (Zhou and Lei, 2021). MCA has been applied for its easy and quick search process without requiring the use of equations. MCA is also one of the most widely used layouts offered for dealing with CSPs like PSPSH.

The $A_{1}$ parameter has been used for equipping MCA at the exploitation section of GWO. In GWO, the $A_{l}$ variable behaves like the $μ$ parameter; so if $A_{l} | > 1$ , then GWO explores the search, and when $A_{l} | < 1$ , GWO exploits the search. $A_{l}$ is determined in the following way:

A_{l} = 2 \times a_{l} \times r_{1} - a_{l} . (15)

Here, $r_{1}$ is chosen randomly between (0, 1], and $a_{1}$ linearly reduces from 2 to 0 during iterations according to Eq. 36.

a_{l} = 2 - (2 \times \frac{i t r}{I}) . (16)

Here, this indicates the present iteration and $I$ represents the iteration’s maximum number. The first step in the GWO–MCA process is to initialize the CSP and GWO variables. Then, fitness values are calculated for every solution. Furthermore, $X_{α}$ , $X_{β}$ , and $X_{δ}$ are the three best solutions, respectively. The suggested parameters, namely, $r_{1}$ , $a_{1}$ , and $A_{l}$ are calculated in the next step. When $| A_{l} | < 1$ , so MCA would select one of the optimal solutions (i.e., $X_{α}$ , $X_{β}$ , and $X_{δ}$ ) randomly and try to minimize the numbers of conflicts among the chosen solution parameters. The MCA will choose and improve one of the three optimal solutions because of their impacts on the other solutions. Furthermore, the fitness amount of the new solutions is computed and assigned again. Then, GWO updates $X_{α}$ , $X_{β}$ , $X_{δ}$ , and the rest of the solutions for finding better solutions. Fitness values are calculated and solutions are improved until the stop criterion is reached.

2.3 The Artificial Neural Network Architecture

This study implements two MLP networks. There are three layers in an MLP: the output, input, and latent layers. Figure 2 shows the typical architecture for MLP methods. The numbers of input node is m, the latent node’s number is h, and the output node’s number is k. Various numbers have been given for binary problems, trinary-class problems, and multiple-class problems. Figure 3 shows the training method of the MLP. Inputs are taken and outputs are generated according to the current weights and biases. A loss function is used to compare the output from the feed-forward route with the goal result. Next, the Levenberg–Marquardt back-propagation method (Kaveh et al., 2020) has been applied for updating the bias and weight for the subsequent iteration in the conventional MLP method. A GWA is used for updating the weights for the subsequent iteration for the suggested MLP method.

FIGURE 2

FIGURE 2. Generic structure of the MLP.

FIGURE 3

FIGURE 3. Training method of the MLP.

2.4 The Proposed Gray Wolf Algorithm–Artificial Neural Network

An ANN is trained to organize CAs from usual occurrences in the energy systems using the GWA here. The GWA–ANN first initializes every search agent for optimizing a candidate neural network (NN). There are vectors of weights and biases in an MLP network indicating the relations among the input and hidden layers, and also between the hidden and the output layers (Qiao et al., 2021). Equation 17 illustrates the whole number of bias and weight parameters in MLP networks to be optimized using the GWA. Here, the whole number of input nodes is shown by $q$ and the whole number of neurons in the hidden layer is represented by $p$ .

V = p q + 2 p + 1. (17)

By using the MLP method’s MSE as a fitness function, the search agents (whales) can determine a difference among the predicted and actual classes. Equation 18 illustrates MSE, in which $O_{i}$ represents the real output for input instance $i, \hat{O} i$ shows the estimated output for input instance $i$ , and $n$ represents the numbers of instances.

M S E = \frac{\sum_{i = 1}^{n} {(O_{i} - \hat{O} i)}^{2}}{n} . (18)

MATLAB R2018a was used to implement the GWA trainer for the experiments. Normalization would be crucial for an MLP if dataset attributes have multiple ranges (Qiao et al., 2021). The min–max normalization is shown in Eq. 19.

u^{'} = \frac{u - u_{m i n}}{u_{m a x} - u_{m i n}} . (19)

Here, $u^{'}$ shows the normalized value of $u$ between $[u_{m i n}, u_{m a x}]$ .

A flowchart of the GWA-ANN training method is shown in Figure 4. Once importing the data, data cleansing is used for preprocessing it. Prior to using the GWA–ANN to classify the data, the data have been normalized, and feature selection has been performed for determining the number of input features. Next, Gaussian random distribution is used for dividing the datasets into subgroups, 20 percentage for testing, 80 percentage for confirmation (16 percentage), and 64 percentage training that can be according to the most usual ANN research action. GWA–ANN classification has been combined with feature selection (dimension reduction). With the aim of determining the accuracy of the model, the testing data have been employed to feed the classification of an ANN layout with optimum bias and weight achieved from the training step.

FIGURE 4

FIGURE 4. GWA–ANN training scheme.

GWA–ANN can be effective at avoiding local optima. For the suggested model, this would make it easier to find the best MLP’s bias and weight related to great accuracy and great performance (Qiao et al., 2021).

3 Power Grid Structure and Explanations of the Dataset

3.1 Description of the Power Grid

This study’s power system structure is illustrated in Figure 5. This system includes two generators, $G_{1}$ and $G_{2}$ , three bus bars $B_{1}$ via $B_{3}$ , two transmission lines, $L_{1}$ and $L_{2}$ , and four circuit breakers, $C B_{1}$ via $C B_{4}$ that have been controlled via four relays, $R_{1}$ via $R_{4}$ . A substation switch and a router connect those relays to the SCADAs. Distance protection schemes are used by the relays for tripping the breakers on diagnosed error and fault, regardless of whether the fault is actual or not since they do not have any internal validation to determine whether the fault is real or not. These intelligent relays can also be controlled manually by operators so that breakers can be manually tripped by relays (Wang et al., 2021; Zeng et al., 2021). These scenarios suppose that attackers have already accessed a substation’s grid and have been able to access to the switch of substation’s commands, as illustrated in the figure. Electricity is distributed to various equipment by means of the power distribution center (PDC). There are many smart electronic tools, like the control panel, Syslog, and Snort at the bottom of the figure that can monitor the whole grid.

FIGURE 5

FIGURE 5. Power grid structure.

3.2 Datasets and Attack Case Studies

The GWA–ANN is evaluated using the CAs in SG datasets in the ORNL and MSU (Morrison et al., 2021). The types of issues and the segments of case studies are shown in Table 1. A total of 45 diverse datasets are available. In total, there are 15 binary, multiple, and trinary-class datasets. There are no two datasets that can be identical. There are over 5,000 samples in every dataset. The samples correspond to one of the 37 occurrence scenarios. As an example, one trinary-class datasets contains 5,236 observations, consisting of 292 sans occurrences, 3,713 attack, and 1,212 natural observations (Mohamed et al., 2021b). This scenario employed in Kumar et al. (2018) is similar to what is employed here; 1,212 natural similar to Kumar et al. (2018), attack scenarios like short circuits, input of remote command, and maintenance of line, relay adjusting changes, and FDIs are considered. Among the 37 event scenarios in binary datasets, 28 are CAs case studies and nine are usual operation case studies. There are 28 and seven CAs and usual case studies, and one case study sans occurrence for trinary-class datasets. All the 37 case studies in a multiple-class dataset is a class on its own.

TABLE 1

TABLE 1. Problem kinds and case studies.

A comprehensive list of 37 case studies (one sans occurrence, 28 CAs, and eight normal) is presented in Table 2, Table 3, Table 4 in the MSU/ORNL dataset.

TABLE 2

TABLE 2. Natural event case studies in MSU/ORNL information.

TABLE 3

TABLE 3. CA occurrence case studies for MSU/ORNL information.

TABLE 4

TABLE 4. No occurrence case studies for MSU/ORNL information.

There are 129 columns in each dataset, including 128 properties columns and one class tag column. The short names for the properties have been shown in Table 5. All 128 features are generated by four PMUs. PMUs or synchrophasors measure electrical waves from an electrical network utilizing a common time resource for synchronization. The measurements of four PMUs are shown in the first four columns, each measuring 29 relay features. Twelve extra properties from the control panel, Snort, and relay logs are included in the last column.

TABLE 5

TABLE 5. Features in the datasets.

Table 6 lists the symbols employed in the feature names. As an instance, (R2-PM2:V) (in column Relay 2 and row #4) denotes Relay 2’s Phase B voltage magnitude as determined using PMU R2, while (R3-PA:ZH) (in column Relay 3 and row #28) denotes Relay 3’s impedance angle as determined using PMU R3.

TABLE 6

TABLE 6. Symbols applied in the names of characteristics.

Each of the 45 datasets contains around 650,000 data points (5,000 rows by 129 columns), and the 45 datasets contain an overall of $29 \times 10^{6}$ data spots.

4 Explanations and Outcomes of Experiments

Our research sets the maximum number of iterations to 100 and the numbers of quest units to 50. The parameters listed here are typical for the GWA, and they work perfectly in most cases. Preprocessing and the feature selection lead to the selection of 76, 92, and 92 properties from the 128 properties for binary, trinary, and multiplex-class issues. In other words, these three kinds of problems have MLP network architectures of 76-20-1, 92-20-1, and 92-20-1, respectively. Part 4.1 describes the model training and validation method utilizing Dataset 15 binary classification problem. As previously described in Part 4.1, Part 4.2 displays the outcomes for all 45 datasets and issue kinds.

4.1 Pattern Training and Verification

The efficiency of the suggested GWA–ANN model will be measured by recall, $F_{1}$ score precision, and accuracy. The binary classification confusion matrix of this suggested pattern has been presented in Table 7. Outcomes of actual (rows) and predicted (columns) classes are included in the matrix. TP indicates a real CA occurrence that is indicated as an CA; TN (true negative) indicates the usual occurrence that is indicated as usual; FP indicates a usual occurrence that is indicated as the CA, and FN (false negative) indicates a real CA that is indicated as the usual occurrence.

TABLE 7

TABLE 7. Matrix of confusion.

Eqs. 20–23 summarize the recall, $F_{1}$ score precision, and accuracy from Table.7. The accuracy, represented in Eq. 20, generally calculates when the classifier can be right (Al-Ghussain et al., 2021b; Al-Ghussain et al., 2022). Precision, described in Eq. 21, calculates that whenever the classifier predicts the CA, when it can be right. Recall, described in Eq. 22, calculates that when a CA really happens, how often it can be indicated accurately. $F_{1}$ score, described in Eq. 23, combines precision and recall.

A c c u r a c y = \frac{T P + T N}{T P + F P + F N + T N}, (20)

P r e c i s i o n = \frac{T P}{T P + F P}, (21)

R e c a l l = \frac{T P}{T P + F N}, (22)

F_{1} S c o r e = 2 \times \frac{P r e c i s i o n \times R e c a l l}{P r e c i s i o n + R e c a l l} . (23)

The accuracy curve of convergence throughout the adjusting method for the ANN via GWA employing Dataset 15 binary classification issue is shown in Figure 6. Based on the figure, increasing the number of iterations slowly raises the accuracy of the model. Beginning from 64 number of iteration, the accuracy has jumped up and rapidly stabilized at approximately $99 %$ .

FIGURE 6

FIGURE 6. Convergence curve of the ANN accuracy during the GWA-tuning process.

Figure 7 illustrates the histogram’s error with 20 bins representing training, confirmation, and error of trials for Dataset 15 binary classification. This figure illustrates how the trained pattern can fit the dataset. The majority of errors have been concentrated in the tiny area near zero, with $0.02592$ being the most prominent error.

FIGURE 7

FIGURE 7. Training error histogram, confirmation, and testing with the GWA–ANN.

4.2 Trail Outcomes for the Mississippi State University/ORNL Datum

These classification outcomes in this subsection are based on all 45 MSU/ORNL datasets. Figures 8–11 show the classification outcomes from this suggested pattern for trinary-class, multiple-class, and binary issues regarding recall, $F_{1}$ score precision, and accuracy.

FIGURE 8

FIGURE 8. Accuracy across 15 multiple-class, 15 trinary-class, and 15 binary datasets.

FIGURE 9

FIGURE 9. Precision across 15 multiple-class, 15 trinary-class, and 15 binary datasets.

FIGURE 10

FIGURE 10. Recall across 15 multiple-class, 15 trinary-class, and 15 binary datasets.

FIGURE 11

FIGURE 11. $F_{1}$ score across 15 binary, 15 trinary-class, and 15 multiple-class datasets.

Figures 12–15 have compared the mean amounts for the recall, $F_{1}$ score, precision, and accuracy for the 45 datasets utilizing typically employed classifiers for the research, like the OneR, JRip, AdaBoost + JRip, SVM (Panthi, 2021), and NN with no GWA. According to the figure, GWA–ANN performs better than other algorithms for most applications.

FIGURE 12

FIGURE 12. Mean precision amounts of diverse classifiers of the 45 datasets.

FIGURE 13

FIGURE 13. Average precision amounts of diverse classifiers of the 45 datasets.

FIGURE 14

FIGURE 14. Mean recall amounts of diverse classifiers of the 45 datasets.

FIGURE 15

FIGURE 15. Mean $F_{1}$ score amounts of diverse classifiers of the 45 datasets.

5 Conclusion

Detecting suspicious or anomalous events with a very high speed and accuracy is essential for a reliable SG operation and management. As power systems are highly dependent on cyber infrastructure, cybersecurity is a significant problem. This infrastructure is necessary to distribute and process huge amounts of real-time data produced throughout system operation. This study overcomes several weaknesses associated with conventional algorithms on the basis of ANNs, like the trapping of local minima. This study uses the GWA-ANN model for classifying the CAs and detecting failures in the electrical grid by applying the MSU/ORNL datasets at diverse difficulty levels (binary, trinary-class, and multiple-class). The GWA is used to train the ANN for achieving the best bias and weight with minimum MSE in the classification task. The efficiency of the suggested GWA-ANN is evaluated applying different standard metrics, like $F_{1}$ score recall, precision, and accuracy. Experiments demonstrated that the suggested method is capable of detecting the CA data in electrical systems efficiently. Compared to other classification methods, like OneR, JRip, AdaBoost + JRip, SVM, and NN (with not GWA), the GWA-ANN is superior due to its powerful capability to explore and prevent local optimization. A periodic update of the suggested model is possible. In the event that an unidentified event has been later confirmed as a CA by humans, it must achieve the confirmed tag and has been added to the library of training and be employed to detect potential CAs in the future.

Data Availability Statement

The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

Author Contributions

All authors listed have made a substantial, direct, and intellectual contribution to the work and approved it for publication.

Conflict of Interest

TY, KD, ZW, YL, XL, DB, and CY were employed by the company Guangxi Power Grid Co., Ltd.

This work was supported by the Guangxi Power Grid Co., Ltd., Research on vulnerability defense technology of power monitoring system based on interdependent network (047000KK52210031), Research and application of security attack monitoring technology for power intranet terminal equipment based on counter deep learning (047000KK52200012).

Publisher’s Note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors, and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

References

Al-Ghussain, L., Abubaker, A. M., and Ahmad, A. D. (2021). Superposition of Renewable-Energy Supply from Multiple Sites Maximizes Demand-Matching: Towards 100% Renewable Grids in 2050. Appl. Energ. 284, 116402. doi:10.1016/j.apenergy.2020.116402

CrossRef Full Text | Google Scholar

Al-Ghussain, L., Ahmad, A. D., Abubaker, A. M., Abujubbeh, M., Almalaq, A., and Mohamed, M. A. (2021). A Demand-Supply Matching-Based Approach for Mapping Renewable Resources towards 100% Renewable Grids in 2050. IEEE Access 9, 58634–58651. doi:10.1109/ACCESS.2021.3072969

CrossRef Full Text | Google Scholar

Al-Ghussain, L., Subaih, M. A., and Annuk, A. (2022). Evaluation of the Accuracy of Different PV Estimation Models and the Effect of Dust Cleaning: Case Study a 103 MW PV Plant in Jordan. Sustainability 14 (2), 982. doi:10.3390/su14020982

CrossRef Full Text | Google Scholar

Alnowibet, K., Annuk, A., Dampage, U., and Mohamed, M. A. (2021). Effective Energy Management via False Data Detection Scheme for the Interconnected Smart Energy Hub-Microgrid System under Stochastic Framework. Sustainability 13 (21), 11836. doi:10.3390/su132111836

CrossRef Full Text | Google Scholar

Chattopadhyay, A., Ukil, A., Jap, D., and Bhasin, S. (2017). Toward Threat of Implementation Attacks on Substation Security: Case Study on Fault Detection and Isolation. IEEE Trans. Ind. Inform. 14 (6), 2442–2451. doi:10.3390/su14020982

CrossRef Full Text | Google Scholar

Chen, J., Mohamed, M. A., Dampage, U., Rezaei, M., Salmen, S. H., Obaid, S. A., et al. (2021). A Multi-Layer Security Scheme for Mitigating Smart Grid Vulnerability against Faults and Cyber-Attacks. Appl. Sci. 11 (21), 9972. doi:10.3390/app11219972

CrossRef Full Text | Google Scholar

Cheng, G., Song, S., Lin, Y., Huang, Q., Lin, X., and Wang, F. (2019). Enhanced State Estimation and Bad Data Identification in Active Power Distribution Networks Using Photovoltaic Power Forecasting. Electric Power Syst. Res. 177, 105974. doi:10.1016/j.epsr.2019.105974

CrossRef Full Text | Google Scholar

Cong, M., Mu, X., and Hu, Z. (2021). Sampled-data-based Event-Triggered Secure Bipartite Tracking Consensus of Linear Multi-Agent Systems under DoS Attacks. J. Franklin Inst. 358 (13), 6798–6817. doi:10.1016/j.jfranklin.2021.07.012

CrossRef Full Text | Google Scholar

Cui, H., Dong, X., Deng, H., Dehghani, M., Alsubhi, K., and Aljahdali, H. M. (2020). Cyber Attack Detection Process in Sensor of DC Micro-grids under Electric Vehicle Based on Hilbert-Huang Transform and Deep Learning. IEEE Sensors J. 21, 15885–15894. doi:10.1109/jsen.2020.3027778

CrossRef Full Text | Google Scholar

Dehghani, M., Kavousi‐Fard, A., Dabbaghjamanesh, M., and Avatefipour, O. (2020). Deep Learning Based Method for False Data Injection Attack Detection in AC Smart Islands. IET Generation, Transm. & Distribution 14 (24), 5756–5765. doi:10.1049/iet-gtd.2020.0391

CrossRef Full Text | Google Scholar

Gosain, A., and Sachdeva, K. (2020). “Random Walk Grey Wolf Optimizer Algorithm for Materialized View Selection (RWGWOMVS),” in InNovel Approaches Inf. Syst. Des.. PA, United States: IGI Global, 101–122. doi:10.4018/978-1-7998-2975-1.ch005

CrossRef Full Text | Google Scholar

Kaveh, K., Kaveh, H., Bui, M. D., and Rutschmann, P. (2020). Long Short-Term Memory for Predicting Daily Suspended Sediment Concentration. Eng. Comput. 8, 1–5. doi:10.1007/s00366-019-00921-y

CrossRef Full Text | Google Scholar

Kumar, M. N., Koushik, K. V., and Sundar, K. J. (2018). Data Mining and Machine Learning Techniques for Cyber Security Intrusion Detection. Int. J. Scientific Res. Comput. Sci. Eng. Inf. Technol. 3 (3), 162–167. doi:10.1109/TII.2017.2770096

CrossRef Full Text | Google Scholar

Lan, T., Jermsittiparsert, K., T. Alrashood, S., Rezaei, M., Al-Ghussain, L., and A. Mohamed, M. (2021). An Advanced Machine Learning Based Energy Management of Renewable Microgrids Considering Hybrid Electric Vehicles' Charging Demand. Energies 14 (3), 569. doi:10.3390/en14030569

CrossRef Full Text | Google Scholar

Liang, C., Shanmugam, B., Azam, S., Karim, A., Islam, A., Zamani, M., et al. (2020). Intrusion Detection System for the Internet of Things Based on Blockchain and Multi-Agent Systems. Electronics 9 (7), 1120. doi:10.3390/electronics9071120

CrossRef Full Text | Google Scholar

Liu, Y., Jin, T., Mohamed, M. A., and Wang, Q. (2021). A Novel Three-step Classification Approach Based on Time-dependent Spectral Features for Complex Power Quality Disturbances. IEEE Trans. Instrum. Meas. 70, 1–14. doi:10.1109/tim.2021.3050187

PubMed Abstract | CrossRef Full Text | Google Scholar

Ma, F., Wang, B., Zhou, J., Jia, R., Luo, P., Wang, H., et al. (2021). An Effective Risk Identification Method for Power Fence Operation Based on Neighborhood Correlation Network and Vector Calculation. Energ. Rep. 7, 6995–7003. doi:10.1016/j.egyr.2021.10.061

CrossRef Full Text | Google Scholar

Meng, F., Zou, Q., Zhang, Z., Wang, B., Ma, H., Abdullah, H. M., Almalaq, A., and Mohamed, M. A. (2021). An Intelligent Hybrid Wavelet-Adversarial Deep Model for Accurate Prediction of Solar Power Generation. Energ. Rep. 7, 2155–2164. doi:10.1016/j.egyr.2021.04.019

CrossRef Full Text | Google Scholar

Mirjalili, S., Aljarah, I., Mafarja, M., Heidari, A. A., and Faris, H. (2020). Grey Wolf Optimizer: Theory, Literature Review, and Application in Computational Fluid Dynamics Problems. Nature-inspired optimizers., 87–105. doi:10.1007/978-3-030-12127-3_6

CrossRef Full Text | Google Scholar

Mohamed, M. A., Almalaq, A., Abdullah, H. M., Alnowibet, K. A., Alrasheedi, A. F., and Zaindin, M. S. A. (2021). A Distributed Stochastic Energy Management Framework Based-Fuzzy-PDMM for Smart Grids Considering Wind Park and Energy Storage Systems. IEEE Access 9, 46674–46685. doi:10.1109/access.2021.3067501

CrossRef Full Text | Google Scholar

Mohamed, M. A., Mirjalili, S., Dampage, U., Salmen, S. H., Obaid, S. A., and Annuk, A. (2021). A Cost-Efficient-Based Cooperative Allocation of Mining Devices and Renewable Resources Enhancing Blockchain Architecture. Sustainability 13 (18), 10382. doi:10.3390/su131810382

CrossRef Full Text | Google Scholar

Morrison, R., Liu, X., and Lin, Z. (2021). Anomaly Detection in Wind Turbine SCADA Data for Power Curve Cleaning. Renew. Energ. 184, 473–486. doi:10.1016/j.renene.2021.11.118

CrossRef Full Text | Google Scholar

Nazir, A., and Khan, R. A. (2021). A Novel Combinatorial Optimization Based Feature Selection Method for Network Intrusion Detection. Comput. Security 102, 102164. doi:10.1016/j.cose.2020.102164

CrossRef Full Text | Google Scholar

Pan, K., Teixeira, A., Cvetkovic, M., and Palensky, P. (2018). Cyber Risk Analysis of Combined Data Attacks against Power System State Estimation. IEEE Trans. Smart Grid 10 (3), 3044–3056. doi:10.1109/TSG.2018.2817387

CrossRef Full Text | Google Scholar

Panthi, M. (2021). Identification of Disturbances in Power System and DDoS Attacks Using Machine Learning. InIOP Conf. Ser. Mater. Sci. Eng. 1022 (No. 1), 012096. IOP Publishing. doi:10.1088/1757-899x/1022/1/012096

CrossRef Full Text | Google Scholar

Qiao, W., Khishe, M., and Ravakhah, S. (2021). Underwater Targets Classification Using Local Wavelet Acoustic Pattern and Multi-Layer Perceptron Neural Network Optimized by Modified Whale Optimization Algorithm. Ocean Eng. 219, 108415. doi:10.1016/j.oceaneng.2020.108415

CrossRef Full Text | Google Scholar

Reddy, D. K., Behera, H. S., Nayak, J., Vijayakumar, P., Naik, B., and Singh, P. K. (2021). Deep Neural Network Based Anomaly Detection in Internet of Things Network Traffic Tracking for the Applications of Future Smart Cities. Trans. Emerging Telecommunications Tech. 32 (7), e4121. doi:10.1002/ett.4121

CrossRef Full Text | Google Scholar

Varmaziari, H., and Dehghani, M. (2017). “Cyber-attack Detection System of Large-Scale Power Systems Using Decentralized Unknown Input Observer,” in 2017 Iranian Conference on Electrical Engineering (ICEE) 2017 May 2 (IEEE), 621–626. doi:10.1109/iraniancee.2017.7985114

CrossRef Full Text | Google Scholar

Wang, Q., Jin, T., Mohamed, M. A., and Deb, D. (2021). A Novel Linear Optimization Method for Section Location of Single-phase Ground Faults in Neutral Noneffectively Grounded Systems. IEEE Trans. Instrum. Meas. 70, 1–10. doi:10.1109/tim.2021.3066468

PubMed Abstract | CrossRef Full Text | Google Scholar

Xue, P. (2021). Impact of Large-Scale Mobile Electric Vehicle Charging in Smart Grids: A Reliability Perspective. Front. Energ. Res., 101–122. doi:10.3389/fenrg.2021.688034

CrossRef Full Text | Google Scholar

Zeng, L., Xia, T., Elsayed, S. K., Ahmed, M., Rezaei, M., Jermsittiparsert, K., Dampage, U., and Mohamed, M. A. (2021). A Novel Machine Learning-Based Framework for Optimal and Secure Operation of Static VAR Compensators in EAFs. Sustainability 13 (11), 5777. doi:10.3390/su13115777

CrossRef Full Text | Google Scholar

Zhang, Z., Deng, R., Yau, D. K. Y., and Cheng, P. (2021). Zero-Parameter-Information Data Integrity Attacks and Countermeasures in IoT-Based Smart Grid. IEEE Internet Things J. 8 (8), 6608–6623. doi:10.1109/jiot.2021.3049818

CrossRef Full Text | Google Scholar

Zhou, B., and Lei, Y. (2021). Bi-objective Grey Wolf Optimization Algorithm Combined Levy Flight Mechanism for the FMC green Scheduling Problem. Appl. Soft Comput. 111, 107717. doi:10.1016/j.asoc.2021.107717

CrossRef Full Text | Google Scholar

Zou, H., Tao, J., Elsayed, S. K., Elattar, E. E., Almalaq, A., and Mohamed, M. A. (2021). Stochastic Multi-Carrier Energy Management in the Smart Islands Using Reinforcement Learning and Unscented Transform. Int. J. Electr. Power Energ. Syst. 130, 106988. doi:10.1016/j.ijepes.2021.106988

CrossRef Full Text | Google Scholar

Keywords: smart grid, cyberattack, intrusion detection system, advanced machine learning, smart city

Citation: Yu T, Da K, Wang Z, Ling Y, Li X, Bin D and Yang C (2022) An Advanced Accurate Intrusion Detection System for Smart Grid Cybersecurity Based on Evolving Machine Learning. Front. Energy Res. 10:903370. doi: 10.3389/fenrg.2022.903370

Received: 24 March 2022; Accepted: 11 April 2022;
Published: 30 May 2022.

Edited by:

Loiy Al-Ghussain, University of Kentucky, United States

Reviewed by:

Mohamed A. Mohamed, Minia University, Egypt
Mostafa Rezaei, Griffith University, Australia

Copyright © 2022 Yu, Da, Wang, Ling, Li, Bin and Yang. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Tong Yu, WXV0b25ncGdjb0BnbWFpbC5jb20=

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.