LSTM-RNN-FNN Model for Load Forecasting Based on Deleuze’s Assemblage Perspective

Reliable load forecasting is essential for electricity generation and even for people’s lives. However, the existing load forecasting theories cannot match the requirements of complex systems (e.g., smart grids). Deleuze’s metaphysical complexity theory is seen as the theoretical foundation for comprehending complex systems, and thus, a new perspective based on Deleuze’s assemblage concept is given. According to the assemblage perspective, the electrical load is a quantitative representation of the mutual becoming of people and electricity, and load forecasting is an attempt to control this continuous process of deterritorialization and reterritorialization. We built an LSTM-RNN-FNN model for load forecasting based on the assemblage perspective, and the assessment results demonstrated that the model has high prediction performance. Furthermore, the performance of adding the temperature parameter into the network is also tested, while the correlation between the temperature and load is not strong enough and may not be suitable for load prediction. The assemblage perspective has significant implications for future load forecasting and potentially smart grid research.


Background
Electricity is an enthralling subject to investigate, and it is an infrastructure in many ways. Since the end of the 19th century, electricity has been the foundation of modernity's experience. Currently, electricity powers a plethora of supplementary equipment and services on which modern lifestyles rely. Worldwide electricity usage will account for about 20% of the total energy consumption in 2021 (Zhang and Li, 2021).
With the increasing reliance of human existence on power, electricity production planning is becoming increasingly vital. Because electricity is difficult to store, it is usually required to consume it shortly after it is generated. As a result, dependable load forecasting is the essential foundation for directing power generation. Unreliable projections, on the other hand, might have catastrophic implications. Overestimation of future electricity use will result in waste of primary energy and power producing facilities, whereas underestimation would result in power shortages, which will have a direct impact on people's daily lives.
Currently, a variety of methods have been developed to predict the load of the power grid, such as the time series method, regression method, exponential smoothing, random forest, and others (Kamel and Baharudin, 2007;Singh et al., 2012;Dodamani et al., 2015;Lahouar and Slama, 2015). However, these traditional methods have some different defects. For example, for the time series method, factors affecting the load change are not considered, and the uncertain factors (such as weather  and holidays) are not sufficiently considered. When the weather changes greatly or encounters holidays, the prediction error of the model is large. As a result, different neural networks (NNs) such as the conventional neural network (CNN) and recurrent neural network (RNN) are utilized to forecast the load value, obtain better results, and also used for smart grids Zhang et al. (2010); Zheng et al. (2017); Kuo and Huang (2018); Mohammad and Kim (2020).
However, because future electricity consumption is a reflection of human social activities, it will be influenced by a variety of factors, including political conditions, the economy (Lin and Liu, 2016), human activities, population behavior (Hussain et al., 2016), climate factors (Hernández, 2013), and other external factors influencing electrical consumption forecasting accuracy. Forecasting electricity is complex and fraught with uncertainty.

Literature Review and Motivation
At present, there are several ways for predicting load, which may be broadly classified as the Markov chain or autoregressive techniques (Dodamani et al., 2015;Kamel and Baharudin, 2007;Li and Niu, 2009;Baharudin and Kamel, 2008). However, contemporary load forecasting research is still centered on cybernetics and views the power grid as an independent technology and facility that can no longer match the needs of complex systems (e.g., smart grids) (Xiao et al., 2021;Sun et al., 2021b,c,d). User demand, from the perspective of smart grids, is simply another managed resource that will aid in balancing supply and demand and ensuring system dependability. Furthermore, the availability of intermittent clean energy sources such as wind, solar, and water energy adds to the issues of smart grid stability (Sun et al., 2021a;Yang et al., 2022). As a result, the smart grid should be considered as part of a wider, unpredictable system that incorporates the natural environment and a huge number of users. Therefore, we require a new theory to guide power forecasting research in the context of smart grids.
Deleuze's metaphysical complexity theory gives a theoretical foundation for comprehending complex systems. Electricity's material-social entanglements have been included into concepts of modernity and progress by a significant number of anthropologists. Winther (2008) investigates the influence of electricity reaching rural populations in Africa based on the ethnographic fieldwork in Zanzibar at various moments in time. Howe and Boyer (2015) discovered that the material politics of electricity flow through state power in their research of wind energy and the energy transition in Mexico. Akhil Gupta's research (Gupta, 2015) on electricity and class examines the link between the global South's developing middle class and rapidly growing power demand. Anusas and Ingold (2015) pondered whether we must think more expansively about electricity as a phenomenon of matter and life.
Despite the fact that electricity prediction is impacted by various human, political, economic, and social elements, no one has yet integrated the material-social entanglements of electricity into the field of electrical load forecasting study.

Contribution
This research chooses to analyze and model electricity forecasting from a novel perspective in order to bridge the gap between the literature and motivation stated previously. The major study contents of this article are separated into three parts: first, we described how to comprehend electricity forecasting using Deleuze's metaphysical complexity theory; second, we designed a neural network model using this theory; and finally, we assessed the model. The article's primary contributions are as follows: 1) Based on Deleuze's theory, a new approach to understanding electric demand and load forecasting is offered. According to this line of thought, the electrical load is a quantitative description of the mutual becoming of people and electricity, whereas load forecasting is an attempt to control this constant deterritorialization and reterritorialization process. 2) A neural network model using Deleuze's theoretical guidance is created. In this article, an LSTM-RNN-FNN model is utilized to predict the load in future 5 h based on the load value in the last 24 h and the current temperature.
The result indicates that current architecture can do such a job very well. 3) We assessed the model. Here, the performance of the networks with different superparameters and datasets is also tested for the reference of subsequent researchers when selecting the parameters.
The rest of this article is organized as follows: In Section 2, we introduced Deleuze's concept of assemblage and the understanding of load forecasting from the assemblage perspective after a criticism of three typical perspectives (separate, embedded, and articulation perspectives). Section 3 proposes the LSTM-RNN-FNN model based on the assemblage perspective, and Section 4 evaluates the model using different superparameters and datasets. Section 5 discusses the assessment results and the potential future study directions. Section 6 concludes this article.

DELEUZE-STYLE THINKING PARADIGM ABOUT THE ELECTRICAL LOAD
Deleuze was one of the first philosophers to attempt to conceptualize in terms of the new electronic information environment, criticizing cybernetics' systematicity, and advocating for "open systems" of operation, code, power, and flow (Boyer, 2015). We suggested a new way of thinking about electricity forecasting based on Deleuze's concept of "assemblage. " (Wise, 2013) In this section, we discussed and critiqued three typical perspectives on understanding electrical grids or, more broadly, human-technology relations and then presented Deleuze's concept of "assemblage" and how this concept should be used to comprehend the grid and load forecasts.

Separate Perspective
It is very common to think of people and technology as separate entities. In this view, both humans and technology are assumed to be distinct and interacting special things. People may be surrounded by numerous forms of technology; however, technologies are completely external to individuals and are only seen as tools. This perspective is very prevalent in the study of electricity. Therefore, when the smart grid enters the field of study, in other words, when the grid as a technology and humans begin to converge, individuals who share this perspective get rather nervous. This perspective has prompted a never-ending debate between the technological and social determinism.

Embedded Perspective
The second point of view says that technology and people cannot be isolated from their surroundings and that we should look at the interaction between people and technology in context rather than thinking about a technology in isolation. This research route offers two benefits: first, it is intuitive; second, humans and technology are both constrained. This approach views technological or social determinism as environmental rather than universal, in which people and technology are embedded (Howard, 2004). This perspective is also prevalent in the study of electricity, and terms such as electric ecology/ecosystem are rooted on it. However, it must be recognized that "embedded" implies that it can be "disembedded"; thus, this approach continues to believe that technology and humans are different and independent entities.

Articulation Perspective
The concept of articulation, which maintains that distinct pieces may be connected (articulated) or separated to generate completeness and identities, is the third perspective of perceiving. Historically, any articulation is unintentional. Articulation must be created, sustained, modified, and destroyed in specific practices. From this perspective, we may pose the following questions about the grid: how does the grid articulate to specific functions and uses, people, ideas and discourses, and practices? What articulations does the grid itself consist? Many anthropological studies have used this approach to better understand topics such as relationships between electrical and social power, human daily life (Winther and Wilhite, 2015), culture, and economics (Özden-Schilling, 2015).

Deleuze's Assemblage Concept
Assemblage, as defined by Deleuze, is a concept that deals with contingency and the roles of structure, organization, and change. Assemblage is a process of arranging, structuring, and fitting together, not a static term. An assemblage is also not a random collection of objects. It is a whole that reflects a certain identity and announces the scope of territorialization. An assemblage is some kind of "becoming 1 " that brings the elements together. 1 "Becoming" is one of Deleuze's most important concepts. Deleuze believes that the world is nothing more than a stream of becoming and that all existence is nothing The assemblage may be a more complicated mode of articulation, yet there are several distinctions between the assemblage and articulation. First, the assemblage is dynamic, stressing the "process, " but articulation highlights the complexity of the relationship between the static elements. Second, the assemblage's dynamics imply that rather than merely items, practices, and symbols articulated into a structure, extra aspects that characterize the process (e.g., speed) are brought into the territorialization of the assemblage. Third, an assemblage is territorialization that extracts something from the environment and draws it into the relation with other environments, rather than merely an environment, a clump of space-time, or an articulation of elements. Assemblages also disperse (deterritorialization), with the elements moving into different relations and configurations (reterritorialization). Then, the assemblage is assembled again, and the elements are reterritorialized at the same moment but in a different way.

Rethinking Load Forecasting From the Assemblage Perspective
Using Deleuze's concept of "assemblage, " we may gain a fresh understanding of people's relationship with electricity. Everyone's power use may be regarded as the becoming of an appliancegrid-generator (AGG) assemblage. On the route to inventing this assemblage, people are becoming electricity, and electricity is becoming people. This pair of mutual becoming relationships is quantitatively described as the electrical load. The load at each moment indicates territorialization, and the change of load at each subsequent moment signifies deterritorialization of this moment and reterritorialization of the following moment.
Understanding the electrical load forecasting from the assemblage perspective has unparalleled advantages. First, this perspective challenges the previous belief that electricity is a separate entity, combining electricity and people. Second, the assemblage perspective emphasizes dynamics and changes, which allow researchers to make more accurate predictions. Furthermore, the assemblage's continual deterritorialization and reterritorialization means that it is constantly being created, opening up an infinite number of possibilities for load forecasting research. Researchers can develop more specific models to deal with various situations by emphasizing/removing one or more elements.
However, Deleuze reminds us that focusing solely on the usage of electricity by individuals or organizations prevents such specific assemblages from exhibiting a larger set of functions or principles. These functions or principles are referred to as the "abstract machine" by Deleuze and Guattari (1988). Deleuze (2017) envisions a new abstract machine, "society of more than a relatively stable moment in the stream of "becoming-life. " With the help of the concept of becoming, he rejects the concept of human beings as the basic existence, affirming that all kinds of existences in the world have multiple existence values and meanings. He supports the dynamic view of becoming and vitality theory's multiple perspectives control, " in his last articles: "We're moving toward control societies that no longer operate by confining people but through continuous control and instant communication. " Electricity has been demonstrated as a control in the work of McDonald (2012) and Von Schnitzler (2013), and as a communication, the change in electrical load precisely describes the flow of control, and the forecast of the electric load should be regarded as our attempt to control this flow. We used to rely on the negative feedback regulation of the load, which means that the regulation to keep the flow under control of the grid will always be slower than the rate of change of the load, but now (in this research), we rely on the assemblage called neural networks, or more specifically, a multilayered data-perceptron assemblage consisting of a large amount of data and multiple perceptrons.

ARCHITECTURE OF NEURAL NETWORKS
The electrical load reflects the becoming flow of the AGG assemblage and attempts to predict that the electrical load is based on the historical situation of the becoming flow. Therefore, it is necessary to extract the relationship between the load in different time series. In this article, a neural network is developed in order to predict the grid load for the next 5 h based on the grid load 24 h before and the current temperature. As a result, the long short-term memory recurrent neural network (LSTM-RNN) architecture is taken into consideration, which is thought to be good at predicting based on time series.

Recurrent Neural Networks
With the development of the machine learning technology, all kinds of architecture have been invented and developed in order to adapt to the needs of different types of tasks (Cho et al., 2014;Girshick, 2015;Schwing and Urtasun, 2015;Sindagi and  Sherstinsky, 2020; Kattenborn et al., 2021). Recurrent neural networks (RNNs) are a classical architecture that is often used to process data sequences such as speech recognition and sales forecasting.
The comparison of the characteristics of the RNN and ordinary neural networks is shown in the following Figures 1A, B, which indicates that the output of the RNN at each moment depends not only on the input at that moment but also on the system state, and the main difference is that the RNN relies on the state of the entire scene more than the original network does.
The classical architecture of RNN is shown in Figure 2. It can be observed that the output value at this moment is not only controlled by the input value at this moment x t i but also receives the influence of the output value S t i−1 at the previous moment. During the training process, with the continual update of the weight matrix, the error between prediction and target data will be decreased, which means that the neural networks gradually fit the intrinsic connections between the datasets.

Long Short-Term Memory
Even though the RNN solves the problem about how can the neural networks datasets learn with time series, it still has certain defects. Over time scales, the RNN network does take into account the effect of the previous moment's output. However, in most cases, data from more distant moments have little effect on the present moment; as a result, introducing the effect of too far moments in a neural network may degrade its performance.
In order to overcome the negative effects of too distant a point in time, the long short-term memory (LSTM) network is developed and generally accepted by researchers. LSTMs are a special kind of RNN, capable of learning long-term dependencies based on the addition of the gate mechanism. The LSTM-RNN usually contains forgotten, input, and output gates and introduced the concept of cell state. These kinds of mechanisms allow LSTM-RNN to be made to switch between remembering the recent information and information from a long time ago, letting the data decide for itself which information to keep and which to forget. LSTM has stronger temporal correlation, allowing the neural network to obtain the relationship between the parameters from previous data, which is very helpful for the data prediction job. An original graph of architecture of LSTM networks is provided in following Figure 3 (Yu et al., 2019), which contains extra forget gate per LSTM block to reduce the weight of values at distant time points.

Architecture
The full architecture of the hybrid deep neural network used in this article is shown in Figure 4. The inputs are the information of the load value in the past few hours and the current temperature, and the outputs represent the prediction of the future load values. The current network contains two major parts: the LSTM-RNN and fully connected neural network (FNN).  In the process of preparing for the datasets, null values are checked, and the load data are split into training and test datasets, which relatively contain 70 and 30% data points in the whole file, respectively. The LSTM-RNN is used to extract the relationship between the load in different time series, and the FNN part is utilized to perform feature fusion, which means fusion of time series with temperature features. The output of the LSTM-RNN will be combined and sent to FNN layers and then used to calculate the final predicted values.

Loss Function
In this article, in order to train and test the performance of the network, different loss functions are utilized, which is called the mean square error and the mean absolute percentage error. The error measure is defined as follows: Frontiers in Energy Research | www.frontiersin.org whereŷ is the predicted load value, and y is the real load value in future several hours. In order to improve the performance, the Huber Loss function is also used as the train loss function, which is defined as follows: where δ is a super parameter and can be defined by researchers. This function is quadratic for small values of a and linear for large values, with equal values and slopes of the different sections at two points where |ŷ − y| = δ. Comparing with the MSE loss function, the Huber loss function is more robust to outliers, and often has better performance than the general loss function.

MERICS
The network is established for predicting the real grid load in the real word. In this section, the performance of the network is tested and provided.

Datasets Description
In this article, the electric load dataset in the Tetouan city in the Morocco and GEF2017 datasets was used. For the dataset in Tetouan city, it provides loads of data of Tetouan city of the whole year of 2017, and for GEF2017 dataset, it provides the load data from the year of 2003-2009. An example of training datasets is shown Figure 5.

Test Performance Without Temperature
The classical performance of the current network is shown in Figure 6, where it can be observed that both the datasets can be fitted by the current neural network. The average test MAPE loss of the Tetouan database is about 0.64% while 4.32% with the GEF2017 database. The difference between two different databases is probably caused by the length of the datasets. For the larger datasets, it seems more difficult to analyze the intrinsic pattern of the load value for the LSTM network.
In order to find networks with better performance, different superparameters of the number of hidden layers and hidden sizes are tested by using both the datasets, and the MAPE is shown in Figure 7: Figure 7 indicates that the increase in the hidden size has positive influence on reducing the error, but after the hidden size larger than 64 is reached, this effect will be diminished, and the train time and calculation time will also be increased while increasing such size; as a result, 64 might be the best choice to balance the time consumption with accuracy. However, for the hidden layers, networks with only one LSTM layer have better performance than others, which means more complex networks are less effective. The probable reason is that when the complexity of the network increases, the risk of overfitting increases accordingly, which is particularly serious in large datasets. However, in order to guarantee the fitting ability of the neural network, we assumed that two LSTM layers are the best choice at present, according to the result.

Adding Temperture Into Consideration
In order to test the influence of the temperature of the load value, shown in Figure 4, the temperature parameter is added into the neural network by being combined with the output of the LSTM part. However, the MAPE loss seems increased if we do not increase the number of epochs because of the increase in the complexity of the model. As a result, the number of epochs is also changed in order to get higher accuracy. The MAPE of different epochs using the Tetouan datasets can be shown by Figure 8. Figure 8 indicates that after adding the temperature parameter into the network, more epochs are needed to get the best performance, while too many epochs will also increase the risk of overfitting; this causes the increase of MAPE in epoch = 200 in both the datasets. In addition, comparing with the original network, after adding the temperature, the forecast performance of the networks slightly declines rather than increase, which means the addition of the temperature is a negative performance of the networks. This could mean that the relationship between weather and electricity load is very uncertain for the networks to forecast.

DISCUSSION
An assemblage takes materials from the environment and arranges them in a unique way. "An assemblage, in this sense, is a veritable invention. " (Deleuze and Guattari, 1988) Therefore, a larger assemblage (which must be part of a smart grid) can be invented, consisting of a combination of the AGG assemblage and neural networks. In this assemblage, people's historical electricity usage is becoming load forecasts. This assemblage also explains the performance degradation caused by adding temperature as a   variable to the neural network. Temperature is a factor that affects people's electrical behaviors (Valor et al., 2001). Thus, adding temperature may be considered as a double-count.
Actually, the relationship between the load value and temperature is not stable and even chaotic under some conditions. For example, the temperature-load relationship will be different in summer and winter, or in weekday and weekends. If we want to use the neural network to reflect it, it may get lost sometimes, causing the decrease in the performance.
Meanwhile, in the larger assemblage, the load forecasting results are utilized to "optimize" the grid, which in turn affects human behaviors. This mutual becoming constitutes a deconstruction of the time series. The forecast of the future is becoming the present, and it all points to the past. Attempting to control the flow of the control instead causes it to control itself.
Furthermore, we are constantly entangled and constructed by multiple assemblages. Therefore, Deleuze cautions us not to oversimplify the assemblage. The distinction between conventional and renewable energy sources, for example, is a more vivid description of the power producing assemblage in Howe and Boyer's study (Howe and Boyer, 2015).Similarly, future study on electric load forecasting can be more specific and accurate by combining different assemblages or by the constant becoming and destruction of the assemblages.

CONCLUSION
Unlike previous research on electric load forecasting, we presented a novel pattern of thinking based on Deleuze's assemblage concept in this article. From the perspective of the assemblage, the electrical load is regarded as a quantitative description of the mutual becoming of people and electricity (i.e., a constant deterritorialization and reterritorialization process), whereas load forecasting is an attempt to control the aforementioned process. Depending on the assemblage perspective, an LSTM-RNN-FNN model is utilized to predict the load in future 5 h based on the load value in the last 24 h and the current temperature. The outcome of evaluating multiple superparameters and datasets reveals that the current architecture can conduct forecast job quite effectively. We believed that the proposed LSTM-RNN-FNN model will aid in electric load forecasting and that the assemblage perspective on which we rely will give new inspiration for future electric load forecasting and potentially smart grid research. By adding or removing the elements in the assemblage or constructing new assemblages, future research could delve more into the impact of smart grid, natural environment, and human and societal factors on the accuracy of electric load forecasting.

DATA AVAILABILITY STATEMENT
The original contributions presented in the study are included in the article/Supplementary Material further inquiries can be directed to the corresponding authors.

AUTHOR CONTRIBUTIONS
Conceptualization: JX, ZW, and WN; writing-original draft preparation: JX and ZW; writing-review and editing: JX, ZW, and YD; visualization: ZW; supervision: WN; funding acquisition: WN. All authors have read and agreed to the published version of the manuscript.