Cooperated Online Learning and Optimization Operation of the Low Voltage Distribution System Considering User Electrical Characteristics

With the acceleration of energy reform, photovoltaic, energy storage, electric vehicles, and other new loads in low-voltage distribution networks have been rapidly developed. However, the distribution network with distributed power supply has some problems, such as imprecise power flow modeling and difficult coordination between various energy sources and loads, which bring challenges to the online optimization of the distribution network load curve. In this study, a multi-time scale online load optimal operation scheme of the distribution network is proposed by using the Bayesian online learning method. This scheme transforms the online power optimization of the distribution network into a Markov decision process. The output time of different energy sources is different, and the load with different user load characteristics is optimized. The scheme can track the state of the distribution network in real time and make the optimization scheme of multi-energy output online. Finally, an example is given to verify the effectiveness of the proposed method, which has theoretical significance for promoting the diversified development of low-voltage user-side load.


INTRODUCTION
With the rapid development of energy, a large number of photovoltaic (PV), electric vehicles (EV), and energy storage systems appear in the low-voltage distribution system. The Photovoltaic cell is an integral part of the main power supply, energy storage system as backup power supply, the photovoltaic power output when there is strong electricity storage, for lack of photovoltaic power generation, the energy storage discharge, and the load of power supply such as electric cars. The energy storage system can alleviate the influence of the randomness and fluctuation of photovoltaic power generation on the whole system operation. PV systems and energy storage systems are preferentially used to charge EV loads, avoiding the strong impact on the power system caused by EV loads directly connected to the large power grid. It can not only increase the consumption of new energy but also use the energy storage system to cut peak load and fill the valley, saving the cost of power distribution and capacity increase. The green operation mode of multi-load coordinated support in a low-voltage distribution system has great development potential. Li et al. (2022)and Xiong et al. (2020) proposed modeling and stability issues of voltage-source converter-dominated power systems. Cao et al. (2017) proposed a charging station construction scheme that combines photovoltaic and energy storage, namely, integrated optical storage and charging power station. The main components of the system and how to manage the power supply side of the system are introduced. Zhou et al. (2016) and Wu et al. (2021) constructed the topological structure of a single-bus DC microgrid containing distributed photovoltaic, mixed energy storage, and EV loads from the microlevel and proposed the hierarchical control strategy of line voltage for coordinating the operation of microgrid. On the basis of meeting the charging needs of EV users, Lu et al. (2014) took residential charging stations as the research object and established an optimization model aiming at minimizing the peak-valley difference of load. Through the demand response model, the optimized peak-flat-valley electricity price and the corresponding user responsiveness are solved. Zhao et al. (2015) proposed the optimal configuration of optical storage and gridconnected microgrid considering demand side response. Yu et al. (2017) proposed an optimization control model of microgrid with electric vehicles based on the multi-agent method in automatic demand response mode, aiming at the microgrid with distributed power supply and energy storage system as the power side and conventional load and electric vehicle load as the user side. Zhang et al. (2021) proposed to use the hydrogen production system to realize the optimal coordinated control of energy. Ren et al. (2018) proposed an optimization method for multi-time scale active and reactive power coordinated dispatching in an active distribution network based on model predictive control; however, there are many problems in the aforementioned methods , such as heavy computation, heavy reliance on prediction data, and difficulty in online control. Therefore, it is necessary to actively explore the collaborative optimization of optical storage and charge in the low-voltage distribution system and design a reasonable optimized operation and scheduling scheme, which can not only meet the needs of the power grid but also adapt to dynamic load change and achieve the maximum benefit and efficiency.
Studies have shown that the learning algorithm can be used to realize the learning optimization of part of the observable system, independent of the predicted data, and the reward feedback mechanism can realize the online optimization of the system. A low-voltage distribution system is an important part of electricity consumption. Because of the large quantity and wide distribution of low-voltage power consumption, it is necessary to standardize the data of low-voltage station areas. There has been a lot written about user characteristics. In the low voltage distribution system, customers have similar power characteristics to adjacent. Zhao et al. (2020), put forward in view of the low-pressure area of topology identification, showed that under the same area the user has electric similarity. Luo et al. (2016) provided effective data support for demand response such as peak-time electricity price formulation, staggered peak management, and load regulation. So this study, from the perspective of power characteristics, studies new coordinated control in the low voltage-power distribution system load, using online Bayesian learning methods, under the timesharing electricity price, which is established aiming at the peak-valley load cutting optimization model; the real-time tracking online load distribution network, at the same time, meets under the dynamic change of load distribution network optimization, to implement the dynamic coordinated control of the low-voltage power distribution system; the effectiveness of the proposed strategy is demonstrated by applying it to a typical lowvoltage platform area.

Photovoltaic Module
The generation power of the photovoltaic module at time t is closely related to the illumination intensity, and its output power is: where P scc is the maximum output power of a photovoltaic module, kW; R AC is actual irradiance, kW/m2; R sTc is irradiance under standard test conditions; γ T is the power temperature coefficient; T ct is the actual operating temperature of the PV module; and T stc is the temperature of the PV module under standard test conditions (temperature 25°C).

Energy Storage Battery
The state of charge (SOC) of the battery reflects the ratio of the remaining power of the battery to the total capacity, and the mathematical model of the charge and discharge state is: where S soc (t + 1) and S soc (t) are the charged state of the battery at the end of (t + 1) period and (t) period, respectively; α is the selfdischarge coefficient of the battery; P c and P d represent charging power and discharging power, respectively; Δt represents time intervals; and C bat represents battery capacity.

Electric Vehicles
EVs play a very important role in environmental protection and energy consumption reduction. They can be used not only as charging load in the microgrid but also as a reserve power source to release electric energy during peak load to maintain the energy balance and power supply reliability of the grid. Referring to the social survey data of family electric vehicle travel, it can be seen that the relationship between the driving distance and time of family electric vehicle travel basically corresponds to the lognormal distribution. Thus, the expression of its probability density function can be obtained through the calculation as follows: where μ means value, μ = 0.32; and δ means standard deviation, δ = 0.88. EVs in the peak load can release the power according to their range and set the initial state of the power battery charged state to the lowest power battery charged state. Assuming that the electric car is charged at night, at the end of the trip, the power battery charged state calculation as shown in the following type, orderly in the electric vehicle charging and discharging mode, generally chooses the load slack period: where E i represents the energy consumed per kilometer by the electric car, η i represents the discharge efficiency of each vehicle, C EV,i represents the power battery capacity of each car, and f i (x) represents the daily mileage of each car.

LOAD OPTIMIZATION MODEL BASED ON THE ONLINE BAYESIAN LEARNING METHOD Description of the Project
In the distribution network with multiple new loads, the output scheme of new loads is optimized with the goal of minimizing the electricity cost of users under the time-sharing price. In addition, real-time tracking of the distribution network status and updating of the optimization scheme can be achieved in the case that new load access of the system changes rapidly over time. Because different load output and electricity price have different time characteristics, this study designs a multi-time scale online optimization scheme. According to different time intervals, the day is divided into several periods, and the load output is dynamically optimized according to the different time-shared price in each period, so as to realize the dynamic cooperative control of the low-voltage distribution system.

Online Bayesian Learning
Online learning is a model training method of machine learning. It can adjust the model quickly in real time according to the changes of online data, so that the model can reflect the changes online in time and improve the accuracy of online prediction. The goal of online learning is to minimize the loss of the entire function. Online learning does not need to determine the training data set in advance. The training data arrive one by one in the training process. Every time a training sample comes, the model will be iterated according to the loss function value, objective function value, and gradient generated by the sample. The main flow of online learning includes: the prediction results of the model are presented to the user and then the feedback data of the user are collected, which are used to train the model and form a closed-loop system, as shown in Figure 1.
In a Bayesian neural network by using the Bayesian formula on the prior probability and posterior probability, the relationship between the prior probability and posterior probability, with the initial set of standard normal distribution, can be found as a posteriori probability as the independent variable changes, in order to find out a posteriori probability, points to the independent variable on the value space. In order to simplify the process, MCMC (Markov Chains Monte Carlo) sampling is considered here to approximate the integral of the denominator, so that the posterior probability of the training set can be calculated and the neural network model can be obtained.
The Bayesian method can naturally derive the training method of online learning given the parameter in prior, calculate a posteriori probability according to feedback, take it as the prior probability of the next prediction, and then calculate a posteriori probability according to feedback.

Objective Function
When the system is running islands, if the daily load curve of the peak load is too big, it can lead to a line load increase. Considering the randomness of the efforts of photovoltaic, PV power, and energy storage discharge maximum constraints, the system stability and reliability in operation possibly cannot meet the needs of all load condition, thus appearing in the system suspend operation state. When the peak-valley load difference is large, the operating cost of the photovoltaic system and energy storage system in the low-voltage distribution system will increase. Therefore, in order to avoid too high peak value and too large peak-valley difference, this chapter takes peak clipping and valley filling as the objective function. The optimization time is 24 h in a day, and the decision variable is the change of electricity price in each period. To be sure, the objective function is calculated by using the load curve after collaborative optimization, so it is simplified here. The modules involved in collaborative control can be restricted by relevant constraints. Then, the objective function can be expressed as: where E T is the load after time-shared electricity price response. The first objective function represents the minimum peak value of system load, and the second objective function represents the minimum peak-valley difference of intra-day load. The two goals belong to the same latitude, so this chapter adopts the method of the weight coefficient to transform the two goals into a single goal problem. The objective function after transformation is: where λ 1 and λ 2 are weight coefficients. Both goals reflect the user's impact on the system's peak cutting and valley filling, so they can both be set to 0.5.

1) Customer satisfaction with electricity:
Customer satisfaction constraints: where ΔE(t) is the electric quantity change at moment t, E 0 (t) is the electric quantity consumption at moment t before implementing time-of-use electricity price, and A is the minimum satisfaction value of the electricity consumption mode.
2) Continuity constraint of energy storage state of charge: where η c and η d correspond to the charge-discharge efficiency of BESS, respectively.
3) Charge/discharge state constraints: B dis (i) and B ch (i) are 0-1 variables, where 1 represents the state of charge and 0 represents the state of discharge, satisfying the constraint.
4) Energy storage charge/discharge constraints: During the operation of BESS, the power of each charge/ discharge should not exceed its rated value, and the total discharge power should not exceed the rated power capacity of the energy storage:

5) Limit the total amount of battery charges and discharges
The life of the energy storage battery is mainly affected by the charge and discharge state transition, that is, the charge and discharge times of energy storage. According to relevant studies, energy storage life is closely related to the total amount of charge and discharge in a day. Therefore, in order to reduce the number of charge and discharge times of household energy storage, the total amount of charge and discharge in a day is constrained.
where Q max is the biggest power.

Parameter Settings
In this study, according to the reference that users in the same low-voltage station area have similar electrical characteristics, users with similar electrical characteristics in a low-voltage station area are directly selected as the typical scenario in this case. Moreover, different time-shared price users have different power consumption characteristics, and the user load curve will also be different. Here, industrial users are taken as an example, and the load curve is as follows: the PV curve of the area is shown in Figure 2, and the time-of-use electricity price table is shown in Table 1. It is assumed that the actual price of electricity before the implementation of time-of-use electricity is 0.56 yuan/KWH. Since this study is only for research and analysis, the value of the elasticity coefficient matrix of electricity price in reference (Yu et al., 2017) is directly taken, as shown in the following formula:  The demand response charging electricity price is set, the user's peak electricity price is 0.35 yuan/KWH, the normal electricity price is 0.85 yuan/KWH, and the parameters required by the energy storage system are shown in Table 1.
The genetic algorithm was used to solve the model. The number of population was set at 100, the maximum genetic algebra was set at 500, the crossover rate was 0.8, and the mutation rate was 0.05. The proportional coefficient was set at 0.25, and the result of selecting the minimum objective function value after multiple calculations was the final result.
The model established in this study is a mixed-integer programming model, which uses the commercial solution CPLEX12.8 to solve and uses MATLAB by Yong (2016) for graph drawing and data analysis.

Operation Result
The optimization results are shown in Figure 3. According to the analysis of the calculation results, the user's load in multiple time scales will change with the change of the new load, and the load curve and peak-valley difference will also change accordingly without being affected by other constraints. By comparing the load curves obtained by online learning and offline learning, it can be found that when online learning is not adopted, the load curve obtained by users under the objective function has little change; but when online learning is adopted, the load curve obtained by users under the objective function is relatively smooth, and the effect of peak clipping and valley filling is more obvious. It shows that the model in this study can realize the coordination of various new loads, get a better optimization effect, and effectively improve the economy of the power grid operation.

CONCLUSION
In the new multi-load low-voltage distribution system, the goal of peak cutting and valley filling is realized through the cooperative management and control of user load on a multi-time scale. At the same time, online Bayesian learning can track the distribution network in real time, adapt to the change of new load, dynamically adjust the load distribution, and realize the optimal collaborative optimization strategy. The aforementioned example verifies the effectiveness of the strategy.

DATA AVAILABILITY STATEMENT
The original contributions presented in the study are included in the article/Supplementary Material; further inquiries can be directed to the corresponding author.