Accurate Solar Cell Modeling via Genetic Neural Network-Based Meta-Heuristic Algorithms

Accurate solar cell modeling is essential for reliable performance evaluation and prediction, real-time control, and maximum power harvest of photovoltaic (PV) systems. Nevertheless, such a model cannot always achieve satisfactory performance based on conventional optimization strategies caused by its high-nonlinear characteristics. Moreover, inadequate measured output current-voltage (I-V) data make it difficult for conventional meta-heuristic algorithms to obtain a high-quality optimum for solar cell modeling without a reliable fitness function. To address these problems, a novel genetic neural network (GNN)-based parameter estimation strategy for solar cells is proposed. Based on measured I-V data, the GNN firstly accomplishes the training of the neural network via a genetic algorithm. Then it can predict more virtual I-V data, thus a reliable fitness function can be constructed using extended I-V data. Therefore, meta-heuristic algorithms can implement an efficient search based on the reliable fitness function. Finally, two different cell models, e.g., a single diode model (SDM) and double diode model (DDM) are employed to validate the feasibility of the GNN. Case studies verify that GNN-based meta-heuristic algorithms can efficiently improve modeling reliability and convergence rate compared against meta-heuristic algorithms using only original measured I-V data.


INTRODUCTION
In recent years, due to rapid fossil fuel depletion (Peng et al., 2020), booming global energy demand (Shangguan et al., 2020a), and a series of severe eco-environmental problems (Yang et al., 2015), concepts of sustainable development and an environmentally friendly society are receiving increasingly widespread attention (Shangguan et al., 2020b). Hence, environmental protection (Sun et al., 2019) and energy structure transition (Sun and Yang, 2020) are becoming global development strategies via the application of renewable energies, e.g., solar (Zhang et al., 2019a;Murty and Kumar, 2020) and wind (Liu et al., 2020). Note that solar energy is one of the most efficient alternatives among various available candidates due to its remarkable superiorities of photovoltaic (PV) systems (Gao et al., 2021;Liu et al., 2021), e.g., wide distribution, abundance, and lack of pollution Huang et al., 2020).
In order to carry out precise performance analysis (Jordehi, 2016), optimal design (Zhang et al., 2019b), and power generation efficiency enhancement Yang et al., 2021) of PV systems, many solar cell modeling approaches have been devised to investigate their dynamic physical behaviors and output characteristics under various operation conditions (Jordehi, 2016). Generally speaking, two PV cell models have the most widely used applications, i.e., the single diode model (SDM) (Rodriguez et al., 2017) and double diode model (DDM) (Abbassi et al., 2018;Qais et al., 2019a). In particular, precise and reliable estimation of their unknown parameters is the first and foremost step for PV cell mathematical modeling. However, the parameters cannot sustain constant and are tested under standard test conditions (STC) (Xiong et al., 2018). In addition, the value of these parameters also changes with degradation and faults over time.
Over the decades, numerous methods developed to solve such obstacles can be generally categorized into three types, i.e., analytical methods (Wolf and Benda, 2013;Torabi et al., 2017), deterministic approaches, and meta-heuristic algorithms. Analytical methods utilize the data sheet information provided by manufacturers to undertake mathematical calculations, which have the merits of easy implementation but lack stable accuracy as they mainly depend on a group of selected points on a current-voltage (I-V) curve. Meanwhile, deterministic techniques, including Lambert W-functions (Gao et al., 2016) and iterative curve fitting (Villalva et al., 2009) can obtain more accurate results but easily fall into local optimum when solving high multimodality problems. Thus, limitations of the two aforementioned methods prevent them from maintaining a stable and satisfactory performance on PV cell parameter extraction. Nevertheless, meta-heuristic algorithms can powerfully compensate for the shortcomings, they display high applicability (Nesmachnow, 2014), strong reliability (Roeva and Fidanova, 2018), and a high computation rate (Pillai and Rajasekar, 2018;Figueroa et al., 2020), etc. Up to now, numerous meta-heuristic algorithms have been utilized for the parameter estimation of PV cells (Yu et al., 2018;Guchhait and Banerjee, 2020;Yang et al., 2020), e.g., genetic algorithm (GA) (Jervase et al., 2001), differential evolution (DE) , particle swarm optimization (PSO) (Ye et al., 2009), artificial bee colony (ABC) (Oliva et al., 2014), water cycle algorithm (WCA) (Kler et al., 2017), bacterial foraging algorithm (BFA) (Awadallah, 2016), and imperialist competitive algorithm (ICA) (Fathy and Rezk, 2017), together with many hybrids (Chin et al., 2015;Allam et al., 2016;Nayak et al., 2019).
Besides, as all modeling heavily depends on volume and accuracy of measured data from a data sheet, it is of great significance to undertake reasonable optimization on data samples rather than only focusing on algorithm improvement. Note that the measuring I-V data offered by the manufacturer are always insufficient, which might result in the loss of sample information that can finally decrease simulation accuracy. Hence, it is critical to adopt effective data processing methods to enrich data samples before parameter estimation. In the past few decades, the artificial neural network (ANN) (Mittal et al., 2018) has shown its great effectiveness in data analysis and prediction. To obtain optimal parameters of the ANN, various methods are employed to train networks, such as the Newton-Raphson method (Soloway and Haley, 1996) and gradient descent method (Noriega and Wang, 1998). However, these methods essentially belong to gradient-based optimization, which easily result in a low-quality optimum or a complex computation (Song et al., 2007) as their performance highly depends on neural network structure, complexity of cost function, and so on. Compared with gradient-based optimization, evolutionary algorithms, e.g., genetic algorithms (GAs) which have a superior global searching ability and high application flexibility are more appropriate to train an ANN. Therefore, this paper develops novel genetic neural network (GNN)-based meta-heuristic algorithms for solar cell accurate modeling, which have the following contributions: • The GNN is utilized to generate more virtual I-V data based on inadequate measured I-V data, such that it can provide a more reliable fitness function with adequate I-V data to meta-heuristic algorithms; • GNN-based meta-heuristic algorithms can implement an efficient search for PV cell parameter estimation, which can acquire a higher-quality optimum than conventional metaheuristic algorithms with only inadequate measured I-V data; • Practical performance is effectively verified via an SDM and DDM, respectively. Experiment results illustrate that the proposed optimization strategy displays higher optimization accuracy and convergence stability on PV cell modeling.
The rest of this paper is organized as follows: PV cell modeling and objective function are presented in Photovoltaic Cell Modeling and Problem Formulation. The proposed GNN-based meta-heuristic algorithms are elaborated on in Methodologies. Case studies and detailed experimental results are shown in Case Studies. Lastly, conclusions are provided in Conclusion.

PHOTOVOLTAIC CELL MODELING AND PROBLEM FORMULATION
In general, the most commonly utilized equivalent circuit models are SDMs and DDMs. Their mathematical models and corresponding objective functions are introduced in this section.

Mathematical Modeling
The first step for studying the characteristics of PV cells, or to develop a more accurate prediction of PV systems operation is appropriate PV cell modeling (Guchhait and Banerjee, 2020). Then, PV cell parameters can be reliably extracted to depict the output characteristics more accurately for better performance analysis. The most commonly applied equivalent circuit models are SDM and DDM (Ram et al., 2018).

Single Diode Model
The structure of an SDM is shown in Figure 1, which contains an ideal constant current source I ph , a series resistance R s , a shunt resistance R sh , and a diode D (Ye et al., 2009;Murty and Kumar, 2020). An SDM is characterized by high simplicity and decent accuracy (Humada et al., 2016).
The output current of an SDM defined as I L can be described by (Nayak et al., 2019; Guchhait and Banerjee, 2020) where I sh denotes shunt resistance current R ph ; and I d represents diode current which is able to be further calculated by Jordehi (2016) where I sd represents the diode's reverse saturation current; V L denotes the output voltage; a means the diode's ideality factor; while V t means junction thermal voltage, as follows : where T denotes the temperature of the PV cell; and K 1.38 × 10 − 23 J/K means the Boltzmann constant; and q 1.6 × 10 − 19 C represents electron charge. Combining Eqs. 1-3, the output I-V relationship of the SDM is described by Hence, five parameters need to be identified for the SDM, e.g., I ph , I sd , R s , R sh , and a.

Double Diode Model
As demonstrated in Figure 2, the only difference between an SDM and DDM is that a DDM has one more diode in parallel, upon which recombination losses in the depletion layer are considered in the DDM (Zhang et al., 2019b).
Under such a circumstance, the DDM displays higher accuracy than the SDM, while the increase of unknown parameters also brings an extra computation burden .
Note that the output current of the DDM is described by where currents I d1 and I d2 flowing through diodes D 1 and D 2 are written as Hence, the output I-V relationship of the DDM can be calculated by Note that the variables in Eq. 8 are demonstrated in the nomenclature.
Thus, seven parameters need to be identified for the DDM, e.g., I ph , I sd1 , I sd2 , R s , R sh , a 1 , and a 2

Objective Function
Root mean square error (RMSE) is chosen as the objective function as follows Khanna et al. (2015) RMSE where x denotes the solution vector and N means the amount of experimental data.
For the DDM, it gives

METHODOLOGIES
In this section, the developed GNN and its combination with meta-heuristic algorithms are illustrated.

Genetic Neural Network
The detailed network structure of the GNN is elaborated in this section.

Principle of Artificial Neural Network
ANNs have achieved widespread attention and applications due to their high accuracy and shorter computational time in predictions (Zhou et al., 2020). Particularly, the jth neuron is connected with k inputs (x 1 , x 2 , x 3 ,. . ., x k ) and one bias input b j , while its output y i can be calculated by Chang (2011) where w ij denotes the ith weight of the jth neuron; x i means the neuron input layer; b j denotes the bias of the jth neuron; k represents the number of inputs; and F(.) represents the transfer function, which can be defined by where z denotes the function variable, which can be described Weights and biases among each layer can be expressed as where W l and B l denote weights and biases between neurons in layer l and layer (l+1); while m and n represent the number of neurons in layer l and layer (l+1), respectively. To generate more virtual I-V data of the PV cell, an ANN should accomplish a training process on the basis of training data. In general, it attempts to minimize cost function via optimizing network parameters (i.e., weights and biases) as follows: where W denotes weight vector; B represents bias vector; I h L is the measured output current of the PV cell of the hth training sample; I h L is the output current of the PV cell generated by the ANN for the hth training sample; W lb and W ub denote the lower and upper bounds of weights, respectively; and B lb and B ub mean the lower and upper bounds of bias, respectively.

Artificial Neural Network Training by Genetic Algorithms
GA mainly contains three critical operators, i.e., selection, crossover, and mutation (Khani et al., 2019). To find optimal parameters for an ANN, a GA can be directly used to handle the training model in Eqs. 16, 17, in which detailed steps are given in Figure 3.

Genetic Neural Network-Based Meta-Heuristic Algorithms
The detailed optimization structure of GNN-based metaheuristic algorithms for PV cell parameter extraction is illustrated in this section.

Genetic Neural Network-Based Modified Fitness Function
Based on the trained GNN, more virtual I-V data of the PV cell can be generated to modify fitness function for meta-heuristic algorithms. Since all optimization variables show lower and upper bounds during optimization, RMSE can be regarded as the fitness function by taking prediction data into account, as follows: where N p denotes the amount of prediction data.

General Execution Procedure
Overall operation framework of GNN-based meta-heuristic algorithms for solar cell modeling mainly consists of three parts, as illustrated in Figure 4. The main differences between various algorithms are individual roles and searching mechanisms of exploration and exploitation. Firstly, measured output I-V data of various PV cells are utilized for GNN training. Secondly, more virtual I-V data are generated by the GNN, thus a more reliable fitness function can be established to guide algorithm searching. Finally, meta-heuristic algorithms implement exploration and exploitation at different stages to find optimal PV cell parameters.

Parameter Setting of the Genetic Neural Network
Main structure of the GNN is designed to be a five-layer network, including one input layer, three hidden layers with 11 neurons (i.e., five neurons in the first hidden layer and three neurons in each of the other two layers), and one output layer with one neuron for one output. Figure 5 shows the convergence of the GA for the cost function of the GNN under different training datasets and framework design of the GNN, which indicates that cost function obtained by the GA is very small. In the case of a 50% training dataset, weights and biases of the GNN are as follows:

CASE STUDIES
Seven meta-heuristic algorithms are adopted to achieve the accurate modeling of three PV cell models. In particular, 26 sets of measured I-V data are collected from a 57 mm diameter R.T.C. France solar cell under the environmental condition (G 1000 W/m 2 and T 33°C) (T 33°C is the cell temperature). Due to the benchmark, I-V datasets used for case studies are only determined under conditions of G 1000 W/m 2 and T 33°C, thus there is only one single fitted I-V curve. To validate the practical applicability of meta-heuristic algorithms based on inadequate data, six datasets are randomly chosen from 26 pairs of measured data that are 50, 60, 70, 80, 90, and 100% of measured data. To provide a reliable fitness function to the meta-heuristic algorithms, the total number of each dataset and prediction data are set at 50, e.g., 37 pieces of prediction data for a 50% dataset. In addition, each meta-heuristic algorithm is evaluated under two circumstances, that is, without data prediction (i.e., with only selected measured data) and with data prediction.
Note that the maximum iteration number and population size of all meta-heuristic algorithms under each PV model are designed to be the same. Particularly, their maximum iteration number is designed to be identical, i.e., 300, and all methods are independently operated in 80 runs. Besides, population size of each algorithm is designed to be 30 and 50 for the SDM and DDM, respectively. Table 1 shows the simulation results of the average RMSE acquired via seven methods under various measured datasets, which demonstrates that the average RMSE obtained by each GNN-based meta-heuristic algorithm is significantly smaller than that with only measured data, especially under 50% inadequate measured data. For instance, the average RMSE obtained by GNN-PSO is 56.25% smaller than original PSO without data prediction under 50% measured data, which validates that GNNbased I-V data prediction can effectively improve optimization accuracy and stability.

Results of the Single Diode Model
Besides, Figure 6 shows the convergence of seven approaches without the GNN and with GNN under various datasets. It can be seen that WOA is prone to a low-quality optimum and GWO easily falls into a local optimum at the initial stage under 100% data without the GNN, while GNN-based training can help both of them achieve a more stable convergence and a higher quality optimum. Besides, most algorithms can hardly achieve stable and efficient convergence due to inadequate data. In contrast, an increase of training data helps them to gradually find high-quality solutions in a more stable way.
Moreover, boxplots of RMSE for the SDM under 50% data are depicted in Figure 7, which explicitly shows the resulting distribution of the seven different algorithms in 80 runs. Figure 6 clearly indicates that the distribution range and upper/lower bounds of GNN-based meta-heuristic algorithms are smaller than that without a GNN. It can effectively verify that the increase of data based on the GNN can simultaneously enhance convergence stability and searching ability. Besides,  outliers of average RMSE of some algorithms can also be efficiently reduced by GNN-based data prediction, such as WCA and ABC. It indicates that each algorithm can find the global optimum more easily via GNN-based experimental data prediction, upon which optimal values of these unknown parameters can be determined in a more accurate and stable way. Figure 8 depicts the I-V and P-V curves acquired via the best algorithm (i.e., the algorithm that can obtain minimum RMSE) under 50% training data and 100% training data, respectively. It can be observed that the output curves acquired by data prediction-based GNN-WCA are extremely consistent with actual data, which proves its strong performance for PV modeling.

Results of the Double Diode Model
For the DDM, the average RMSE achieved via seven different algorithms with different measured datasets in 80 runs is illustrated in Table 2, which indicates that increased prediction data generated by the GNN can effectively improve calculation accuracy and stability. For example, average RMSE obtained by GNN-BSA is 52.37% lower than the BSA without GNN-based data prediction under 50% measured data. This illustrates that a GNN-based meta-algorithm can simultaneously realize high estimation precision and strong stability, thus it can output desirable results when both accuracy and reliability are considered in the DDM. Moreover, Figure 9 shows the convergence of all algorithms under different training data without a GNN and with a GNN,  which illustrates that convergence speed of GWO is low and the ABC is prone to a local optimum under 50% training data without a GNN. In contrast, the GNN can effectively increase its optimum searching efficiency and quality with a higher convergence stability. Besides, a large amount of extending data samples can effectively improve the searching efficiency of global optimum. Boxplots of different algorithms under 50% data are depicted in Figure 10, upon which one can easily find that all GNN-based meta-heuristic algorithms have a smaller distribution range and upper/ lower bounds in comparison to that without the GNN. Obviously, each algorithm can search for the highest quality solution more easily when experimental data are expanded by GNN-based data prediction, upon which optimal values of these unknown parameters can be determined in a more accurate and stable way. This indicates that the increase of data is able to effectively improve optimization quality and stabilize global searching ability in PV cell parameter estimation.   Figure 11 depicts the output curves of the best data prediction-based meta-heuristic algorithm (GNN-WCA) and actual data under 50% and 100% training data, respectively. Apparently, the WCA shows the high fitting accuracy with actual data under both different training datasets.

Statistical Results and Analysis
Radars of average RMSE achieved via each meta-heuristic algorithm with six groups of data at different scales are provided in Figures 12, 13, which show that average RMSE acquired via all algorithms with GNNbased data prediction are smaller compared with that obtained without data prediction at different scales of data, especially under 50% data. In particular, the ranking basis of various methods is based on a comprehensive and systematic comparison of their performance in PV cell parameter extraction, e.g., extraction accuracy, convergence speed, and convergence stability. This effectively verifies the outstanding reliability of GNN-based meta-heuristic algorithms for accurate PV cell modeling.

CONCLUSION
This paper develops a novel hybrid of a GNN and advanced metaheuristic algorithms for accurate modeling of different solar cells, and its main contributions are as follows: • A GNN is applied to generate more virtual I-V data based on inadequate measured data, in which a GA is adopted to find optimal network parameters of an ANN, which can improve the accuracy of generated virtual I-V data. As a result, the GNN can significantly enrich the dataset, and provide a more reliable fitness function to the meta-heuristic algorithms for PV cell modeling; • Two widely used PV cell models, i.e., the SDM and DDM, are adopted to verify the practical performance of the proposed strategy. For instance, the average RMSE obtained by the ABC, BSA, GWO, MFO, PSO, WCA, and WOA based on GNN prediction for the SDM is 72. 29, 97.04, 79.72, 39.48, 43.74, 86.82, 100.96% to that of without GNN data prediction under 50% measured data, respectively; • Case studies show that GNN-based meta-heuristic algorithms can comprehensively enhance optimization precision and convergence stability compared with original meta-heuristic algorithms utilizing untrained measured I-V data.
Future studies should focus on the improvement and optimization of the structure of the proposed GNN, e.g., data training process and relevant network parameters tuning can be further simplified to reduce the optimization burden. Besides, as all the experiments were carried out in a simulation environment which is different from real operation conditions, experiments combined with hardware platforms under practical working scenarios are imperative for future engineering applications.

DATA AVAILABILITY STATEMENT
The original contributions presented in the study are included in the article/Supplementary Material, further inquiries can be directed to the corresponding author.