Electrical Characteristics Estimation of Photovoltaic Modules via Cuckoo Search—Relevant Vector Machine Probabilistic Model

This work presents an optimized probabilistic modeling methodology that facilitates the modeling of photovoltaic (PV) modules with measured data over a range of environmental conditions. The method applies cuckoo search to optimize kernel parameters, followed by electrical characteristics estimation via relevance vector machine. Unlike analytical modeling techniques, the proposed cuckoo search-relevance vector machine (CS-RVM) takes advantages of no required knowledge of internal PV parameters, more accurate estimation capability and less computational effort. A comparative study has been done among the electrical characteristics predicted by back-propagation neural network (BPNN), radial basis function neural network (RBFNN), support vector machine (SVM), Villalva's model, relevance vector machine (RVM), and the CS-RVM. Experimental results show that the proposed CS-RVM provides the best prediction in most scenarios.


INTRODUCTION
Recently an increasing interest in application of photovoltaic (PV) generation, together with related problems of power optimization, environmental impact, and grid stability, has led to a speedup of the research in this field (Di Piazza and Vitale, 2013). PV technology is largely applied into various areas. Such as solar still systems (Muthu Manokar et al., 2018aPounraj et al., 2018;Praveen Kumar et al., 2018;Kabeel et al., 2019;Balachandran et al., 2020;Sasikumar et al., 2020), self-powered room (Karthick et al., 2020), and etc.
Since the first PV cell was created, a variety of semiconductor materials has been used to develop PV cells. To suitably face the design issues, such as prediction of power generation, optimal choice of PV modules as well as design of power converter, a general performance estimation tool, known as PV model, is necessitated to estimate the electrical characteristics of these cells before installing.
Numerous modeling methods have been proposed to estimate the current-voltage (I-V) characteristics as well as the maximum power of PV modules. Mellit et al. (2013) classified these models into two types: explicit I = f (V) and implicit I = f (I, V) models. The former is basically a simple analytical expression predicting I in terms of V, and usually is less computational effort. On the other hand, the latter introduces more model parameters (e.g., series resistance R s and shunt resistance R p ) and apply transcendental functions expressing the I-V relations. However, the parameters involved in implicit models, varied with different PV materials, are normally not provided by PV manufacturers. Although they can be obtained empirically, system designers often find difficulties in using the model (Massi Pavan et al., 2014).
Recently, a good number of artificial intelligence algorithms have been investigated in PV modeling. In 2004, AbdulHadi et al. (2004 first introduced neuro-fuzzy models to predict solar cell short-circuit current and open-circuit voltage, followed by coordinate translation of a measured I-V response. The simulation results matched measured data more accurately than four-parameter single-diode model, and required an order of magnitude fewer data to train than other neural-network models. Mellit et al. (2013) developed artificial neural network (ANN) models for estimating the power produced. A comparative study showed that the ANN-models performed better than polynomial regression, multiple linear regression, analytical five-parameter single-diode models. In Mellit and Kalogirou (2011) and Bi et al. (2016), the authors used adaptive neuro-fuzzy inference scheme (ANFIS) in an expert configuration PV power supply system. The results showed that the ANFIS-based modeling method gave a good prediction accuracy of 98% and performed better than the ANN counterpart.
Celik (2011) applied a generalized regression neural network to predict the operating current of PV modules. The operating current were predicted from both neural network and fiveparameter analytical model. The results showed that the ANN model provides a better prediction of the current than the analytical model.
A radial basis function neural network (RBFNN) based model of a PV module was developed by Bonanno et al. (2012) to improve the accuracy of the estimated output I-V and P-V (power-voltage) curves at different environmental conditions. The values of the computed I-V and P-V characteristics match closely to those obtained from the experimental data.
The main objective of this paper is to develop an accurate PV model, which offers the possibility to develop a new expert configuration of PV systems by implementing an cuckoo searchrelevance vector machine (CS-RVM) in Matlab and through a PC with data acquisition system. The advantage of the proposed model is that they do not obtain additive model parameters or complicate calculations. A cuckoo search algorithm is applied to optimize the kernel parameter involved in relevance vector machine (RVM).
The rest of paper is organized as follows. In the next section, a general description of RVM is introduced. The proposed CS-RVM model for characterizing the behavior of PV modules, working at arbitrary environmental conditions, is presented in section 3. Section 4 demonstrates the simulation results and discussions. Finally, the conclusions are drawn in section 5.

RELEVANCE VECTOR MACHINE (RVM)
Relevance vector machine (RVM) is a probabilistic model representing a Bayesian interpretation of a generalized linear model of identical functional form to support vector machine (Yuan et al., 2007).
Similar to the supervised learning, given a set of measurements t = {t i } N 1 at some training points x = {x i } N 1 , the function y(x, w) that needs to be predicted at some arbitrary point x i can be expressed as Equation (1).
where ε is the additive noise involved in target t. The y(x, w) can be formulated as Equation (2).
where w = [w 1 , w 2 , . . . , w N ] is weight vectors and w 0 is a bias. In practice, the w 0 is usually incorporated into w. The y can be rewritten as Equation (3).
The K(x, x i ) represents a kernel function and is a matrix obtained by substituting {x i } N 1 into kernel function, namely Equation (4).
In this paper, the Gaussian data-centered basis function If the noise is with zero-mean Gaussian with variance σ 2 , the likelihood function of whole samples is expressed as Equation (5).
Maximum likelihood estimation of w and σ 2 in Equation (5) often leads to over fitting (Caesarendra et al., 2010). To constrain complexity, Tipping (2001) recommended imposition of some prior constrains on the weights w. Typically, an explicit zeromean Gaussian prior probability distribution over the weights w is defined as follows in Equation (6): where α is a N × 1 vector of hyper parameters describing the inverse variance of each w i . On the basis of Bayes' rule, the posterior over all unknowns (w,σ 2 ,α) could be computed via Equation (7).
In order to evaluate Σ and m, the α and σ −2 should be determined to maximize the second part of Equation (7) P(α, σ 2 |t), which is expressed as Equation (9).
Since uniform hyperpriors is assumed, P(α) and P(σ 2 ) are ignored. The problem is to maximize the following equation (Equation 10).
The values of hyper parameters can be iteratively adjusted to maximize the weigh of posterior distribution. Predictive results can be evaluated over t for a new inputx as given in Equation (17).
A new estimate of a target value t for a new inputx is given by Equation (18).
The confidence in prediction is the sum of variance associated with both the noise process and the uncertainty of the weight estimates as Equation (19).
In this optimization process, the vector from the training set that associates with the remaining non-zero weights is called the relevance vector (RV) (Yuan et al., 2007). The pseudocode of the summary the inference procedure of RVM is described in Algorithm 1.

CS-RVM MODEL DEVELOPMENT
The electrical characteristics of PV modules are directly correlated to environmental factors. The RVM can be considered simply as a nonlinear input-output mapping as seen in Figure 1.
It is able to find the desired relationships among the parameters, namely solar irradiance G, module temperature T, and terminal voltage V by approximating the function I = f (x), where x includes three dimensions: V, G, T.
Since the behavior of an RVM depends on the kernel parameter of Gaussian kernel Γ , practically, the RVM model can be improved by an optimization algorithm. The root mean square error (RMSE) of RVM models is suitable to be used as the fitness function for optimizations as expressed by Equation (20).
where I andÎ denote measured and predicted current values of PV module, respectively. In this paper, cuckoo search (CS) Deb, 2009, 2010) is used to determine the optimum kernel parameter in minimizing the RMSE. The CS is an optimization algorithm inspired from the obligate brood parasitic behavior of cuckoo species (Ma et al., 2013). In the CS, Lévy flights are used instead of simple random walks (i.e., the random numbers used for generation of new candidate solutions follows a Lévy distribution). The CS is a population-based algorithm. The jth nest at tth generation x t+1 i moves in terms of Lévy flights as Equation (21).
where α is the step size related to the scales of the problem. The product ⊕ means entry-wise multiplications. The consecutive random step of Lévy flights follows power-law step-length distribution with a heavy tail, given by Equation (22).
With the above walk around the best solution obtained so far will speed up the local search process. To avoid the search to be trapped in a local optimum, the CS generate replace the worst solutions by random walk with probability P a . The CS algorithm can be summarized in the following steps.
Step 1 Define stop criterion and P a for the CS algorithm; Step 2 Generate a random population of K host nests by Lévy flights, Step 3 Calculate fitness function via Equation (20); Step 4 Select a nest randomly among the host nests, and replace it with new solution; Step 5 Replace a fraction of P a of the worst nest by generating new solutions using Lévy flights; Step 6 Keep the current optimum nest, and go to Step 2 if the stop criterion is not fulfilled; Step 7 Find the optimum Γ . Figure 2 shows the block diagram of the complete optimized RVM PV model, which combines the CS algorithm with RVM models. The optimized RVM basically includes three procedures: experimental measurements, tuning phase, and estimation phase. The goal of tuning is to identify good Γ , the only one parameter involved in Gaussian kernels which attributes to the range [0, 1], so that the RVM model can accurately predict I-V relations. On the basis of experimental measurements on the applied PV modules, the parameter Γ can be easily tuned via cuckoo search. The hyper parameters of RVMs (α, γ , and σ 2 ) are then determined by using Equations (14), (15), and (16), and the PV hyper-surface in the I−V −G−T working space is produced. The last stage is the estimation for terminal current. It can be simply done by substituting a set of measured data to Equation (18).

RESULTS AND DISCUSSIONS
To objectively evaluate the performance of the proposed method, besides RMSE, mean absolute percentage error (MAPE), and coefficient of determination (R 2 ), were also applied. The two statistical indicators are mathematically expressed as follows as Equations (23) and (24).
The formula for panel electrical efficiency is given in Equation (25).
where P MAX is the maximum power generated by the PV panel at the solar irradiance G. a is the area of the solar panel in meter squared. Among the above statistical measures, RMSE was used as objective function in this study. It is a frequently used measure of the differences between values predicted by a model and the Frontiers in Energy Research | www.frontiersin.org    experimental data. MAPE makes use of all observations and has the smallest variability from sample to sample (Swanson et al., 2011). The R 2 indicates how well data fit a statistical model (an R 2 of 1 interprets that the statistical model perfectly fits the experimental data, while an R 2 of 0 interprets that the statistical model does not fit the data at all). As shown in Figure 3, Gaobo GSMT-H-3A100 solar module tester was applied to measure the I-V experimental data of three different PV modules. These modules includes multicrystal "KYOCERA-Solar KC200GT" 200 W PV module, monocrystalline "SUNTECH STP 265S20" 265 W PV module, and thin film "TSMC-Solar TS-150C1" 150 W PV module. The module characteristics provided by manufacturers are listed in Table 1.
As discussed in section 3, the proposed CS-RVM approach has the capability of optimizing Γ , the only parameter involved in the kernel function of RVM model. Figure 4 depicts the median of RMSEs along the number of iterations in the 50 runs of a CS algorithm with 10 host nests. It can be seen that the CS algorithm converges within seven iterations, with the observed steady-state error to be <1.0E-4, which is relatively low and confirms the convergence performance.
Once the optimal Γ was determined, the CS-RVM was used to obtain the I-V curve of modules under various environmental conditions. The experimental data were collected by the solar module tester at different solar irradiation levels between 200 and 1,000 W/m 2 , every 200 W/m 2 . The tested temperature was varied from 25 to 50 • C, every 5 • C, controlled by an additional temperature controller. The voltage at each environment was collected evenly distributed throughout each module's valid voltage range. A comparative study has been done among the I-V curves predicted by back-propagation neural network (BPNN) (Gao, 2012), radial basis function neural network (RBFNN), support vector machine (SVM) (Shi et al., 2012), Villalva's model (Villalva et al., 2009), RVM and the proposed CS-RVM model (see Figure 5) with all tested PV experimental data. The BPNN and RBFNN are the most widely applied neural networks. SVM is a supervised learning model with associated learning algorithms that are used for regression and classification analysis. Villalva's model is a one of the most successful analytical PV models compromising simplicity and accuracy, and thus it was used for comparisons. Figure 5 illustrates the graphical output provided by this analysis. model outputsÎ are plotted versus the measured data I as open circles. The perfect fit (I =Î) is indicated by a solid line. It is observed that many simulated current values deviate the best linear fit line for Villalva's model, indicating large errors in I-V characteristics estimation. The BPNN, RBFNN, and RVM show better fit, yet several open circulars can be found away from measured data. It is difficult to distinguish the predicted values of SVM and CS-RVM from the perfect fit line, suggesting the best estimation performance. The accuracy of these models can be quantified by RMSE, MAPE, and R 2 values. The CS-RVM model obtains the minimum RMSE and MAPE. Compared to RVM, the CS-RVM model improves the MAPE by about 99.66%. Figure 6 depicts the superposition I-V curves between experimental data and predicted values via CS-RVM and Villalva's model for "KYOCERA-Solar KC200GT, " "SUNTECH STP 265S20, " and "TSMC-Solar TS-150C1" PV modules at the temperature of 25 • C and different solar irradiances. As can be seen, the predicted values from CS-RVM are relatively close to the measured ones for the three modules produced by different process. The accuracy of Villalva's model is slightly lower than that of CS-RVM model. To assess the performance of the designed CS-RVM model, the RMSE, MAPE, and correlation coefficient between experimental data and predicted values were estimated in Table 2. It could be observed that the CS-RVM obtained the lower RMSE and MAPE values for "KYOCERA-Solar KC200GT, " "SUNTECH STP 265S20, " and "TSMC-Solar TS-150C1" PV modules, implying the more accurate estimation performance.

CONCLUSIONS
A simple and accurate CS-RVM probabilistic approach has been developed for modeling electrical characteristics of various PV modules. Owing to PV circuit models that require complicated calculations, with parameters not readily available, the proposed methodology estimates I-V characteristics curves associated with environmental conditions. The capability of the CS-RVM has been verified with reasonable statistical indicators such as RMSE.
It has been demonstrated that the errors of RVM can be efficiently minimized by CS algorithm, and the CS-RVM model has been proven to be more beneficial than traditional explicit and implicit models in terms of accuracy. From the result comparison, we actually achieved significant improvements over the past works, with RMSE of 1.84×10 -5 . Others have RMSE of approximately 1.0. We conclude that the proposed CS-RVM to be a very efficient method in parameter estimation of PV modules. Consequently, the CS-RVM could be applied in the design stage, before PV system installation, providing a suitable performance estimation of PV modules. Hence, abundant solar energy could be harvested in a long run.

DATA AVAILABILITY STATEMENT
The original contributions presented in the study are included in the article/supplementary material, further inquiries can be directed to the corresponding author/s.

AUTHOR CONTRIBUTIONS
JB wrote the manuscript. XP and ZB implemented the experiments. MG checked the manuscript. All authors contributed to the article and approved the submitted version.