Statistical analysis and degradation pathway modeling of photovoltaic minimodules with varied packaging strategies

Degradation pathway models constructed using network structural equation modeling (netSEM) are used to study degradation modes and pathways active in photovoltaic (PV) system variants in exposure conditions of high humidity and temperature. This data-driven modeling technique enables the exploration of simultaneous pairwise and multiple regression relationships between variables in which several degradation modes are active in specific variants and exposure conditions. Durable and degrading variants are identified from the netSEM degradation mechanisms and pathways, along with potential ways to mitigate these pathways. A combination of domain knowledge and netSEM modeling shows that corrosion is the primary cause of the power loss in these glass/backsheet PV minimodules. We show successful implementation of netSEM to elucidate the relationships between variables in PV systems and predict a specific service lifetime. The results from pairwise relationships and multiple regression show consistency. This work presents a greater opportunity to be expanded to other materials systems.

Degradation pathway models constructed using network structural equation modeling (netSEM) are used to study degradation modes and pathways active in photovoltaic (PV) system variants in exposure conditions of high humidity and temperature. This data-driven modeling technique enables the exploration of simultaneous pairwise and multiple regression relationships between variables in which several degradation modes are active in specific variants and exposure conditions. Durable and degrading variants are identified from the netSEM degradation mechanisms and pathways, along with potential ways to mitigate these pathways. A combination of domain knowledge and netSEM modeling shows that corrosion is the primary cause of the power loss in these glass/backsheet PV minimodules. We show successful implementation of netSEM to elucidate the relationships between variables in PV systems and predict a specific service lifetime. The results from pairwise relationships and multiple regression show consistency. This work presents a greater opportunity to be expanded to other materials systems.

Introduction
With each passing year, the field of photovoltaics (PV) is rapidly expanding. The field's business value of hundreds of billions of dollars and global capacity progressing to terrawatts present a great need for creating long-lasting PV modules to minimize the levelized cost of electricity (LCOE) Jäger-Waldau (2022); Masson and Kaizuka (2020); Cole et al. (2017). Minimizing LCOE involves a careful selection of polymers, cell and module designs, and consideration of potential degradation modes that can arise through the interactions of components under the influence of the external environment. Individual material components in PV systems are affected by several environmental stressors (such Frontiers in Energy Research 01 frontiersin.org as heat and moisture), that can lead to degradation events such as corrosion Hihara et al. (2013), cracking, and discoloration which lead to decreased system performance. Ethylene vinyl acetate (EVA) is the most popular encapsulants in the PV industry. There are emerging polymeric systems that are being designed to overcome the issues with acetic acid formation in EVA in the presence of humidity along with high temperature and/or UV radiation Kempe et al. (2007); de Oliveira et al. (2018). Ultimately, undesirable changes leading to thermal/oxidative/hydrolytic/photo degradation will decrease the overall performance of the system Odegard and Bandyopadhyay (2011); Brebu (2020). Various models have been used to study degradation in PV modules, as evident from prior literature Radouane et al. (2014); Lindig et al. (2018); Escobar and Meeker (2006); Bala Subramaniyan et al. (2018). Most of these degradation models deal with degradation rates and/or isolated degradation modes to interpret power loss. Studying the degradation rate alone is insufficient in identifying the root cause of degradation. As the models have been constructed for PV modules under specific exposure conditions, they rely on fitting parameters and choosing non-linear terms that would best explain the trend. Such approaches are based on simplified assumptions, do not allow for generalization, and do not correlate to real-world exposure conditions, in which multiple degradation modes occur simultaneously Bruckman et al. (2013b); Nalin Venkat (2021).
Owing to the complex nature of degradation, it is essential to design an elaborate study protocol in which multiple degradation modes of PV variants under exposure can be explored and generalized models can be constructed to gain insights into the overall system performance. In this regard, data-driven modeling techniques are extremely useful in providing valuable insights into degradation behavior Lindig et al. (2018).
Network structural equation modeling (netSEM) is a generalized data-driven approach that allows for a systematic study of linear and non-linear relationships between variables along with the strength of the relationships between them by the usage of stressor, mechanistic variables, and response. netSEM was developed based on the foundational concepts of structural equation modeling (SEM) Ullman and Bentler (2012); prior applications of SEM have been demonstrated in psychology, sociology, and the life sciences netSEM is primarily used to analyze systems that are experiencing degradation of some performance characteristic under exposure to a particular stressor which is considered as an exogenous variable Bruckman et al. (2013b); Yang et al. (2019); Gok et al. (2019b,a).
There are two principles governing the netSEM analysis: Principle 1 (Markovian model) and Principle 2 (multiple regression model). In the Markovian model, variables are exclusively considered in a pairwise relationship and each pathway is described using a linear or non-linear model as well as statistical metrics (not to be confused with the netSEM model that consists of multiple best model pathways between variables). The multiple regression model, on the other hand, considers the multiple relationships among variables when variables change in a simultaneous fashion Yang et al. (2019). The resulting equations can have multiple linear and non-linear terms among several variables (see Section 3.6).
As there are multiple variants being analyzed in this study, it is also possible to obtain statistical insights into degradation and durability by utilizing confidence intervals (CIs). CIs capture the true population values within intervals Barde and Barde (2012). 95% CIs, which are based on the 1-sample t-test, can determine if a variant is durable or degrading at the end of exposure. 83.4% CIs, which are based on the 2-sample t-test, are indicative of difference between two means Knol et al. (2011);Nalin Venkat (2021). In order to determine if two samples behave similarly or differently (based on degree of CI overlap) at the end of exposure, inference of eye method by Cumming and Finch (2005) has been utilized in this study Nalin Venkat (2021).
In this study, we show the statistical analysis using inference by eye and application of netSEM in the context of 4-cell PV modules (referred to as minimodules) in accelerated exposure conditions. The purpose of this study is to compare PV minimodules that differ in packaging strategies. The goal is to analyze which types of PV minimodules undergo substantial power loss at the end of the exposure cycle and also gain insights into active degradation modes Nalin Venkat (2021). From this study, we observe that corrosion is the primary cause of degradation in the glass/backsheet PV minimodules. The results from analyzing pairwise relationships and multiple regression are consistent. To the broader research community, netSEM can be coupled with statistical analysis methods to gain insights into real-world degradation in different materials systems.

Study protocol: Experimental and analytical methods
An extensive study protocol, consisting of fabrication, exposures, evaluation, and analysis, was designed to systematically identify causes of degradation. Sixteen PV minimodules were fabricated and exposed in two types of indoor accelerated conditions. Stepwise electrical evaluation was performed to monitor changes in minimodules using current-voltage (I-V) and Suns-V oc measurements. The obtained dataset was used to compare degradation patterns in PV minimodule variants by statistical analysis and network structural equation models. The various components of the study protocol are detailed in the subsequent sections.

Fabrication of 4-cell PV minimodule variants
Each PV minimodule was fabricated using four multicrystalline monofacial passivated emitter and rear cells (PERC), provided by Canadian Solar Inc. (CSI). The four cells were soldered in series in different fabrication facilities, depending on the manufacturer. The front and rear sides of different 4-cell minimodules are shown in Figure 1. The minimodules differ on the basis of the module architecture and encapsulant material, which we will refer to as PV minimodule variants. There are two manufacturers in this study, named A and B, but the focus will be on the minimodules manufactured by B. Some of the minimodule variants manufactured by B show greater power loss, and this can help us identify the potential cause of degradation.  The two types of module architectures used were double glass (DG) and glass/backsheet (GB). In each DG minimodule, 2.5 mm heat-strengthened front and rear glass were utilized. In each GB minimodule, 3.2 mm tempered front glass was used and the backsheet was KPf, which is composed of polyvinylidene fluoride (PVDF)/polyethylene terephthalate (PET)/fluoropolymer layers.
The encapsulants were ethylene vinyl acetate (EVA) and polyolefin elastomer (POE). In each minimodule, transparent encapsulant was the front encapsulant layer and the rear encapsulant was of the UV-cutoff type. The encapsulant and backsheet materials were supplied by Cybrid Technologies Inc. At the end of fabrication of minimodules, five junction boxes were fixed on each minimodule to enable cell-level measurements and module-level measurements Nalin Venkat (2021). In this work, cell-level measurements were used for statistical and netSEM analysis.
In total, four minimodule variants (DG/GB, EVA/POE) were fabricated in which each variant had two minimodules (4 × 2 = 8 minimodules of all variants). There are two indoor accelerated conditions, namely, modified damp heat (mDH) and modified damp heat with full spectrum light (mDH + FSL). Eight minimodules were exposed in mDH and eight minimodules were exposed in mDH + FSL. In total, sixteen minimodules are considered in this study. The specifications of minimodules are summarized in Table 1. The details of exposure conditions are highlighted in Section 2.2.

Indoor accelerated exposure conditions
Damp heat (DH) is frequently used as a standard accelerated test in which the minimodules are exposed to 85°C at 85% relative humidity. Accelerated tests are frequently used as opposed to outdoor exposure as it takes about 20-25 years for natural aging to occur and degradation to manifest in PV modules A Omazic et al. (2019). DH is also a qualification test (pass/fail) on PV modules; however, it does not provide additional insights into the module service lifetime nor their long-term durability Koehl et al. (2017); Wohlgemuth and Kempe (2014).
DH exposure can induce hydrolytic degradation in the PET core layer, which is a crucial component in KPf backsheets. At temperatures above the glass transition temperature (T g ) of PET (T g = 85°C), there is an increased mobility of the polymer chain backbone which enhances the rate of hydrolysis and leads to loss of properties in PET. At exposure conditions below T g , the hydrolytic degradation was found to be minor despite humidity as high as 95% Kanuga (2012); Omazic et al. (2019). Hence, taking these concepts into account, the exposure temperature was reduced by 5°C in this study (i.e., the temperature used in the study was 80°C).
The two types of indoor accelerated exposures are modified damp heat (mDH) and modified damp heat with full spectrum light (mDH + FSL). Eight minimodules were exposed in mDH exposure (which is 80°C and 85% relative humidity) and the other Eight minimodules were exposed in mDH + FSL (of intensity 420 Wm −2 ) as per Table 1. The environmental chamber used in the study was a Cincinnati Sub-Zero SPHS-100. The full spectrum light was generated using Class C solar simulator high-intensity discharge (HID) lamps from Iwasaki Electric (Eye Lighting).
Both the exposures had a total duration of 2520 h (about 3.5 months) and were divided into five exposure steps of 504 h (equivalent to 21 days) each. This enabled us to perform stepwise evaluation on minimodules (discussed in Section 2.3). While the minimodules in mDH conditions had 504 h at each exposure step, the ones in mDH + FSL had 336 h (14 days) of mDH exposure, subsequently followed by 168 h (7 days) of full spectrum light.

Stepwise electrical evaluation
At the end of every exposure step, stepwise cell-level electrical measurements (current-voltage (I-V) and Suns-V oc ) were collected for the minimodules in both exposure conditions. Spi-Sun Simulator 4600SLP was used for taking I-V measurements. The Suns-V oc instrument used in the study was manufactured by Sinton Instruments.
I-V curves are useful in understanding the current and voltage at which the PV modules can be operated at fixed irradiance

Data processing
ddiv R package (version 0.1.1) Huang et al. (2021) was used to extract the electrical features from I-V and Suns-V oc data. Because of the inconsistent quality of the junction boxes, not all the cell-level measurements could be obtained. To avoid bias in the analysis, missing observations were handled by mean imputation Jakobsen et al. (2017). Mean imputation involves substituting the mean of the cell measurements in place of the missing data point. When the imputed values were found to be identical for two missing cell measurements, normally distributed values by using mean and standard deviation were obtained using the qnorm() function in R Nalin Venkat (2021). Due to the control in manufacturing and consistency of input materials in PERC cell fabrication, we assume that there is no major distinction among as-fabricated solar cells.
Once the values were imputed, the electrical features used in the study were normalized (i.e., cell-level electrical measurements were divided by their respective baseline values at exposure step 0) in order to reduce the noise in the data. The variables were chosen based on domain knowledge and involves selecting electrical features that track possible degradation modes. A detailed overview of variable selection is included in Section 3.1. After variable selection, netSEM R package (version 0.7.0) was used to construct models and obtain equations for pairwise relations Huang et al. (2018).

Network structural equation modeling (netSEM)
netSEM is a statistical approach to perform pathway network analysis in a system composed of continuous variables Bruckman et al. (2013b). Prior applications of netSEM have been successfully demonstrated in polymer studies Bruckman et al.

Functional form Equation
Simple linear (SL) Yang et al. (2019). In this work, netSEM is applied to PV minimodules. netSEM allows the incorporation of non-linear relationships between the variables, as opposed to SEM, which allows only linear relationships. Seven functions are available in the netSEM package: simple linear, quadratic, simple quadratic, change point, exponential, logarithmic and non-linearizable exponential (shown in Table 2) Nalin Venkat (2021).
In this study, the steps involved in obtaining netSEM results are illustrated in Figure 2. Feature selection (or commonly referred to as variable selection) is performed by using concepts from domain knowledge; even before collecting the data, knowledge of the variables obtained from various measurement techniques is crucial. This is discussed in Section 3.1. The data are acquired from two electrical measurement techniques, namely, I-V and Suns-V oc , as discussed in Section 2.3. After that, mean imputation is done to handle missing observations (highlighted in Section 2.4). Once the data is processed, the netSEM R package is used to obtain pairwise relationships and multiple regression equations.
In netSEM, the selection of the best models and the statistical significance of relationships can be retrieved using p-values and R 2 adj . Statistical testing is performed in netSEM, and p-values are calculated and compared against the selected significance level (α = 0.05) to validate the null hypothesis.
In netSEM, Markovian model and multiple regression model are utilized for variable selection and to rank their contribution to the response. The Markovian model Faraway (2004), considers only a pair of variables while the others are kept constant. The multiple regression model utilizes multiple regression to consider the simultaneous impact of variables on each other. Bruckman et al. (2013b). Both of the models are used to obtain results (refer to Section 3).

FIGURE 2
Flowchart showing the steps involved in obtaining netSEM results.

Results
This section presents an overview of the netSEM results for the minimodule variants used in this work. Before obtaining netSEM results, variable selection was done (Section 3.1), followed by the construction of 83.4 and 95% confidence intervals using data from the end of the exposure cycle (shown in Section 3.2). The selected stressor (S), mechanistic variables (M i ), and response (R) from the variable selection process were used in building netSEM <Stressor|Mechanism|Response> as well as pairwise <Stressor|Response>, <Stressor|Mechanism| and |Mechanism|Response> models using Markovian model. This is covered in Section 3.3 and Section 3.4. The factors contributing to power loss were inspected; one of the minimodule variants was chosen as an example for demonstration (Section 3.5). Using the multiple regression model, the <Stressor|Mechanism|Response> models were constructed and service lifetime prediction comparison for a minimodule variant fabricated by two different manufacturers was done. As an example, one of the <Stressor|Mechanism|Response> models obtained using the multiple regression model is shown in the article. In addition, the power loss due to mechanistic variables is demonstrated (Section 3.6).

Variable selection
Before constructing the netSEM models, variable selection was performed based on domain knowledge of PV module degradation. Exposure time was converted into decimal year and used as the stressor (S); here, decimal year means that the numerical value is not an integer but rather has a decimal value (for instance, 0.2 is a decimal year).
The mechanistic variables were selected in a way that they can track degradation mechanisms occurring in the minimodules. From prior studies, a decrease in short circuit current (I sc ) from I-V measurements has been attributed to changes in optical transmittance of encapsulants/glass, p-n junction degradation, and/or soiling Ahmad et al. (2019); Luo et al. (2019). Since there is no possibility of soiling/accumulation of dust in environmental chambers and because all the cells from the same batch were made at the same time, it is assumed that the decrease in I sc is most likely related to optical transmission loss. An increase in series resistance (R s ) from I-V measurements has been known for negatively impacting solder joints, interconnects, resistance in junction box connections and emitter/base regions of the cell, and/or cell metallization, causing increased corrosion van Dyk et al. (2005); van Dyk and ; Meyer and van Dyk (2004). Suns-V oc features provide information about the recombination losses and presence of shunts Kerr et al. (2001); Hossain et al. (2019). For this reason, voltage at maximum power (V mp ) obtained from Suns-V oc has been considered to track both recombination and shunting in minimodules.
In this work, the mechanistic variables and response have been normalized to reduce noise. The normalized mechanistic variables (M i ) are n I sc,IV (short-circuit current), n R s,IV (series resistance), and n V mp,PIV (voltage at maximum power). Maximum power from I-V measurements ( n P mp,IV ) is used as the response (R). The superscript "n" denotes normalized values for the variable and the subscripts, "IV" and "PIV", refer to whether the variable is extracted from I-V or Suns-V oc , respectively.

Confidence intervals at the end of exposure cycle
Confidence intervals of 83.4% and 95% were obtained for minimodule variants by manufacturer B at the end of exposure cycle (i.e., at exposure step 5). The minimodule variants underwent exposure for 2520 h in either mDH or mDH + FSL, marking the end of exposure cycle. For constructing each of the CIs, 8 cell measurements from two minimodules for each variant were used; this reduces the standard error by a factor of √ 8 and improves the statistical significance of the results. Figure 2 shows the confidence intervals at the end of exposure for GB minimodules fabricated by manufacturer B.
The CIs are categorized on the basis of architecture (GB/DG) and exposure type (mDH/mDH + FSL) for easier comparison. A normalized value of one means that there is no change. As per inference by eye conditions, overlapping CIs indicate that the variants behave similarly without significant differences Cumming and Finch (2005). From Section 1, 83.4% CIs help identify if two minimodule variants are similar/different in behavior and 95% CIs are useful in determining if a minimodule variant is durable/degrading.
From Figure 3A with n P mp,IV at the end of exposure, it can be seen that there is no significant impact of encapsulant type (i.e., 83.4% CIs overlap in each of the architecture/exposure type categories). However, between categories, it can be seen that EVAbased minimodules (GB in mDH and mDH + FSL exposures) are significantly different from DG in mDH + FSL exposure. In the case of other minimodule categories, they seem to be similar to each other due to overlapping 83.4% CIs. In the case of 95% CIs, GB minimodules are seen to be experiencing greater power loss. This observation is made, based on their interquartile ranges and estimated means. The estimated means for the GB minimodules indicate that the power loss is, on average, about 5%-6%. DG minimodules in mDH exposure seem to be exhibiting different trends with encapsulant type, as the ones with EVA have more power loss than those with the POE type. However, the CIs for DG minimodules in mDH exposure are relatively wider, making it less certain to quantify degradation. Figure 3B shows an increase in the n R s,IV and an increase from the baseline normalized value of 1 at the end of exposure cycle. From 83.4% CIs, there is significant overlap between different encapsulant types in each exposure. However, from 95% CIs, we see that most GB-based minimodules have increased n R s,IV (with the exception of GB with POE in mDH + FSL exposure). Most DG minimodules seem to be experiencing lesser corrosion in comparison to the GB counterparts (with the exception of DG with EVA in mDH exposure). Please note that the exceptions are highlighted to indicate that there are relatively wider CIs that affect certainty of the results.
We have inspected other mechanistic variables, namely, n I sc,IV and n V mp,PIV . Since these normalized variables vary within a small range of 0.98-1, we think that they do not contribute to power loss as much as n R s,IV does. The results of other mechanistic variables are included in the supplementary information.
3.3 <Stressor|Mechanism|Response> modeling of PV minimodule variants using markovian model Figure 4 and Figure 5 show the <Stressor|Mechanism|Response> (<S|M|R>) models generated by considering pairwise relationships between variables. These two specific cases were chosen to represent a variant that experiences degradation and another variant, that is, relatively stable. Each variable is color-coded: stressor (dark blue), mechanistic variables (yellow), and response (purple). The corresponding short-hand descriptions of degradation modes tracked by mechanistic variables are included in the light blue boxes. Each pairwise relationship (referred to as <S|M| and |M|R>) between variables is described by the 'best model' that fits two variables and the corresponding R 2 adj . In netSEM, the best model refers to the functional form between two variables that has the highest R 2 adj . The figure shows that both linear models as well as nonlinear best models are present. A higher R 2 adj signifies that there is a strong correlation between two variables and the model describes the trend well. In addition, the p-values obtained for the best models from netSEM pairwise relationships were significantly smaller than the significance level of 0.05, indicating that the relationships are statistically significant. Figure 4 shows the <S|M|R> model for GB with EVA fabricated by manufacturer B in mDH exposure. Between dy and n P mp,IV , the best model is SQuad with an R 2 adj of 0.37. The <S|M| paths connecting dy and n I sc,IV , n R s,IV , and n V mp,PIV show that the R 2 adj (and corresponding best models) are 0.47 (SL), 0.40 (SQuad) and 0.36 (Quad). There is moderate dependence of the response and mechanistic variables on dy from |M|R> paths connecting n I sc,IV , n R s,IV , and n V mp,PIV to n P mp,IV indicate that the R 2 adj (and corresponding best models) are 0.31 (Log), 0.96 (CP) and 0.2 (SQuad). This shows that the change in n P mp,IV is strongly impacted by n R s,IV , which is explained by change point (CP). n V mp,PIV does not impact n P mp,IV as much as the R 2 adj , the lowest among the three mechanistic variables.

FIGURE 4
<S|M|R> model constructed using Markovian model for the variant GB with EVA encapsulation made by manufacturer B and exposed in mDH condition. dy is exposure time (stressor), n P mp,IV is maximum power, which is the response. n I sc,IV indicates short-circuit current, n R s,IV indicates series resistance and n V mp,PIV indicates voltage at maximum power (IV means that the measurement is from current-voltage data whereas PIV means it is a Suns-V oc measurement). The blue boxes indicate the degradation mode that the variable tracks: n I sc,IV tracks optical transmission loss, n R s,IV monitors corrosion and n V mp,PIV tracks recombination and shunting.

FIGURE 5
<S|M|R> model constructed using Markovian model for the variant DG with POE encapsulation made by manufacturer B and exposed in mDH + FSL conditions. dy is exposure time (stressor), n P mp,IV is maximum power, which is the response. n I sc,IV indicates short-circuit current, n R s,IV indicates series resistance and n V mp,PIV indicates voltage at maximum power (IV means that the measurement is from current-voltage data whereas PIV means it is a Suns-V oc measurement). The blue boxes indicate the degradation mode that the variable tracks: n I sc,IV tracks optical transmission loss, n R s,IV monitors corrosion and n V mp,PIV tracks recombination and shunting.

Frontiers in Energy
Research 07 frontiersin.org

FIGURE 6
Variation of n P mp,IV with dy manufactured by B. The best model equation line and name in text, data points and 83.4% CIs (orange) at the end of exposure cycle are shown. Figure 5 shows the <Stressor|Mechanism|Response> model for DG with POE by manufacturer B in mDH + FSL exposure. In the <S|R> pathway, the R 2 adj is very low (SQuad: 0.036), which means that there is no significant impact of dy on n P mp,IV . This means that power is not affected by exposure time, indicating stability of the minimodule variant. The <S|M| paths connecting dy and n I sc,IV , n R s,IV , and n V mp,PIV show that the R 2 adj (and best models) are 0.07 (Quad), 0.04 (SQuad) and 0.7 (Quad). In <Stressor|Mechanism|Response> models, we look at the direct pathway (connecting dy and n P mp,IV ) and note the R 2 adj . Then, we look at the remaining paths and see if two variables are strongly/weakly correlated to each other for a particular best model (based on R 2 adj ) and compare with the direct path. n V mp,PIV is highly correlated to dy. Only n R s,IV has a direct impact on n P mp,IV with SL as the best model and R 2 adj of 0.92. The rest of the variables have an R 2 adj value of < 0.03. With the exception of n V mp,PIV , the rest of the mechanistic variables are weakly correlated with dy.
Between these two <S|M|R> models, it is apparent that GB with EVA in mDH shows a stronger correlation in the <S|R> pathway compared to DG with POE in mDH + FSL. There is a strong dependence between n R s,IV and n P mp,IV in both the cases. It is to be noted that the power is affected due to a particular mechanistic variable if the R 2 adj is significant in <S|M| and |M|R>. Keeping this point in mind, GB with EVA in mDH is affected by n R s,IV and hence, experiences substantial power loss. Each of these individual pathways can be studied in further detail to understand how the variables are related to each other.

<Stressor|Response>, <Stressor|Mechanism| and |Mechanism|Response> models using markovian model
The pairwise relations showing best fitting models from <S|M|R> models in Figure 4 and Figure 5 are mathematical equations. The different variants and exposure types are shown in the form of a facet plot wherein individual panels represent a particular subset of data (divided in terms of module architecture, encapsulant, and exposure type). In each of the facet plot grids, there are data points, along with the best model equation line and name, 83.4% CI at the end of exposure cycle and the corresponding estimated mean. Using facet plots, we can further investigate <S|R>, <S|M| and |M|R> in greater detail and gain a stronger quantitative perspective beyond <S|M|R> models. Figure 6 shows the <S|R> best model equation line, data points, 83.4% CIs at the last exposure step, and the name of the best model that fits the data in the best possible manner in text. It can be observed that the GB minimodules experience a power loss of about 5%-6% on average, as highlighted in Section 3.2 (each minimodule can generate a power of about 16 W). The best model equation also shows a considerable drop in power for GB minimodule variants in both the exposure types. Most DG minimodule variants seem stable, the exception being DG with EVA in mDH exposure. DG with EVA in mDH + FSL is the most stable as there is no best model equation that exists between n P mp,IV and dy due to R 2 adj being less than 0.01. In  Power loss due to mechanisms: GB with EVA by manufacturer B in mDH. Each of the lines correspond to the power loss caused by each mechanistic variable compared to the total power loss. The equations were obtained using Markovian model. the netSEM package, any pathway with R 2 adj less than 0.01 does not have a best model.
The causes of power loss in GB minimodules can be better understood by investigating <S|M| and |M|R> results. Considering corrosion as the mechanism, tracked by n R s,IV , it can be seen that there is a substantial increase in n R s,IV with increasing dy in all the GB variants from Figure 7A. It can also be observed that there is high scatter in the data points; this is because, before normalization, series resistance values are small and prone to variations Nalin Venkat (2021). With increasing n R s,IV , there is a strong decrease in n P mp,IV , as shown in Figure 7B. For n R s,IV to cause time-dependent power loss, there needs to be a significant relationship between dy and n R s,IV as well as n R s,IV and n P mp,IV , which is highlighted in Section 3.3. Even though DG minimodules have a strong |M|R> trend (where the mechanistic variable is n R s,IV ), the <S|M| trend is not as strong compared to the GB variants. From Figure 7 and results from other mechanistic variables (included in the supplementary information), power loss in GB minimodules is seen to be driven primarily by corrosion.

Power loss due to mechanisms using markovian model
In the netSEM package, it is possible to obtain mathematical equations of the best model fit between two variables (bestModel) as well as other statistical measures such as R 2 adj and p-values using netSEMp1() function. From this section onwards (excluding <S|M|R> model generated using multiple regression), we use the inverse of n R s,IV called n C s,IV (series conductance) to ensure that the range is from 0-1 (instead of 1-∞, as in the case of n R s,IV ). Using n C s,IV also makes it convenient for comparing between mechanisms that potentially cause power loss. We have primarily used n R s,IV , as it is the variable that has been used for tracking corrosion in prior literature; having an understanding of original mechanistic variables will aid in understanding degradation.
Considering the minimodule variant GB with EVA in mDH exposure fabricated by manufacturer B, we get the following set of equations by substituting <S|M| in |M|R>; the list of equations are included in Eq. 1. Note that the mechanistic variables include both n I sc,IV and n V mp,PIV , along with n C s,IV .
Here, n P tot mp,IV refers to the total power whereas n P M i mp,IV refers to power loss due to individual mechanistic variables (M i is the normalized mechanistic variable and subscripts IV and PIV have been dropped off for convenience).

FIGURE 9
<S|M|R> model of GB with EVA by manufacturer B exposed in mDH conditions generated using multiple regression model. dy is exposure time (stressor), n P mp,IV is maximum power, which is the response. n I sc,IV indicates short-circuit current, n R s,IV indicates series resistance and n V mp,PIV indicates voltage at maximum power (IV means that the measurement is from current-voltage data whereas PIV means it is a Suns-V oc measurement). The blue boxes indicate the degradation mode that the variable tracks: n I sc,IV tracks optical transmission loss, n R s,IV monitors corrosion and n V mp,PIV tracks recombination and shunting.

FIGURE 10
Service lifetime prediction plot comparing GB with EVA in mDH manufactured by A and B. Figure 8 shows the power loss due to individual mechanistic variables (i.e., <S|M i |R>) and total power (i.e., <S|R>). <S| n C s,IV |R> curvature matches with that of <S|R> in the exposure time range of 0-0.3 decimal year. Both <S| n I sc,IV |R> and <S| n V mp,PIV |R> are unable to achieve the curvature of <S|R>. The decrease of <S| n I sc,IV |R> is linear and <S| n V mp,PIV |R> has curvature which stabilizes after about 0.2 decimal year.

Multiple regression results
We have been able to explore the trends between variables in a pairwise manner while keeping the rest of the variables constant in the previous sections. This approach, however, does not capture the complexity of degradation. In the real world, PV module degradation is a phenomenon in which multiple stressors and degradation modes act simultaneously. The multiple regression Frontiers in Energy Research 11 frontiersin.org

FIGURE 11
Contributions of mechanistic variables compared against total power loss with year (dy) for GB with EVA in mDH by manufacturer B using multiple regression model. model has the ability to perform multiple regressions by considering several predictors. Each variable is regressed on the remaining variables except the response, and using stepAIC(), the most parsimonious model equation is selected on the basis of Principle 1 best models Huang et al. (2018). For example, this means that dy and M i simultaneously impact n P mp,IV . Figure 9 shows the <S|M|R> model obtained. For the variant GB with EVA by manufacturer B exposed in mDH conditions, n P mp,IV is a function of dy, n I sc,IV , and n R s,IV . There is no direct relationship between n P mp,IV and n V mp,PIV . The equation with n P mp,IV as the dependent variable obtained using a netSEM function, namely, netSEMp2(), is given by Eq. 2. n P mp,IV = 1.26 + 0.11dy 2 + 1.68log ( n I sc,IV ) − 0.26 n R s,IV − 0.11( n R s,IV − 1) c (2) The R 2 adj of the model is 0.97, which is much higher than pairwise relationships from Markovian model. The subscript "c" indicates a change-point/segmented term. A change-point/segmented term is simply the breaking point between two linear equations of differing slopes. Furthermore, we can use multiple regression to predict how response and mechanic variables change over time.
The importance of multiple regression model lies in its capability of service lifetime prediction (SLP) by considering the influence of multiple mechanistic variables and stressor. Using multiple regression, we obtain equations including n P mp,IV as a function of M i and dy, as well as equations for each M i as a function of the rest of M i and dy. Most often those multivariable equations are implicit and we use Newton/Broyden's method to find the numerical solutions, determining how n P mp,IV and M i s change over dy. In this part, we have performed the SLP for a single variant fabricated by two different manufacturers: A and B. From Figure 10, we see the service lifetime prediction plot in which the minimodule variant, GB with EVA, in mDH is compared on the basis of manufacturer (A versus B). We are able to see that the variant by manufacturer B undergoes greater power loss than that of manufacturer A. In addition, we are able to see that the highest contribution is from n C s,IV , as it closely follows the n P mp,IV plot from Figure 11.
Eq. 3 provides the general form of multiple regression for degradation in GB with EVA in mDH fabricated by both manufacturers A and B. Table 3 shows the corresponding change points and coefficient values. By comparing each coefficient between the minimodules by two different manufacturers, we can quantify differences in the degradation behavior using the value in Eq. 3, as shown in Table 3. The γ dy,1 , γ dy,2 , and δ V mp of minimodules by manufacturer B are all zeros. It indicates that the major degradation mechanisms for manufacture B are corrosion and optical loss, corresponding to nonzero δ I sc and δ R s , respectively. The degradation of minimodules by manufacturer A is more complicated as there is additional recombination loss and other unknown timedependent losses. Moreover, the change points indicate different degradation patterns in the same minimodule variant fabricated by manufacturers A and B. n P mp,IV = η + γ dy,1 dy + γ dy,2 dy 2 + δ I sc ( n I sc,IV )

Discussion
This section highlights the advantages of netSEM as a generalized data-driven analysis tool and its importance in developing a study protocol. Observations from this study are also compared with that of prior literature. In addition, Markovian model and multiple regression model results are compared for concurrence and the role of corrosion in minimodule degradation is briefly discussed.

Comparison of this work with other degradation models
With an extensive study protocol, statistical analysis using CIs coupled with data-driven netSEM modeling gives insights into the degradation behavior of minimodule variants. This is a novel approach to thoroughly explore the pairwise relationships between variables using Markovian model as well as the simultaneous impact of multiple variables at the same time.
Degradation of PV modules reported in prior literature has assumed a degradation rate model, that is, defined based on specific module or exposure conditions. For instance, in the study by Theristis et al. (2022) degradation rates for fielded modules were defined on the basis of nameplate ratings and from poststabilization flash test. In other cases, degradation rate models have been proposed for specific degradation mechanisms or processes in which model parameters are variable Lindig et al. (2018). In either of these cases, the proposed degradation rate models are applicable only for particular cases and are restricted by the assumptions. PV reliability models have been constructed to understand the trend of power degradation; it has been commonly observed that power loss is non-linear. To capture the non-linearity of power loss, many regression-based models have been proposed; however, they are only applicable to specific sample types and external conditions, and cannot be generalized Kaaya et al. (2021); Lindig et al. (2018).
netSEM overcomes issues that are present in these contemporary models. No inherent assumptions are made in netSEM data-driven modeling allowing it to be applicable to any system with a welldefined stressor, mechanistic variables, and response framework. Another advantage of netSEM is that we can compare parallel and/or competing degradation pathways as well as their impact on power loss, which is yet to be demonstrated by traditional PV modeling techniques. By coupling netSEM with statistical analysis using CIs, the results are not just observations but are statistically significant at the 5% level.

Current findings and relation to prior literature
A subtle but important observation emerges from the results discussed in Section 3; the encapsulants do not seem to experience drastic degradation in either mDH or mDH + FSL, indicating that there is no over-acceleration of degradation modes Nalin Venkat (2021).
Many studies in literature implement DH as the predominant indoor exposure condition. In a study by Park et al. (2021) DH test (for 5500 h) was performed on p-PERC GB modules with EVA; an increase in fill factor and series resistance was observed due to corrosion of metal electrodes by moisture ingress. In the same study, DH with temperature cycling (DH5000/TC600) revealed that POE showed better durability than EVA. In both the cases, the power loss followed a change point trend and the series resistance increased faster for EVA module than the POE one Park et al. (2021). In another work by Oreski et al. (2020) upon 3000 h of DH exposure, only EVA-based modules were seen to have corrosion at the silver grid as well as above the ribbons; modules with POE displayed no corrosive effects. DH exposure has been debated as an aggressive exposure condition in other studies; ranging from overacceleration of PET layer to 2x higher degradation level in outdoor conditions Kanuga (2012); Yang et al. (2019); Hülsmann and Weiss (2015); Kempe and Wohlgemuth (2013). Using mDH with/without FSL does not lead to extreme degradation as evidenced in our study. Cross-correlation of degradation in minimodules exposed in indoor accelerated conditions (mDH with/without FSL) and outdoor conditions will be part of our future work.
The differences in degradation among minimodule variants is primarily due to the module architecture. Although module architectures do not play an active role in power generation, they can lead to issues in long-term performance Aghaei et al. (2022).
In this study, on average, GB minimodules undergo greater power loss in comparison to DG minimodules. GB minimodules were observed to experience a greater power loss primarly due to corrosion. Most of the DG variants were observed to be stable in both mDH and mDH + FSL exposures. In a study by Karas et al. (2020) involving packaged silicon heterojunction cells, corrosion in GB modules with EVA encapsulation was found to be higher than DG counterparts in DH exposure due to higher moisture penetration; GB modules with POE were found to undergo lower degradation in comparison Sinha et al. (2021). Even though our study does not consider packaged c-Si cells in particular, moisture ingress could be a possible explanation to why GB minimodules experience more power loss in our study. Moisture ingress is known to be initiated from edges of modules Poulek et al. (2021); Park et al. (2021). In another independent study by Kumar et al. (2022) GB minimodules with EVA were seen to have increased series resistance at high humidity levels. In a netSEM analysis done for EVA-based minimodules under DH conditions, hydrolysis of EVA was seen to be a dominant degradation pathway Wheeler (2017); Yang et al. (2019). Further investigation needs to be performed to validate the aforementioned claim in our study. In the scientific community, there is an ongoing discourse about which module architecture is better Aghaei et al. (2022); therefore, we cannot make a generalized claim that DG is better than GB. Furthermore, since POE is a relatively new material in the PV industry, its performance in longterm exposure has not been explored yet especially in real-world conditions. Considering the conditions, duration of the exposure, and the obtained results, we cannot make firm conclusions that one type of module architecture/encapsulant is better than the other.

Comparison between results from markovian model and multiple regression model
The results from Markovian model and multiple regression model are observed to be fairly consistent. Higher power loss in GB minimodules due to corrosion are supported by both Markovian model and multiple regression model results. We considered series conductance for ease of comparison between different mechanistic variables).
From multiple regression analysis, we observed that the degradation is heavily impacted by the difference in manufacturer for the same minimodule variant, highlighting the importance of quality control in the experimental and fabrication process. Lamination process plays an important role in the quality, reliability, and longevity of PV modules as validated from previous studies Davis et al. (2016); Schneller et al. (2016); Aghaei et al. (2022). In order to increase the lifetime and the overall performance of PV modules, it is of utmost necessity to control the fabrication process (including but not limited to soldering and lamination) to manufacture PV modules of high quality.

Conclusion
In this work, the application of netSEM R package and statistical analysis have been demonstrated by using stepwise measurement data of indoor-exposed minimodule variants. A comprehensive overview of the study protocol comprising of fabrication, exposure types, and characterization techniques, as well as the steps involved in obtaining data to eventually use the netSEM R package has been provided. Using domain knowledge regarding PV module degradation and statistics, CIs and netSEM models were constructed. By utilizing Markovian model and multiple regression, durable/degrading variants were identified.
As part of our future work, we are developing an automated analysis pipeline for analyzing minimodule variants using multiple regression and rank-ordering them on the basis of their degradation behavior.

Data availability statement
The original contributions presented in the study are publicly available. This data can be found on OSF open data platform: https:// osf.io/FYG6E/.