Characterization of Groundnut (Arachis hypogaea L.) Test Locations Using Representative Testing Environments With Farmer-Preferred Traits

In this study, the differential rankings of 36 groundnut genotypes under varying environmental conditions were studied at various levels of phenotype. Locations that are generally accepted by the crop- and soil-based research community to represent the entire Guinea and Sudan Savanna agro-ecological zones in Ghana were characterized, this time using a crop. The characterization was done based on four farmer-preferred traits (early and late leaf spot disease ratings, and haulm and pod yields) using three models (i.e., AMMI, GGE, and Finlay–Wilkinson regression). These models were used to capture specific levels of phenotype, namely, genotype-by-environment interaction (GE), genotype main effect plus GE (G+GE), and environment and genotype main effects plus GE (E+G+GE), respectively. The effect of three major environmental covariables was also determined using factorial regression. Location main effect was found to be highly significant (p < 0.001), confirming its importance in cultivar placement. However, unlike genotypes where the best is usually adjudged through statistical ranking, locations are judged against a benchmark, particularly when phenotyping for disease severity. It was also found that the locations represent one complex mega-environment, justifying the need to test new technologies, including genotypes in all of them before they can be approved for adoption nationally. Again, depending on the phenotypic level considered, genotypic rankings may change, causing environmental groupings to change. For instance, all locations clustered to form one group in 2017 for early and late leaf spot diseases and pod yield when GE was considered, but the groupings changed when G+GE was considered for the same traits in the same year. As a result, assessing genotypic performance at the various levels to arrive at a consensus decision is suggested. Genotypes ICGV-IS 141120 and ICGV-IS 13937 were found to be the best performing.

In this study, the differential rankings of 36 groundnut genotypes under varying environmental conditions were studied at various levels of phenotype. Locations that are generally accepted by the crop-and soil-based research community to represent the entire Guinea and Sudan Savanna agro-ecological zones in Ghana were characterized, this time using a crop. The characterization was done based on four farmer-preferred traits (early and late leaf spot disease ratings, and haulm and pod yields) using three models (i.e., AMMI, GGE, and Finlay-Wilkinson regression). These models were used to capture specific levels of phenotype, namely, genotype-by-environment interaction (GE), genotype main effect plus GE (G+GE), and environment and genotype main effects plus GE (E+G+GE), respectively. The effect of three major environmental covariables was also determined using factorial regression. Location main effect was found to be highly significant (p < 0.001), confirming its importance in cultivar placement. However, unlike genotypes where the best is usually adjudged through statistical ranking, locations are judged against a benchmark, particularly when phenotyping for disease severity. It was also found that the locations represent one complex mega-environment, justifying the need to test new technologies, including genotypes in all of them before they can be approved for adoption nationally. Again, depending on the phenotypic level considered, genotypic rankings may change, causing environmental groupings to change. For instance, all locations clustered to form one group in 2017 for early and late leaf spot diseases and pod yield when GE was considered, but the groupings changed when G+GE was considered for the same traits in the same year. As a result, assessing genotypic performance at the various levels to arrive at a consensus decision is suggested. Genotypes ICGV-IS 141120 and ICGV-IS 13937 were found to be the best performing.
Keywords: groundnut, AMMI, GGE, model diagnostics, early leaf spot, late leaf spot, multi environments trial INTRODUCTION Groundnut (Arachis hypogaea L.) is a leguminous oilseed crop grown in the semi-arid and subtropical regions across 40 • north and south of the equator (Ajeigbe et al., 2015). In Ghana, it is the most important grain legume largely cultivated under rain-fed conditions (Oteng-Frimpong et al., 2017). The pods and haulms are important sources of income for smallholder farmers (Ajeigbe et al., 2015;Oteng-Frimpong et al., 2017). Although Africa holds the maximum global area under groundnut cultivation (i.e., 11.7 m ha representing 47.56% of total area cultivated). Meanwhile, its yield are very low (929 kg ha −1 ) compared with that of the Americas (3,632 kg ha −1 ) (Ajeigbe et al., 2015) due to a myriad of biotic and abiotic stresses (Oteng-Frimpong et al., 2015). This has resulted in a series of interventions aimed at improving pod yields on farmers' fields.
In an attempt to identify genotypes that combine inherent tolerance to biotic and abiotic stresses with high pod and haulm yields across target environments, breeders conduct multienvironment trials (METs). This has resulted in the identification of superior genotypes that are better in specific locations (specifically adapted genotypes and unstable) or across locations in a range of crops including groundnuts (Padi, 2008;Asibuo et al., 2018). Most studies, however, focus solely on the genetic component of the breeders' equation and its interaction with the environment, with little or at times no consideration for the non-genetic component.
Nonetheless, the major challenge of increasing food production by about 50% by 2050 in a context of shrinking and degraded arable land, nutrient deficiencies, increased water scarcity, uncertainty due to predicted climatic changes (Vadez et al., 2012), and under contrasting environments requires that test locations are well-delineated for the various crops to optimize crop improvement programs. For instance, a survey of groundnut fields in southern Africa found pod borers (Elaterids, Tenebrionids, doryline ants, and millipedes) to be present but rarely at sufficient densities to warrant concern (Wightman and Wightman, 1994), while doryline ants, white grubs, termites, and millipedes were a major concern on CSIR-SARI research fields in Ghana. Also, location is reported to be the key cause of variation for pod yield (tha −1 ; plot −1 ), number of pods (plot −1 ), hundred pod weight, and number of seeds (plot −1 ) in groundnut (Asibuo et al., 2018). Environmental main effect has been found to contribute the largest variation (68%) in sugar content in groundnut as compared with genotype main effect and genotype-by-environment interaction (GE), respectively (Isleib et al., 2008). In comparing the effectiveness of Eberhart and Russell joint regression method and GGE biplot analysis for GE detection, environment main effects were found to account for over 80% of all variation, compared with <20% for genotype and GE in maize hybrids (Alwala et al., 2010). This means that knowing and matching the non-genetic component of the breeders' equation (nurture) with the genetic component (nature) are key in attaining crop potential yields.
In Ghana, selection of test locations for assessing genotypes performance is largely based on agro-ecological zones. The Ghana Meteorological Agency has grouped the country into four agro-ecological zones based on the climate, namely, Coast, Forest, Transition, and North (Amekudzi et al., 2015). However, there are five, viz. tropical rain forest, semi-deciduous forest, forest-savannah transition, Guinea savannah, and Sudan savannah, in terms of vegetation. Within each ecological zone is a wide range of soil types and fertility status, cropping systems and history, diseases and pests, rainfall amount, duration and distribution, temperature, and humidity (Siaw, 2001). The interaction between these two classification scenarios (climate and vegetation) and spatial factors creates a very complex system of environments that confront groundnut cultivar development.
As a result, it is not enough to generalize and choose test locations for all crops based on climate, vegetation, or any economic consideration as has been happening in Ghana. Rather, these should be determined using the crop in question's response in all possible locations within the target environment. Representative locations can then be selected to constitute test locations for successive years, although re-characterization will be necessary after some years due to the dynamic nature of environmental conditions. Thus, the objectives of this study were to (i) categorize the target environment of CSIR-SARI groundnut breeding program into test locations using models that capture various components of the overall phenotype and (ii) determine the effect of GE on trait associations at the various levels of the phenotype.

Plant Materials and the Design of Experiments
Field experiments were conducted in two seasons (2017)(2018) under rain-fed conditions. In the 2017 season, the experiments were conducted at Manga, Damongo, Nyankpala, and Silbele, while in the 2018 season, Tanina was added as an additional location (Supplementary File 1 Sheet 1). The locations used represent testing sites for CSIR-SARI and have been used in other studies (Marfo and Padi, 1999;Padi, 2008). The experiments comprised 36 genotypes (Supplementary File 1 Sheet 2) arranged in a lattice design and replicated twice, with six blocks per replicate. Each block contained six plots with each plot covering an area of 6.2 m 2 . Plants were spaced at 0.4 m between rows and 0.1 m within rows.
In each experimental location, fields were prepared by plowing followed by harrowing. Alligator 400 EC (pendimethalin, 400 g a.i. L −1 ) was applied immediately after planting to suppress weeds followed by one-hand weeding before pegging. Phosphorus in the form of triple super phosphate was applied as a basal fertilizer at the rate of 125 kg ha −1 (100 g plot −1 ) just after seedling emergence. The plants were further supplemented with calcium by the application of 400 kg ha −1 Omya Calciprill R (38% Ca and 0.6% Mg) between 20 and 25 days after sowing. All other recommended agronomic practices in groundnut production were adhered to. Data were collected on severity of early leaf spot (ELS) and late leaf spot (LLS) diseases as well as pod yield (PY, kg ha −1 ) in 2017. In 2018, data on above ground dry matter (ADM, kg ha −1 ) were also recorded in addition to all the datasets recorded in 2017. The ELS and LLS severity were scored on a scale of 1-9, with 1 representing complete resistance and nine representing complete defoliation of plants (Subrahmanyam et al., 1995). These scores were converted to quantitative variables using area under disease progress curve (AUDPC) based on the formula below: where Y i = disease level at time t i and t (i+1) -t i = time in days between two sequential disease scores. High AUDPC value represents high disease susceptibility.

Statistical Analysis
Normality of data was checked using the Shapiro-Wilk test. In situations where the data did not follow the Gaussian distribution, data transformation using the Box-Cox procedure (Box and Cox, 1964) with the MASS package (Ripley et al., 2019) of R statistical software (version 3.6.2) (R Core Team, 2020) was done. Data analyses were done using R statistical software based on a two-stage strategy (Malosetti et al., 2013). In the first stage, linear mixed effect model was fitted to the location specific data with restricted maximum likelihood (REML) using the lme4 package  of R to include all terms for design features. All model terms were regarded as random except the genotype (Equation 1).
where y ijk is the performance of genotype k in block j nested within replicate i, r i is the effect of replicate i, b j(i) is the effect of block j nested within replicate i, g k is the effect of genotype k, and ε ijk is the residual. Significance of the fixed variables and effect estimates was tested and computed using Wald test with car (Fox et al., 2018) and lemerTest (Kuznetsova et al., 2019) packages, respectively. Estimated marginal means were computed, and multiple comparisons with Tukey's honestly significant difference (HSD) test at 0.05 probability were computed using the emmeans package (Lenth et al., 2019). The degree of freedom for the marginal means was computed based on Kenward-Roger's method, and confidence intervals were computed at 0.95.
To determine the AMMI and GGE family of models required in the second stage, model diagnosis was done using replicated data from the various locations based on signal-noise estimation (Gauch, 2013). The model diagnosis was done with an inhouse algorithm designed to work in the R statistical software (Supplementary Files 2 and 3) based on the fundamental equations developed with descriptors from Gauch (2013), as shown below: where GE N is the GE noise; GE df , GE degree of freedom; GE S , GE signal; GE SS , GE sum of squares; GGE N , G+GE noise; GGE df , G+GE degree of freedom; GGE S , G+GE signal; GGE SS , G+GE sum of squares; and Error MS , error mean square. In addition to the traditional model diagnosis, the functions make the F-test component of the AMMI and GGE more robust by separating the Pure Error from the Error [referred to as Experimental Design in Gauch (2013)]. As a result, if blocks within environment are not statistically significant (p > 0.05), Error is used; otherwise, Pure Error is used in F-tests.
Relationship among traits at the various levels of phenotype was determined based on Pearson's correlation. The correlation analysis was done and visualized using the agricolae and corrplot (Wei et al., 2017) packages of R.
In the second stage of the analysis, the genotype-byenvironment table of means was subjected to the various analytical procedures, viz. AMMI, GGE, and Finlay-Wilkinson (FW) regression. AMMI and FW regression were done using agricolae (Mendiburu, 2019) and FW (Lian, 2018) packages, respectively. GGE analysis was carried out by modifying the AMMI function (Supplementary File 3) to capture G+GE. Location-specific genotype winners from the adjusted means were used in environmental characterization (Gauch, 2013;Gauch and Moran, 2019). A factorial regression model was also fitted using the base functions in R, with mean daily precipitation, temperature, and relative humidity (RH) per growing season used as the explicit environmental covariables. The relevance of the genotype and location main effects as against the genotypeby-environment effect was determined using a linear model with genotype-by-environment mean-squares. The best-performing genotypes were selected based on results from the linear model.

Relevance of the Environment and Genotype Main Effects Over the Genotype-by-Environment Interaction
The main effect for location was significant (p < 0.001) for all traits studied in 2017 and 2018 (Table 1), when compared with the GE. Manga was the location with the highest (82.9) ELS AUDPC in 2017, followed by Nyankpala (65.10) with Silbele being the lowest (37.10) ( Table 2). In 2018, Manga recorded the highest AUDPC for ELS (86.90), but there was no significant difference between this location and Damongo (85.60) or Nyankpala (85.60). In contrast, Silbele and Tanina had lower AUDPC with no significant difference in this variable between these two locations. Also, Manga had the highest LLS in 2017 and 2018 (93.80 and 65.20, respectively). There was, however, no significant difference between Silbele and Tanina in 2018 ( Table 2). For haulm yield, Silbele had the highest in 2017 and 2018 (3,733.00, 4,700.52), and this was significantly (p < 0.001) different from the other locations. Also, the haulm yield obtained in Nyankpala differed from that of Manga in both years, with Manga being among the lowest in both years. Damongo had the highest (2,272.00) pod yield in 2017, while Silbele had the lowest ( Table 2). In 2018, Silbele had the highest pod yield, while Tanina had lowest (466.00). There was no significant difference between this variable in Damongo and Nyankpala.

AMMI Model Diagnosis Based on Genotype-by-Environment Interaction Signal-Noise Estimation
GGE model diagnosis does not exist in any of the statistical software currently available, while the AMMI model diagnosis only exists in the AMMISOFT software (Gauch and Moran, 2019). However, a rearrangement of data is necessary if AMMISOFT is to be used. An algorithm was therefore written to capture the signal in the respective multiplicative terms of each model from the total multiplicative terms leaving the noise signal. The environment and genotype main effects as well as the GE were statistically significant (p < 0.01) for all the studied traits in 2017 (Table 4). However, the AMMI model diagnosis based on Gollob's test showed AMMI3 as the appropriate model for ELS disease, while AMMI2 was the appropriate model for haulm and LLS disease, respectively, with AMMI1 being appropriate for pod yield. On the other hand, model diagnosis based on signal-noise estimation revealed AMMI1 as the appropriate model for the ELS disease and haulm yield with AMMI0 being the appropriate model for LLS disease and pod yield, respectively. The GE signal present in the overall GE for ELS and LLS diseases and haulm and pod yields data was 72.16, 53.80, 62.71, and 45.03%, respectively, with the rest being noise ( Table 4). The signal (GE S ) captured by the interactive principal components (PCs) for ELS disease and haulm yield were 71.15 and 95.79%, respectively.
In 2018, the environment main effect was significant for all traits (Table 4). Also, that of the genotype main effect was significant for all traits, except ELS. Similarly, the GE was not significant (p > 0.05) for ELS, although the AMMI model diagnosis based on Gollob's test suggested AMMI1 as the appropriate model for this trait. On the other hand, AMMI3, 1, and 3 were suggested as appropriate models for haulm yield, LLS, and pod yield, respectively. However, signal-noise estimation showed AMMI0 as the appropriate model for ELS and LLS diseases, with AMMI1 being appropriate for pod and haulm yields (Table 4). Also, the total GE signal present was 15.13, 29.60, 80.63, and 54.70% for ELS and LLS diseases and haulm and pod

GGE Model Diagnosis Based on Genotype Main Effect Plus Genotype-by-Environment Interaction Signal-Noise Estimation
The environment main effect of the GGE model was significant (p < 0.01) for all traits studied in 2017 (Table 5). Also, the genotype main effect plus GE (G+GE) was significant for all traits. Gollob's test showed GGE3 as the appropriate model for ELS and LLS diseases, whereas GGE2 and 1 were shown as appropriate models for haulm and pod yields, respectively. On the other hand, signalnoise estimation suggested GGE2 for ELS disease, GGE1 for haulm yield and LLS disease, and GGE0 for pod yield. The GGE S   (Table 5).
In 2018, the G+GE was not significant (p ≥ 0.05) for ELS diseases but was significant for haulm yield, LLS disease, and pod yield ( Table 5). Gollob's F-test suggested GGE2 for ELS and LLS diseases, GGE4 for haulm yield and GG3 for pod yield, whereas signal-noise estimation showed GGE0 as the appropriate model for ELS and LLS diseases and GGE1 for haulm and pod yields. The overall GGE S was 19.37, 79.20, 33.37, and 57.77% for ELS disease, haulm yield, LLS disease, and pod yield, respectively, with the signal captured by the PCs for haulm and pod yields being 90.0 and 65.47%.
The environment main effect and the G+GE were significant for all traits (p < 0.001) when the combined data from 2017 and 2018 were considered ( Table 5). First four, six, three, and four PCs of the interactive component were significant (p < 0.05) for ELS, haulm yield, LLS, and pod yield, respectively, from the F-test. Hence, based on Gollob's test, GGE4, 6, 3, and 4 were appropriate for these trait analyses, respectively. However, signal-noise estimation diagnosed GGE1 for all the traits, except haulm yield, which GGE2 was diagnosed for. The GGE S was 41.28, 75.34, 40.27, and 59.09% with the captured signal being 60.79, 92.68, 73.13, and 99.30% for ELS disease, haulm yield, LLS disease, and pod yield, respectively (Table 5).

Relationship Among Test Locations and the Interaction With Genotypes Based on the Genotype-by-Environment Interaction
Damongo, Manga, and Silbele locations grouped to form a cluster of related locations with Nyankpala standing alone for ELS when GE and the location means were considered in 2017 (Tables 6, 7). However, when haulm yield was considered, Manga and Nyankpala had a common genotype winner for both GE and location means. All the locations clustered into a single group when LLS and pod yield were considered (Tables 6, 7). Genotypes ICGV-IS 09926 and ICGV-IS 13937 had the highest ELS disease severity scores (winners) in the various locations in 2017 when GE was considered, with 12CS-042 having the highest for LLS in all locations ( Table 6). However, when the location means were considered, genotype CHINESE, which is always used as the susceptible check, had the highest ELS disease severity scores in three of the four locations for ELS and in all locations for LLS, respectively (Table 7). Also, genotypes ICGV 86124 and ICGV-IS 13851 were the winners for haulm yield, while 12CS-042 was the sole for pod yield in all locations when  GE was considered ( Table 6). On the other hand, when location means were considered, ICGV-IS 131090 and ICGV-IS 13871 had the highest haulm yield, whereas ICGV-IS 141120 won in all locations for pod yield.
In 2018, all the locations clustered into a single group when GE and overall location means were considered for both ELS and LLS severity scores (Tables 6, 7). However, location Nyankpala stood out when haulm yield was considered. Damongo and Manga separated from the other locations for the GE of pod yield with Manga clustering with them when the overall location means were considered (Tables 6, 7). Genotype 12CS-042 had the highest ELS and LLS disease severity in all locations in 2018 when GE alone was considered. However, CHINESE and ICGV-IS 13834 were the most susceptible for ELS and LLS diseases, respectively, in all locations when the locations means were considered. Also, genotype ICGV-IS 13989 had the highest haulm yield in four of the five locations when GE was considered, while ICGV-IS 14849 had the highest pod yield in three of the five locations ( Table 6). When the location means were considered, genotypes ICGV-IS 14877, ICGV-IS 07947, and ICGV-IS 14943 had the highest haulm yield in the locations, with ICGV-IS 141120 and ICGV-IS 13937 having the highest pod yield.
For combined analysis of 2017 and 2018 years' data, repeatable patterns were observed when the location means were considered for ELS disease severity with CHINESE being the most susceptible in Damongo and Nyankpala for both years, respectively ( Table 7).

Relationship Among Test Locations and the Interaction With Genotypes Based on the Genotype Main Effect Plus Genotype-by-Environment Interaction
Locations Manga and Nyankpala formed a group of common environments, with genotype ICGV-IS 13842 being the most susceptible in these locations when ELS severity was considered in 2017 (Table 8). On the other hand, Nyankpala and Silbele had a common winner for haulm yield, with Damongo, Nyankpala, and Silbele also forming a cluster of similar locations when LLS disease was considered. When pod yield was considered, all locations formed a single cluster, with 12CS-042 being the winner ( Table 8).
with genotype ICGV-IS 14943 being the winner, while ICGV-IS 13989 was the winner in the other locations. Also, Nyankpala and Tanina clustered to form common environment with ICGV-IS 14849 as the winner for pod yield. ICGV-IS 07947 was the winner for the other three locations (Table 8).
For combined analysis of 2017 and 2018 years' data, all environments, except Silbele in 2017 and Tanina in 2018, clustered, with ICGV-IS 13834 being the common susceptible genotype (Table 8).

Relationship Among Test Locations and the Interaction With Genotypes Based on the Genotype and Environment Main Effects Plus Genotype-by-Environment Interaction
Locations Damongo and Silbele clustered to form a common environment, with genotype ICGV-IS 13848 being the most susceptible when ELS disease was considered in 2017 (Table 9). Again, these locations formed a cluster, with genotype YENYAWOSO as the most susceptible when LLS was considered.
In both scenarios, locations Manga and Nyankpala stood out, with genotype-specific susceptibilities ( Table 9). For haulm yield, Manga and Nyankpala clustered, whereas all the locations evaluated in 2017 clustered for pod yield and had genotype ICGV-IS 141120 as the winner.
In 2018, locations Damongo, Manga, and Nyankpala had a common most susceptible genotype (ICGV-IS 13834), with CHINESE being the most susceptible for Silbele and Tanina for ELS disease (Table 9). On the other hand, Manga, Silbele, and Tanina all had genotype ICGV-IS 13834 as the most susceptible for LLS disease. With regard to haulm yield, genotype ICGV-IS 13848 was the common winner for Damongo and Tanina, while genotype ICGV-IS 13937 won in four of the five locations (Damongo, Manga, Nyankpala, and Silbele) for pod yield ( Table 9).
When 2017 and 2018 data were jointly considered, the most consistent environmental grouping was observed for pod yield, with genotype ICGV-IS 141120 winning in eight of the nine environments ( Table 9).

Influence of Environmental Covariables on Genotypic Performance and the Relationship Among Traits at Various Levels of Phenotype
Based on the factorial regression model, the environmental (locations and years combined) main effect was significant (p <  File 4). However, the genotype main effect was only significant for pod yield. ELS and LLS diseases had a significant (p < 0.05) positive correlation at all levels of phenotype (i.e., GE, G+GE, and E+G+GE) in all years except for E+G+GE in 2018 (Figure 1). Also, haulm in 2017 correlated positively with haulm yield in 2018 at G+GE and E+G+GE levels of phenotype. However, pod yield in 2017 correlated negatively with pod yield in 2018 at E+G+GE but positively at G+GE level of phenotype (Figure 1).

DISCUSSION
The present study sought to delineate locations using a crop (groundnut) with farmer-preferred traits. In this work, a high statistical significance of the location main effect for all traits in both years shows the important contribution of the non-genetic component of phenotype to overall crop performance. It also confirms the most common scenario in multi-environmental trials where the location main effect captures the largest of the total variation (Gauch, 2006). Although this component is unimportant to breeders (Gauch, 2013), it highlights the significance of putting a genotype in an optimal environment (cultivar placement) for enhanced performance. Generally, locations that ranked first in ELS and LLS disease severity (highest disease pressure) were again ranked first in pod and/or haulm yields too. This suggests that the disease condition was not too severe to affect yield. For instance, per the scale used in scoring, the highest disease severity in AUDPC that can be scored for ELS and LLS is 180 and 135, respectively. Therefore, unlike genotypes where ranking of means is required to select the best material for advancement, locations should be judged against a benchmark. This will deepen our understanding as to the amount of investment required in a location to enable a genotype perform to its maximum. Also, the genotype main effect was only significant for pod yield in both years. However, this is not surprising, because the yield obtained from pod is the end product of all the conditions and physiological processes the groundnut crop underwent during growth. It therefore highlights the importance of developing a cultivar with high genetic potential for pod yield under the set of conditions it will grow. Genotypes ICGV-IS 141120 and ICGV-IS 13937 were the best in terms of pod yield in 2017 and 2018, with average yields of 1,798 and 1,470 across years, respectively.
Traditionally, the visualization of AMMI and GGE is done using the first and second PCs (PC1 and PC2), i.e., AMMI2 and GGE2, respectively (Gauch, 2006). With the implementation of a triplot function in the agricolae package of R statistical software (Mendiburu, 2019;R Core Team, 2020), researchers now have the option of visualizing AMMI plots with three PCs on a 2D plane. However, relying on p-values from Gollob F-test to select a model family of AMMI and GGE can be misleading (Gauch, 2013;Gauch and Moran, 2019). For instance, the GE and G+GE of ELS disease in 2018 were not significant. This suggests that AMMI and GGE analyses were not justified, respectively. Meanwhile, the PC1 of the AMMI model, and PC1 and 2 of the GGE model were highly significant. Proper model diagnosis is therefore necessary in order to capture the real patterns in the data. The F R -test (Piepho, 1995) and signal-noise comparison (Gauch, 2013) are by far the most robust procedures currently available for diagnosing the AMMI models. However, the F R -test, which relies on Gollob's procedure for estimating degree of freedom (df) (Piepho, 1995;Gauch, 2013), underestimates GGE df, although its effect may be offset by declaring significance at 1 or 0.1% instead of the usual 5%. AMMI and GGE models, which were used to model phenotype at GE and G+GE levels, respectively, in the present study focused on signal-noise estimation and comparison as a means of model diagnosis, because it is highly conservative and under no circumstance does it allow some amount of noise to be captured. The downside is that it can lead to model underfitting when attention is not paid to the signal captured as seen when the suggested model captured 65.47 and 60.79% of G+GE signal in pod yield and ELS disease in 2018 and across years, respectively. However, unlike other diagnostic procedures that solely depend on statistical significance, its sensitivity is not dependent on the size of the dataset. The highest member of the AMMI model family required for any trait in this study was AMMI2, while that of GGE model was GGE2.
The use of different models (AMMI, GGE, FW regression, and factorial regression) in this study was not to determine their robustness in revealing patterns in a MET dataset. That has been discussed extensively in the literature already (Yan et al., 2000(Yan et al., , 2001Yan and Rajcan, 2002;Gauch, 2006Gauch, , 2013Malosetti et al., 2013;Yan, 2014). The different models were used to understand the relationship among test locations and genotypic performance at various levels of phenotype (i.e., GE, G+GE, and E+G+GE) as well as the role of explicit environmental covariables (rainfall, temperature, and RH). When the GE component was considered, patterns (similar location groupings) were repeatable across traits and across years for the same trait, although a complex scenario largely existed for across environments (location and year combined). For instance, the location groupings were the same for LLS disease and pod yield in 2017 and ELS and LLS diseases in 2018 (Tables 6, 7). The G+GE component also showed repeatable patterns across traits, across years for the same trait. In fact, across environments, GE and G+GE components of the phenotype gave similar environmental patterns and genotype winners for ELS disease severity. When the main and interactive effects (E+G+GE) were jointly considered, location groupings were again similar between ELS and LLS disease severities in 2017 and between ELS disease and pod yield in 2018, although the genotype winners were entirely different. The clustering of almost all locations (within years) and environments (across years) to form a single mega-environment among consistent patterns suggests that the locations represent one mega-environment. However, the megaenvironment is a complex one (complex mega-environment) since such consistency did not cut across all traits. As a result, although these locations are testing sites and together represent two agro-ecological zones (Guinea and Sudan Savanna) (Marfo and Padi, 1999), testing in these sites approximates the entire target region. Although these locations were historically selected for trials not necessarily based on any crop's performance, their use over the years has been justified particularly for groundnut. On the other hand, the recharacterization of the entire target region may be necessary to ascertain whether these current locations will continue to be the representative environments for the respective mega-environments. However, considering the financial investment required, it will be prudent to use the onfarm approach.
Environmental components that contribute to genotype by environment in general can be categorized into two groups, i.e., those that result in predictable and those that result in unpredictable GE. An example of the predictable and unpredictable includes soil and weather parameters, respectively. Precipitation is the main source of water for groundnut cultivation in Sub-Saharan Africa, while humidity and temperature affect leaf spot disease development Oteng-Frimpong and Dakora, 2019) and growth stages (Rao et al., 1992;Oteng-Frimpong et al., 2019), respectively. Hence, the lack of significance for the interactive components of the factorial regression model suggests that the predictable components of the environments were driving the genotypes by environment. It is, therefore, not surprising that clusters of location groupings were repeatable across traits and years. Also, the fact that this study yielded findings that agree to the historical characterization, which was carried out primarily based on vegetation type and other indices other than a crop performance, confirms this assertion. However, when the data from the year 2017 and 2018 were combined and analyzed, environments from the same locations (e.g., Damongo 2017 and Damongo 2018) were largely having different genotype winners resulting in them falling into different environmental groups. This scenario was observed at all levels of phenotype and indicates the present effect of latent seasonal variation, which could not be detected statistically through the factorial regression modeling. Since this has the tendency of complicating the breeder's ability to identify useful environmental patterns, it is suggested that when data from multiple years/seasons experiment are available, the analysis should be done year-wise, unless the confounding effect of the yearly variation can be properly accounted for.
Relationships among traits were consistent across the phenotypic levels of GE and G+GE for the disease-related traits. However, when the "E" component was added, the patterns changed. And since breeders are more interested in the genotype, such associations should be estimated devoid of the "E." On the other hand, the changes in trait associations when "E" was added (i.e., E+G+GE) mean breeders ought not to ignore the environment in which cultivars are to be placed. Therefore, collaborative cultivar development that will see disciplines such as soil scientists and agronomists actively involved in the breeding process is encouraged.

CONCLUSION
This study uses a comprehensive approach in test location characterization and genotype performance to examine four farmer-preferred traits with model diagnosis and at the various levels of phenotype. The study justified the use of the current test locations (Damongo, Manga, Nyankpala, Silbele, and Tanina) to represent the entire Guinea and Sudan Savanna zones in Ghana during METs. GE was driven by the predictable scenario, although latent yearly/seasonal effect exists.
Depending on the phenotypic level (i.e., GE, G+GE, or E+G+GE) at which genotypes performance are assessed, the best-performing genotype may change. Although different statistical models are used to approximate these phenotypic levels in general, comprehensively assessing genotypic performance and arriving at a decision using this approach will likely result in the identification of the most superior cultivar among a set genotypes evaluated. Considering the MET data from the 2 years independently and at the various levels of phenotype, genotypes ICGV-IS 141120 and ICGV-IS 13937 were selected and recommended for on-farm study due to their superior and consistent performance across the traits considered. Genotype CHINESE will continue to be used as a susceptible check for ELS and LLS disease studies, while ICGV-IS 13834 will be the new additional susceptible check to be added.

DATA AVAILABILITY STATEMENT
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

AUTHOR CONTRIBUTIONS
RO-F conceived the study. RO-F, DP, JA-D, and FK designed the experiments and came out with an outline. YK, JN, MR, A-RI, and FK implemented the experiments at the different locations. YK conducted the statistical analysis and produced the initial draft. RO-F and DP edited and improved the draft. All authors read and approved the final manuscript.

FUNDING
This work was done with the financial support from the BMGFfunded Tropical Legumes III Project (Grant ID OPP1114827).