A Systematic Narration of Some Key Concepts and Procedures in Plant Breeding

The goal of a plant breeding program is to develop new cultivars of a crop kind with improved yield and quality for a target region and end-use. Improved yield across locations and years means better adaptation to the climatic, soil, and management conditions in the target region. Improved or maintained quality renders and adds value to the improved yield. Both yield and quality must be considered simultaneously, which constitutes the greatest challenge to successful cultivar development. Cultivar development consists of two stages: the development of a promising breeding population and the selection of the best genotypes out of it. A complete breeder's equation was presented to cover both stages, which consists of three key parameters for a trait of interest: the population mean (μ), the population variability (σG), and the achieved heritability (h2 or H), under the multi-location, multi-year framework. Population development is to maximize μσG and progeny selection is to improve H. Approaches to improve H include identifying and utilizing repeatable genotype by environment interaction (GE) through mega-environment analysis, accommodating unrepeatable GE through adequate testing, and reducing experimental error via replication and spatial analysis. Related concepts and procedures were critically reviewed, including GGE (genotypic main effect plus genotype by environment interaction) biplot analysis, GGE + GGL (genotypic main effect plus genotype by location interaction) biplot analysis, LG (location-grouping) biplot analysis, stability analysis, spatial analysis, adequate testing, and optimum replication. Selection on multiple traits includes independent culling and index selection, for the latter GYT (genotype by yield*trait) biplot analysis was recommended. Genomic selection may provide an alternative and potentially more effective approach in all these aspects. Efforts were made to organize and comment on these concepts and procedures in a systematic manner.


INTRODUCTION
Plant breeding plays a key role in meeting the human needs for more food, nutrition, and fiber under a changing climate. The goal of a plant breeding program is to develop new cultivars of a crop kind with improved yield and quality for its target region and end-use. All theories, concepts, processes, procedures, and analyses related to plant breeding are developed and implemented around this goal. A target region is the target population of environments, which is the sum of soil, climatic, biotic, and abiotic conditions plus common management practices that are likely to be encountered in the region. Improved yield means improved adaptation to the target region, which is reflected in improved mean performance and stability of performance across locations and years in the target region. Improved quality means improved adaptation to the end-uses that bring value and income to the growers in the target region. Both the target environments and the target end-uses may change over time, in a predictable or unpredictable manner. Yield is the result from integrating numerous traits including various yield components, agronomic traits, disease resistances, and tolerance to various abiotic stresses characteristic of the target region. Consequently, yield in different regions may mean different ways of packaging these traits and underlying gene alleles. Likewise, quality is a collective term of many parameters for a specific end-use. Thus, dealing with many traits simultaneously is an essential task of cultivar development, although most breedingrelated publications deal with only a single trait, typically yield. The relation between yield and other traits is analogous to that between the skin and the hair of a fur or that between the trunk and the branches of a tree; other traits gain importance only when attached to (i.e., combined with) high yield (Yan and Frégeau-Reid, 2008;Yan et al., 2019a). Plant breeding is a mature discipline of applied sciences, with well-developed concepts and procedures. Nevertheless, a systematic combing and narration of the numerous, sometimes confusing, concepts and procedures should help both new and experienced breeders in their work toward developing superior cultivars. The concepts and procedures in plant breeding are indeed much easier to tackle for a single trait. So, much of the discussion will be on a single trait while keep in mind that multi-trait selection is essential to cultivar development, which is discussed in the last section. In addition, genomic selection (Goddard and Hayes, 2007;Heffner et al., 2009;Jannink et al., 2010) has become a growing point or integral part in most plant breeding programs. Its role will be briefly mentioned when the various concepts and procedures are discussed.

THE COMPLETE BREEDER'S EQUATION
The cultivar development process includes two stages: the development of a promising breeding population and the identification of the best progeny out of it. Breeding success can be measured by the following equation, referred to as the Complete Breeder's Equation (modified from Yan et al., 2019b), in comparison with the well-known Breeder's Equation of Eberhart (1970), Here B stands for breeding success per unit time and cost and G stands for selection gain over the population mean per unit time, for a trait of interest (typically yield). µ is the mean of the breeding population, σ G is the square root of the genotypic variance of the population, i is the selection intensity in the unit of σ G , h is the square root of achieved heritability (h 2 or H), Y is the length of the breeding cycle in years, and C is the operation cost per year. µ , σ G and h are to be estimated from environments representing the target region. A target region may consist of multiple mega-environments, as will be discussed later. For the time being the target region is assumed to be a single mega-environment. A mega-environment is defined as a group of environments that share the same best cultivar(s) (Gauch and Zobel, 1997;Yan et al., 2000).
Relative to Equation 2, Equation 1 emphasizes the importance of population mean in cultivar development and serves as a reminder that any selection progress is on the basis of the population mean. The inclusion of C emphasizes that cultivar development is an enterprise that must consider the cost for the achieved genetic gain.
Cultivar development consists of two stages: population development and progeny selection. Practical breeders would agree that developing a promising breeding population, i.e., making a promising cross or crosses, is the crucial first step toward cultivar development. A promising breeding population is the basis for any meaningful selection effort. This point may be implied in Eberhart (1970) and by later researchers (e.g., Cobb et al., 2019;Rutkoski, 2019) when discussing the Breeder's Equation but its importance to cultivar development can never be overemphasized, thus implicitly indicated in Equation 1. The potential of a breeding population for cultivar development, shorted as population potential (P), depends on both the population mean (µ) and the population variability (σ G ): Apparently, if there is no genetic variability, there would be no selection progress; if the population mean is low, it is unlikely to lead to any superior cultivars regardless of selection strategies. Practical plant breeders are well aware of the importance of the population mean. They cross best with best and look for recombinants better than both parents (Duvick, 1996). A high µ is usually achieved by using currently the most popular, usually the highest yielding, cultivars as parent(s), while a high level of σ G is achieved by choosing parents that are different and complementary in yield components, agronomic traits, disease resistances, and quality traits, and by use of a large enough breeding population. Crossing an adapted local cultivar with a geographically distant cultivar with desired traits led to some of the most important wheat cultivars in China (Zhao et al., 1981).
In the era of genomic selection, µ and σ G and therefore P can be predicted for any pair or set of potential parents for a trait of interest if reliable genomic models are available (Wang et al., 2018). Usually, the genetic variability in a breeding population is created by crossing different parents, but it can also be created through induced mutations by treating a superior cultivar with γ radiation, chemical mutagen treatment, transposons, genetic transformation, or gene editing (e.g., van Harten, 1998;Kharkwal et al., 2004;Shu et al., 2012).
To maximize µσ G may suggest that µ and σ G are equally important. In cultivar development, however, µ may be more important than σ G although both are essential. The use of backcross, recurrent selection, and crosses between closely related breeding lines (e.g., Rasmusson and Phillips, 1997) are examples to ensure a high µ at the expense of σ G . On the contrary, wide crosses (e.g., Baum et al., 1992) can bring much variability to the population at the expense of reduced population mean. Wide crosses are essential to introduce novel genes and traits from wild species (e.g., Ma et al., 2018; numerous research done worldwide for various crops), which are crucial to longterm crop improvement; however, they are unlikely to directly lead to superior cultivars. It may be more accurately referred as the breeder's equation for progeny selection. Here σ G is fixed for a given breeding population, i is a parameter artificially set, and h is the square root of achieved heritability. In fact, while i is the artificially set selection intensity, ih is the realized selection intensity. Cobb et al. (2019) discussed approaches to improving breeding efficiency in the framework of Equation 2, with the emphasis on reducing Y. Rutkoski (2019) reviewed the basis and approaches to achieve genetic gain.
Equation 1 can be better understood from Figure 1. Assume that the breeding population is normally distributed with a mean µ and a variability σ G . The X-axis is the range of the phenotypic values and the Y-axis is the frequency density. The area under the curve is unity (1 or 100%). With a selection intensity ih, genotypes to be culled lie on the left side of the vertical line defined by x = µ + ihσ G , and genotypes to be retained lie on the right side of the line. The area α is the proportion of the population to be retained and 1 − α is proportion to be culled. α is also the probability for ihσ G < 0, while 1 − α is the probability for ihσ G > 0. In other words, α is the probability for a genotype with a phenotypic value of µ + ihσ G to be no better than the population mean. µ is the mean of the breeding population; σ G is the square root of the genotypic variance of the population; i is the artificially set selection intensity in the unit of σ G ; h is the square root of achieved heritability (h 2 or H); α is the portion of the population to be selected; it is also the probability that the best genotypes are not included in the selected portion.
An extended interpretation is that α is the risk that the best genotype in the population is not retained at the selection intensity ih. Apparently, the risk is reduced as ih is increased while h is the only objective variable. If h = 0, then ih = 0, and α = 50%. As h approaches unity, α approaches 0. This provides a clue for the choice of i. According to the normal distribution table, if α is set to 0.0001, then z = ih = 3.7. Therefore, it is rational to set i = 3.7 at α = 0.0001. The relationships between heritability, ih, and α at i = 3.7 for some selected heritability values are listed in Table 1, assuming a population size of n = 10, 000.
If α is interpreted as the percentage of the population that must be retained to ensure that the best genotype(s) is included, then the number of genotypes must be selected, N, will be: where n is the effective population size, i.e., the number of unique genotypes in the breeding population. The inverse of N may be defined as the rate of selection success (Yan et al., 2019b): For example, for n = 10, 000, H = 0.9, and i = 3.7, we have α = 0.02% and N = 2 ( Table 1). That is, for a population of 10,000 unique genotypes, an achieved heritability of 0.9 would guarantee that the best genotype is between the top two. A smaller N means less time (in years) and cost that are needed to single out the best genotype. In the extreme case, if a selection method (genomic prediction or any other approach) is accurate enough to identify the best genotype (i.e., N = 1) out of a breeding population, then all the time and cost associated with subsequent testing would be saved. In contrast, in the Ottawa oat breeding program, it takes about seven years of visual selection 1 | The realized selection intensity (z = ih), the proportion of the population to be retained (α), and the number of genotypes to be retained (N) at different levels of heritability (H or h 2 ) assuming a population of n = 10, 000 and a selection intensity of i=3.7.
Each year ∼10,000 F 2 derived breeding lines are planted in a hill nursery and 1,000 are visually selected in the field and the seed lab. Assuming that the best genotype is included in these selected lines, the rate of selection success for this stage (the "Hill Nursery" stage or Stage 3.1) is ∼1/1000, roughly corresponding to assuming an H = 0.1 ( Table 1). The 1,000 selected lines are then planted in yield plots and ∼200 lines are visually selected (the "Observation Plot" stage or Stage 3.2). The accumulative rate of selection success for these two years of visual selection is, therefore, ∼1/200, corresponding to H = 0.3 ( Table 1). It takes four to five additional years of multi-location test to single out the best few genotypes as potential new cultivars (Stages 4.1 to 4.5 in Table 1). Experience indicates that the top genotypes at the Stage 4.3 are usually the ones to be released as cultivars; this corresponds to H = 0.7 (Table 1). Trials in Stages 4.4 and 4.5 (years 2 and 3 of the Registration Test) are conducted mainly to verify the results and to obtain data required for official variety registration. Genomic selection applied at the Hill Nursery stage (Stage 3.1) is expected to dramatically improve the rate of selection success so as to reduce the number of years spent in visual selection and yield trials (Y, Equations 1 and 2). A minimum requirement for a viable genomic selection procedure is to improve the selection efficiency to an extent that covers the extra cost spent in genotyping, phenotyping, and model development. Alternatively, genomic selection is justified if it can identify the best genotypes that may be discarded by the breeder's eye.
The parameter i should be set according to the population size such that α = 1/n. This reflects the idea that a larger population allows a higher selection intensity at the same level of heritability and that the top genotype is the best genotype (N = 1) when h = 1. According to the normal distribution table, i should be set at 2.05, 2.33, 3.00, and 3.71 when n = 50, 100, 1,000, and 10,000, respectively. The relationship among heritability, selection intensity, and probability of false culling (α) is displayed in Figure 2. Incidentally, Singh and FIGURE 2 | The relationship between heritability (H or h 2 ) and probability of false culling (α) at four levels of selection intensity (i). It is suggested that the probability of false culling that can be tolerated be set according to the population size (n) such that α = 1/n; i can then be determined by α by consulting the normal distribution table. The vertical line of H = 0.75 indicates the target heritability to be achieved for reliable selection.
Alternatively, the achieved heritability may be used as the culling rate when the population is small; the number of genotypes that must be retained can then be roughly estimated by: When h 2 = 0, no genotype would be discarded because the selection is completely unreliable; and when h 2 = 1, all but the top performing genotype can be discarded because any observed difference is genetic and heritable. For example, if n = 40 and h 2 = 0.95, then 95% or 38 of the 40 entries can be discarded and the top two performing genotypes can be selected or recommended.
To summarize, for a given breeding population and a given target environment, the allowable culling rate, the allowable selection intensity, the achievable rate of selection success, and the expected selection gain are all determined solely by the achieved heritability, in a curvilinear fashion (Figure 2). Therefore, heritability is the single most important concept in progeny selection.

HERITABILITY UNDER THE MULTI-LOCATION, MULTI-YEAR FRAMEWORK
Cultivars are developed to adapt to a specific target region, i.e., to the environments that may be encountered across locations and years in a target region. Therefore, the heritability discussed so far must be defined in the multi-location, multi-year framework (Comstock and Moll, 1963;DeLacy et al., 1996;Atlin et al., 2000). According to the general linear model, a phenotype, i.e., an observed value, is a mixed effect of environmental main effect (E), genotypic main effect (G), genotype by environment interaction (GE), and experimental error (ε), where E is the sum of location main effect (L), year main effect (Y), and their interaction (LY). Assuming orthogonal experimental design i.e., the same set of genotypes are tested at the same set of locations each year with the same number of replicates, the phenotypic variance is σ 2 lyr . The entry-mean heritability, i.e., the proportion of phenotypic variance that can be explained by the genetic variance at the entry mean level, is estimated by (Fehr, 1991;DeLacy et al., 1996): where H rly stands for heritability across l locations in y years with r replicates; σ 2 GL , σ 2 GY , and σ 2 GLY are the variances for the genotype by location interaction (GL), genotype by year interaction (GY), and genotype by location by year interaction (GLY), respectively; and σ 2 ε is the variance for experimental error. When trials are not conducted orthogonally regarding genotypes, location, years, or replicates, which is usually the case, each trial (location-year combination) may be considered as an environment, and the heritability can be estimated by where σ 2 GE is the variance for genotype by environment interaction. For convenience, Equation 7 will be used in further discussions. Restricted maximum likelihood (REML) is the preferred method for estimating the various variances, particularly when the data are unbalanced (e.g., Gilmour et al., 1995). REML is implemented in all software packages with a mixed model procedure.
Heritability for a single trial can be estimated by However, H r can be used to assess the data quality of a trial but not for making final selections. For making section decisions, Equation 7a below should be used instead: That is, although the interaction terms cannot be estimated from a single trial, they must be factored in when making selection decisions. It can be seen that H r is an inflated estimation of H rly for a trial because the denominator in Equation 7a should be much larger than that in Equation 9. Likewise, a heritability can be estimated for multi-location trials conducted in a year, but it is not to be used to make final selection decisions. Instead, equation 7b should be used, H rl is an inflated estimation of H rly for a single-year test. The definition of heritability in the form of Equation 7 is the only valid definition to be used in Equation 1; Figure 1, and Table 1, with h = H rly , even though H rly cannot be directly estimated in some stages of the breeding cycle. It should be noted that the definition of heritability is in line with the concept of mixed models. It consists of variances for G, GE (= GL + GY + GLY), and experimental error but excludes that for E (= L + Y + LY), implying a mixed model. It implies that G, GE, and experimental error are considered as random effects but E as fixed effects (DeLacy et al., 1996). Researchers are often puzzled on which effects should be treated as random and which fixed when analyzing multi-environment trials data using mixed models (Piepho et al., 2003); for the purpose of genotype evaluation, this is clear from the definition of heritability. The definition of heritability is also consistent with the concept of GGE biplot analysis, which excludes E and focuses on G and GE for cultivar and test environment evaluation (Yan et al., 2000;Yan and Kang, 2002;Yan and Tinker, 2006;Yan, 2014). All efforts made to improve selection efficiency are also efforts to improve the heritability as defined in Equation 7 or Equation 8. Put it differently, all possible approaches to improve selection efficiency reside in the definition of heritability. These include approaches to deal with GE and approaches to minimize experimental error. Dealing with GE include two steps: (1) identifying and utilizing repeatable GE, a process often referred to as mega-environment analysis (Yan, 2014(Yan, , 2015(Yan, , 2016(Yan, , 2019, and (2) accommodating unrepeatable GE through adequate testing Yan, 2016Yan, , 2021. Dealing with experimental error includes adequate replication Yan, 2021) and spatial variation adjustment (Cullis and Gleeson, 1991;Gilmour et al., 1997;Cullis et al., 1998;Burgueño et al., 2000;Qiao et al., 2000;Yang et al., 2004;Yan, 2014).

Repeatable GE vs. Unrepeatable GE
Mega-environment analysis is analysis of the G+GE patterns aiming at dividing a target region into meaningful subregions or mega-environments (subregions and mega-environments are used interchangeably in this article). Among the components of GE, GY and GLY are obviously unrepeatable because it is impossible to predict the environments of next year. It is possible, though, that some of the GL is repeatable as the soil and daylength at a location are fixed. Some management factors such as irrigation, fertilizer application, and fungicide application may also lead to repeatable GE (Cooper et al., 2021), which are lumped as "common management practices in the target region or mageenvironment" for simplicity. Assuming that the test locations can be divided into two or more groups (subregions), the variance for GL will be divided into variance for genotype by subregion interaction (σ 2 GS ), which is the repeatable part, and genotype by location interaction within subregions (σ 2 GL(s) ), which is the unrepeatable part, of GL (Atlin et al., 2000;Yan, 2016): and the number of test locations l will also be divided among the subregions: where l k is the number of test locations within subregion k. Subdivision of the target region into subregions will improve the overall heritability if the genotype by subregion interaction is sufficiently large, because the genotype by subregion interaction is converted into genotypic main effect within subregions when genotype evaluation is conducted by subregion: where H ′ rly is the entry-mean heritability when genotype evaluation is conducted by subregion. On the other hand, dividing a region into subregions may lead to reduced heritability within a subregion due to the smaller number of test locations (Equation 12). Thus, Atlin et al. (2000Atlin et al. ( , 2011 warned that subdivision of a target region should be avoided if genotype by subregion interaction is small relative to G. They used the genetic correlation between divided subregions and the undivided whole region (r G ) as a measure to decide whether the target region should be divided, which is defined as: They suggested that subdivision should be avoided if r G is high, although an explicit criterion was not given. In fact, the correlation between candidate subregions should be a more meaningful measure. Nevertheless, if a subregion is found to be distinct from other subregions, it should be treated as such; if a subregion is economically important, it is justifiable to increase the number of test locations within it to achieve a sufficiently high heritability or selection reliability. The merit of dividing a target region into meaningful subregions is to allow selection and deployment of subregion-specific cultivars to achieve a higher genetic gain within each subregion and thereby the whole region. Annicchiarico (2021) presented a recent example that selection for mega-environment specific cultivars increased genetic gains, in addition to a good review on the subject matter. An essential condition for dividing a target region into subregions is the presence of substantial crossover genotype by subregion interactions (discussed below).

How to Reveal Repeatable GE
To investigate whether heritability can be improved by dividing a target region into subregions, the prerequisite is a good hypothesis on how to divide the target region. Various approaches have been used in dividing a jurisdictional region into agroclimatic regions as reviewed in Yan et al. (2011). A poor hypothesis will lead to the false conclusion that the target region cannot be divided and thereby miss the opportunity to utilize the repeatable GE. For example, Atlin et al. (2000) hypothesized that western Canada (including Alberta, Saskatchewan, and Manitoba) and eastern Canada (including Ontario, Quebec, and Maritime provinces) were two barley mega-environments and rejected the hypothesis. Based on the same dataset, however, Yan and Tinker (2005) showed two clear mega-environments, with locations in Alberta and Saskatchewan as one mega-environment and locations in Manitoba and the eastern Canadian provinces as the other. For another example, in analyzing the data of a set of maize hybrids tested at 24 sites in six African countries in 2009, Atlin et al. (2011) hypothesized that each country was a mega-environment and concluded that there was no megaenvironment differentiation. However, a country is a political entity rather than an ecoclimatic region, so the hypothesis per se is questionable. A good hypothesis on mega-environment differentiation must be based on the G+GE patterns. Two methods have been developed to reveal repeatable GE patterns: GGE + GGL biplot analysis (Yan, 2014(Yan, , 2015(Yan, , 2016 and LG (location-grouping) biplot analysis (Yan, 2019;Yan et al., 2019bYan et al., , 2021.

GGE + GGL Biplot Analysis
As the definition of heritability in Equation 7 or Equation 8 suggested, data from multi-location, multi-year trials are required to conduct GGE + GGL biplot analysis, where GGE stands for G + GE (meaning fitting G + GE by principal components), and GGL for G + GL. In variety trials, usually the same set of genotypes are tested at all locations in a year but different sets of genotypes are tested in different years, because poor genotypes are dropped and new genotypes added each year. Consequently, multi-location, multi-year data are typically unbalanced. Nevertheless, usually a sizable number of common genotypes are tested in two or more consecutive years; this allows missing values in the genotype by environment two-way table to be imputed based on certain procedures (e.g., Yan, 2013). Data from such trials can be investigated using a GGE + GGL biplot (Yan, 2014(Yan, , 2015(Yan, , 2016, as shown in the example below. The yield data from the 2013 to 2019 Quebec Oat Registration and Recommendation trials are used here as an example (data available from the author upon request). Each year 41 to 46 registered oat cultivars or breeding lines were tested at eight to 10 locations. The locations represent three ecoclimatic zones of Quebec Yan, 2015). Zone 1 was represented by NDHY1 (St Hyacinthus) and STHU1 (St. Huber), Zone 2 by PRIN2 (Princeville), PINT2 (Pintendre), STAU2 (St. Augusta), and STET2 (St. Etienne), and Zone 3 by NORM3 (Normandin), HEBE3 (Hebertville), CAUS3 (Causapscal), and LAPO3 (La Pocatière); the number at the end of each location code indicates the zone it belongs. In addition, the trials were also conducted at OTT (Ottawa in Ontario), which is geographically close to Zone 1 of Quebec. A total of 118 genotypes and 67 trials (location by year combinations) were involved in these seven years, forming a 118-genotype by 67-trial two-way table, with 63% missing values. The first step of the analysis was to generate a GGE biplot containing the 118 genotypes and the 67 trials (Figure 3). The GGE biplot was constructed by the first two principal components from subjecting the trial-standardized genotype by trial two-way table to singular value decomposition, after proper singular value partition (Yan, 2002). The most obvious message from this fairly crowed biplot is that the trials placed on the upper portion of the biplot and those on the lower portion were negatively correlated, as indicated by the obtuse angles between them. This indicates existence of strong GE. The second step is to summarize the trials conducted at a location by a location marker, the placement of which is determined by the mean coordination of the trials (Figure 4). For example, the placement of the location LAPO3 (in red) was determined by the seven trials conducted at LAPO3, namely LAPO3_13, LAPO3_14, LAPO3_15, LAPO3_16, LAPO3_17, LAPO3_18, and LAPO3_19 (in black). The genotypes are represented by "+" for clarity. The biplot in Figure 4 is both a GGE biplot and a GGL biplot, thus the term GGE+GGL biplot.
In Figure 4 the 10 locations are clearly separated into two groups: group 1 include locations NORM3, HEBE3, CAUS3, PRIN2, PINT2, and STAU2 on the upper quadrant, and group 2 include locations NDHY1, STHU1, LAPO3, and OTT on the lower quadrant. Thereby the GE is divided into repeatable GE and unrepeatable GE. The genotype by location group interaction, i.e., the difference in the placement between the two location groups, is the repeatable GE; the genotype by trial interaction within groups, i.e., the variation in the placement among the trials within each of the two location groups, is the unrepeatable GE. The two location groups suggests two different mega-environments. All locations in mega-environment 1 (ME1) belong to Zone 2 or Zone 3 of Quebec; locations in megaenvironment 2 (ME2) consists of two Zone 1 locations, a Zone 3 location, and OTT. Thus, the mega-environment differentiation was largely, but not completely, consistent with the agroclimatic zones.
LG Biplot Analysis Presented in Figure 5 is the LG biplot based on the same dataset that was used to generate the GGE + GGL biplot (Figure 4). The steps to construct the LG biplot follows. First, a genetic correlation matrix among locations was calculated for each year.  Figure 3 to show two groups of locations or oat mega-environments (ME) in Quebec. Mega-environment 1 (ME1) consists of Zone 2 and Zone 3 locations PINT2, PRIN2, STAU2, STET2, CAUS3, HEBE3, and NORM3, and mega-environment 2 (ME2) includes locations NDHY1, STHU1, LAPO3, and OTT. The trials conducted at each location are presented as a cluster of trials, with the location name placed at the center and the individual trials, indicated by the last two digits of the year, placed around it, and the trials are connected to the center with straight lines. Two locations are regarded as belonging to the same mega-environment if their clusters overlap; they are regarded as belonging to different mega-environments otherwise. The variation in the placement of the locations between mega-environments represents repeatable GE and the variation in the placement of the trials within mega-environments represents unrepeatable GE. The genotypes are displayed as "+" in blue for clarity.

FIGURE 5 |
LG biplot to show two oat mega-environments in Quebec. PC1 and PC2 are the first two principal components from singular value decomposition of the location by trial two-way table of correlations, without centering ("Centering = 0") or scaling ("Scaling = 0"). The LG biplot is a location by trial biplot, with the locations presented in blue and the trials in red. The trials conducted at each location are presented as a cluster of trials, with the location name placed at the center and the individual trials, indicated by the last two digits of the year, placed around it. The trials are connected to the location with straight lines. Two locations are regarded as belonging to the same mega-environment (ME) if their clusters overlap; they are regarded as belonging to different mega-environments otherwise. The same two MEs (ME1 and ME2) shown in Figure 4 are shown in this LG biplot. The variation in the placement of the locations between mega-environments represents repeatable GE and the variation in the placement of the trials within mega-environments represents unrepeatable GE.
Second, the yearly correlation matrices were stacked to form a location by trial two-way table of correlations, each trial being a location-year combination. Third, the location by trial table was submitted to singular value decomposition, without entering or scaling ("centering = 0, scaling = 0"). Fourth, the resulting first two principal components were used to construct a location by trial biplot. Fifth, as in the GGE + GGL biplot, the trials conducted at a location in different years were summarized by a location marker defined by the mean coordination of the trials. Finally, the trials at a location are displayed as a cluster, with the location marker as the center and the trials in different years as members; the trials and the location are connected with straight lines. If the clusters of two locations overlap, they are regarded as belonging to the same mega-environment; if they do not overlap, they are considered as belonging to different mega-environments. In the LG biplot, the variation among trials and locations within a mega-environment, i.e., the variation among trials at overlapping locations, represents unrepeatable GE; the variation between two mega-environments represents repeatable GE. It can be seen that the same two mega-environments shown in the GGE+GGL biplot (Figure 4) are clearly separated in the LG biplot ( Figure 5); thus, the two approaches are functionally equivalent or similar. Importantly, the LG biplot has the advantage that it does not require any common genotypes to be tested in different years. Therefore, it can be used to reveal repeatable GE and delineating megaenvironments using multi-year trial data in which completely different sets of genotypes are tested in different years .
A general comment on the use of biplots follows. A 2-D biplot is usually used for data visualization for convenience and on the understanding that the most important patterns in the data are captured by the first two principal components. However, there may be cases where some important patterns exist in higher order principal components. This is usually indicated by the presence of vectors that are obviously shorter than others. When this is the case, variation not displayed in the biplot can be explored by biplots displaying a subset of the data. A recent example can be found in Yan et al. (2021).

Utilization of Repeatable GE by Selecting Mega-Environment Specific Cultivars
The approach to utilizing repeatable GE is to select separately for each mega-environment, preferably using the mean vs. stability view of the GGE biplot (Figure 6). The red line with a single arrow passes through the biplot origin and the mean environment (which has mean coordination of all environments) and is referred to as the average environment axis (AEA) or GGE-Mean axis; the arrow points to higher mean yield. The blue line with arrows at both ends points to greater instability in either direction; it can be referred as the GGE-stability axis (Yan, 2001;Yan and Kang, 2002;Yan and Tinker, 2006). This is an extended application of the inner-product property of a biplot (Gabriel, 1971). Thus, the three highest yielding cultivars in ME1 across 2013-2019 were Akina > Nicolas > Nice (Figure 6A), and those for ME2 were Nicolas > Akina > Richmond ( Figure 6B). Therefore, the repeatable GE can be utilized by recommending different sets of cultivars in ME1 and ME2.
The similarity/dissimilarity in cultivar ranking between ME1 and ME2, along with that in the undivided whole region, are further presented in the which-won-where view of the GGE biplot in Figure 7. The polygon or which-won-where view of the GGE biplot (Yan et al., 2000) is also an extended application of the inner-product property of a biplot (Gabriel, 1971). The polygon was formed by connecting the genotypes that are placed away from the biplot origin in all directions. For each polygon side a line perpendicular to it was drawn from the biplot origin. These lines dissect the biplot into sectors. For each sector, the genotype at the vertex is the nominal highest yielder for the environments or mega-environments fell in it. In this case, Akina  Figure 4 and associated text for the definitions of ME1 and ME2. PC1 and PC2 are the first two principal components from singular value decomposition of trial-standardized yield data ("Centering = 2," "Scaling = 1"). The singular values were entirely partitioned to the genotypic scores ("SVP = 1") for proper genotype evaluation. The trials are represented by "o" for clarity. The red line with a single arrow is the average environment axis (AEA), the arrow pointing to higher mean yield. The blue line with arrows on both ends is the instability axis, the arrows pointing to greater instability in either direction.
was the highest yielder in ME1 while Nicolas was the highest yielder in ME2 and "ALL, " indicating crossover genotype by subregion interaction. On the other hand, ME1 was placed close to the radiate line labeled "1, " which separates ME1 from ME2; this means that Akina had higher yield than Nicolas in ME1 but not by much. The two mega-environments were moderately correlated (r = 0.652; Figure 7) and shared Akina and Nicolas as the top two yielding cultivars, though in a reversed order. Thus, the two oat mega-environments in Quebec were classified as sub mega-environments within one of the three major oat mega-environments in Canada .

ACCOMMODATION OF UNREPEATABLE GE THROUGH ADEQUATE TESTING
The solution to accommodating unrepeatable GE is to test adequately within a target mega-environment, i.e., to test at a sufficiently large number of locations in a sufficiently large number of years with sufficiently large number of replicates so as to sufficiently sample the environments and to achieve a sufficiently high heritability as defined in Equation 7 or Equation 8. It is obvious that more replicates, more locations, and more years will lead to a higher heritability. The solution to identify widely adapted cultivars (within a meg-environment) is to "test widely" (Troyer, 1996). However, each additional replicate, location, or year involves considerable cost. As a compromise between selection reliability and test cost, the concept "adequate test" was proposed and defined Yan, 2016).
The terms "adequate testing, " "optimum testing, " and "minimum testing" are used interchangeably in this article to indicate that a minimum level of testing in terms of years, locations, and replicates must be conducted to achieve sufficiently reliable selection. When tested inadequately, the selection intensity must be lowered according to the achieved heritability, to prevent superior genotypes from being mistakenly discarded. The "optimum" level of replicates, locations, or years was defined as one to achieve a heritability of 0.75, based on examining a heritability response curve  Figure 2 this article). However, Cobb et al. (2019) suggested that a heritability of 0.5 was sufficient for reliable selection of the best 10 individuals to be used to start the next breeding cycle.

Optimum Number of Years
Based on Equation 7 and assuming neglectable GL and experimental error or unlimited number of locations and replicates, the minimum number of years required to achieve a heritability of 0.75 can be estimated by For example, based on the yield data from three-year spans of Quebec provincial oat tests, the estimated minimum number of years to achieve a heritability of 0.75 was from 1.2 to 6.3 and averaged 3.2 ( Table 2), while the officially required number of years to register a cultivar is three. So, the requirement for FIGURE 7 | The which-won-where view of the GGE biplot to show the relative yield of 13 oat cultivars in mega-environment 1 (ME1), mega-environment 2 (ME2), and the undivided Quebec oat growing regions (ALL). The polygon was formed by connecting the genotypes that are placed away from the biplot origin in all directions. For each polygon side a line perpendicular to it was drawn from the biplot origin. These lines dissect the biplot into sectors. For each sector, the genotype at the vertex is the nominal highest yielder for the environments or mega-environments fell in it. In this case, Akina was the highest yielder in ME1 while Nicolas was the highest yielder in both ME2 and "ALL." three years of testing was adequate and appropriate in general. More years of testing were required for the 2016-2018 and the 2018-2020 ( Table 2) spans due to reduced genetic variability and therefore achieved heritability.

Optimum Number of Locations
Yearly multi-location trials are usually balanced as the same set of genotypes are tested at all locations. Therefore, it is convenient to use yearly data to estimate the number of locations required for adequate testing. Assuming an infinite number of replicates or negligible experimental error, the heritability within a year (Equation 10) can be reduced to where H rl,max is the maximum achievable within-year heritability (Yan, 2021). Based on this equation, the minimum number of locations required to achieve a heritability of 0.75 can be estimated by Yan, 2021) The minimum number of locations so estimated is expected to differ with the year. Therefore, it should be estimated for a number of years to achieve a good understanding on the required number of test locations for a target mega-environment . Presented in Table 2 are the estimated yearly minimum number of locations based on the 2013-2019 Quebec provincial oat trial data for the two mega-environments as well as for the undivided Quebec oat growing region. When estimated for the undivided region, the mean number was 8.4, in comparison to the actual number of locations of 9.6. Thus, the number of locations used was more than adequate in most years. Interestingly, when estimated for each mega-environment, the estimated minimum number was ∼one location more than that actually used (7.3 vs. 5.9 for ME1 and 4.6 vs. 3.7 for ME2). Thus, even though there is a clear mega-environment differentiation, the trials in one mega-environment still provided useful information to selection for the other, because the two mega-environments were positively correlated (Figure 7). In contrast, the southern vs. northern oat mega-environments in eastern Canada were uncorrelated, and as a result, the total required number of locations was smaller when estimated separately for each mega-environment than that when estimated for the undivided region . In a Canada-wide study, the southern vs. northern mega-environments in eastern  Figure 4 or Figure 5 for the definition of mega-environment 1 (ME1) and mega-environment 2 (ME2); b The estimated number for ME1 + ME2 is the sum of the estimated number for ME1 and that for ME2.
Canada were designated as ME1 and ME2, respectively, while the two Quebec mage-environments in Figure 4 or Figure 5 were designated as ME2a and ME2b . Given the results in Table 3, cultivar recommendation for each of the two Quebec mega-environments should consider performance both within the mega-environment and across the whole region, as shown in Figure 7.

Optimum Number of Replicates
Several classic studies investigated the optimum numbers of years, seasons, test locations, and replicates for the allocation of a fixed number of plots or fund according to the relative magnitudes of various variance components (Sprague and Federer, 1951;Hanson and Brim, 1963;Wricke and Weber, 1986;Swallow and Wehner, 1989). Conclusions from this "resource allocation" approach inevitably led to the suggestion to maximize the number of locations and/or years and to minimize the number of replicates (i.e., to use a single replicate) (McCann et al., 2012;Schmidt et al., 2018). However, this conclusion applies only when it is possible to increase the number of locations and/or years. For a breeding program or a regional crop recommendation committee, yield trials are conducted every year at a more or less fixed number of locations. Researchers need to know the minimum or optimum number of replicates under this scenario. To answer this question, an equation was derived from the definition of heritability on the single trial basis, in the form of Equations 15 and 17 . More recently, another equation was developed for estimating the optimum number of replicates in a multi-location context (Yan, 2021): where r l is the optimum number of replicates given the number of locations l, and H rl, max is as defined in Equation 16. Equation  18 shows that the required number of replicates is determined by the relative magnitude of experimental error variance, is modified by the number of locations in a non-linear manner, because an increase in the number of locations also improves H rl, max (Equation 16). Applying this equation to the yield data of some oat trials conducted across Canada, it was determined that two replicates would suffice to identify the highest yielding oat cultivars (Yan, 2021). Applying this equation to the 2015-2019 yield data of barley, oat, spring wheat, and winter wheat trials conducted in Ontario also led to the conclusion that two replicates would suffice (Yan et al., 2000). It is recommended that similar analysis be conducted for other crops and regions. Regional variety trials are usually conducted with three or four replicates. Reducing the number of replicates to two, if supported, can substantially reduce the evaluation cost or allow more genotypes to be evaluated with the same cost. Importantly, reliable estimation of the various variances are a prerequisite to accurate estimation of the optimum number of years, optimum number of locations, and optimum number of replicates for adequate testing (Arief et al., 2015).

Adjust for Spatial Variation
The discussions on optimum testing and optimum replication above assumed that the field and management are uniform within each trial. However, spatial variation within trials has been recognized as a major source of experimental error. Traditionally it is controlled by blocking, i.e., dividing a replicate into blocks, such as the so-called incomplete blocks design (R.A. Fisher, from Street, 1990). This is referred to "dealing with spatial variation by design." In the last three decades, spatial analysis and adjustment becomes an increasingly popular research subject and a routine practice in the analysis of crop variety trials. The use of spatial analysis makes experimental design more flexible.
In a variety evaluation trial, g genotypes are usually allocated into a rectangular field of b rows (blocks) and c column (plots). The observed value in a plot, Y ij , is, therefore, combined effects of the row, the column, the genotype, and the experimental error: µ being the mean of the trial. The effects of rows and columns can be modeled by various spatial analysis techniques (Cullis and Gleeson, 1991;Gilmour et al., 1997;Cullis et al., 1998;Burgueño et al., 2000;Qiao et al., 2000;Yang et al., 2004). Spatial analysis is a within-trial analysis so it is also referred to "local analysis" (Kempton et al., 1994;Grondona et al., 1996). A straightforward and intuitive approach is to use a polynomial regression to model any trend across the plots within each block (Yan, 2014), which is routinely used in the Ottawa oat breeding program. The order of the polynomial regression can be set according to the block size of the block. An iterative procedure can be used to adjust the Y ij values so as to minimize the experimental error. Adjusted genotypic values are then calculated from the adjusted plot values at the final iteration. Spatial adjustment based on polynomial regression usually leads to reduced trial coefficient of variation and increased trial heritability (Yan, 2014). The plot values, and thus the genotypic values, will not be altered if no spatial trend is found. An example of spatial trend adjustment for a block of 36 plots in an oat trial conducted in Ontario in 2019 is presented in Figure 8. This procedure can also be used to fill missing plot values.

Genomic Selection: To Replace Multi-Environment Evaluation With Multi-Model Prediction
Some researchers believe that genomic selection will eventually replace breeders' visual selection and even alter the role of yield trials in making selection and recommendation decisions (Heffner et al., 2009;Jannink et al., 2010). Indeed, encouraging results of genomic selection have started to emerge as advanced genotyping, bioinformatics, and genomic modeling procedures have become available (e.g., Tinker et al., 2016;Bekele et al., 2018). The confidence on genomic selection comes from two aspects. First, genome-wide markers can sufficiently capture the genotypic variability of a relevant breeding population tested in a relevant environment (i.e., a trial at a location in a year). That is, the genetic variability of the breeding population observed in the trial can be accurately captured by a genomic model. Second, genomic models can be developed for a large number of trials that sufficiently to fully represent the target mega-environment. Assuming m models have been developed from m trials covering multiple locations and years, then predicting the performance of a breeding population using m models would be equivalent to testing the breeding population in m trials. This represents a great advantage of genomic selection over conventional selection because in a practical breeding program, it is impossible to test a large breeding population in replicated trials, let alone at multiple locations in multiple years. Assuming a genotype by model twoway table of predictions for a breeding population, the achievable heritability with genomic selection, H m , can be estimated by where σ 2 GM is the variance for genotype by model interaction. As m increases, the achievable heritability with genomic selection and hence selection reliability can become much higher than what is achievable by conventional selection (Yan et al., 2019b). From this viewpoint, genomic selection is potentially a much more effective approach to dealing with unrepeatable GE.
On the other hand, yield trials aiming at genomic model development are large and expensive; it remains a question as to when model development can be considered complete and such trials are no longer needed (Yan et al., 2019b). If the year factor is completely random, then genomic model development may be considered complete at some point when sufficient data (years and locations) have been obtained. However, if there is a trend in climatic change, data from recent years would be more relevant for predicting future-year performances, and the trials must continue. It is also a question whether routinely conducted yield trials with a limited number of entries can be used to replace the large-scale trials for genomic model development or refinement, although it was so suggested (Heffner et al., 2009). Cost efficiency will continue to be a determinant factor to the application of genomic selection in plant breeding.

SELECTING FOR MEAN PERFORMANCE AND SELECTING AGAINST INSTABILITY
Superior cultivars must demonstrate high and stable yield cross the target mega-environment. Cultivars yielded well in some environments but poorly in others, relative to other cultivars, are said to be unstable and undesirable as they can cause unbearable losses to growers. Various stability indices have been developed in order to quantify instability. Lin et al. (1986) reviewed nine stability or instability parameters and classified them into four groups. More parameters were proposed after that (e.g., Huehn, 1990). The initial idea of stability analysis was to select against unstable genotypes rather than to select for stable genotypes. This idea was somehow twisted to treating a stability index as a positive trait, which caused confusion among researchers. It would be less confusing to call these stability indices as instability indices. The large number of indices are also confusing to practical breeders. The purpose of this section is to reinstall the original idea of stability analysis and to clear up the confusions.
First, a stability index should reflect a genotype's susceptibility to GE, because it is GE that caused its unstable performance across environments. The numerous stability parameters may be classified according to its composition in terms of G, E, and GE. In the classification of Lin et al. (1986), stability indices in Group B involve GE only; they are suitable instability parameters. Indices in group A involve both E and GE; they are not suitable parameters because GE is confounded with E. The linear regression coefficient b against E in Eberhart and Russell (1966) (Group C) is a genotype's response to E; its usefulness depends on how well the linear regression fits the data (Lin et al., 1986), which is usually poor . When the fit is sufficiently good, b = 1 means stable, b > 1 means good performance in high-yielding environments, and b < 1 means good performance in low-yielding environments (Ceccarelli, 1989). Deviation from the linear regression (Group D) is merely a measure of the goodness of fit of the linear regression and is not a useful measure of stability.
Second, stability analysis is a concept of selection within a mega-environment. So, it should be conducted within megaenvironments rather than across mega-environments, unless the mega-environments are highly correlated. In such cases, the mega-environments should be merged and treated as a single mega-environment.
Third, a stability index representing GE must be used in combination with the mean performance (G), thus the term G+GE or GGE (Yan et al., 2000(Yan et al., , 2007. High stability (less GE) is desirable only when combined with high mean performance. High stability is least desirable when combined with low mean yield because it means consistently low yielding (Yan et al., 2007). Parameters or procedures combining both G and GE include the superiority index of Binns (1988, 1994) and the stability index of Kang (1993). In addition, several graphical methods also combines G and GE. These include the AMMI1 biplot  and the AEA view of the GGE biplot (Figure 6; Yan, 2001). Zobel et al. (1988) showed that AMMI analysis was superior to the joint regression of Eberhart and Russell (1966) and Alwala et al. (2010) concluded GGE biplot analysis to be a better platform than the joint regression. There is some debate on whether AMMI or GGE is a better approach in analyzing yield trial data (Gauch, 2006;Yan et al., 2007;Gauch et al., 2008;Yan, 2011). Many studies compared AMMI and GGE biplot analysis and concluded that GGE biplot analysis was superior (e.g., Badu-Apraku et al., 2012;Amira et al., 2013;Hoyos-Villegas et al., 2016;Oliveira et al., 2019). AMMI analysis was advocated as a means to separate "signal" (true GE) from "noise" (error) in GE and a means to use the GE-signal to adjust the genotypic means Gauch, 2013). AMMI1 or AMMI2 (i.e., main effects plus the first one or two principal components of GE) is often found the best AMMI model, and the AMMI1 biplot is often used as a visual tool for genotype evaluation (Gauch, 2006). Unfortunately, the AMMI1 biplot is not an effective graphical presentation of G + GE because G is often masked by the much larger E in it, because its G and GE axes are in different units, and because it does not have the inner-product property of a true biplot (Yan, 2011). Similar to AMMI analysis, GGE biplot analysis can also be viewed as a means to separate signal from noise. In a two-dimensional GGE biplot such as that in Figure 6, the first two principal components are considered as signals and the higher dimensions as noise. This GGE biplot displays the amount of G+GE in between that of AMMI1 and AMMI2; so, it should be close to the best model in most cases. In the AEA view of the GGE biplot (Figure 6), the GGEmean axis represents the GE-adjusted genotypic means, while the GGE-stability axis represents the genotypes' susceptibility to unrepeatable GE (instability). Both axes pass through the biplot origin and are perpendicular to each other, meaning that they are independent parameters. Thus, the AEA view of the GGE biplot is a convenient tool for visual analysis of genotype-by-environment data and for visual selection for mean performance and against instability.
Finally, "test adequately" is much more important than any stability analysis. When tested adequately, genotypes with high mean performance should also be genotypes that are relatively stable, because it is not possible for a highly unstable genotype to achieve very high mean performance. However, when not tested adequately, as indicated by a low heritability across locations and years, neither the estimated mean nor the estimated stability is reliable, and a low selection intensity or culling rate must be applied. In such cases, selection should be mainly on mean performance, rather than on stability. Instead, effort should be made to understand the causes of the instability for a highyielding genotype. For example, severe lodging may be the reason for its low yield in a severely lodged trial. If severe lodging rarely occurs in the target region, then the genotype is expected to show good mean yield and stability when tested adequately; if lodging is a common yield-limiting factor in the target region, then the genotype is expected to have low mean yield when tested adequately.

SELECTION FOR MULTIPLE TRAITS
While geneticists can focus on a single trait and ignore others, breeders must deal with multiple traits. In addition to high yield, which is always the most important breeding objective, a cultivar must meet a minimum requirement for each and every trait that is important to the relevant growers, processors, and end-users. In fact, the greatest challenge in plant breeding is to combine all desirable traits in a single genotype, because key breeding objectives are often adversely associated, due to either genetic linkage or pleiotropy (e.g., Tanksley, 1983;Yan and Wallace, 1995;Asins, 2002;Cooper et al., 2009;Hao et al., 2014;Crespo-Herrera et al., 2016). Strategies for multi-trait selection include independent culling and index selection (Simmonds and Smartt, 1999;Yan and Frégeau-Reid, 2008;Yan, 2014). Independent culling is to cull all genotypes that fail to meet the minimum requirement for any breeding objective, because such genotypes will not be accepted as cultivars. Index selection is to rank genotypes based on an index that is composed to reflect the perceived economic values of the genotypes. Independent culling can be implemented at all stages in the breeding cycle but it is more important in the early breeding stages when multi-location yield trials are not possible. Index selection is mainly implemented in the yield trial stage, at which all important traits can be determined. For cultivar developmentoriented genomic selection, both independent culling and index selection should be conducted.

Independent Culling
Independent culling is important to ensure that selected high yielding genotypes will be accepted by growers and end-users; it is also an effective approach to reduce the breeding population size safely and speedily. Assume that t is the number of independently inherited breeding objectives, each with a heritability h 2 k , with k = 1, . . . t. If h 2 k is used as the culling rate for trait k, then the joint culling rate would be: and the number of genotypes must be retained to ensure that the best genotype is selected is: N/n being the proportion of the breeding population that must be retained. For example, assume the culling rate for each of five traits is 0.3, then, according to Equation 21, the joint culling rate would be 0.83, and the retaining rate would be 0.17 or 17%. Therefore, a large proportion of the population can be safely culled by independent culling if multiple traits are considered, even though the heritability or culling rate is low for each trait. This explains the effectiveness of visual selection (culling) by an experienced breeder, who can visualize and select on many traits simultaneously. Genomic selection for oat yield in eastern Canada proves effective (Bekele, Tinker, and Yan, unpublished results); it should also be effective for other traits that are more simply inherited than yield. Therefore, independent culling based on genomic models is expected to be more accurate than visual selection by even the most experienced breeder. If the traits under consideration are positively correlated, the overall culling rate would be lower than when they are independent; the overall culling rate would be higher if the traits are negatively correlated, which is often the case. The overall culling rate can be much higher if some of the target traits are simply inherited and less affected by GE and experimental error, for example, oil content in oat (Hizbai et al., 2012;Yan et al., 2016).

GYT (Genotype by Yield * trait) Analysis
A large portion of the genotypes that survived independent culling should be qualified as a cultivar if they are sufficiently high yielding. Therefore, the focus of selection following independent culling should be on yield although other target traits should also be considered. Selection based on a selection index is the common method for selection on multiple breeding objectives (note but: not any traits). Here, the GYT (genotype by yield * trait) analysis (Yan and Frégeau-Reid, 2018;Yan et al., 2019a) is recommended over the traditional index selection.
In traditional index selection, the superiority of genotype i, P i , is calculated as where t is the number of breeding objectives that are to be selected in addition to yield, y i is the standardized yield for genotype i, w 0 is the weight assigned for yield, w j is the weight assigned for trait j, and x ij is the standardized value of genotype i for trait j.
In the GYT approach, the superiority of a genotype may be presented as The selection index for a genotype is usually presented as the standardized value of P i . The difference between the traditional selection index (Equation 23) and the GYT approach (Equation 24) follows. In traditional index selection, the weight for a trait other than yield is a fixed value for all genotypes, while in the GYT approach it varies with the yield level of each genotype. In traditional index selection the emphasis is on the levels of the traits; in the GYT approach it is on the levels of yieldtrait combinations. The GYT concept is better in reflecting the economic value of a trait. For example, superior lodging resistance (or high protein) has little value in a low yielding genotype but it is highly valuable in a high yielding genotype. Consequently, based on the traditional selection index, a low yielding genotype may be ranked the highest due to its superior levels in other traits; such genotypes will not be accepted as cultivars by growers, however (Yan et al., 2019a). This problem can be prevented with the GYT approach, as the genotypes ranked highest will always have high yield levels.
Another advantage of the GYT approach is that the superiority and the trait profiles of the genotypes can be visually investigated in a biplot, referred to as GYT biplot (Yan and Frégeau-Reid, 2018;Yan et al., 2019a,b). As an example, the mean values of the 13 oat cultivars for eight important traits from the 2013-2019 Quebec oat trials are presented in Table 4, ranked by their GYT index. The steps to construct a GYT biplot follow. First, standardize the genotype by trait table for each trait. Second, multiply yield with each trait to form a genotype by yield-trait combination two-way table. Third, subject the weighted genotype by yield-trait two-way able to singular value decomposition to obtain the principal components (PC). Fourth, multiply each of the yield-trait combination PC scores with the assigned weight. And finally, construct a biplot using the genotypic and trait combination scores of the first two principal components based on the yield-trait combination preserving singular value partition. Note that for milling oat a higher value is more desirable for all the traits listed in Table 4 except lodging and oil content. For these two traits a smaller value is more desirable; they were therefore given a weight of "−1." The information contained in Table 4 can be visualized in a GYT biplot (Figure 9). The biplot clearly shows the rank of the genotypes in their GYT index, i.e., Nicolas > Akina > Kara > Richmond >. . . > Avatar. Thus, Nicolas and Akina | GYT biplot to display the yield-trait combinations of 13 oat cultivars tested in the 2013-2019 Quebec provincial oat trials. The biplot was based on singular value decomposition of yield-trait combination standardized data ("Centering = 2, Scaling = 1"). The red line with a single arrow is the average yield-trait axis, the arrow pointing to higher GYT index. The blue line with arrows on both ends indicate contrasting trait profiles of the genotypes. For example, it showed Richmond to be strong yield-oil combination but weak in yield-β-glucan combination, while Kara had the opposite trait profile to that of Richmond.
should be selected and recommended to the Quebec oat growers without hesitation; they are in fact the most important two cultivars in Quebec. The biplot also shows the trait profiles of the genotypes. For example, it shows that Richmond is superior in having a low oil content but inferior in having a low βglucan content. In fact, all cultivars placed above the red line (the GYT index axis) have relatively low oil and low β-glucan whereas the opposite is true for cultivars placed below the GYT index axis. "Y * Oil(−1)" and "Y * Lodging(−1)" indicate that oil content and lodging score were given a weight of "−1" because high oil content and high lodging are undesirable for milling oat (Figure 9). The GYT biplot approach has been adopted in multi-trait selection for various crops (Boureima and Abdoua, 2019;de Oliveira et al., 2019;Hamid et al., 2019;Mohammadi, 2019;Gouveia et al., 2020;Mahmoud et al., 2020;Merrick et al., 2020;Badu-Apraku et al., 2021;Sofi et al., 2021;Tsenov et al., 2021;Xu et al., 2021).

CONCLUSIONS
Plant breeding plays a key role in meeting the increasing need for food, fiber, health, and comfort and in combating the adverse impacts of the changing climate. Plant breeding consists of two stages: breeding population development and progeny selection. For cultivar development, population development is more important than progeny selection but has largely been neglected in the literature. Hence, a "complete breeder's equation" was presented, which contains three key parameters: the population mean, the population variability, and the achieved heritability under the multi-location, multi-year framework. The value of a breeding population is measured by both the population mean and the population variability. For progeny selection, the key is to improve the heritability, i.e., selection reliability. Three aspects were identified to improve heritability: utilizing repeatable GE through mega-environment analysis, accommodating unrepeatable GE by adequate testing, and adequate replication and adjusting for spatial variation. Procedures for mega-environment analysis include GGE + GGL biplot analysis and LG biplot analysis. Adequate testing includes estimation and use of an optimum number of years, locations, and replicates. Cultivar evaluation within a mega-environment should select for mean performance and select against instability, with GGE biplot analysis being a preferred graphical method. A stability index is meaningful only when combined with high mean yield. Adequate testing is more important than any stability analysis. Last but not least, cultivar development must consider multiple traits; both independent culling and index selection are essential. GYT biplot analysis is a preferred method for index selection. In addition, genomic selection is an alternative and potentially more effective approach in all stages and aspects of cultivar development if reliable models are developed and if it can be done cost-efficiently.

DATA AVAILABILITY STATEMENT
The original contributions presented in the study are included in the article/supplementary materials, further inquiries can be directed to the corresponding author.

AUTHOR CONTRIBUTIONS
The author confirms being the sole contributor of this work and has approved it for publication.