Extending the breeder’s equation to take aim at the target population of environments

A major focus for genomic prediction has been on improving trait prediction accuracy using combinations of algorithms and the training data sets available from plant breeding multi-environment trials (METs). Any improvements in prediction accuracy are viewed as pathways to improve traits in the reference population of genotypes and product performance in the target population of environments (TPE). To realize these breeding outcomes there must be a positive MET-TPE relationship that provides consistency between the trait variation expressed within the MET data sets that are used to train the genome-to-phenome (G2P) model for applications of genomic prediction and the realized trait and performance differences in the TPE for the genotypes that are the prediction targets. The strength of this MET-TPE relationship is usually assumed to be high, however it is rarely quantified. To date investigations of genomic prediction methods have focused on improving prediction accuracy within MET training data sets, with less attention to quantifying the structure of the TPE and the MET-TPE relationship and their potential impact on training the G2P model for applications of genomic prediction to accelerate breeding outcomes for the on-farm TPE. We extend the breeder’s equation and use an example to demonstrate the importance of the MET-TPE relationship as a key component for the design of genomic prediction methods to realize improved rates of genetic gain for the target yield, quality, stress tolerance and yield stability traits in the on-farm TPE.


Introduction
Plant breeding is grounded in prediction (Goldman, 2000;Duvick, 2001;Cooper et al., 2014a;Voss-Fels et al., 2019;Kholová et al., 2021). Plant breeding programs are the operational implementation of coordinated sequences of prediction methods, organized to continuously create, evaluate, and select new genotypes over multiple breeding program cycles (Duvick et al., 2004;Moose and Mumm, 2008;Cobb et al., 2019;Technow et al., 2021). The cycles are designed to iteratively improve on the outcomes from previous cycles. Breeding objectives are framed to develop product outcomes (Fehr, 1987a;Fehr, 1987b;varieties, hybrids, clones, populations). These products are to be used by farmers within the Genotype-by-Environment-by-Management (GxExM) context of agricultural systems of the target population of environments (TPE); which includes the biophysical environment and the agronomic management practices adopted by farmers (Ceccarelli, 1989;Ceccarelli, 1994;Duvick et al., 2004;Chenu et al., 2011;Persley and Anthony, 2017;van Etten et al., 2019;Ceccarelli and Grando, 2020;Cooper et al., 2020;Cooper et al., 2021Kholová et al., 2021;Ronanki et al., 2022;Zhao et al., 2022). Through successful adoption and use of the improved products by farmers, together with appropriate agronomic management practices, breeding programs can improve food productivity and so contribute to enhanced global food security. However, there are many persistent gaps documented between the current levels of crop productivity in agricultural systems and the targets required to achieve food security (van Ittersum et al., 2013;van Ittersum et al., 2016;Kholová et al., 2021). Thus, there is continued interest in improving the design of breeding programs to target the creation of new products to help close yield gaps (van Etten et al., 2019;Ceccarelli and Grando, 2020;Cooper et al., 2020;Kholová et al., 2021;Messina et al., 2022a).
Application of genomic prediction technologies has emerged as a major theme of breeding program design in the 21st Century (Meuwissen et al., 2001;Bernardo and Yu, 2007;Heffner et al., 2009;Cooper et al., 2014a;Voss-Fels et al., 2019;Rogers et al., 2021;Varshney et al., 2021). Here we discuss and extend the "breeder's equation" as a framework to help evaluate opportunities to enhance genomic breeding outcomes through enhanced design of METs to provide the relevant training data sets with the required MET-TPE alignment (Cooper et al., 2014a;Cooper et al., 2014b;Gaffney et al., 2015;Gonzá lez-Barrios et al., 2019;Rogers et al., 2021;Smith et al., 2021a;Smith et al., 2021b). Attention to improve the MET-TPE alignment, as a criterion for the design of MET training data sets, provides the foundation for effective use of environmental covariates, crop models and high-throughput phenotyping in combination with genome-to-phenome (G2P) modelling algorithms to predict GxExM interactions and enhance application of genomic prediction for the TPE (Cooper et al., 2014a;Cooper et al., 2014b; please insert after Messina et al., 2022a;Gaffney et al., 2015;Messina et al., 2018;Diepenbrock et al., 2021;Messina et al., 2022a).

Theoretical development 2.1 Breeder's equation
The basic form of the "breeder's equation" provides a framework to predict the response to selection (DG ) from one cycle (L) of a breeding program, following application of a selection strategy (Moose and Mumm, 2008;Cobb et al., 2019). Here we consider selection strategies that incorporate applications of genomic prediction (Meuwissen et al., 2001;Bernardo and Yu, 2007;Heffner et al., 2009;Cooper et al., 2014a;Voss-Fels et al., 2019). Selection pressure is implemented by applying truncation selection to the distributions of observed or predicted values for one or more traits within the reference population of genotypes (RPG) of a breeding program; for example, selection to increase crop yield, improve grain quality and improve abiotic and biotic stress tolerances to reduce the extent of yield losses due to the occurrence of the frequent stresses in the TPE (Chenu et al., 2011;Kholová et al., 2013;Hajjarpoor et al., 2021;Messina et al., 2022a). The structure of the breeder's equation has a long history in animal and plant breeding (Lush, 1937;Hallauer and Miranda, 1988;Nyquist and Baker, 1991;Comstock, 1996;Moose and Mumm, 2008) and is frequently used as a quantitative framework for the design and optimization of crop breeding programs (Araus and Cairns, 2014;Araus et al., 2018;Cobb et al., 2019;Voss-Fels et al., 2019;Kholová et al., 2021;. For applications of genomic prediction, a common form of the breeder's equation is given as: Where i represents the selection differential applied to the selection units, based on the trait variation within the RPG, r a represents the prediction accuracy for breeding values for the selection units within the RPG, and s a represents the additive genetic variation among the selection units within the RPG for the traits that are targeted for improvement by selection. For genomic breeding, the quantification of prediction accuracy r a is based on G2P models for traits that are constructed using suitable training data sets. These G2P models are created algorithmically using the genetic marker fingerprints and trait phenotypes for the genotypes included in breeding multi-environment trials (METs) used as training data sets (Meuwissen et al., 2001;Crossa et al., 2017;Messina et al., 2018;Diepenbrock et al., 2021). The foundation of the MET training data sets is typically based on data collected from the relevant stages of the breeding program (Cooper et al., 2014a;Voss-Fels et al., 2019;Smith et al., 2021a). Environmental covariates and model-based characterizations of the sample of environments present in the MET can be used to create environmental predictors to be included in the G2P model. These environmental predictors provide a basis to adjust genomic predictions of genotype breeding value and performance for different environments to account for effects of GxE interactions (Jarquí n et al., 2014;Crossa et al., 2017;Messina et al., 2018;de los Campos et al., 2020;Diepenbrock et al., 2021). Importantly, the samples of environments included in the METs are considered to represent the environmental composition of the TPE (Comstock and Moll, 1963;Nyquist and Baker, 1991;Cooper and DeLacy, 1994;Chenu et al., 2011). The environmental composition of the METs can be augmented in many ways using specifically designed field-based and controlled-environment experiments (Cooper et al., 1995;Cooper et al., 1997, Campos et al., 2004Cooper et al., 2014a;Cooper et al., 2014b;Rebetzke et al., 2013;van Eeuwijk et al., 2019;Langstroff et al., 2022;. Many assumptions are made when applying the breeder's equation, as represented by equation (1). We consider some of these assumptions in more detail as they relate to the prediction of response to selection for improved on-farm performance within the TPE. We focus on the influence of the MET-TPE relationship in the presence of GxE interactions within the TPE of the breeding program and use this as the basis for deriving the extended breeder's equation introduced below.
2.2 Extending the breeder's equation to take aim at the TPE The breeder's equation, as represented in equation (1), quantifies the per cycle rate of change of the trait mean value for the RPG (Nyquist and Baker, 1991;Moose and Mumm, 2008;Araus et al., 2018;Cobb et al., 2019;Voss-Fels et al., 2019). However, this form of the breeder's equation does not explicitly quantify the directionality of the changes in trait values, that are based on the results and predictions from METs, relative to their requirements for improved performance in the TPE. Instead, it relies on the assumption that the environmental composition of the MET is a good representation of the environmental composition of the TPE, i.e., that there is good MET-TPE alignment (Comstock and Moll, 1963;Nyquist and Baker, 1991). To enable efficient design of a breeding program, targeted on creation of new products to close onfarm yield gaps within the TPE, it is desirable to have a form of the breeder's equation that includes both the rate and the directionality components of genetic gain for the TPE. One approach is to explicitly include a term in the breeder's equation that quantifies the influence of the MET-TPE alignment on the predicted rate of change within the TPE. Applying correlated response selection theory (Falconer, 1952;Cooper and DeLacy, 1994;Rogers et al., 2021;, we provide an extended form of the breeder's equation that combines both the rate and directionality components of trait change under the influence of selection, explicitly accounting for the influence of the MET-TPE alignment on the directionality of the change relative to the requirements for the TPE. Considering the environmental composition of the MET to be a sample of the environmental composition of the TPE (MET∈TPE ), an equation for trait genetic gain within the TPE, based on selection decisions made using predictions from G2P trait information obtained from METs (DG (MET,TPE) ), can be given as: Two of the terms in equation (2) are equivalent to terms in equation (1): i MET is the selection differential applied to phenotypic and G2P prediction information obtained from analyses of the MET training data sets, as for i in equation (1), r a(MET) is the prediction accuracy for the selection units based on applications of the training data available from the MET, as for r a in equation (1). In equation (2) the s a term of equation (1) is replaced by the product of two terms r a(MET,TPE) and s a(TPE) . The term r a(MET,TPE) is the genetic correlation between the additive genetic effects estimated by applying G2P models developed using the MET training data sets, and the additive genetic effects for the trait targets required for realized trait performance in the TPE. The term s a(TPE) represents the relevant target additive genetic variation for the traits within the TPE. Thus, the r a(MET,TPE) term of the extended breeder's equation provides a quantitative measure of the impact of the MET-TPE alignment for the prediction of additive genetic variation for traits in the TPE, and thus for predicting their contributions to genotype performance in the TPE. The r a(MET,TPE) can range from +1, with good MET-TPE alignment, to -1, with poor MET-TPE alignment. Additional forms of equation (2) can be given, for example for prediction at the level of the total genotypic trait performance level. Equally equation (2) can be further extended to examine the contributions of quantitative trait loci (QTL) and combinations of haplotypes and specific QTL to the additive or total genotypic variance for multiple traits in the RPG for the TPE.
Applying the extended form of the breeder's equation given in equation (2), statements can be made regarding the design of genomic prediction strategies based on applications of equation (1).
• Firstly, if the environmental composition of the MET is an accurate sample of the environmental composition of the TPE then it can be expected that r a(MET,TPE) ! +1 and equations (1) and (2) will converge to the same form of the breeder's equation, as given in equation (1); in this case the s a of equation (1) converges to the s a(TPE) of equation (2). However, if there is GxE interaction and divergence in environmental composition between the MET and the TPE, r a(MET,TPE) < +1 can occur, diminishing prediction accuracy for the TPE. Under such circumstances it can be expected that realized genetic gain in the TPE will be lower than predicted when based on studies confined to pursuing G2P modelling algorithms for improved prediction accuracy within the bounds of the MET training data sets; in this case the s a of equation (1) can diverge from the s a(TPE) of equation (2). Whenever there is historical evidence that realized genetic gains in the on-farm TPE are lower than the predicted gains, the magnitude of r a(MET,TPE) should be investigated to quantify its potential impact on the expected realized prediction accuracy that can be achieved in the TPE based on prediction accuracy derived from the training data available through the MET. • Secondly, whenever there is evidence of GxE interactions within the TPE, including GxExM interactions, and there is the potential for divergence between the environmental composition and trait data obtained from current METs and those expected for the future TPE, as is often projected for the influences of climate change (Chapman et al., 2012;Ceccarelli and Grando, 2020;Cooper et al., 2021), the extended form of the breeder's equation (2) provides a more appropriate framework than equation (1) for quantifying the impact of such changes on the design and optimization of prediction-based breeding strategies. • Thirdly, for long-term breeding programs, consideration should be given to characterization of the TPE and the design of MET experiments to obtain empirical estimates of the genetic correlation r a(MET,TPE) and determination of the genetic and environmental factors contributing to r a(MET, TPE) < +1. The effects of climate change on the environmental composition of the TPE and associated changes in trait contributions to yield and GxE interactions for current and future cropping systems represents one clear area for urgent consideration in the design of METs to address the MET-TPE alignment (Braun et al., 2010;Chapman et al., 2012;Lobell et al., 2015;Ceccarelli and Grando, 2020;IPCC, 2021;Bustos-Korts et al., 2021;Cooper et al., 2021;Resende et al., 2021;Snowdon et al., 2021;. To demonstrate the implications of GxE interactions on realized genetic gain in the on-farm TPE we consider two examples of the application of the extended form of the breeder's equation to investigate the MET-TPE alignment and its potential impact on the r a(MET,TPE) component of equation (2). The first considers a familiar theoretical example from the study of crossover GxE interactions (Haldane, 1947;Ceccarelli, 1989;Ceccarelli, 1994;Cooper and DeLacy, 1994;van Eeuwijk et al., 2001). The second considers an empirical example based on a previously published MET-TPE data set for wheat in Australia (Cooper et al., 1995;Cooper et al., 1997;Cooper et al., 2001). The wheat example was previously used to investigate the implications of GxE interactions for grain yield in the TPE, and also the MET-TPE relationship for the design of METs to accelerate genetic gain for yield from wheat breeding in a TPE where complex GxE interactions for grain yield are ubiquitous (Brennan et al., 1981;Cooper and DeLacy, 1994;Cooper et al., 1995;Cooper et al., 1997;Basford and Cooper, 1998;Cooper et al., 2001;Chenu et al., 2011;Lobell et al., 2015;Bustos-Korts et al., 2021).

Investigating the MET-TPE alignment: theoretical example
Theoretical and empirical considerations of the influences of GxE interactions for breeding have consistently emphasized the importance of crossover GxE interactions ( Figure 1A; Haldane, 1947;Ceccarelli, 1989;Ceccarelli, 1994;Cooper and DeLacy, 1994;Cooper et al., 2021;Rogers et al., 2021;Smith et al., 2021a;Smith et al., 2021b). Examples of such crossover interactions in breeding METs have been demonstrated at the genotypic (Cooper et al., 1995;Cooper et al., 1997;van Eeuwijk et al., 2001;Xiong et al., 2021;Smith et al., 2021b) and QTL levels (Boer et al., 2007. For the theoretical example of crossover GxE interactions shown in Figure 1A, the yield performance responses for two genotypes (G2 and G8) in two environments (Env_1 and Env_2) are considered. The potential impact of the crossover interactions depicted in Figure 1A on selection decisions can be examined using equation (2) by considering the influence of changes in the frequency of occurrence of the two environments within both the MET and TPE on the genetic correlation r a(MET,TPE) term from equation (2). Here we consider the genotypic correlation r g(MET,TPE) between weighted average yield of the two genotypes between the MET and the TPE, where the weights are based on the frequencies of occurrence of the two environments in the MET and the TPE (Podlich et al., 1999). This provides a simulated scan of the range of possible MET-TPE alignment scenarios based on the potential range in frequency of occurrence of the two environments within the MET and the TPE.
In Figure 1B the genotypic covariance s g(MET,TPE) of the average performance of the two genotypes in the MET and the TPE is plotted against the frequency of Env_1 in the MET and the TPE. The genotypic covariance is the numerator of the genetic correlation r g(MET,TPE) term of equation (2) and is used here in place of r g(MET,TPE) to smooth out the response surface for illustration purposes. The shape of the response surface for the genotypic covariance ( Figure 1B Two examples of the potential influences of Genotype by Environment (GxE) interactions for grain yield on the expected genetic correlation between the average genotype performance in a multi-environment trial (MET) and the target population of environments (TPE) r g(MET,TPE) as the frequencies of environment types change between the sample of environments obtained in the MET and their presence in the TPE: (A) Schematic yield reactionnorms for two wheat genotypes (G3, G8) in two environments (Env_1, Env_2) demonstrating crossover GxE interaction; (B) Response surface of the expected genotypic covariance s g(MET,TPE) between average genotype yield performance in a MET and in the TPE as the frequencies of the two environments (Env_1, Env_2) change within the MET and TPE; (C) Scatter plot of the average grain yield for 15 wheat genotypes based on two independent sets of environments representing both the MET and the TPE; (D) Response surface of the expected genotypic correlation, r g(MET,TPE) from equation (2), between average genotype yield performance in a MET and in the TPE as the frequencies of two environment-types (E1 = Mild water deficit, E2 = Severe water-deficit) change within the MET and TPE data sets. The filled symbol on the response surface indicates the position of the empirical estimate of r g(MET,TPE) for the grain yield data shown in sub-figure 2c (MET f(E1) = 0.41, TPE f (E1) = 0.31, r g(MET,TPE) =0.70 ). Data for grain yield estimates were obtained from the study reported by Cooper et al. (1997). positive values depending on the frequency of occurrence of both environments in the MET and the TPE. Two aspects are noted.
• Firstly, when the frequencies of both environments are close to 0.5 in the MET or TPE the genetic covariance, and thus the genetic correlation r g(MET,TPE) , approaches 0 ( Figure 1B). In such situations selection decisions will require direct investigation of the GxE interactions and consideration of how to target breeding for both environments instead of selection for average performance in the MET to improve average performance in the TPE, as simulated here ( Figure 1A). • Secondly, as the frequencies of the environments within the MET and the TPE deviate from 0.5 towards 1.0 for Env_1 and towards 0.0 for Env_2, or towards 0.0 for Env_1 and towards 1.0 for Env_2, then the influence of the MET-TPE alignment becomes increasingly important. When there is good MET-TPE alignment of the environment frequencies the genotypic covariance is positive and the crossover GxE interaction is less problematic for selection decisions ( Figure 1B). However, if there is poor MET-TPE alignment of the environment frequencies, for example a high frequency of Env_1 in the MET when Env_1 has a low frequency in the TPE, then the genotypic covariance can become negative ( Figure 1B). In this situation selection based on the information obtained from the MET will result in poor selection decisions that are not aligned with the needs of the TPE, even if a high prediction accuracy, based on the value of r a from equation (1) and of r a(MET) from equation (2), is demonstrated for any prediction method within the confines of the MET training data set.

Investigating the MET-TPE alignment: empirical example
Building on the theoretical example ( Figures 1A, B), we apply the extended breeder's equation to quantify the impact of the MET-TPE alignment for an empirical example by estimating the genotypic correlation r g(MET,TPE) term of equation (2) for a range of wheat MET-TPE alignment scenarios for north-eastern Australia ( Figures 1C, D). We utilize grain yield data available from a previously published wheat data set (Cooper et al., 1995;Cooper et al., 1997;Cooper et al., 2001). The example provides grain yield data for 15 genotypes and 53 environments. Importantly, for current considerations, the 53 environments were previously organized to represent a breeding MET (27 environments) and the TPE (26 environments) for the north-eastern region of the Australian wheat belt (Brennan et al., 1981;Cooper et al., 1995;Cooper et al., 1997;Chenu et al., 2011). The MET was specifically designed to represent the current understanding of GxE interactions and MET-TPE alignment scenarios for the wheat breeding program at that time. The set of 15 genotypes was chosen to represent groupings of key germplasm from the reference population of genotypes for the wheat breeding program (Cooper and DeLacy, 1994;Cooper et al., 1995;Cooper et al., 1997;Cooper et al., 2001). Further, we identify that the data for the two genotypes (G2 and G8), used to illustrate crossover GxE interactions in the theoretical example ( Figure 1A), were chosen from the larger set of 15 genotypes included in the empirical example ( Figure 1C). Also, the two environments (Env_1 and Env_2) used in the theoretical example were taken from the empirical example. Thus, the numerical values for the example of crossover GxE interaction for grain yield ( Figure 1A) used for the theoretical investigations of MET-TPE alignment ( Figure 1B) were representative of important crossover GxE interactions under consideration within the target breeding program, as considered in the empirical example (Figures 1C, D;Brennan et al., 1981;Cooper and DeLacy, 1994;Basford and Cooper, 1998;Cooper et al., 2001).
Improving grain yield stability for the TPE of the north-eastern region of the Australian wheat-belt was a primary objective of the wheat breeding program at that time (Brennan et al., 1981). A weighted selection strategy, combined with field-based managedenvironments, was developed to account for GxE interactions in the TPE (Cooper et al., 1995;Cooper et al., 1997;Cooper et al., 2001;Podlich et al., 1999). Spatial and temporal variability for water availability was identified as primary driver of grain yield variation within the TPE, and drought was a major source of crossover GxE interactions for grain yield. Thus, the environments included in the MET were managed to sample a gradient of water availability scenarios, ranging from severe drought to water-sufficient environments, by managing combinations of irrigation and nitrogen inputs at a restricted number of locations. The TPE set of environments was designed by sampling a range of water availability scenarios from a wider range of locations and years within the north-eastern region of Australia. The objective was to design a MET for the stages of the wheat breeding program that could be consistently managed at a few locations to provide a stratified sample of the range of water availability environments expected within the TPE (Brennan et al., 1981;Cooper et al., 1995;Cooper et al., 1997;Cooper et al., 2001).
Grain yield GxE interactions were previously identified for both the MET and TPE data sets (Cooper et al., 1995;Cooper et al., 1997;Cooper et al., 2001). Crossover GxE interactions were frequent ( Figure 1A; Cooper and DeLacy, 1994). For the purposes of demonstrating an application of equation (2) to the empirical wheat example, the prior envirotyping was used to identify two groups of environment-types for both the MET and TPE sets; environment-type 1 (E1) characterized by mild water-deficits, and environment-type 2 (E2) characterized by severe water-deficits. There were GxE interactions between the two environment-types within the MET and TPE sets (Figure 2; Cooper et al., 1995;Cooper et al., 1997;Cooper et al., 2001). There was a moderate to weak positive genotypic correlation for grain yield variation among the 15 genotypes between both environment-types E1 and E2 for the MET (Figure 2A) and TPE ( Figure 2B). Importantly, for interpretation of the genotypic correlation r g(MET,TPE) between the MET and TPE ( Figure 1D), the genotypic correlation for grain yield variation between the mild stress environment-type E1 was positive and strong between the MET and the TPE ( Figure 2C). However, there was no relationship for grain yield variation between the severe drought stress environment-type E2 between the MET and the TPE ( Figure 2D). The details of the lack of relationship for environmenttype E2 are discussed in detail elsewhere (Cooper et al., 1995;Cooper et al., 1997). In summary the MET was designed to focus on the expected water availability gradient in the absence of other abiotic and biotic stresses that could also occur within the TPE. Occurrences of these other abiotic and biotic stresses within the TPE set were interpreted to be contributing factors to the low relationship observed for severe drought stress environment-type E2 between the MET and TPE ( Figure 2D). In the absence of the drought stress for environment-type E1 these other abiotic and biotic stresses were less influential on the genotypic correlation for grain yield ( Figure 2C).
For purposes of demonstrating an application of the extended breeder's equation to the wheat MET-TPE data set ( Figures 1C, D) it is sufficient to note that there was GxE interaction for grain yield between Environment-types E1 and E2 in both the MET (Figure 2A) and the TPE ( Figure 2B) data sets and that there was positive predictability between the MET and TPE sets for environment-type E1 ( Figure 2C), but no predictability for environment-type E2 ( Figure 2D). Using this level of envirotyping we can simulate the influence of changes in the MET-TPE alignment on r g(MET,TPE) and prediction of average grain yield in the TPE based on average grain yield estimated from the MET ( Figure 1D). Following the same procedures applied to the theoretical example ( Figures 1A, B), the potential range of MET-TPE alignment scenarios was simulated by changing the frequencies of environment-types E1 and E2 within the MET and the TPE in steps of 0.1 from 0.0 to 1.0, calculating the weighted average grain yield of the 15 genotypes for both the MET and TPE, taking into consideration the frequencies of both environment-types, and calculating the genotypic correlation r g(MET,TPE) between the estimates of weighted average grain yield for the 15 genotypes between the MET and TPE for all MET-TPE alignment combinations. We then plotted the r g(MET,TPE) against the frequency of environment-type E1 in the MET and TPE to generate a simulated r g(MET,TPE) genotypic correlation response surface for all MET-TPE alignment configurations ( Figure 1D). The genotypic correlation r g(MET,TPE) between the simulated MET and TPE alignments ranged from a high value of 0.90 to a low value of -0.07 ( Figure 1D). The r g(MET,TPE) response surface for the wheat example has interesting features. Firstly, there is a relatively broad plateau of high r g(MET,TPE) values for many of the MET-TPE alignment scenarios. This plateau of high r g(MET,TPE) values occurred for scenarios where the frequency of the water-sufficient environment-type E1 was higher than 0.5 in both the MET and TPE ( Figure 1D), taking advantage of the high predictability between environment-type E1 in the MET and TPE ( Figure 2C). Secondly, when the frequency of environment-type E1 falls below 0.5 in the MET or TPE, and therefore the frequency of the water-limited environment-types E2 increases above 0.5, the r g(MET,TPE) is degraded from the high levels of the plateau ( Figure 1D), reflecting the increased influence of the poor predictability between the MET and TPE for the water-limited environmenttype E2 ( Figure 2D). This impact of the MET-TPE alignment on predictability for performance in the TPE using MET results will apply to all levels of prediction, including genomic prediction, phenotypic prediction, and combined prediction approaches.
For the specific environment-type configuration realized for the empirical example (Figure 2), the estimate of r g(MET,TPE) for prediction of average grain yield for the TPE based on average gain yield obtained for the MET was intermediate ( Figure 1C); r g(MET,TPE) = 0.70 for MET f(E1) = 0.41, f(E2) = 0.59 and for TPE f(E1) = 0.31, f(E2) = 0.69. Thus, the MET-TPE alignment for the empirical example was located on the r g(MET, TPE) response surface ( Figure 1D) slightly off of the plateau of higher r g(MET,TPE) levels, but still above the precipice where the r g(MET,TPE) value is severely degraded. This empirical realization of MET-TPE alignment is just one of the many possible scenarios that can occur as the frequencies of environment-types change between the MET and the TPE ( Figure 1D).
The empirical wheat example (Figures 1, 2) was used to demonstrate the utility of the extended form of the breeder's equation for applications in prediction-based breeding. Here we have emphasized the use of the extended breeder's equation as a useful framework to guide the design MET data sets for training G2P models for applications of genomic prediction and genomic selection at different stages of a breeding program to take aim at the TPE (Cooper et al., 2014a;Cooper et al., 2014b;Gaffney et al., 2015; A B D C

FIGURE 2
Scatter diagrams comparing average grain yield predicted for 15 wheat genotypes for two environment-types (E1 = Mild water deficit, E2 = Severe water-deficit) obtained from independent data sets representing a multi-environment trial (MET) and the target population of environments (TPE): (A) Comparison between grain yield predicted for environment-types E1 and E2 in the MET data set, r g(E1,E2|MET) ; (B) Comparison between grain yield predicted for environment-types E1 and E2 in the TPE data set, r g(E1,E2|TPE) ; (C) Comparison of grain yield predicted for environment-type E1 between the MET and the TPE data sets, r g(MET,TPE|E1) ; (D) Comparison of grain yield predicted for environment-type E2 between the MET and the TPE data sets, r g(MET,TPE|E2) . Data for grain yield predictions were obtained from the study reported by Cooper et al. (1997). Messina et al., 2022a). Many other possible prediction scenarios can also be investigated, and these will be the subject of future research.

Discussion
Design of breeding programs, and crop improvement strategies in general, to take aim at the crop productivity requirements of the TPE is critical to both accelerate and achieve realized genetic gain on-farm that contributes to closing yield gaps (Messina et al., 2022a), improving global food security Kholová et al., 2021;Rogers et al., 2021), and the many other requirements for sustainable agricultural systems (Ceccarelli, 1989;Ceccarelli, 1994;Persley and Anthony, 2017;van Etten et al., 2019;Messina et al., 2022b). However, in most considerations of breeding program design and optimization there is no direct connection between the optimization considerations that use the framework of the breeder's equation, as in equation (1), and the understanding of the TPE. Thus, there is often a disconnect between the attention to rate of genetic gain, and the directionality of the breeding program through its MET-TPE alignment with the requirements of the onfarm TPE. In the presence of GxE interactions and low r a(MET,TPE) this MET-TPE alignment disconnect can result in low realized genetic gain under the on-farm conditions of the TPE, even when high prediction accuracy, based on r a in equation (1) or more explicitly r a(MET) in equation (2), is demonstrated for genomic prediction methods evaluated within the confines of the MET. The extended form of the breeder's equation, introduced here as equation (2), provides a framework to remove this disconnect and to support design of prediction-based breeding strategies that take aim at the TPE by emphasizing the influence of the MET-TPE alignment on realized genetic gain for the on-farm TPE (Cooper et al., 2014a;Cooper et al., 2014b;Gaffney et al., 2015;Messina et al., 2022a). Here we demonstrated such application of the extended breeder's equation framework through investigation of r a(MET,TPE) , rather than assuming r a(MET,TPE) = +1, as is the case for the traditional form of the breeder's equation.
We have introduced and demonstrated the utility of the extended form of the breeder's equation through applications to a theoretical and empirical example. In summary the following key points were presented.
Theoretical considerations: We extended the breeder's equation, introducing the genetic correlation r a(MET,TPE) to explicitly incorporate and quantify the relationship between a MET and the TPE, as a framework for designing METs to take aim at the TPE. Three further considerations are important: (1) the traditional form of the breeder's equation assumes that the genetic correlation r a(MET,TPE) = +1; (2) in the presence of GxE interactions the genetic correlation r a(MET,TPE) can be decomposed to take into account the genetic variance-covariance structure among the environment-types within the TPE (Cooper and DeLacy, 1994;van Eeuwijk et al., 2001;Smith et al., 2005;Smith et al., 2021a;Smith et al., 2021b;Rogers et al., 2021); and (3) the genetic correlation r a(MET,TPE) can be applied to the continuum of selection units of interest to breeders, extending from the level of sequence information, accounting for QTL and chromosomal haplotypes, to total multi-trait, multi-QTL predicted genotypic performance or breeding value obtained for any G2P model that is derived from relevant training data sets that can be generated from METs together with augmented data sources from specialized phenotyping facilities (Cooper et al., 2014a;Cooper et al., 2014b;Gaffney et al., 2015;Diepenbrock et al., 2021).
Taking aim at specific target environment-types, for example specific biotic or abiotic stresses, is not uncommon in plant breeding (Blum, 1988;Millet et al., 2019). However, taking aim at the TPE as a mixture of environment-types (Podlich et al., 1999;Duvick et al., 2004;Cooper et al., 2014a;Cooper et al., 2014b;Gaffney et al., 2015;Rogers et al., 2021;Smith et al., 2021a;Smith et al., 2021b;Messina et al., 2022a;Messina et al., 2022b) is much less common than taking aim at specific environment-types. Taking aim at the TPE requires detailed consideration of the mixture of target environment-types within the TPE (Chapman et al., 2000, Chenu et al., 2011Chapman et al., 2012;Kholová et al., 2013;Cooper et al., 2014a;Cooper et al., 2014b;Lobell et al., 2015;Hajjarpoor et al., 2021;Resende et al., 2021), the extent of GxE interactions between environment-types ( Figure 2) and the details of the genetic variance-covariance structure among the environment-types, and appropriate attention to weighting the sources of G2P information for traits, that is available from the environment-types sampled in the MET training data sets, by their frequencies of occurrence and relative importance in the TPE (Podlich et al., 1999;Cooper et al., 2014a;Cooper et al., 2014b;Gaffney et al., 2015;Messina et al., 2018;Smith et al., 2021b;. Empirical considerations: We demonstrated the application of the extended form of the breeder's equation by applying it to a grain yield data set designed for a wheat breeding program, where the environments had previously been grouped into MET and TPE sets with a characterization of the different environment-types in both the MET and TPE sets (Figures 1, 2;Cooper et al., 1995;Cooper et al., 1997;Cooper et al., 2001). This prior characterization of environment-types and the MET-TPE alignment was conducted prior to the more comprehensive characterization of the wheat TPE for north-eastern Australia (Chenu et al., 2011;Bustos-Korts et al., 2021) and so we provided some additional interpretation of GxE interactions for yield related to water availability and the incidence of drought and their influences on the genetic correlation r g (MET,TPE) in terms of the more recent TPE characterization (Figures 1, 2).
Future research: The extended form of the breeder's equation is particularly relevant as a framework for the design of breeding strategies to target climate resiliency to address the impacts of climate change on the environmental composition of the short, medium, and long-term future diverse geographical TPEs expected for our global agricultural systems (Chapman et al., 2012;van Etten et al., 2019;Ceccarelli and Grando, 2020;IPCC, 2021;Langridge et al., 2021;. Future work will explore developments and other applications of the extended breeder's equation to assist design of prediction-based breeding programs to tackle the effects of climate change, where it is expected that frequencies of environment-types within the TPE will change with time (Chapman et al., 2012;Lobell et al., 2015;Hammer et al., 2020;Snowdon et al., 2021;Cooper et al., 2021;IPCC, 2021;Bustos-Korts et al., 2021;.

Data availability statement
The data analyzed in this study is subject to the following licenses/restrictions: The dataset utilized in the examples was obtained from previous studies, as cited within the article. The dataset can be obtained from the corresponding author. Requests to access these datasets should be directed to mark.cooper@uq.edu.au.

Author contributions
MC conceived and wrote the manuscript. Ideas that contributed to the manuscript came from collaborative research conducted by MC, CG, CM, TT, OP. All authors contributed to the article and approved the submitted version.