METHODS article

Front. Physiol., 01 June 2012

Sec. Plant Physiology

Volume 3 - 2012 | https://doi.org/10.3389/fphys.2012.00156

I.4 Screening Experimental Designs for Quantitative Trait Loci, Association Mapping, Genotype-by Environment Interaction, and Other Investigations

  • WT

    Walter T. Federer 1

  • JC

    José Crossa 2*

  • 1. Division of Rare and Manuscript Collections, Cornell University, Ithaca NY, USA

  • 2. Biometrics and Statistics Unit, International Maize and Wheat Improvement Center (CIMMYT) Mexico DF, Mexico

Abstract

Crop breeding programs using conventional approaches, as well as new biotechnological tools, rely heavily on data resulting from the evaluation of genotypes in different environmental conditions (agronomic practices, locations, and years). Statistical methods used for designing field and laboratory trials and for analyzing the data originating from those trials need to be accurate and efficient. The statistical analysis of multi-environment trails (MET) is useful for assessing genotype × environment interaction (GEI), mapping quantitative trait loci (QTLs), and studying QTL × environment interaction (QEI). Large populations are required for scientific study of QEI, and for determining the association between molecular markers and quantitative trait variability. Therefore, appropriate control of local variability through efficient experimental design is of key importance. In this chapter we present and explain several classes of augmented designs useful for achieving control of variability and assessing genotype effects in a practical and efficient manner. A popular procedure for unreplicated designs is the one known as “systematically spaced checks.” Augmented designs contain “c” check or standard treatments replicated “r” times, and “n” new treatments or genotypes included once (usually) in the experiment.

Introduction

Conventional breeding will continue to make significant contributions to efforts to maintain the rate of crop improvement for food production and nutrition in order to meet the increase in human population growth. However, biotechnological methods, such as linkage analysis for detecting quantitative trait loci (QTLs), marker-assisted selection (MAS), association mapping, genomic selection, etc., will also be required. It is of paramount importance that the statistical methods used for designing field and laboratory trials and for analysing the data originating from those trials be accurate and efficient.

Crop breeding programs using conventional approaches, as well as new biotechnological tools, rely heavily on data resulting from the evaluation of genotypes in different environmental conditions (agronomic practices, locations, and years). The incidence of genotype-by-environment interaction (GEI) is a consequence of QTL-by-environment interaction (QEI) and marker effect-by-environment interaction, and this affects conventional breeding as well as MAS and genomic selection breeding strategies. The series of field trials known as multi-environment trials (METs) are vital for: (i) studying the incidence of GEI and assessing the stability of quantitative traits; (ii) mapping QTL and QEI; and (iii) finding associations among molecular markers and quantitative trait variation based on linkage disequilibrium analysis. To detect and quantify the presence of QEI is of vital importance for understanding the genetic architecture of quantitative traits.

All biotechnological methods are based on molecular marker data and phenotypic data. Phenotypic data are vitally important for assessment of the within-environment error structure for each of the trials that will be used later in the MET analysis. The MET statistical analysis is useful for assessing GEI, mapping QTLs, and studying QEI. Large populations are required for scientific study of QEI, and for determining the association between molecular markers and quantitative trait variability. Therefore, appropriate control of local variability through efficient experimental design is of key importance.

Spatial variability in the field is a universal phenomenon that affects the detection of differences among treatments in agricultural experiments by inflating the estimated experimental error variance. Researchers wishing to conduct field trials are faced with this dilemma. They tackle the problem by using an appropriate statistical design and layout for the experiment, and by using suitable methods for statistical analysis. A priori control of local variability in each testing environment is usually determined from the experimental design used to accommodate the genotypes to the experimental units. However, a posteriori control of the residual effect based on a model that provides a good fit to the data can effectively complement the control of local variability provided by the experimental design (see e.g., Federer, 2003a). Recently, efficient experimental designs (both unreplicated and replicated) have been developed, assuming that observations are not independent in that contiguous plots in the field may be spatially correlated (Martin et al., 2004; Cullis et al., 2006).

Commonly, field trials used for linkage analyses or association mapping analyses are of 200 or more genotypes in size. These may consist of individuals from segregating F2 and F3 populations, recombined inbred lines (RILs), accessions from a genebank, advanced breeding cultivars, or individuals from any segregating population. Usually, QTL mapping is done on large numbers (500 or more) in as many locations or conditions as possible, for estimating QEI and examining the stable or unstable part of the chromosome that influences the trait under study. Thus, seed availability and land and labor costs are crucial factors to be considered when establishing METs for QTL and QEI analyses, and association mapping.

The class of augmented designs is especially useful for achieving control of variability and assessing genotype effects in a practical and efficient manner. In the early stages of a breeding program, a plant breeder is faced with evaluating the performance of large numbers of genotypes. Frequently, the seed supply is limited, but even if it is not, the large number of genotypes can necessitate using a single experimental unit per genotype.

A popular procedure for unreplicated designs is the one known as “systematically spaced checks.” In this procedure, a standard check genotype is systematically spaced every certain number of experimental units. Several statistical procedures have been devised over the years to compare the yield of a new genotype with the standard variety. This procedure can require an inordinate amount of space, labor, and other resources devoted to check plots of a single standard genotype. Yates (1936) has shown that the number of check plots should be of the order of the square root of the number of (new genotype) test plots. In conducting METs, Sprague and Federer (1951) have shown that a cost-efficient procedure for maximizing genetic advancement involves using two replicates at each location for single crosses of maize, three replicates for top crosses, and four replicates for double crosses.

A third class of procedure used in the screening of genotypes for yield and other characteristics is that of “augmented experimental designs.” These designs contain c check or standard treatments replicated r times, and n new treatments or genotypes included once (usually) in the experiment. Some of the c checks could be promising new genotypes (treatments) in the final stages of testing. Any standard experimental design may be used for the check treatments and then the block sizes or the number of rows and columns are increased to accommodate the new treatments. This class of design has several desirable qualities, including the following:

  • 1. The number of checks can be any kind and number c.

  • 2. The number of new entries can be any number n.

  • 3. The new treatments can be considered as random or as fixed effects.

  • 4. Survivors in the final stages of screening may be used as checks along with some standard checks. The dual use of these genotypes as checks and as their final evaluation is an efficient use of resources.

  • 5. Some of the designs in this class allow for screening when other factors are present, thereby revealing genotype-by-factor interactions.

  • 6. Non-contenders can be discarded prior to harvest, since they do not affect computation of blocking effects and variances.

Various augmented experimental designs are discussed in the following sections. These are augmented block (Federer, 1956, 1961), augmented row–column (Federer and Raghavarao, 1975; Federer et al., 1975), augmented resolvable row–column (Federer, 2002), augmented split plot (Federer, 2005b), and augmented split block (Federer, 2005a).

When the field layout is in a row–column formation, either for the entire experiment or within each complete block, an experimental design can be developed that controls variability in two directions for any number of genotypes and replicates. The row–column experimental designs have two block components, i.e., blocks in rows and blocks in columns. When the entire experiment is laid out in a row–column arrangement, the “latinised” designs assure that entries do not occur more than once in a row or a column of the experiment. Also, neighbor restricted designs restrict randomization of entries in such a way that certain groups of entries do not occur together, so that genotypic interference due to different maturity or plant height can be avoided.

Analysis of designed, spatially laid out experiments needs to take account of the design restrictions encountered. The actual spatial variation that occurs during the course of conducting field experiments may not be taken into account in the experimental design or in the standard statistical analysis selected before the experiment was conducted. Hence, to achieve appropriate statistical analysis for the data obtained from the experiment, it is necessary to determine the type and nature of the spatial variation present in the experiment. This often means selecting from a family of plausible statistical analyses. Federer (2003a) presented a number of methods useful for “exploratory model selection,” to account for the variation that is present in the results of an experiment rather than what the variation pattern was expected to be. He used various forms of trend analysis on a variety of examples to determine the model that explained the variation present in each experiment. Several publications have been written using various forms of trend analysis for a variety of situations (Wolfinger et al., 1997; Federer, 2002, 2003a,b; Federer and Wolfinger, 2003).

Augmented Block Experimental Designs

Augmented block experimental designs fall into two categories, complete blocks and incomplete blocks for the check genotypes or treatments. A randomized complete block design (RCBD), with r replicates or blocks, is used for the c check genotypes to start the construction of an augmented randomized block. Then, the r blocks are expanded to include the c checks plus n/r new genotypes in each block. If n is not a multiple of r, then fewer or more new genotypes would appear in some of the blocks. The c checks and n/r new genotypes are randomly allotted to the experimental units (plots) in each block. Genotype numbers are randomly assigned to the new genotypes, but this is not necessary in the early stages of screening since each new genotype is a random event in itself. To illustrate an augmented RCBD, let c = 3 checks, r = 4 blocks, and n = 13 new genotypes. A plan is:

Block 1Block 2Block 3Block 4
[A 1 4 B C 9][C 5 B 6 13 A][12 B A 2 3 C][7 A 8 C B 10 11]

A partitioning of the degrees of freedom in an analysis of variance (ANOVA) table for this design is:

Source of variationDegrees of freedom
Total25
Correction for mean1
Block, B3
Genotype15
    Check2
    New12
    Check versus new1
B × check6

In the first stage of screening, there may be a very large number of new genotypes with n of 8,000, 30,000, or even over 100,000. In these cases, the block size may become larger than is considered necessary to retain relative homogeneity within each block. The class of experimental designs known as an “incomplete block design” (ICBD) can then be used. The incomplete blocks of an ICBD may be in complete blocks, resolvable, or they may not. An appropriate ICBD for c checks, r replicates of the checks, incomplete blocks of size k, s incomplete blocks within a complete block, and b incomplete blocks is selected for the check genotypes. Then the b incomplete block sizes are increased to include n/b new genotypes in each incomplete block. To illustrate, let c = 15 checks arranged in r = 5 replicates and b = rs = 25 incomplete blocks of size k = 3. Let n = 300 new genotypes, and then n/b = 300/25 = 12. By enlarging the 25 incomplete blocks from k = 3 to k = 15 to accommodate 3 + 12 = 15 experimental units, the 300 new genotypes can be put into these 25 incomplete blocks. The 12 new genotypes and the three checks are randomly allotted to the 15 experimental units in each of the 25 incomplete blocks. The blocks of genotypes are randomly allotted to the incomplete blocks in the field layout. The 15 check genotypes may, for example, be two standard genotypes and 13 promising and surviving new genotypes from previous screening cycles.

A randomized form of an ICBD may be obtained from a software toolkit such as Gendex (2009). Using the parameters k = c + n/b = 15, v = c + n/r = 75, and r = 5, a randomized form of an ICBD is obtained. Then the n/r numbers for v that appear in an incomplete block are replaced by genotype numbers to accommodate the n = 300 new genotypes, but retaining k of the check treatments in each incomplete block according to the plan for checks only.

A partitioning of the degrees of freedom in an ANOVA table for the above example is:

Source of variationDegrees of freedom
Total375
Correction for mean1
Block, R4
Genotype314
    Check14
    New299
    Check versus new1
Incomplete blocks within R20
Intrablock error36

When the new genotypes are unreplicated, they do not contribute to the estimation of the block and error variances and the estimation of the block effects (Federer and Raghavarao, 1975). Only the replicated check treatments do this. Computer codes for analysing the results from augmented block designs have been given by Wolfinger et al. (1997) and Federer (2003a).

Augmented Complete Block Design for a QTL Mapping Study

A typical QTL experiment in maize consists of F2 plants obtained from the cross of two maize inbred lines referred to as parent 1 (P1) and parent 2 (P2). Subsequently, the F2 plants can be selfed to produce, say, 900 independent F5 lines. These 900 new entries (RILs) will be genotyped with molecular markers and genetic data, and the respective phenotypic data will be used for QTL and QEI mapping. These lines may be crossed to an inbred tester from an opposite heterotic group to obtain testcross seeds. The check entries may include the parents P1 and P2, the F1 from the cross P1 × P2 and two other checks (check1 and check2) the breeder wishes to include. One possible augmented complete block design (CBD) may consist of 20 blocks of size 45 augmented by P1, P2, F1, and check1 and check2. Thus, the block size comprises a total of 50 entries (45 new entries comprising testcross F5 lines and five other entries that will be repeated in every block). The same or a different group of test lines in the incomplete block can be used in all the sites where the experiment is planted, but with different randomization of the incomplete blocks. In this case, the augmented RCBD has c = 5 checks (P1, P2, F1 check1 and check2), r = 20 blocks, and n = 900 new genotypes. A possible plan is:

Block 1
P1…1, 2…P2…14, 15…F1…20, 21…check1…30, 31…check2…44, 45…
Block 2
P1…46, 47…P2…54, 55…F1…60, 61…check1…70, 71…check2…89, 90…
.
.
Block 20
P1…460, 470…P2…540, 550…F1…600, 610…check1…700, 710…check2…890, 900…

The distribution of the repeated checks in the field should avoid, as much as possible, appearance of the same replicated check more than once in the same row or column. This latinised augmented CBD may help to reduce bias due to unexpected soil trends running across columns or rows.

A partitioning of the degrees of freedom in an ANOVA table for this design in each site is:

Source of variationDegrees of freedom
Total1000
Correction for mean1
Block, B19
Genotype904
    Check4
    New899
    Check versus new1
B × check76

Supposing that the trial were established in three different sites, then the partition of the degrees of freedom in the ANOVA table would be as follows:

Source of variationDegrees of freedom
Total3000
Correction for mean1
Site2
Block within site, B(S)57
Genotype904
    Check4
    New899
    Check versus new1
Genotype × site1808
    Check × site8
    New × site1798
    Check versus new × site2
B(S) × check228

Augmented Incomplete-Complete Block Design for an Association Mapping Study

This example supposes that 200 diverse bread wheat accessions from a genebank are to be used for an association mapping study. The accessions will be used to examine the possible relationship between various phenotypic traits (such as grain yield, resistance to leaf and yellow rust, bread making quality, protein content, etc.) and the molecular markers located along the seven chromosomes of the three genomes of wheat (A, B, and D). Ten sites with contrasting environmental conditions would be used to allow good discrimination of the 200 accessions. Differential environmental conditions must be used in order to obtain a good discrimination for resistance to different potential rust pathogens as well as for the other traits.

It is assumed that c = 15 checks can be arranged in r = 5 replicates and b = 25 incomplete blocks of size k = 3 are formed. The 200 accessions can be accommodated in 25 incomplete blocks of size 11 by enlarging the incomplete blocks from k = 3 to k = 11 by adding n/b = 200/25 = 8 new entries in each incomplete block.

The ANOVA table of the combined analysis across ten environments is:

Source of variationDegrees of freedom
Total2750
Correction for mean1
Site9
Block within sites, R(S)40
Genotype214
    Check14
    New199
    Check versus new1
Genotype × site1926
    Check × site126
    New × site1791
    Check versus new × site9
Incomplete blocks within R × S200
Intrablock error within sites560

Augmented Row–Column Experimental Designs

Augmented row–column designs can be constructed either by adding rows and/or columns or by enlarging the intersections of the rows and columns of a square or rectangle. Considering the latter option, a 5 × 5 Latin square can be used for five checks A, B, C, D, and E, augmented with 250 new genotypes, adding 10 new genotypes to each row–column intersection as follows to obtain the schematic plan before randomization:

A1–10B11–20C21–30D31–40E41–50
B51–60C61–70D71–80E81–90A91–100
C101–110D111–120E121–130A131–140B141–150
D151–160E161–170A171–180B181–190C191–200
E201–210A211–220B221–230C231–240D241–250

A randomization plan would be obtained for the Latin square and then the 11 entries in each row–column intersection would be randomly allotted to the 11 experimental units in each intersection. The new genotypes are randomly assigned to the numbers 1–250. A partitioning of the degrees of freedom in an ANOVA table is:

Source of variationDegrees of freedom
Total275
Correction for the mean1
Row4
Column4
Genotype254
    Check4
    New249
    Check versus new1
Error12

An alternative row–column plan would be to set up a 25 row by 15 column rectangle as shown below.

If the variation in rows and in columns can be explained by linear, quadratic, and perhaps cubic tends and their interactions, then two checks would have been sufficient to obtain row and column solutions to adjust the new treatments, and 325 new treatments could have been included. An equal number of rows and columns results in the minimum number of check genotypes. For example, using a 20 × 20 square, 40 plots could be allocated to two check genotypes and 360 to new genotypes. There still would be more than 20 degrees of freedom associated with the error mean square. Another scenario supposes that one standard check genotype and four promising new genotypes in the final stage of evaluation are used. Utilizing new genotypes in their final stage of testing allows dual use of the results and efficient experimentation, eliminating the inclusion of too many check plots.

A randomization plan would involve randomly allocating the rows and columns in the above plan to the rows and columns in the experimental area, randomly assigning the letters A–E to the checks, and randomly allotting the numbers 1–250 to the new genotypes. A partitioning of the degrees of freedom in an ANOVA table is:

Source of variationDegrees of freedom
Total375
Correction for the mean1
Row24
Genotype254
    Check4
    New249
    Check versus new1
Column (eliminating genotype)14*
Error82*

*Need correction for confounding effects.

Federer et al. (1975) discuss a number of other arrangements including one used by Dr. A. Mangelsdorf. The Mangelsdorf design has a nice balanced property and was used for METs.

The first plan given above within this section is row–column–check connected in that solutions are obtainable for all effects. The plan immediately above is row–check connected and column–check connected but is not row–column–check connected. This means that functions of the column effects, such as linear, quadratic, cubic, etc., regressions are used in the analysis of such designs. In order to have a plan that is row–column–check connected, two of the transversals of the square or rectangle need to be adjacent to each other, a feature that an experimenter may consider as undesirable. Computer codes illustrating this type of analysis are given by Federer (2003b), Federer and Wolfinger (2003), and Wolfinger et al. (1997).

Augmented Resolvable Row–Column Experimental Designs

Experimental designs such as a lattice square or a lattice rectangle may be used to construct augmented lattice square and augmented lattice rectangle plans (Federer, 2002, 2003b). For such plans, row blocking and column blocking are included in each complete block, thus making the design resolvable. Since the proportion of experimental units in relation to the number of checks is less in an augmented lattice square, this is the plan that will be illustrated. There are k × k experimental units in each complete block, and 2k, 3k, etc., check genotypes may be used. To construct such a plan, a lattice square plan is obtained first for v = k2 treatments. The complete blocks where treatments 1 to k and k + 1 to 2k appear together in a row or in a column are deleted. For 2k check genotypes, treatments 2k + 1, 2k + 2, …, k2 are deleted in each of the r blocks. The rk (k – 2) new treatments are inserted into the deleted treatment spaces of the lattice square. To illustrate, with k = 7 and r = 7, a plan would be as shown at the bottom of the page.

A12B34C56D78E910
11A1213B1415C1617D1819E20
2122A2324B2526C2728D2930E
E3132A3334B3536C3738D3940
41E4243A4445B4647C4849D50
5152E5354A5556B5758C5960D
D6162E6364A6566B6768C6970
71D7273E7475A7677B7879C80
8182D8384E8586A8788B8990C
C9192D9394E9596A9798B99100
101C102103D104105E106107A108109B110
111112C113114D115116E117118A119120B
B121122C123124D125126E127128A129130
131B132133C134135D136137E138139A140
141142B143144C145146D147148E149150A
A151152B153154C155156D157158E159160
161A162163B164165C166167D168169E170
171172A173174B175176C177178D179180E
E181182A183184B185186C187188D189190
191E192193A194195B196197C198199D200
201202E203204A205206B207208C209210D
D211212E213214A215216B217218C219220
221D222223E224225A226227B228229C230
231232D233234E235236A237238B239240C
C241242D243244E245246A247248B249250

The symbol × indicates where one of the rk (k – 2) = 245 new genotypes would be entered. Row linear and quadratic effects and column linear and quadratic effects can be estimated (Federer, 2002). Checks 1–7 appear once with checks 8–14 in rows and in columns, but do not appear with each other. The diagonal elements need not be adjacent, as illustrated below.

A partitioning of the degrees of freedom in an ANOVA is:

Source of variationDegrees of freedom
Total343
Correction for the mean1
Replicate or block6
Genotype258
    Check13
    New244
    Check versus new1
Check × block78
    Row linear within block7
    Column linear within block7
    Row linear × column linear within block7
    Row quadratic within block7
    Column quadratic within block7
    Row quadratic × column quadratic within block7
    Row cubic within block7
    Column cubic within block7
Residual or error22

To screen 30,000 new genotypes, k would be 33 and k = r = 33 replicates would be required. As stated earlier, the 2k = 66 checks could consist of two standard checks plus 64 new genotypes in their final stage of testing.

As an alternative design in this class, the checks could be in a lattice square experimental design. Then, each of the row–column intersections within each complete block could be enlarged to include the desired number of new genotypes.

Augmented Split Plot Experimental Designs

In order to compare the effect of environments and management procedures on new genotypes, the class of augmented split plot experimental designs has been proposed by Federer (2005b). The effects of factors such as tillage, fertilizers, insecticides, irrigation, planting density, date of planting, etc on new genotypes could be assessed. The effect of the date of planting is often confused with site-to-site effects. The new genotypes to be assessed may appear in split plot treatments or in whole plot treatments. New genotypes can be tested for several factors at a time by using split split plot, split split split plot, etc augmented designs. These designs allow for genotype-by-factor interactions and GEI, and are useful, especially in the final stages of screening genotypes. A schematic plan of a design is shown below for four whole plots, such as tillage practices, three checks (20, 21, and 22), and 19 new genotypes such as the 7 or 8 split plot treatments, and r = 4 blocks or replicates of check genotypes.

There are seven split plot treatments in Block 4 and eight in the other three blocks. The checks are given the highest numbers because SAS software subtracts the highest numbered effect from all the others for the estimated effects, and gives a standard error of a difference between an estimated effect of a genotype and the highest numbered one, rather than a standard error of an effect as indicated. It is usually more desirable to compare all new genotypes with a check, rather than compare all entries with a new genotype. The usual randomization procedure for a split plot experimental design would be used.

Replicate 1Replicate 2Replicate 3Replicate 4
1xxxxx141xxxxx131xxxxx121xxxxx11
82xXxxx142xxxxx132xxxxx122xxxxx
x93Xxxxx83xxxxx143xxxxx133xxxx
xx104xxxxx94xxxxx84xxxxx7xxxx
xxx115xxxxx105xxxxx95xxxxx85xx
xxxX126xxxxx116xxxxx106xxxxx96x
xxxXx137xxxxx127xxxxx117xxxxx107

Replicate 5Replicate 6Replicate 7

1Xxxxx101xXxxx91xxxxx8
112xxxxx102Xxxxx92xxxxx
x123xxxxx113xxxxx103xxxx
xX134xxxxx124xxxxx114xxx
xXx145xxxxx135xxxxx125xx
xXxx86xxxxx146xxxxx136x
xXxxx97xxxxx87xxxxx147

A partitioning of the degrees of freedom in an ANOVA would be:

Source of variationDegrees of freedom
Total124
Correction for mean1
Block, B3
Tillage, T3
B × T, error T9
Genotype21
    Check2
    New18
    Check versus new1
T × genotype63
T × check6
T × new54
T × check versus new3
B × check within T24

Codes for analysing data for this design and others in this class are given by Federer (2005b).

TillageBlock 1Block 2
120212212345202122678910
220212212345202122678910
320212212345202122678910
420212212345202122678910

TillageBlock 3Block 4

1202122111213141520212216171819
2202122111213141520212216171819
3202122111213141520212216171819
4202122111213141520212216171819

Augmented Split Block Experimental Designs

Augmented split block experimental designs are another class of augmented experimental design for assessing the effects of various factors on new genotypes, as described by Federer (2005a) who discussed five different examples of this class and presents a numerical example and a code for analysis of the data. New genotypes may be considered to be random or fixed effects. One of the cases considered is an intercropping example for two crops with new genotypes for both crops. Allowing for interaction of factors with genotypes is an important aspect of this class of design. To illustrate one design within this class, an augmented randomized block experimental design is used for c = 3 checks (A, B, C), n = 25 new genotypes (1–25), and r = 4 blocks. Then, d = 4 dates of planting (D1, D2, D3, D4) are strip blocked across the entries in each of the four blocks. This is illustrated in the schematic layout at the bottom of the page.

The date treatments are in an RCBD and the checks and new genotypes are in an augmented randomized block experimental design. The date experimental units are distributed across all the genotype entries in a block.

A possible partitioning of the degrees of freedom in an ANOVA table is:

Source of variationDegrees of freedom
Total148
Correction for the mean1
Block, B3
Genotype27
    Check2
    New genotype, G24
    Check versus new1
B × check6
Date, D3
B × D9
D × genotype81
    D × check6
    D × G72
    D × check versus new3
B × D × check18

Block 1Block 2
DateABC123456DateABC789101112
D1D1
D2D2
D3D3
D4D4

Block 3Block 4

DateABC131415161718DateABC19202122232425
D1D1
D2D2
D3D3
D4D4

Discussion

In the early stages of a plant breeding program, expected genetic gains may be increased by screening a large number of genotypes in contrast to having more precise comparisons of a fewer number of genotypes. This makes it necessary to evaluate many entries where there may not be sufficient seed to replicate each. For this reason Federer proposed augmented designs where a set of check entries are replicated an equal (or unequal) number of times in a specified field design and an additional set of new test entries are included in the experiment only once. In this review we show different type of augmented complete and ICBD for the check treatments with the test entries being added or “augmented” to the blocks.

This approach provides a very efficient means of screening test entries and has a considerable amount of flexibility. Augmented ICBD might be preferred over augmented CBD when the number of repeated checks is large. When soil variability runs in two directions augmented row–column designs should be a good alternative, and when the experiment is “latinized” so that entries do not occur more than once in a row or column, then the efficiency of increasing precision increases. The augmented incomplete block or/and the row-column designs can be used for association mapping and/or genomic selection where a large number of entries (usually more than 1000) are needed but cannot be planted in all possible environments. The advantages of using these augmented designs is when the soil heterogeneity increases due to limiting factors as low water, and nitrogen availability in the field.

Conclusions

There are many variations of split plot and split block experimental designs. Federer and King (2007) discuss several of these variations as well as combinations of the designs. Experimenters may find some of these variations suitable for augmenting with new genotypes that will fit the conditions for their experiment. Such designs as given in the last two sections above allow the experimenter to obtain interactions of new genotypes with a variety of factors. Instead of a single factor, a factorial combination of several factors could be used. For example, instead of date only, a factorial arrangement of date, fertilizer level, and insecticide could be used. Considerable flexibility is possible through the use of augmented experimental designs.

When it is advisable to use an augmented design, it may be used at several sites. For example, the Manglesdorf design presented by Federer et al. (1975) was used at several sites in Brazil. Methods for combining results over sites have been described by Federer et al. (2001), and they even allow for different designs at the different sites.

Statements

Acknowledgments

Sadly, Professor Walter T. Federer, the lead author of this chapter, passed away in April 2008. He was one of the greatest statisticians on the theme of experimental design for plant breeding, agronomy, and agriculture in general. Professor Federer was a unique, enthusiastic human being who was always ready to discuss serious scientific issues without losing his unique character of extreme kindness and gentlemanliness. I have the privilege to say that he was my friend.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

  • 1

    CullisB. R.SmithA. B.CoomiesN. E. (2006). On the design of early generation variety trials with correlated data. J. Agric. Biol. Environ. Stat.11, 381393.10.1198/108571106X154443

  • 2

    FedererW. T. (1956). Augmented (or hoonuiaku) designs. Hawaii. Plant. Rec.55, 191208.

  • 3

    FedererW. T. (1961). Augmented designs with one-way elimination of heterogeneity. Biometrics17, 447473.10.2307/2527837

  • 4

    FedererW. T. (2002). Construction and analysis of an augmented lattice square design. Biom. J.44, 251257.10.1002/1521-4036(200203)44:2<251::AID-BIMJ251>3.0.CO;2-N

  • 5

    FedererW. T. (2003a). Exploratory model selection for spatially designed experiments – some examples. J. Data Sci.1, 231248.

  • 6

    FedererW. T. (2003b). “Analysis for an experiment designed as an augmented lattice square design,” in Handbook of Formulas and Software for Plant Geneticists and Breeders, ed. KangM. S. (Binghamton, NY: Food Products Press), 283289.

  • 7

    FedererW. T. (2005a). Augmented split block experiment design. Agron. J.97, 578586.10.2134/agronj2005.0578

  • 8

    FedererW. T. (2005b). Augmented split plot experiment design. J. Crop Improv.15, 8196.10.1300/J411v15n01_07

  • 9

    FedererW. T.KingF. (2007). Variations on Split Plot and Split Block Experiment Designs. Hoboken, NJ: John Wiley and Sons Inc, 270.10.1002/0470108584

  • 10

    FedererW. T.NairR. C.RaghavaraoD. (1975). Some augmented row–column designs. Biometrics31, 361373.10.2307/2529426

  • 11

    FedererW. T.RaghavaraoD. (1975). On augmented designs. Biometrics31, 2935.10.2307/2529426

  • 12

    FedererW. T.ReynoldsM.CrossaJ. (2001). Combining results from augmented designs over sites. Agron. J.93, 389395.10.2134/agronj2001.932389x

  • 13

    FedererW. T.WolfingerR. D. (2003). “Augmented row–column design and trend analyses,” in Handbook of Formulas and Software for Plant Geneticists and Breeders, ed. KangM. S. (Binghamton, NY: Food Products Press), 291295.

  • 14

    Gendex. (2009). Gendex DOE Toolkit

  • 15

    MartinR. J.EcclestonJ. A.ChanB. S. P. (2004). Efficient factorial experiments when data are spatially correlated. J. Stat. Plan. Inference126, 377395.10.1016/j.jspi.2003.08.001

  • 16

    SpragueG. F.FedererW. T. (1951). A comparison of variance components in corn yield trials: II. Error, year × variety, location × variety, and variety components. Agron. J.43, 535541.10.2134/agronj1951.00021962004300110003x

  • 17

    WolfingerR. D.FedererW. T.Cordero-BranaO. (1997). Recovering information in augmented designs, using SAS PROC GLM and PROC MIXED. Agron. J.89, 856859.10.2134/agronj1997.00021962008900060002x

  • 18

    YatesF. (1936). A new method of arranging variety trials involving a large number of varieties. J. Agric. Sci.26, 424455.10.1017/S0021859600022760

Summary

Keywords

multi-environment trials, augmented experimental designs, genotype × environment interaction, quantitative trait loci (QTL)

Citation

Federer WT and Crossa J (2012) I.4 Screening Experimental Designs for Quantitative Trait Loci, Association Mapping, Genotype-by Environment Interaction, and Other Investigations. Front. Physio. 3:156. doi: 10.3389/fphys.2012.00156

Received

15 March 2012

Accepted

03 May 2012

Published

01 June 2012

Volume

3 - 2012

Edited by

Jean-Marcel Ribaut, Generation Challenge Programme, Mexico

Reviewed by

Shan Lu, Nanjing University, China; Stanislav Kopriva, John Innes Centre, UK; Uener Kolukisaoglu, University of Tuebingen, Germany

Copyright

*Correspondence: José Crossa, Biometrics and Statistics Unit, International Maize and Wheat Improvement Center (CIMMYT), Apdo.Postal 6-641, 06600 Mexico DF, Mexico. e-mail:

This article was submitted to Frontiers in Plant Physiology, a specialty of Frontiers in Physiology.

Disclaimer

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

Outline

Cite article

Copy to clipboard


Export citation file


Share article

Article metrics