Abstract
An increase in the distribution of data points indicates the presence of genetic or environmental modifiers. Mapping of the genetic control of the spread of points, the uniformity, allows us to allocate genetic difference in point distribution to adjacent, cis effects or to independently segregating, trans genetic effects. Our genetic architecture-mapping experiment elucidated the “environmental context specificity” of modifiers, the number and effect size of positive and negative alleles important for uniformity in single and combined stress, and the extent of additivity in estimated allele effects in combined stress environments. We found no alleles for low uniformity in combined stress treatments in the maize mapping population we examined. The major advances in this research area since early 2011 have been in improved methods for modeling of distributions and means and detection of important loci. Double hierarchical general linear models and, more recently, a likelihood ratio formulation have been developed to better model and estimate the genetic and environmental effects in populations. These new methods have been applied to real data sets by the method authors and we now encourage additional development of the software and wider application of the methods. We also propose that simulations of genetic regulatory network models to examine differences in uniformity and systematic exploration of models using shared simulations across communities of researchers would be constructive avenues for developing further insight into the genetic mechanisms of variation control.
Introduction
There is useful information in the distribution of data as well as the mean (Cleasby and Nakagawa, ; Geiler-Samerotte et al., ); genetic analysis of distributions can be especially informative (Hill and Mulder, ; Ronnegard and Valdar, ). Specifically, an increased spread of measured allele effects indicates the presence of a modifier, and is thus a clue to biological mechanisms. In Figure 1, we illustrate this point by showing the increased spread around the average in one symbol shape (Figure 1A, where the normal environment has points clustered around the mean where the stress environment has points spread broadly up and down from the mean), and then illustrate how the presence of a modifier that increases the growth trait under stress could be visualized, using blue filled symbols as compared to the yellow unfilled symbols (Figure 1B, compare normal to stressed effect). The color-coding thus represents the additional “dimension” when a modifier is present. In our hypothetical Figure 1 example, the modifier ameliorates the stress effect of the allele, as without the blue modifier points' contribution to the mean, the average value for the trait of any organism carrying that particular allele would be even lower than the two-fold decrease shown in the Figure 1 example. Of course it is also possible to have equal means when modifiers are present, which would mean that the allele is not detectable as a separate genetic variant. This type of modifier masking, without a detectable mean effect, could thus contribute to false negatives in typical quantitative trait locus (QTL) analyses of genetic architecture.
Figure 1
KEY CONCEPT 1. Modifier
allele or alleles that change the measured phenotype effect of another allele. This definition implies that the modifier effect is heritable and that the modifier allele effect is only measurable when the “receiving” genetic variant is present.
KEY CONCEPT 2. QTL (quantitative trait locus)
a particular allelic difference between DNA molecules that is associated with a difference in a measured phenotype of the organisms. To do such an analysis, there must be variation in the genotype (SNPs, markers) and variation in the phenotype (trait, measured value of trait).
KEY CONCEPT 3. Genetic architecture
list of the number of alleles and the pattern of allele effects in a genotype-phenotype mapping experiment. This can range from one or two large-effect alleles to many very small-effect alleles, or a mixture of these types. Most mapping populations only allow detection of relatively large effect SNPs (down to about 1% of the total variation of the measured trait); we assume that additional undetectable small-effect variants are present.
Our simple hypothetical example in Figure 1 includes both a modifier and an environmental difference. Environment-specific variants have been studied extensively (Lynch and Walsh,
KEY CONCEPT 4. Plasticity
difference in genetic architecture in a comparison of environments. This term is normally used in comparisons of the same population, so that the genetic variation is held constant while the environment is varied.
Basic research on the evolutionary trajectories and specific interactions that underlie genetic architecture differences has incorporated environment-specific differences, though combinations of stress have rarely been examined. In this focused review we emphasize multiple-factor combinations as an intermediate between single-factor lab-scale experiments and large-scale environmental dissection or clustering, such as crop modeling and weather record covariate analyses. Multiple-stress effects are relevant to breeding for our growing population, as typical crop yields are substantially lower than yields under optimum conditions, and the limiting factors vary. This yield gap could theoretically be narrowed by breeding more tolerant genotypes (Tollenaar and Lee,
KEY CONCEPT 5. Uniformity
spread of a group of measurements of experimental units such as individual plants; high uniformity indicates that most of the measurements in the replicates are close to the average, whereas low uniformity indicates that many points are far from the average of the replicates. The value of this number will scale with the value of the mean unless adjusted.
Summary of the main results of our frontiers article (Makumburage and Stapleton, 2011)
The three key results of our mapping experiment include the “context specificity” of modifiers, comparison of the number and effect size (major QTL or minor QTL) of positive and negative alleles for uniformity in single and combined stress, and the extent of additivity in estimated allele effects in combined stress environments. To recap the first point, modifiers that map to the same allele as the mean effect (cis alleles) would be more likely to be transmitted together through the generations and thus not be dependent on the population context, as compared to trans alleles that could segregate independently. We found nine trans alleles for maize plant height trait uniformity, with these nine modifying loci spread over seven of the ten maize chromosomes. These nine loci are trans alleles, as there was no QTL at the same locus for mean amount of height. We detected only one coincident cis QTL (in other words, one locus that had a significant effect on both uniformity and amount of height). That single coincident QTL had different allele effect patterns across environments for height and uniformity, so it is not strictly cis in effect. Thus, the modifiers of plant height in this maize IBM94 RIL mapping population are appeared to primarily segregating independently from the loci that contribute directly to tall and short plants in our experiment.
KEY CONCEPT 6. RIL
recombinant inbred line; these are experimental populations derived from the crossing of two inbred parents and subsequent inbreeding, so that cross-overs are now visible as “chunks” of the genome from each parent in each RIL. “Clones” of the same genotypes can be tested in many environments, and the lines have a known ancestral origin so that presence of a SNP can be modeled as independent from SNPs on other chromosomes.
Secondly, under single stress conditions there are stress-specific alleles that are high and low uniformity as compared to the population mean, but we only detected alleles important for uniformity decrease under combined stress. This pattern is different than the architecture of mean plant height, which has loci with both high and low allele effects under combined stress. In our hypothetical example (Figure 1B), the yellow-point modifier effect confers stress environment tolerance, and thus reduces the effect size of the blue allele under stress. This particular modifier interaction example was chosen as it illustrates a common modifier pattern in our data, with the stress modifier reducing the effect size of the allele at a locus. Thus, the stress-specific modifier would increase the spread of the points (decreasing the uniformity) as it conferred tolerance to the stress. All the uniformity alleles we found do indeed decrease uniformity, though we could not separate the effects of the modifier and the allele in the way we color-coded our hypothetical example, as we carried out separate Levene and mean analyses and compared them by map overlay. New methods for detecting modifier contributions are discussed in the next section of this review.
Finally, we found six loci that had predictable multiple-stress allele uniformity effects and three loci with surprising allele effects that could not be extrapolated from single stress effect estimates. One-third non-additive is probably an underestimate, as we first identified our QTL as having at least one significant genotype-environment interaction. After these G x E loci were identified we examine the allele effects post hoc. We did not fit models designed to test for modifier contributions as illustrated in our example (Figure 1B). Fitting of models designed to specifically detect effects jointly in combined and single stress and to also separate modifier contributions from mean effects would be useful in future analyses, as such more specific models for additive or multiplicative combinations might be expected to increase our ability to detect higher-order effects even in small data sets. We discuss this point in more detail in the “Areas for Future Work” section below.
Review of related results since our frontiers publication
The major advance in this area since early 2011 has been the development of improved methods for modeling of distributions and means and detection of important loci. Double hierarchical general linear models and, more recently, a likelihood ratio formulation have been developed to better model and estimate the genetic and environmental effects in populations. There have also been applications of the models to new data sets and application of straightforward statistical models to well-understood biological data since our publication appeared.
Conventional statistical methods, while still popular, have some inherit pitfalls that have been addressed by more recent methods. Levene's test (the method used in the article that is the focus of this review) is obtained as the absolute differences between each observation and its group's mean or median. Generally, an F-statistic is used to make inferences about the trait's uniformity. This test is relatively robust to the distribution of the data points. Unfortunately, the test is unable to compare means and uniformity simultaneously and lacks the capacity to include covariates directly in the analysis.
In determining how much organisms' genetics control responses to environments, a novel statistical method was developed that used double hierarchical generalized linear models (DHGLM) to estimate the genetic variance of both macro- and micro-environmental sensitivity simultaneously (Ronnegard and Valdar,
KEY CONCEPT 7. Environmental sensitivity
difference in measured phenotype value when the population is examined in different environments. The difference in environments can be loosely specified, such as difference in season or site. Alternatively, environments can be varied in a tightly specified way, as factorial experiments with all other aspects controlled.
In addition to the offspring size effect on precision, the inclusion of additional fixed effects in the models could further increase both the required family size and required offspring per family. However, the required offspring numbers for detecting genetic variance in macro-environmental sensitivity is generally lower than for differences in micro-environmental sensitivity and this expectation was observed in the authors' analysis. Further, the environmental parameter used for the reaction norm slope was assumed to be known and without error. In many cases, this might be fine, but there may be specific situations in which researchers may wish to consider environmental parameters estimated from the data. Unfortunately, environmental parameters that are estimated from the data were shown to seriously bias the estimations of genetic variance in macro-environmental sensitivity. Provided these limitations are addressed in the experimental design, the DHGLM method remains a very useful method to increase our understanding of the genetic variance of environmental sensitivity as well as provide us with tools for discriminating between these types of environmental sensitivity.
A more recently developed method (Cao et al.,
KEY CONCEPT 8. Variance heterogeneity
differences in the spread of points between different samples or factors in an experiment; this is similar to uniformity but with the addition of the comparison of more than one experimental unit to the meaning.
The purpose of the likelihood ratio tests (LRTMV, LRTM, and LRTV) is to compare the likelihood of both the full and null model given observed phenotypes and using that to draw conclusions about the presence of mean differences and/or variance heterogeneity. Under Wilks's theorem, the distribution of the likelihood ratio tests follows an approximate chi-squared distribution, permitting us to draw conclusions with significance testing. The LRTV, however, is closely related to Bartlett's test of equality of variance, which has shown to be sensitive to even slight violations of the normality assumption. Simulation studies of the LRT showed that LRTMV and LRTV (both including test of variance heterogeneity) indeed have inflated Type I errors, and thus the authors recommended a bootstrap method for non-normal traits.
Compared with the double hierarchical generalized linear models (using the pre-cursor to the 2011 macro- and micro-environmental model) in simulated data with strong normality (an advantage in uniformity testing), the LRT mean tests performed comparably. In the variance tests, however, the LRTV performed with the highest power. In single-purpose tests, the LRTM and LRTV are comparable to the DHGLMM and the DHGLMV. Further, joint tests (tests where differences in means uniformity were tested simultaneously) were never as powerful as mean tests in the presence of just mean differences, and similarly, joint tests were never as powerful as variance tests in the presence of just variance heterogeneity.
Both the DHGLM and LRT methods have been applied to real data sets. The results of the cow heard analyses indicated that the within-herd micro-environmental model had the best fit, and that selection for increased milk production increased the environmental sensitivity (Mulder et al.,
Variance-detection methods have not been widely exploited in plant genetics since 2011. We found one example, in analysis of flowering time. The genetics of flowering in plants is extensively studied and this pathway was recently used for a data analysis incorporating uniformity (Shen et al.,
In addition to improved statistical methods since 2011, and applications of those methods, there is some recent published work on combined stress environment genetics. We recently examined the genetic architecture of combined ultraviolet radiation and drought stress QTL in maize, in both the IBM94 mapping populations and a subset of the nested association NAM population (Makumburage et al.,
Statistical formulations for plasticity epistasis and pleiotropy detection have been described recently (Zhou et al.,
Areas for future research
As uniformity decreases in the combination stress environment in our experiments with the maize IBM94 population, we propose that there are more modifiers involved in combination stress than in single stress responses. We suggest that a single-stress response would have a smaller network of transcription factors or physiological intermediaries, and genetic variation in those factors would be detectable in small mapping experiments. In contrast, the network of master regulators in combinations of stresses is hypothesized to balance input from different stress responses and thus be either a) a larger network with each individual transcription factor or physiological component playing a proportionally smaller, attenuating role, so that there would be no large-effect detectable allele that controls high genotype-by-environment effects in the combined-stress case, or b) have attenuating network interactions that repress QTL effects. We base our proposed mechanisms on both our results and on the more general observation that heritability decreases in stress environments. We favor the second explanation in theory, as the network of effects cannot be increased indefinitely as more environmental factors are applied, and as negative feedback/homeostasis is a defining feature of biological systems. These two hypotheses could be distinguished by increasing the power to detect small-effect QTL, either by increasing the sample size or improving the model power, or both; an increase in detection of QTL in combination stress would support the first hypothesis. In addition, we suggest that agronomically important factors such as heat and drought combinations be considered for follow-up experimental analyses; these factors are relatively difficult to manipulate and would likely require large numbers of experimental plots and careful fitting of covariates, but would provide results relevant to conditions predicted under climate change. If attenuating combined environment networks are common in elite germplasm, then simulations of breeding strategies could incorporate this constraint.
In small mapping populations such as ours there is limited statistical power for detection of combinatorial all-two-way interactions. We thus suggest that pathway or other complex biological priors be developed for follow-up analyses. Simulations and improved, easy-to use simulation methods to assist in developing priors for fitting complex causal models to the relatively small combination stress data sets that are possible for single investigators to generate would be helpful. A careful comparison of models with negative interaction latent variable structure to models with increased numbers of latent variables would also be helpful in distinguishing between our two hypotheses for combination stress genetic architecture.
Why does heritability typically decline under stress? Is it an increase in noise as the system goes out of bounds, or a recruitment of more functions? We can restate this question in the as “does Ve or Vg*g*e increase?”. The key to addressing this question is developing better ways to partition the variance and to rigorously incorporate priors such as genetic regulatory pathway architecture. We suggest further exploration of causal models using methods such as Pearl's causal graph calculus (Pearl,
As we consider methods for complex trait and environment interaction, our choice of simulation model type becomes important. For example, are trans alleles for uniformity best thought of as nodes, as physical objects that change state such as protein that can be phosphorylated, or increased hormone concentrations, or should they be conceptualized as networks, with connections such as kinase activity or hormone movement? We suggest simulations of genetic regulatory network models to examine differences in uniformity, and systematic exploration of models using shared simulations across communities of researchers to better understand the constraints and power of different methods such as structural equation modeling and Boolean network construction. Simulations are especially useful for comparison of detection methods for precision and accuracy, and for ensuring that follow-up experiments have maximum power. Model systems for genetic architecture are also important to consider as simulations are constructed. For example, in yeast model systems that have one-step allele replacement, comprehensive simulation of knock-out effects should be part of any modeling effort. Plant model systems are especially well suited to multivariate trait data collection and analysis, and to developmental series-environment analyses, as well as to large-scale replication of genotypes by seed increase. Developmental and multi-trait factors should be incorporated into gene regulatory network models and explicitly tested for sensitivity to inform future experimental work.
Conflict of interest statement
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Statements
Acknowledgments
This project was supported by the National Research Initiative Competitive Grant no. 2009-35100-05028 from the USDA National Institute of Food and Agriculture. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Conflict of interest
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Ann E. Stapleton How are genotype and phenotype associated in changing environments? My students and I examine the genetic architecture of maize responses when drought and other stresses are applied in combination. We also analyze microbial epiphyte interactions with plant genes and growth environments. I collaborate with computer scientists and statisticians for new analytical method development and to create cyberinfrastructure for democratic access to large-scale data analysis. We are funded by NSF and USDA. I was the recipient of the first UNCW award for mentoring of honors research students.
References
1
Bakir-GungorB.EgemenE.SezermanO. U. (2014). PANOGA: a web server for identification of SNP-targeted pathways from genome-wide association study data. Bioinformatics30, 1287–1289. 10.1093/bioinformatics/btt743
2
CairnsJ. E.CrossaJ.ZaidiP. H.GrudloymaP.SanchezC.ArausJ. L.et al. (2013). Identification of drought, heat, and combined drought and heat tolerant donors in Maize. Crop Sci. 53, 1335. 10.2135/cropsci2012.09.0545
3
CaoY.WeiP.BaileyM.KauweJ. S. K.MaxwellT. J.Alzheimer's Disease Neuroimaging Initiative. (2014). A versatile omnibus test for detecting mean and variance heterogeneity. Genet. Epidemiol. 38, 51–59. 10.1002/gepi.21778
4
CleasbyI. R.NakagawaS. (2011). Neglected biological patterns in the residuals. Behav. Ecol. Sociobiol. 65, 2361–2372. 10.1007/s00265-011-1254-7
5
El-SodaM.MalosettiM.ZwaanB. J.KoornneefM.AartsM. G. M. (2014). Genotype × environment interaction QTL mapping in plants: lessons from Arabidopsis. Trends Plant Sci. 19, 339–408. 10.1016/j.tplants.2014.01.001
6
Geiler-SamerotteK.BauerC.LiS.ZivN.GreshamD.SiegalM. (2013). The details in the distributions: why and how to study phenotypic variability. Curr. Opin. Biotechnol. 24, 752–759. 10.1016/j.copbio.2013.03.010
7
HeslotN.AkdemirD.SorrellsM. E.JanninkJ. L. (2014). Integrating environmental covariates and crop modeling into the genomic selection framework to predict genotype by environment interactions. Theor. Appl. Genet. 127, 463–480. 10.1007/s00122-013-2231-5
8
HillW. G.MulderH. A. (2010). Genetic analysis of environmental variation. Genet. Res. 92, 381–395. 10.1017/S0016672310000546
9
LynchM.WalshB. (1998). Genetics and Analysis of Quantitative Traits. Sunderland, MA: Sinauer Associates, Inc.
10
MakumburageG. B.RichbourgH. L.LaTorreK. D.CappsA.ChenC.StapletonA. E. (2013). Genotype to phenotype maps: multiple input abiotic signals combine to produce growth effects via attenuating signaling interactions in maize. G3 (Bethesda). 3, 2195–2204. 10.1534/g3.113.008573
11
MakumburageG. B.StapletonA. E. (2011). Phenotype uniformity in combined-stress environments has a different genetic architecture than in single-stress treatments. Front. Plant Sci. 2:12. 10.3389/fpls.2011.00012
12
MarjoramP.ZubairA.NuzhdinS. V. (2014). Post-GWAS: where next? More samples, more SNPs or more biology?Heredity112, 79–88. 10.1038/hdy.2013.52
13
MulderH. A.CrumpR. E.CalusM. P. L.VeerkampR. F. (2013a). Unraveling the genetic architecture of environmental variance of somatic cell score using high-density single nucleotide polymorphism and cow data from experimental farms. J. Dairy Sci. 96, 7306–7317. 10.3168/jds.2013-6818
14
MulderH. A.RönnegårdL.FikseW. F.VeerkampR. F.StrandbergE. (2013b). Estimation of genetic variance for macro- and micro-environmental sensitivity using double hierarchical generalized linear models. Genet. Sel. Evol. 45:23. 10.1186/1297-9686-45-23
15
PearlJ. (2000). Causality: Models, Reasoning, and Inference. New York, NY: Cambridge University Press.
16
RonnegardL.ValdarW. (2012). Recent developments in statistical methods for detecting genetic loci affecting phenotypic variability. BMC Genet. 13:63. 10.1186/1471-2156-13-63
17
SharmaR.VleesschauwerD. D.SharmaM. K.RonaldP. C. (2013). Recent advances in dissecting stress-regulatory crosstalk in rice. Mol. Plant6, 250–260. 10.1093/mp/sss147
18
ShenX.PetterssonM.RönnegårdL.CarlborgÖ. (2012). Inheritance beyond plain heritability: variance-controlling genes in Arabidopsis thaliana. PLoS Genet. 8:e1002839. 10.1371/journal.pgen.1002839
19
TollenaarM.LeeE. A. (2002). Yield potential, yield stability and stress tolerance in maize. Field Crops Res. 75, 161–169. 10.1016/S0378-4290(02)00024-2
20
WindhausenV. S.WagenerS.MagorokoshoC.MakumbiD.VivekB.PiephoH.-P.et al. (2012). Strategies to subdivide a target population of environments: results from the CIMMYT-Led Maize hybrid testing programs in Africa. Crop Sci. 52, 2143. 10.2135/cropsci2012.02.0125
21
ZhaiY.LvY.LiX.WuW.BoW.ShenD.et al. (2014). A synthetic framework for modeling the genetic basis of phenotypic plasticity and its costs. New Phytol. 201, 357–365. 10.1111/nph.12458
22
ZhouT.LyuY.XuF.BoW.ZhaiY.ZhangJ.et al. (2013). A QTL model to map the common genetic basis for correlative phenotypic plasticity. Brief. Bioinform. bbt089. [Epub ahead of print]. 10.1093/bib/bbt089
Summary
Keywords
QTL, genotype-environment interaction, modifier, uniformity, variance heterogeneity, combined stress effects, abiotic stress, crop
Citation
Landers DA and Stapleton AE (2014) Genetic interactions matter more in less-optimal environments: a Focused Review of “Phenotype uniformity in combined-stress environments has a different genetic architecture than in single-stress treatments” (Makumburage and Stapleton, 2011). Front. Plant Sci. 5:384. doi: 10.3389/fpls.2014.00384
Received
18 February 2014
Accepted
18 July 2014
Published
11 August 2014
Volume
5 - 2014
Edited by
Shawn Kaeppler, University of Wisconsin-Madison, USA
Reviewed by
Seth C. Murray, Texas A&M University, USA; Elhan Ersoz, Syngenta Biotechnology, USA
Copyright
© 2014 Landers and Stapleton.
This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: stapletona@uncw.edu
This article was submitted to the journal Frontiers in Plant Science.
Disclaimer
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.