Maintenance of Complex Trait Variation: Classic Theory and Modern Data

Koch, Evan M.; Sunyaev, Shamil R.

doi:10.3389/fgene.2021.763363

REVIEW article

Front. Genet., 12 November 2021

Sec. Evolutionary, Population, and Conservation Genetics

Volume 12 - 2021 | https://doi.org/10.3389/fgene.2021.763363

Maintenance of Complex Trait Variation: Classic Theory and Modern Data

EM
Evan M. Koch ^1,2
SR
Shamil R. Sunyaev ^1,2^*

1. Department of Biomedical Informatics, Harvard Medical School, Boston, MA, United States
2. Division of Genetics, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, United States

Article metrics

View details

Citations

6,5k

Views

1,8k

Downloads

Abstract

Numerous studies have found evidence that GWAS loci experience negative selection, which increases in intensity with the effect size of identified variants. However, there is also accumulating evidence that this selection is not entirely mediated by the focal trait and contains a substantial pleiotropic component. Understanding how selective constraint shapes phenotypic variation requires advancing models capable of balancing these and other components of selection, as well as empirical analyses capable of inferring this balance and how it is generated by the underlying biology. We first review the classic theory connecting phenotypic selection to selection at individual loci as well as approaches and findings from recent analyses of negative selection in GWAS data. We then discuss geometric theories of pleiotropic selection with the potential to guide future modeling efforts. Recent findings revealing the nature of pleiotropic genetic variation provide clues to which genetic relationships are important and should be incorporated into analyses of selection, while findings that effect sizes vary between populations indicate that GWAS measurements could be misleading if effect sizes have also changed throughout human history.

1 Introduction

Attempts to understand genetic architecture preceded the discovery of DNA as the model of heredity (Fisher, 1918), and much theoretical work on selection, the maintenance of variation, and the adaptation of complex traits began before the ability to record genotypes on a scale sufficient to meaningfully contribute to these questions (Walsh and Lynch, 2018). The modern genetic era has provided an opportunity to test classic theories and to expand models—both long-standing and relatively recent—based on new understandings of genetic architecture and mechanisms. Genome-wide association studies (GWAS) and other data-driven tools have raised additional questions, including how so much heritability for many traits is contributed by relatively common alleles when natural selection is often expected to remove deleterious variation from the population. The flood of methods and data has sharpened and revised our understanding of many components that fashion the structure of the genome—polygenicity, selection, the distribution of mutational effects, pleiotropy—but has left us wanting for models capable of reconciling these elements (Sella and Barton, 2019).

The analysis of GWAS data revealed an extraordinary degree of polygenicity, and showed that most heritability is explained by relatively common, mostly noncoding alleles of small effect. At first glance, this observation is surprising. Natural selection is expected to maintain the population near an optimum value for quantitative traits and to reduce the prevalence of potentially maladaptive phenotypes such as diseases. Such optimums and maladaptive phenotypes are defined within a given environmental context (Harpak and Przeworski, 2021). Selection generally acts by reducing the frequency of phenotypically relevant alleles, though it may drive allele frequency increases when shifts in the optimum phenotype occur. This basic logic led to the question whether the effect of natural selection is evident from GWAS data at all. Recent studies have reached a strong consensus that phenotypic effect sizes are negatively correlated with allele frequency (Gazal et al., 2018; Zeng et al., 2018; Schoech et al., 2019; Speed et al., 2020; Zeng et al., 2021). These findings are inconsistent with purely neutral models, but various models of natural selection influencing trait variation remain plausible (Walsh and Lynch, 2018). Uncertainty largely surrounds whether the focal trait is causally important for fitness compared to pleiotropically related ones, and whether selection is primarily stabilizing or has important directional components. In spite of many unresolved details, the emerging picture is that a vast supply of mutations with weak effects, coupled with generally inefficient selection against such alleles, is the basis of phenotypic variation.

Empirical results from GWAS on the distribution of effect sizes and allele frequencies still pose the challenge of which classic and emerging models from theoretical population genetics are able to best explain the emerging observations. Existing theories range from models of selection acting directly on the focal trait to models where selection on genetic variation is driven by simultaneous effects on other traits (pleiotropy), to even fully “apparent” selection, which assumes the focal trait is not subject to any selective constraint. In this review, we discuss a relevant subset of these models and how their predictions look in light of recent studies of selection in GWAS. We identify pleiotropy and variable effect sizes of genetic variants across time and space as important factors that have yet to be satisfactorily included into statistical methods and theoretical models.

2 Theoretical Models of Maintenance of Complex Traits and Predictions They Generate

Evolutionary quantitative genetics has subsisted for most of its existence on a limited set of possible measurements. Estimates of the genetic and mutational variance, as well as selection gradients, are informative, especially with respect to contemporary patterns of selection. However, most progress in explaining maintenance of genetic variation in phenotypic traits was theoretical. Now that GWAS have generated an abundance of matched phenotypic and genetic measurements we live in a much more data-rich world. If we turn our attention to a single, focal trait, what sort of data would we ideally wish for? We would probably include the impact of genetic variants (estimated as their effect size) on the trait in a range of environments, the fitness effects of these alleles, as well as their frequencies and linkage patterns (Johnson and Barton, 2005). These would yield a satisfying and useful description of the genetic architecture and the process of its development, but there are fundamental details not immediately obvious from this description. We would like to know whether fitness effects arise primarily through selection on the focal trait, and if so what form it takes. If there are substantial fitness effects unrelated to the focal trait, what other traits are involved and how does selection act on them? Is the population in equilibrium? Has the genetic architecture changed in the past and will it do so in the future? Questions like these can be addressed by modeling how selection acts on traits, the mutational distributions underlying them, and how these generate the genetic architectures we observe.

Textbook introductions to population genetics begin by assigning fitnesses to genotypes and examine the consequences for allele frequencies and overall patterns of genetic variation. Connecting trait values, such as those measured in GWAS, to selection on individual causative alleles requires the additional step of specifying how selection on phenotypes leads to fitness differences among genotypes. While slightly less familiar than other selection results, this task was also taken up by many of the authors of classical population genetics and has grown into a large branch of evolutionary theory.

The simplest and most obvious model predicts the selection on individual causative loci arising from stabilizing selection on a single polygenic trait with purely additive genetic variance (Wright, 1935; Robertson, 1956; Bulmer, 1972) (Figure 1A). In this model an individual’s trait value (z) is determined by the sum of effects from L independent loci: , where β_l is the effect size of the allele at locus l, g_l,_m and gl,_p are the maternal and paternal genotypes at each locus, and e is a normally distributed environmental effect centered at zero. If an individual’s fitness is a Gaussian function centered at the population mean M and with width V_S (w(z) = exp (−(M − z)²/2V_S)), then selection will change the average frequency of a causative allele at locus l with effect size β_l = β as follows:

FIGURE 1

Stabilizing selection tends to remove genetic variation in this trait from the population. A balance between mutation, selection, and drift generates the trait’s genetic variance in the population (Bulmer, 1972; Keightley and Hill, 1988). Such direct stabilizing selection leads to a negative correlation between minor allele frequencies and the effect size magnitude.

We can write the selection coefficient for this model as to acknowledge that stabilizing selection takes the underdominant form shown in Eq. 1 () rather than the more familiar additive one (E[Δx] = sx (1 − x)). The term appearing in the underdominant formula means that selection against the derived allele actually decreases as it approaches 50% frequency and actually switches signs after that point. The minor allele is therefore always disfavored. However, when selection is strong or the allele frequency is low, the differences are minor as x is small compared to 1/2. We generally omit the subscript in s_ud for ease of reading, but it is important to note that the interpretation of selection coefficients differs depending on whether stabilizing selection is explicitly modeled or not.

The variance-reducing property of stabilizing selection motivated the development of other models with variance-promoting features like overdominant side-effects of causative alleles (Robertson, 1956; Gillespie, 1984) and strong mutational pressure (Lande, 1976b). As always in evolution, we must also at least consider the possibility that a trait of interest has a negligible impact on organismal fitness. The population mean value of a trait controlled by strictly neutral mutation will drift in Brownian motion and have a genetic variance that depends on the mutation rate, the second moment of the distribution of mutation effect sizes, and the average pairwise coalescent time between randomly sampled loci (Lande, 1976a; Lynch and Hill, 1986; Koch, 2019). Var[z] = E[T₂]θμ₂, where T₂ is the average number of generations it takes for a pair of sites to coalesce, θ is the mutation rate per generation, and μ₂ is the second moment of the distribution of mutational effects. Crucially, there would be no relationship between the effect size and frequency of alleles (Figure 1B).

Of course, both intuitively and empirically, traits in natural and contemporary human populations at least appear to be under some selection (Kingsolver et al., 2001; Corbett et al., 2018; Sanjak et al., 2018), and involve some level of pleiotropy (Stearns, 2010). Models of apparent selection begin with the assumption that the focal trait is not itself under any selection but add pleiotropic fitness effects of the causative alleles. Individuals in the tails of a phenotypic distribution will carry more mutations overall, and if trait-affecting mutations have deleterious pleiotropic effects those individuals will also have lover fitness on average (Barton, 1990; Kondrashov and Turelli, 1992). Fitness that decreases away from the mean is reminiscent of stabilizing selection, but the strict deleterious model of apparent selection does not induce the negative correlation between allele frequencies and effect size magnitudes expected when the focal trait itself is actively selected. The negative correlation between β² and x may yet be rescued if the deleterious pleiotropic effects of variants affecting the focal trait arise from genetic covariance (Lande and Arnold, 1983) or correlated effect size magnitudes (Keightley and Hill, 1990). In this scenario, alleles with larger effects (or absolute magnitudes) on the focal, neutral trait are more likely to have larger effects on a second, selected trait. Allele frequencies are suppressed through selection on the second. In the extreme where the focal and selected trait are so closely biologically related that the effect sizes of mutations are deterministically linked, it is indistinguishable which trait causally impacts fitness, although a strong genetic covariance would be measurable. One can also imagine a model where each mutation has such a relationship with a unique pleiotropic trait, for instance, molecular effects in different pathways. Assuming that large-effect alleles for the focal trait induce stronger molecular effects, there may be strong selection without measurable genetic covariance between the focal trait and any individual pleiotropic trait.

The differences between models come down to how the statistical relationship between selection coefficients and effect sizes is specified: how s scales on average with β and what the random variation around this looks like. In multivariate stabilizing selection, s scales with β² as in direct stabilizing selection, but apparent selection models don’t have this restriction. Apparent selection models were extended, as described above, to include increasing selection with greater β in addition to the negative pleiotropic consequences (Keightley and Hill, 1990; Zhang and Hill, 2002; Eyre-Walker, 2010) (Figure 1D). Models of multivariate stabilizing selection paint a similar picture, but the focal and pleiotropic traits are explicitly under stabilizing selection (Zhang and Hill, 2003; Simons et al., 2018) (Figure 1C). All lead to a negative (β², x) relationship, so differences between models come down to the shape and variance of that relationship, along with impacts on the genetic architecture.

Directional selection on complex disease susceptibility is also a viable hypothesis. In this view, the disease phenotype is itself deleterious and alleles that increase susceptibility will be selected against (Charlesworth, 2001; Wright et al., 2003) (Figure 1E). This also implies a mutational bias towards susceptibility-increasing alleles. It is plausible that there is a fitness cost associated with carrying such alleles, even for late-onset diseases (Pavard and Coste, 2021). All of the pleiotropy arguments made for stabilizing selection would apply equally well here.

There is an emerging consensus that models of mutation-selection-drift balance are likely to explain the genetic architecture of many, if not most, complex traits (Sella and Barton, 2019). The models of apparent, stabilizing, and directional selection described above, with varying possible degrees of pleiotropic selection, all remain possibilities within this consensus and are not mutually exclusive. Progress in statistical genetics methodology and increasing GWAS sample sizes are starting to clarify these details.

3 Detecting Negative Selection in Genome-Wide Association Studies Data

As sample sizes increased and GWAS became sufficiently powered to detect larger numbers of loci for different traits, attention started shifting from the speculative question of how study design should be informed by selection and its effect on genetic architecture (Pritchard, 2001; Reich and Lander, 2001), to what the genetic architecture, as revealed through these studies, might say about selection. A transitional form was contributed by (Agarwala et al., 2013), who investigated how selection may have shaped the genetic architecture of Type 2 Diabetes, which had recently gone from 2 to 39 genome-wide significant loci. Using primarily the number of associations, conditional on the heritability and prevalence of the disease, they ruled out both neutrality of the focal trait and a model where selection is proportional to effect size: β ∝ |s|.

Following this, methods were developed that do not explicitly model natural selection on causative variants, but ask whether lower frequency variants contribute disproportionately to heritability. This heritability bias should only occur if rare variants have larger effect sizes on average, the most plausible explanation being negative selection correlated with the magnitude of effect sizes. A simple approach is to divide variants into MAF bins and estimate the heritability contribution of each in a mixed model framework (Yang et al., 2015). applied this approach to height and body mass index (BMI) and (Mancuso et al., 2016) to prostate cancer risk. Both found increased heritability in rare variants compared to common, and Mancuso et al. (2016) performed simulations to demonstrate that, conditional on disease heritability and prevalence, they could also rule out focal trait neutrality and directly proportional selection.

More sophisticated analyses using the same general idea as partitioning heritability by allele frequency have been developed and applied to a wide variety of human traits. Extensions of LD score regression (LDSC), a useful tool for partitioning heritability among large numbers of annotations (Finucane et al., 2015), were developed for features of negative selection. These analyses found that younger genetic variants contribute more heritability than older genetic variants at the same frequency (Gazal et al., 2017), a key feature of negative selection (Maruyama, 1974; Kiezun et al., 2013). They also confirmed earlier findings that rare variants have greater effect sizes than common ones for a larger number of traits (Gazal et al., 2018). Another popular and tractable approach, termed the alpha model, explicitly sets the MAF dependence of heritability contributions through a single parameter α: E[β²|x] ∝ (x(1 − x))^α (Zeng et al., 2018; Schoech et al., 2019; Speed et al., 2020; Zeng et al., 2021). An α < 0 indicates a heritability bias towards rare variants and some amount of negative selection. Applications of this model have been remarkable in their consistently negative α estimates across all analyzed traits. While some differences between traits are inferred, e.g. height has a smaller α than BMI, estimates are consistently within the range [− 0.5, −0.2].

The negative relationships between effect size magnitude and minor allele frequency inferred for so many traits are informative about the model of selection. In particular, they allow us to rule out neutral models where the focal trait and all underlying variation are unaffected by selection, as well as strict models of apparent selection where the causative variants are deleterious, but the strength of this selection is uncorrelated with effect sizes. However, many other models of selection may still be compatible with these findings (Figure 1, Figure 2). The model of direct stabilizing selection on a single trait first proposed by Wright and others (Equation 1) is plausible for some traits. On the other hand, a genetic correlation between the focal trait and another (or many) under stabilizing selection could produce the observed negative correlations even if the focal traits were completely neutral. There is also a lot of space in between with varying contributions to selection from the focal and pleiotropic traits. While seemingly semantic, the question is really about the extent to which variation in the underlying biology of the focal trait causes variation in fitness. This can apply even when the focal trait is something seemingly benign, an arbitrary bone for instance, whose size is governed chiefly by the biology of overall body size.

FIGURE 2

While the alpha model does not explicitly incorporate a population genetics model in any statistical analysis, it is possible to further interpret results using simulations and theory (Figure 2). In simulations, the idea is to use a model of choice to generate allele frequencies and effect sizes and then use the inference procedure to estimate what α corresponds to those model parameters. For theory, one derives E[β²|x] under the selection model and compare this to the approximate alpha model expectation of E[β²|x] ∝ x^α. Using this approach, Schoech et al. (2019) showed that the inferred α depends both on the distribution of fitness effects of new alleles affecting the trait (DFE), and on the average scaling of effect sizes and selection E[β²|s] ∝ s^2τ, where the parameter τ determines the scaling through the relationship E[β] ∝ s^τ (Eyre-Walker, 2010). The DFE dependence enters primarily through a frequency-threshold effect: alleles below the threshold have effect sizes roughly uncorrelated with frequency because, while above the dependence scales approximately like E[β²|x] ∝ x^−2τ. The threshold is the frequency below which most new mutations from the DFE are still mostly affected by drift rather than selection, and is therefore lower for a heavy-tailed DFE with a high average s. Zeng et al. (2021) used population genetic simulations to fit the DFE for various traits by conditioning on the values they had estimated for α, polygenicity, and SNP heritability. They found greater variation in the DFE among trait categories than variation in α estimates. α estimates that are relatively insensitive to the DFE are consistent with the predictions of Schoech et al. (2019) if most SNPs included in the analysis are above the frequency threshold where selection is detectable. These simulations therefore also demonstrate that polygenicity and heritability are informative about the DFE.

Simons et al. (2018) developed a model for the relationship between effect sizes and selection coefficients based on isotropic stabilizing selection and Fisher’s geometric model (the specifics of the model is discussed in a subsequent section). The number of trait dimensions in this model corresponds to the effective number of independent axes of genetic variation, a value that can be interpreted as the degree of pleiotropy. With a single dimension the selection coefficient is the same as in the classical model of one dimensional stabilizing selection: . When the number of traits becomes large the relationship becomes β ∼ N (0,(V_S/n_e)s), where n_e is the effective number of traits, and expressions for moderate pleiotropy interpolate between these extremes. Rather than fit this model to the heritability explained by different minor allele frequencies, Simons et al. (2018) analyzed the distribution of variance contributions, v = 2β²x(1 − x), among genome-wide significant SNPs. For a given mean among discovered loci, the variance of v is higher with greater pleiotropy (n_e), with a parametric likelihood derived by the authors. The high-pleiotropy model was found to fit the distribution of GWAS hits for standing height and BMI better than the no- and low-pleiotropy alternatives.

Zeng et al. (2021) also simulated varying degrees of pleiotropy using the Simons et al. (2018) model and found that α estimates were insensitive to changes in the degree of pleiotropy (n_e). This makes sense, given that the alpha model only attempts to fit the average effect size - frequency relationship and suggests that new approaches will be needed to investigate the nature of pleiotropy and the relative importance of the focal trait.

4 Model Building Using Geometry and Pleiotropy

Using the distribution of causative allele frequencies and their effects solely on the focal trait, what could be done to further interpret the results of GWAS studies? One advance would be to explicitly include selection in the next generation of models that build upon LDSC or α models (Sella and Barton, 2019). We may start by imagining what class of models could fit the joint distribution of (x, β). Assume a set of parameters Θ that describes the selection model. An analysis would use the likelihood p(x, β|Θ), which decomposes into p(x, β|Θ) = p(x|β, Θ)p(β|Θ). Since the distribution of effect sizes is not a major concern for selection, inference would focus on p(x|β, Θ). The distribution of frequencies for a given effect size is determined by integrating over the possible fitness effects of a mutation with effect size β: p(x|β, Θ) = ∫p(x|s, Θ)p(s|β, Θ)ds. The effect of selection, s, could be either additive or underdominant (stabilizing selection), but could also represent other models beyond these two such as overdominant or fluctuating selection. x can be replaced by the age or historical frequency path of the allele (Stern et al., 2021). p(x|s) can be tackled with standard population genetics, so the trickier problem is to provide p(s|β, Θ) in cases of pleiotropic selection.

In an early attempt to do this explicitly, Keightley and Hill (1990) proposed p(s|β, Θ) as the conditional distribution of a two-dimensional Wishart distribution. In this formulation both the mean and variance of s, conditional on β, are proportional to |β| plus a constant, and a correlation parameter determines how the variance scales with the mean. This contrasts with the model of direct stabilizing selection where s is proportional to β². Another approach decomposes this distribution as p(s|β, Θ) ∝ p(β|s, Θ)p(s|Θ). This has the appealing property of separating the link between fitness and trait effects from the distribution of fitness effects (DFE). Eyre-Walker (2010) proposed a form for p(β|s, Θ) where E[β] ∝ s^τ with multiplicative noise. Both of these models try to capture a space of potential relationships between effect size and selection without being over-parameterized. However, it is not actually clear how one should interpret results from either. A weak correlation parameter from the Keightley and Hill (1990) model would perhaps indicate the importance of pleiotropy, but the linear scaling between |β| and s would not make sense with direct stabilizing selection. Eyre-Walker’s τ doesn’t necessarily mean stronger or weaker selection. Would it mean anything for the relative importance of directional or stabilizing selection? Moreover, these two models make divergent predictions for the contribution of rare versus common alleles to the genetic variance (Eyre-Walker, 2010; Caballero et al., 2015; Sella and Barton, 2019).

Simons et al. (2018) made a strong argument for interpretability when deriving their distribution for p(β|s, Θ). The framework they used was multivariate stabilizing selection in a geometric model (Fisher, 1930). Models within this framework generally posit a multidimensional phenotypic space with a selection function that describes the fitness of each possible phenotypic combination. Typically, the fitness function is Gaussian and centered at some optimum phenotype. A mutation is a vector that moves an individual to a different point in phenotype space, thereby altering fitness. Assuming a population centered at its optimum value, with each phenotypic direction under equal stabilizing selection and mutational pressure, p(β|s, Θ) takes a simple parametric form depending only on the number of dimensions n_e and the strength of selection V_s.

Previous work using Fisher’s geometric model had used it to derive the DFE of new mutations (Martin and Lenormand, 2006; Lourenço et al., 2011) or the expected genetic variance and correlation of the focal trait with fitness (Zhang and Hill, 2003) rather than p(β|s). A major assumption of these studies was that the phenotypic effects of new mutations were drawn from a multivariate normal distribution with different dimensions representing different phenotypes. While a seemingly natural starting place, the assumption of normally distributed mutations is far from realistic and mathematically troublesome. There is accumulating evidence that the mutational effect distribution is substantially leptokurtic for many traits (Zhang et al., 2018; O’Connor et al., 2019; O’Connor, 2021). It is has also been shown that, for a single normal distribution of mutations, the DFE concentrates around a point value of s as the number of traits becomes large, an obviously unrealistic scenario (Waxman and Peck, 1998; Wingreen et al., 2003; Zhang and Hill, 2003). Thankfully, one may still rescue the utility of geometric models by using a mixture of normals.

For example, the Simons et al. (2018) likelihood can be derived from the geometric model with normal mutation proposed by Martin and Lenormand (2006) by integrating out a variance parameter: . If mutations are uncorrelated, equally affected by stabilizing selection, and drawn from a mixture of normal distributions, then the distribution of variances, p(σ²), fully describes the mutational distribution. It is straightforward using Bayes’ theorem to show that p(s|β) ∝ p(β|s)∫p(s|σ²)p(σ²)dσ² and contains two components. The first component, p(β|s), has the form derived by Simons et al. (2018), a normal distribution with variance proportional to s when the number of traits is large. This part is independent of σ². The second component, ∫p(s|σ²)p(σ²)dσ², is the DFE itself. In this example, the DFE is generated by the distribution of mutational effects. The variance of what normal distribution in the mixture a mutation comes from determines how strongly selected it is.

The above approach suggests that a fruitful way to propose future models would be to propose that there exist different mutational modes. Modes might represent different biological pathways and could be parameterized by which traits are involved, the correlation of mutational effects among these, and the distribution of mutational effect sizes. If summarized in Θ_M, we might then integrate over the distribution of modes. If β is conditionally independent of s given Θ_M, then the form of the DFE will be separable from the link between selection and effect sizes, though it is not always clear that this will be the case. Directional selection as well as antagonistic pleiotropy may be possible to model this way, at least for a population at equilibrium in its fitness landscape. To more directly analyze selection and the pleiotropic relationships among traits, a vector of effect sizes could replace the effect β on a single, focal trait.

5 Empirical Demonstrations of the Existence and Nature of Pleiotropy

Since models of the evolution and maintenance of complex trait variation strongly depend on assumptions regarding the degree of pleiotropy. Modeling and measurement of pleiotropy is key to the empirical questions of whether the focal trait is under meaningful direct selection and how selection coefficients depend on the phenotypic effects of individual variants.

Current estimates of polygenicity indirectly but strongly suggest highly pleiotropic genetic architecture for most complex traits (Zeng et al., 2018; O’Connor et al., 2019; Zeng et al., 2021). Indeed, it was estimated that 2% of genetic variation is involved in height and a similar proportion (1%) is involved in risk of Type 2 Diabetes (Zeng et al., 2021). It is clear that a model where every quantitative trait locus (QTL) affects just a single trait is, due to the finite nature of the human genome, inconsistent with high polygenicity (defined here as the probability that a variant has a non-zero effect on the focal trait). We do not know exactly what fraction of the genome plays any functional role; comparative and functional genomics produce a range of estimates generally on the order of 0.1 (Rands et al., 2014; Gulko et al., 2015). If 10% of the genome is of any functional importance, and trait-affecting mutations originate from this functional fraction, it clearly cannot harbor independent QTLs for a vast number of complex traits each with a polygenicity of 2% (Jordan et al., 2019).

With the abundance of GWAS data, many aspects of pleiotropy can be empirically estimated using corresponding well-developed statistical approaches. The specific relationships between causative QTL effect sizes on different traits that these approaches investigate are illustrated in Figure 3. The relationship between two phenotypes is most commonly expressed as global genetic covariance, which reflects the overall degree of pleiotropy in the form of correlation of QTL effects across all loci (Cov[β₁, β₂], where β₁ and β₂ are allelic effect sizes for phenotypes 1 and 2), scaled by the heterozygosity at each causative locus. A significant genetic covariance below one would indicate that individual QTL effects are correlated but not identical. Global genetic covariance is estimated using statistical approaches related to those used to estimate heritability including random effect models implemented into the GCTA software or LD-score regression (Lee et al., 2012; Bulik-Sullivan et al., 2015). Estimates of genetic covariance come with the same caveats as must apply to heritability estimates (Visscher et al., 2008). Measurements apply to the particular environment in which the different traits are measured and offer no guarantee of a fundamental relationship between traits. Under different conditions, gene-by-environment interactions can change which genetic variants contribute to heritability and different pleiotropic traits may associate with the new regime.

FIGURE 3

Using these and related statistical techniques, highly significant genetic covariances were estimated among various autoimmune diseases and among various psychiatric diseases and related phenotypes (Cotsapas et al., 2011; Lee et al., 2013; Watanabe et al., 2019; Lincoln et al., 2021). The analysis of genetic covariances between two traits has some limitations. The genetic covariance alone is not informative about biological mechanisms and per locus patterns. For example, the same genetic covariance may indicate either pleiotropy limited to just a few loci with very similar effects on both traits on the background of other non-pleiotropic loci or the broad pleiotropy of all loci but with non-identical effects (Figures 3A vs 3B). For autoimmune traits, it is possible that some loci impact immune function while others determine tissue or organ specificity.

The question of contribution of individual loci into global genetic correlation must be, therefore, addressed at the local level by studying individual loci. When studying individual loci, one of the challenges is that linkage disequilibrium confounds the analysis. Genetic covariance may imply real pleiotropy, meaning that the same genetic variants causally affect both traits. Alternatively, some variants may exclusively impact the first trait and other variants exclusively impact the second trait, but local genetic correlation can still be induced by linkage disequilibrium between the two sets. Consequently, the field has developed two different classes of methods to address this issue. Methods that estimate local genetic covariance (Shi et al., 2017) do not distinguish between functional pleiotropy versus non-independence induced by linkage disequilibrium. A different class of methods called “colocalization” (Giambartolomei et al., 2014; Hormozdiari et al., 2016; Chun et al., 2017) relies on linkage disequilibrium patterns to specifically test the hypothesis that the same causative variant (or variants) in the locus impacts both traits (Figure 3B). Multiple examples of local genetic correlations and individually colocalized loci have been described (van Rheenen et al., 2019; Aguet et al., 2020; Vuckovic et al., 2020). However, some QTLs involved in genetically correlated traits do not show obvious signals of colocalization, suggesting that genetic correlation does not necessarily imply pleiotropic effects of all variants (Lincoln et al., 2021).

A separate aspect of pleiotropy that statistical genetics addresses is the causal relationship between phenotypes (van Rheenen et al., 2019). There is an important distinction between “horizontal” pleiotropy with genetic variants exerting independent effects on both traits and a causal path or “vertical” pleiotropy, where one trait directly contributes to the other (Jordan et al., 2019). Examples of the latter include LDL cholesterol being a causative risk factor of heart disease (Zhu et al., 2018), the genetic component of smoking being a causative risk of lung cancer (McKay et al., 2017), and all molecular effects (considered as “molecular” phenotypes) leading to changes in a phenotype of the organism. If one trait is a cause of the other trait, every variant inducing an effect on the first trait also affects the second trait (Figure 3C). Moreover, these effect sizes are proportional and correspond to the causal effect of the first trait on the second trait. Because the first trait is usually just one of many causes, most variants affecting the second trait would not be expected to have any effect on the first trait. These considerations are a foundation of Mendelian Randomization methods that attempt to infer causal relationships even if genetic associations for the two phenotypes are measured separately in independent datasets (Pingault et al., 2018). This approach relies on a large number of QTLs and does not translate to individual loci.

Many recent studies of pleiotropy, colocalization and causality have focused on molecular phenotypes such as gene expression, chromatin accessibility or DNA methylation (Umans et al., 2020; Vuckovic et al., 2020; Ye et al., 2020; Morabito et al., 2021). Numerous QTLs for various molecular phenotypes have been identified for these classes of traits (most prominently expression QTLs or eQTLs). The main motivation of these studies is to identify the primary molecular events underlying genetic associations with human traits and diseases. However, it is not guaranteed that genetic covariance or colocalization of a molecular trait with a focal trait is indicative of an underlying causal impact of variation in the molecular trait on the focal trait. One example is that changes in BMI actually induce changes in DNA methylation rather than DNA methylation acting as a molecular mechanism mediating genetic effects on BMI (Wahl et al., 2017). The direction of causality was demonstrated by showing that SNPs which predict methylation levels at individual loci did not predict BMI levels, while a genetic risk score for BMI levels did predict methylation levels.

Even in the absence of genetic covariance, molecular effects may induce pleiotropic relationships between two traits. Imagine a scenario where the two traits are both mediated by a large number of molecular phenotypes (activities of many individual genes or other latent factors), but these molecular phenotypes do not exhibit correlated effects on the two traits (Figure 3D). In this case, genetic covariance might not exist or be very weak on aggregate but covariance between absolute (or squared) genetic effects may be substantial. The popular “omnigenic” model offers one version of such a scenario (Boyle et al., 2017). Genetic covariance may also be close to zero if the pleiotropic effect is limited to one or a small number of loci (in other words, with a substantial local genetic covariance and even colocalization in individual loci) (Liu et al., 2019).

These methodological developments and empirical results related to pleiotropy are important in light of the main subject of this review. They motivate consideration of evolutionary models that take into account groups of correlated traits. For causally related traits, selection effects would probably differ depending on whether selection primarily acts on the upstream or downstream trait along the causal chain. An interesting perspective is also brought by the consideration of molecular phenotype. If each molecular phenotype is pleiotropically involved with many downstream organismal phenotypes, and the focal trait is merely one of these, selection coefficients can depend on effect sizes even if the focal trait is neutral. Variants with larger effect sizes on molecular function would be under stronger selection because this molecular function impacts multiple other selected downstream traits in addition to the neutral focal trait.

Few studies have analyzed the effects of pleiotropy on selection by actually incorporating the measured effects of variants on multiple traits. Some mutation accumulation studies have tried to demonstrate whether pleiotropic mutations are under stronger selection. McGuigan et al. (2014) provide some evidence that mutations underlying combinations of correlated gene expression traits in Drosophila serrata are under stronger selection than the average mutation affecting a given trait. In humans, Sanjak et al. (2018) regressed lifetime reproductive success on genetic scores for multiple traits simultaneously in United Kingdom Biobank participants. Compared to the univariate, this full analysis lacked power, but quadratic and linear selection terms did change in both magnitude and sign for some traits, indicating the importance of accounting for pleiotropy. Stern et al. (2021) took a similar approach, but used the shape of genealogies at GWAS loci to look at historical rather than contemporary patterns of directional selection. Again, the authors found that many estimates of selection changed substantially, and were largely attenuated, when accounting for the correlated response in other traits. At the time of writing, no attempt has been made to account for pleiotropy in the alpha model approaches discussed above that have demonstrated negative selection on many human traits.

6 Constancy of Effect Sizes Across Time and Space

Everything discussed so far has assumed that genetic variants have well-defined additive effects on traits of interest, and that these effects are measurable in contemporary human populations. Although convenient, and the correct starting place for most analyses, recent research has demonstrated that causative variants for many traits and diseases have population-specific effect sizes. Such studies are possible when GWAS for the same traits have been performed in different populations (De Candia et al., 2013; Mancuso et al., 2016). One approach has been to estimate the cross-population genetic correlation, the correlation in causal effect sizes between the different samples, and these estimates are often less than one (Brown et al., 2016; Galinsky et al., 2019). This is most likely due to gene-by-environment (GxE) and gene-by-gene (GxG) interactions, with some effect driven by different measurement practices and diagnosis criteria.

Shi et al. (2021) estimated the impact of different functional annotations on the degree of cross-population effect size correlation of variants within those genomic regions. They found that the squared genetic correlation was depleted most strongly in regions under strong background selection as well as in and around functional elements such as exons, promoters, and enhancers. These regions are also enriched for heritability and, as previous research reviewed here has shown, variants residing there are likely under stronger selection. If a variant has different effect sizes in different contemporary populations, we should be more uncertain about its effect size in the ancestral population where the majority of its existence may have taken place. Cross-population genetic correlation could therefore be used as a measure of the temporal stability of allelic effects. Alternatively, the aggregate pleiotropic effects of an allele may stay roughly constant even as the effects on individual traits vary due to GxE or other factors.

7 Conclusion

Direct data on genotype-phenotype associations for numerous human traits have provided an opportunity to investigate which, if any, of the current theoretical models for the maintenance of complex trait variation fit observed genetic architectures. Depending on the degree and nature of pleiotropy, as well as the importance of the focal trait for selection, these models predict the relationship between β and s (Johnson and Barton, 2005). Selection analyses of human GWAS data have consistently demonstrated a negative relationship between effect size magnitudes and allele frequencies, implying that larger effect sizes are associated with stronger selection on average (Zeng et al., 2018; Schoech et al., 2019; Speed et al., 2020; Zeng et al., 2021). Models where the focal trait is neutral, or largely biologically unrelated to any aspect of fitness, are therefore ruled out. Within the class of alpha models, the scaling between β² and frequency varies across traits, likely reflecting differences in the DFE and the scaling between β and s. These estimates are difficult to interpret in terms of classical stabilizing selection models, and work is needed to reconcile tractable statistical models of how effect sizes change with frequency with realistic models of selection at the phenotypic level. Studies have also largely been limited to analyzing the average relationship of effect size to frequency. This limits the ability to capture the importance of pleiotropy which should create variance around that average. By directly modeling the variation in genome-wide significant variance contributions, Simons et al. (2018) were able to infer a high degree of pleiotropy for height and BMI.

All the approaches reviewed here infer the nature of selection on GWAS loci by analyzing the distribution of allele frequencies and effect sizes (x, β), with the overall trait heritability sometimes included. Future work along these lines may utilize fine-mapping (Weissbrod et al., 2020) or other techniques to better capture this distribution (O’Connor, 2021). An interesting approach was developed by O’Connor et al. (2019) who estimated the kurtosis of heritability contributions using the LDSC framework to measure trait polygenicity at different allele frequencies and functional genomic annotations. The kurtosis depends on the fourth moment of the distribution of effect sizes, and therefore contains additional information beyond that contained in the alpha model. A low kurtosis, and therefore high polygenicity, of common variants indicated a “flattening” due to selection strongly preventing any large-effect variants from reaching high frequencies. This indicates a high importance of the focal trait for selection, but more thought is needed to tell what degree of pleiotropy is consistent with these results.

The greatest advances in our ability to make sense of the maintenance of complex trait variation will likely come from analyses that utilize variant-level pleiotropy and account for effect sizes that vary across time and space. Methods to investigate pleiotropy in statistical genetics are already well-developed (van Rheenen et al., 2019) but have yet to intersect with analyses of stabilizing or negative selection. Effect size differences between populations are also well-documented (Brown et al., 2016; Shi et al., 2021), but have received less attention, likely in part due to the lack of large GWAS from diverse populations and the difficulty of standardizing measurement for some phenotypes. The portability of polygenic scores is also potentially more strongly impacted by differences in allele frequencies and linkage disequilibrium than effect size variation (Wang et al., 2020), and allele frequencies will differentiate more rapidly under stabilizing or negative selection (Yair and Coop, 2021). However, understanding effect size variation in space and time may ultimately end up being more important for modeling the maintenance of variation in complex traits as well as detecting selection on them (Mathieson, 2021).

Statements

Author contributions

EK and SS jointly wrote this review.

Funding

This work was supported by NIH grants R35GM127131, RO1MH101244, and R01HG010372 to SRS.

Acknowledgments

We thank Noah Connally for helpful comments on this manuscript. We are grateful for the suggestions of the editor and two reviewers which greatly improved this review.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

The handling editor declared a past co-authorship with one of the authors (SS).

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Glossary

Causative Allele
The allele which causally affects the focal trait. Due to linkage disequilibrium, many alleles at a GWAS-identified locus are highly correlated. The causative allele refers only to the one which causally affects the trait.
Distribution of Mutational Effects
The distribution from which the phenotypic effects of new mutations are drawn. This can include the focal trait as well as related pleiotropic ones.
Effect Size Magnitude
The absolute value of the effect size of an allele. It is often useful to ignore the direction of effect that an allele has on the trait.
Focal Trait
All studies in quantitative genetics must choose some measurable aspects of biology to focus on. This can be something of obvious importance like diabetic status, or could be simply something easily queried in a biobank. Analyzed one at a time, we call the current trait the focal trait.
Genetic Architecture
The joint distribution of allele frequencies and effect sizes in a population or sample. This determines how much different frequency and effect size ranges contribute to heritability, and answers questions surrounding the importance of rare versus common variants; genetic covariance, The covariance between the effects different genotypes have on two traits. This measures the propensity for an individual with a high genetic value for one trait to also have a high (or low) genetic value for the second. It averages over all alleles and their effects on both traits, scaled by their contributions to the genetic variance.
Genetic Risk Score
A phenotypic prediction calculated for an individual using a weighted sum of the estimated effect sizes of variants found in that individuals genome.
Molecular Phenotype
Phenotypes such as gene expression, methylation levels, or metabolite concentration that are measured at the molecular level. These are hoped to represent “low-level” traits that mediate the effects of genetic variants on other phenotypes.
Overdominant
Selection where the heterozygous genotype has higher average fitness than either of the two homozygotes.
Pairwise Coalescent Time
The amount of time it takes two sampled loci to find a common ancestor going backwards in time. The longer this time, the more likely it is that mutations occur to differentiate the two loci.
QTL
Quantitative trait locus. A region in the genome that has been statistically associated with a quantitative trait.

References

1
AgarwalaV.FlannickJ.FlannickJ.SunyaevS.AltshulerD. (2013). Evaluating Empirical Bounds on Complex Disease Genetic Architecture. Nat. Genet.45, 1418–1427. 10.1038/ng.2804
- CrossRef
- Google Scholar
2
AguetF.BarbeiraA. N.BonazzolaR.JoB.KaselaS.LiangY.et al (2020). The GTEx Consortium Atlas of Genetic Regulatory Effects across Human Tissues. Science369, 1318–1330. 10.1126/science.aaz1776
- CrossRef
- Google Scholar
3
BartonN. H. (1990). Pleiotropic Models of Quantitative Variation. Genetics124, 773–782. 10.1093/genetics/124.3.773
- CrossRef
- Google Scholar
4
BoyleE. A.LiY. I.PritchardJ. K. (2017). An Expanded View of Complex Traits: From Polygenic to Omnigenic. Cell169, 1177–1186. 10.1016/j.cell.2017.05.038
- CrossRef
- Google Scholar
5
BrownB. C.YeC. J.PriceA. L.ZaitlenN. (2016). Transethnic Genetic-Correlation Estimates from Summary Statistics. Am. J. Hum. Genet.99, 76–88. 10.1016/j.ajhg.2016.05.001
- CrossRef
- Google Scholar
6
Bulik-SullivanB.FinucaneH. K.FinucaneH. K.AnttilaV.GusevA.DayF. R.et al (2015). An Atlas of Genetic Correlations across Human Diseases and Traits. Nat. Genet.47, 1236–1241. 10.1038/ng.3406
- CrossRef
- Google Scholar
7
BulmerM. G. (1972). The Genetic Variability of Polygenic Characters under Optimizing Selection, Mutation and Drift. Genet. Res.19, 17–25. 10.1017/s0016672300014221
- CrossRef
- Google Scholar
8
CaballeroA.TenesaA.KeightleyP. D. (2015). The Nature of Genetic Variation for Complex Traits Revealed by GWAS and Regional Heritability Mapping Analyses. Genetics201, 1601–1613. 10.1534/genetics.115.177220
- CrossRef
- Google Scholar
9
CharlesworthB. (2001). Patterns of Age-specific Means and Genetic Variances of Mortality Rates Predicted by the Mutation-Accumulation Theory of Ageing. J. Theor. Biol.210, 47–65. 10.1006/jtbi.2001.2296
- CrossRef
- Google Scholar
10
ChunS.CasparinoA.PatsopoulosN. A.Croteau-ChonkaD. C.RabyB. A.De JagerP. L.et al (2017). Limited Statistical Evidence for Shared Genetic Effects of eQTLs and Autoimmune-Disease-Associated Loci in Three Major Immune-Cell Types. Nat. Genet.49, 600–605. 10.1038/ng.3795
- CrossRef
- Google Scholar
11
CorbettS.CourtiolA.LummaaV.MooradJ.StearnsS. (2018). The Transition to Modernity and Chronic Disease: Mismatch and Natural Selection. Nat. Rev. Genet.19, 419–430. 10.1038/s41576-018-0012-3
- CrossRef
- Google Scholar
12
CotsapasC.VoightB. F.RossinE.LageK.NealeB. M.WallaceC.et al (2011). Pervasive Sharing of Genetic Effects in Autoimmune Disease. Plos Genet.7, e1002254. 10.1371/journal.pgen.1002254
- CrossRef
- Google Scholar
13
de CandiaT. R.LeeS. H.YangJ.BrowningB. L.GejmanP. V.LevinsonD. F.et al (2013). Additive Genetic Variation in Schizophrenia Risk Is Shared by Populations of African and European Descent. Am. J. Hum. Genet.93, 463–470. 10.1016/j.ajhg.2013.07.007
- CrossRef
- Google Scholar
14
Eyre-WalkerA. (2010). Genetic Architecture of a Complex Trait and its Implications for Fitness and Genome-wide Association Studies. Proc. Natl. Acad. Sci.107, 1752–1756. 10.1073/pnas.0906182107
- CrossRef
- Google Scholar
15
FinucaneH. K.Bulik-SullivanB.Bulik-SullivanB.GusevA.TrynkaG.ReshefY.et al (2015). Partitioning Heritability by Functional Annotation Using Genome-wide Association Summary Statistics. Nat. Genet.47, 1228–1235. 10.1038/ng.3404
- CrossRef
- Google Scholar
16
FisherR. A. (1930). The Genetical Theory of Natural Selection. (UK: Clarendon.
- Google Scholar
17
FisherR. (1918). The Correlation between Relatives on the Supposition of Mendelian Inheritance. Trans. Roy. Soc. Edinb.52, 399–433.
- Google Scholar
18
GalinskyK. J.ReshefY. A.FinucaneH. K.LohP.-R.ZaitlenN.PattersonN. J.et al (2019). Estimating Cross-Population Genetic Correlations of Causal Effect Sizes. Genet. Epidemiol.43, 180–188. 10.1002/gepi.22173
- CrossRef
- Google Scholar
19
GazalS.FinucaneH. K.FurlotteN. A.LohP.-R.PalamaraP. F.LiuX.et al (2017). Linkage Disequilibrium-dependent Architecture of Human Complex Traits Shows Action of Negative Selection. Nat. Genet.49, 1421–1427. 10.1038/ng.3954
- CrossRef
- Google Scholar
20
GazalS.LohP.-R.FinucaneH. K.GannaA.SchoechA.SunyaevS.et al (2018). Functional Architecture of Low-Frequency Variants Highlights Strength of Negative Selection across Coding and Non-coding Annotations. Nat. Genet.50, 1600–1607. 10.1038/s41588-018-0231-8
- CrossRef
- Google Scholar
21
GiambartolomeiC.VukcevicD.SchadtE. E.FrankeL.HingoraniA. D.WallaceC.et al (2014). Bayesian Test for Colocalisation between Pairs of Genetic Association Studies Using Summary Statistics. Plos Genet.10, e1004383. 10.1371/journal.pgen.1004383
- CrossRef
- Google Scholar
22
GillespieJ. H. (1984). Pleiotropic Overdominance and the Maintenance of Genetic Variation in Polygenic Characters. Genetics107, 321–330. 10.1093/genetics/107.2.321
- CrossRef
- Google Scholar
23
GulkoB.HubiszM. J.GronauI.SiepelA. (2015). A Method for Calculating Probabilities of Fitness Consequences for point Mutations across the Human Genome. Nat. Genet.47, 276–283. 10.1038/ng.3196
- CrossRef
- Google Scholar
24
HarpakA.PrzeworskiM. (2021). The Evolution of Group Differences in Changing Environments. Plos Biol.19, e3001072–14. 10.1371/journal.pbio.3001072
- CrossRef
- Google Scholar
25
HormozdiariF.van de BuntM.SegrèA. V.LiX.JooJ. W. J.BilowM.et al (2016). Colocalization of GWAS and eQTL Signals Detects Target Genes. Am. J. Hum. Genet.99, 1245–1260. 10.1016/j.ajhg.2016.10.003
- CrossRef
- Google Scholar
26
JohnsonT.BartonN. (2005). Theoretical Models of Selection and Mutation on Quantitative Traits. Phil. Trans. R. Soc. B360, 1411–1425. 10.1098/rstb.2005.1667
- CrossRef
- Google Scholar
27
JordanD. M.VerbanckM.DoR. (2019). HOPS: a Quantitative Score Reveals Pervasive Horizontal Pleiotropy in Human Genetic Variation Is Driven by Extreme Polygenicity of Human Traits and Diseases. Genome Biol.20, 222. 10.1186/s13059-019-1844-7
- CrossRef
- Google Scholar
28
KeightleyP. D.HillW. G. (1988). Quantitative Genetic Variability Maintained by Mutation-Stabilizing Selection Balance in Finite Populations. Genet. Res.52, 33–43. 10.1017/S0016672300027282
- CrossRef
- Google Scholar
29
KeightleyP. D.HillW. G. (1990). Variation Maintained in Quantitative Traits with Mutation-Selection Balance: Pleiotropic Side-Effects on Fitness Traits. Proc. R. Soc. B: Biol. Sci.242, 95–100. 10.1098/rspb.1990.0110
- CrossRef
- Google Scholar
30
KiezunA.PulitS. L.FrancioliL. C.van DijkF.SwertzM.BoomsmaD. I.et al (2013). Deleterious Alleles in the Human Genome Are on Average Younger Than Neutral Alleles of the Same Frequency. Plos Genet.9, e1003301–12. 10.1371/journal.pgen.1003301
- CrossRef
- Google Scholar
31
KingsolverJ. G.HoekstraH. E.HoekstraJ. M.BerriganD.VignieriS. N.HillC. E.et al (2001). The Strength of Phenotypic Selection in Natural Populations. The Am. Naturalist157, 245–261. 10.1086/319193
- CrossRef
- Google Scholar
32
KochE. M. (2019). The Effects of Demography and Genetics on the Neutral Distribution of Quantitative Traits. Genetics211, 1371–1394. 10.1534/genetics.118.301839
- CrossRef
- Google Scholar
33
KondrashovA. S.TurelliM. (1992). Deleterious Mutations, Apparent Stabilizing Selection and the Maintenance of Quantitative Variation. Genetics132, 603–618. 10.1093/genetics/132.2.603
- CrossRef
- Google Scholar
34
LandeR. (1976b). The Maintenance of Genetic Variability by Mutation in a Polygenic Character with Linked Loci. Genet. Res.26, 221–235. 10.1017/s0016672300016037
- CrossRef
- Google Scholar
35
LandeR.ArnoldS. (1983). The Measurement of Selection on Correlated Characters. Evolution37, 1210. 10.2307/2408842
- CrossRef
- Google Scholar
36
LandeR. (1976a). Natural Selection and Random Genetic Drift in Phenotypic Evolution. Evolution30, 314. 10.2307/2407703
- CrossRef
- Google Scholar
37
LeeS. H.RipkeS.NealeB. M.FaraoneS. V.PurcellS. M.PerlisR. H.et al (2013). Genetic Relationship between Five Psychiatric Disorders Estimated from Genome-wide SNPs. Nat. Genet.45, 984–994. 10.1038/ng.2711
- CrossRef
- Google Scholar
38
LeeS. H.YangJ.GoddardM. E.VisscherP. M.WrayN. R. (2012). Estimation of Pleiotropy between Complex Diseases Using Single-Nucleotide Polymorphism-Derived Genomic Relationships and Restricted Maximum Likelihood. Bioinformatics28, 2540–2542. 10.1093/bioinformatics/bts474
- CrossRef
- Google Scholar
39
LincolnM. R.ConnallyN.AxisaP.-P.GasperiC.MitrovicM.van HeelD.et al (2021). Joint Analysis Reveals Shared Autoimmune Disease Associations and Identifies Common Mechanisms. medRxiv, 1–15. 10.1101/2021.05.13.21257044
- CrossRef
- Google Scholar
40
LiuX.LiY. I.PritchardJ. K. (2019). Trans Effects on Gene Expression Can Drive Omnigenic Inheritance. Cell177, 1022–1034.e6. 10.1016/j.cell.2019.04.014
- CrossRef
- Google Scholar
41
LourençoJ.GaltierN.GléminS. (2011). Complexity, Pleiotropy, and the Fitness Effect of Mutations. Evolution65, 1559–1571. 10.1111/j.1558-5646.2011.01237.x
- CrossRef
- Google Scholar
42
LynchM.HillW. G. (1986). Phenotypic Evolution by Neutral Mutation. Evolution40, 915–935. 10.1111/j.1558-5646.1986.tb00561.x
- CrossRef
- Google Scholar
43
MancusoN.RohlandN.RohlandN.RandK. A.TandonA.AllenA.et al (2016). The Contribution of Rare Variation to Prostate Cancer Heritability. Nat. Genet.48, 30–35. 10.1038/ng.3446
- CrossRef
- Google Scholar
44
MartinG.LenormandT. (2006). A General Multivariate Extension of Fisher's Geometrical Model and the Distribution of Mutation Fitness Effects across Species. Evolution60, 893–907. 10.1111/j.0014-3820.2006.tb01169.x
- CrossRef
- Google Scholar
45
MaruyamaT. (1974). The Age of a Rare Mutant Gene in a Large Population. Am. J. Hum. Genet.26, 669–673.
- Google Scholar
46
MathiesonI. (2021). The Omnigenic Model and Polygenic Prediction of Complex Traits. Am. J. Hum. Genet.108, 1558–1563. 10.1016/j.ajhg.2021.07.003
- CrossRef
- Google Scholar
47
McGuiganK.ColletJ. M.AllenS. L.ChenowethS. F.BlowsM. W. (2014). Pleiotropic Mutations Are Subject to strong Stabilizing Selection. Genetics197, 1051–1062. 10.1534/genetics.114.165720
- CrossRef
- Google Scholar
48
McKayJ. D.HungR. J.HanY.ZongX.Carreras-TorresR.ChristianiD. C.et al (2017). Large-scale Association Analysis Identifies New Lung Cancer Susceptibility Loci and Heterogeneity in Genetic Susceptibility across Histological Subtypes. Nat. Genet.49, 1126–1132. 10.1038/ng.3892
- CrossRef
- Google Scholar
49
MorabitoS.MiyoshiE.MichaelN.ShahinS.MartiniA. C.HeadE.et al (2021). Single-nucleus Chromatin Accessibility and Transcriptomic Characterization of Alzheimer's Disease. Nat. Genet.53, 1143–1155. 10.1038/s41588-021-00894-z
- CrossRef
- Google Scholar
50
O'ConnorL. J.SchoechA. P.HormozdiariF.GazalS.PattersonN.PriceA. L. (2019). Extreme Polygenicity of Complex Traits Is Explained by Negative Selection. Am. J. Hum. Genet.105, 456–476. 10.1016/j.ajhg.2019.07.003
- CrossRef
- Google Scholar
51
O’ConnorL. J. (2021). The Distribution of Common-Variant Effect Sizes. Nat. Genet.53, 1243–1249. 10.1038/s41588-021-00901-3
- CrossRef
- Google Scholar
52
PavardS.CosteC. F. D. (2021). Evolutionary Demographic Models Reveal the Strength of Purifying Selection on Susceptibility Alleles to Late-Onset Diseases. Nat. Ecol. Evol.5, 392–400. 10.1038/s41559-020-01355-2
- CrossRef
- Google Scholar
53
PingaultJ.-B.O’ReillyP. F.SchoelerT.PloubidisG. B.RijsdijkF.DudbridgeF. (2018). Using Genetic Data to Strengthen Causal Inference in Observational Research. Nat. Rev. Genet.19, 566–580. 10.1038/s41576-018-0020-3
- CrossRef
- Google Scholar
54
PritchardJ. K. (2001). Are Rare Variants Responsible for Susceptibility to Complex Diseases?Am. J. Hum. Genet.69, 124–137. 10.1086/321272
- CrossRef
- Google Scholar
55
RandsC. M.MeaderS.PontingC. P.LunterG. (2014). 8.2% of the Human Genome Is Constrained: Variation in Rates of Turnover across Functional Element Classes in the Human Lineage. Plos Genet.10, e1004525. 10.1371/journal.pgen.1004525
- CrossRef
- Google Scholar
56
ReichD. E.LanderE. S. (2001). On the Allelic Spectrum of Human Disease. Trends Genet.17, 502–510. 10.1016/S0168-9525(01)02410-6
- CrossRef
- Google Scholar
57
RobertsonA. (1956). The Effect of Selection against Extreme Deviants Based on Deviation or on Homozygosis. J. Genet.54, 236–248. 10.1007/bf02982779
- CrossRef
- Google Scholar
58
SanjakJ. S.SidorenkoJ.RobinsonM. R.ThorntonK. R.VisscherP. M. (2018). Evidence of Directional and Stabilizing Selection in Contemporary Humans. Proc. Natl. Acad. Sci. USA115, 151–156. 10.1073/pnas.1707227114
- CrossRef
- Google Scholar
59
SchoechA. P.JordanD. M.LohP.-R.GazalS.O’ConnorL. J.BalickD. J.et al (2019). Quantification of Frequency-dependent Genetic Architectures in 25 UK Biobank Traits Reveals Action of Negative Selection. Nat. Commun.10, 790. 10.1038/s41467-019-08424-6
- CrossRef
- Google Scholar
60
SellaG.BartonN. H. (2019). Thinking about the Evolution of Complex Traits in the Era of Genome-wide Association Studies. Annu. Rev. Genom. Hum. Genet.20, 461–493. 10.1146/annurev-genom-083115-022316
- CrossRef
- Google Scholar
61
ShiH.GazalS.KanaiM.KochE. M.SchoechA. P.SiewertK. M.et al (2021). Population-specific Causal Disease Effect Sizes in Functionally Important Regions Impacted by Selection. Nat. Commun.12, 1098. 10.1038/s41467-021-21286-1
- CrossRef
- Google Scholar
62
ShiH.MancusoN.SpendloveS.PasaniucB. (2017). Local Genetic Correlation Gives Insights into the Shared Genetic Architecture of Complex Traits. Am. J. Hum. Genet.101, 737–751. 10.1016/j.ajhg.2017.09.022
- CrossRef
- Google Scholar
63
SimonsY. B.BullaugheyK.HudsonR. R.SellaG. (2018). A Population Genetic Interpretation of GWAS Findings for Human Quantitative Traits. Plos Biol.16, e2002985. 10.1371/journal.pbio.2002985
- CrossRef
- Google Scholar
64
SpeedD.HolmesJ.BaldingD. J. (2020). Evaluating and Improving Heritability Models Using Summary Statistics. Nat. Genet.52, 458–462. 10.1038/s41588-020-0600-y
- CrossRef
- Google Scholar
65
StearnsF. W. (2010). One Hundred Years of Pleiotropy: A Retrospective. Genetics186, 767–773. 10.1534/genetics.110.122549
- CrossRef
- Google Scholar
66
SternA. J.SpeidelL.ZaitlenN. A.NielsenR. (2021). Disentangling Selection on Genetically Correlated Polygenic Traits via Whole-Genome Genealogies. Am. J. Hum. Genet.108, 219–239. 10.1016/j.ajhg.2020.12.005
- CrossRef
- Google Scholar
67
UmansB. D.BattleA.GiladY. (2021). Where Are the Disease-Associated eQTLs?Trends Genet.37, 109–124. 10.1016/j.tig.2020.08.009
- CrossRef
- Google Scholar
68
van RheenenW.PeyrotW. J.SchorkA. J.LeeS. H.WrayN. R. (2019). Genetic Correlations of Polygenic Disease Traits: from Theory to Practice. Nat. Rev. Genet.20, 567–581. 10.1038/s41576-019-0137-z
- CrossRef
- Google Scholar
69
VisscherP. M.HillW. G.WrayN. R. (2008). Heritability in the Genomics Era - Concepts and Misconceptions. Nat. Rev. Genet.9, 255–266. 10.1038/nrg2322
- CrossRef
- Google Scholar
70
VuckovicD.BaoE. L.AkbariP.LareauC. A.MousasA.JiangT.et al (2020). The Polygenic and Monogenic Basis of Blood Traits and Diseases. Cell182, 1214–e11. 10.1016/j.cell.2020.08.008
- CrossRef
- Google Scholar
71
WahlS.DrongA.LehneB.LohM.ScottW. R.KunzeS.et al (2017). Epigenome-wide Association Study of Body Mass index, and the Adverse Outcomes of Adiposity. Nature541, 81–86. 10.1038/nature20784
- CrossRef
- Google Scholar
72
WalshB.LynchM. (2018). Evolution and Selection of Quantitative Traits. Oxford: OUP Oxford.
- Google Scholar
73
WangY.GuoJ.NiG.YangJ.VisscherP. M.YengoL. (2020). Theoretical and Empirical Quantification of the Accuracy of Polygenic Scores in Ancestry Divergent Populations. Nat. Commun.11, 3865–3869. 10.1038/s41467-020-17719-y
- CrossRef
- Google Scholar
74
WatanabeK.StringerS.FreiO.Umićević MirkovM.de LeeuwC.PoldermanT. J. C.et al (2019). A Global Overview of Pleiotropy and Genetic Architecture in Complex Traits. Nat. Genet.51, 1339–1348. 10.1038/s41588-019-0481-0
- CrossRef
- Google Scholar
75
WaxmanD.PeckJ. R. (1998). Pleiotropy and the Preservation of Perfection. Science279, 1210–1213. 10.1126/science.279.5354.1210
- CrossRef
- Google Scholar
76
WeissbrodO.HormozdiariF.BennerC.CuiR.UlirschJ.GazalS.et al (2020). Functionally Informed fine-mapping and Polygenic Localization of Complex Trait Heritability. Nat. Genet.52, 1355–1363. 10.1038/s41588-020-00735-5
- CrossRef
- Google Scholar
77
WingreenN. S.MillerJ.CoxE. C. (2003). Scaling of Mutational Effects in Models for Pleiotropy. Genetics164, 1221–1228. 10.1093/genetics/164.3.1221
- CrossRef
- Google Scholar
78
WrightA.CharlesworthB.RudanI.CarothersA.CampbellH. (2003). A Polygenic Basis for Late-Onset Disease. Trends Genet.19, 97–106. 10.1016/s0168-9525(02)00033-1
- CrossRef
- Google Scholar
79
WrightS. (1935). The Analysis of Variance and the Correlations between Relatives with Respect to Deviations from an Optimum. Journ Genet.30, 243–256. 10.1007/BF02982239
- CrossRef
- Google Scholar
80
YairS.CoopG. (2021). Population Differentiation of Polygenic Score Predictions under Stabilizing Selection. bioRxiv. 2021.09.10.459833.
- Google Scholar
81
YangJ.BakshiA.BakshiA.ZhuZ.HemaniG.VinkhuyzenA. A. E.et al (2015). Genetic Variance Estimation with Imputed Variants Finds Negligible Missing Heritability for Human Height and Body Mass index. Nat. Genet.47, 1114–1120. 10.1038/ng.3390
- CrossRef
- Google Scholar
82
YeY.ZhangZ.LiuY.DiaoL.HanL. (2020). A Multi-Omics Perspective of Quantitative Trait Loci in Precision Medicine. Trends Genet.36, 318–336. 10.1016/j.tig.2020.01.009
- CrossRef
- Google Scholar
83
ZengJ.De VlamingR.WuY.RobinsonM. R.Lloyd-JonesL. R.YengoL.et al (2018). Signatures of Negative Selection in the Genetic Architecture of Human Complex Traits. Nat. Genet.50, 746–753. 10.1038/s41588-018-0101-4
- CrossRef
- Google Scholar
84
ZengJ.XueA.JiangL.Lloyd-JonesL. R.WuY.WangH.et al (2021). Widespread Signatures of Natural Selection across Human Complex Traits and Functional Genomic Categories. Nat. Commun.12, 1164. 10.1038/s41467-021-21446-3
- CrossRef
- Google Scholar
85
ZhangX.-S.HillW. G. (2002). Joint Effects of Pleiotropic Selection and Stabilizing Selection on the Maintenance of Quantitative Genetic Variation at Mutation-Selection Balance. Genetics162, 459–471. 10.1093/genetics/162.1.459
- CrossRef
- Google Scholar
86
ZhangX.-S.HillW. G. (2003). Multivariate Stabilizing Selection and Pleiotropy in the Maintenance of Quantitative Genetic Variation. Evol57, 1761–1775. 10.1554/02-587
- CrossRef
- Google Scholar
87
ZhangY.QiG.ParkJ.-H.ChatterjeeN. (2018). Estimation of Complex Effect-Size Distributions Using Summary-Level Statistics from Genome-wide Association Studies across 32 Complex Traits. Nat. Genet.50, 1318–1326. 10.1038/s41588-018-0193-x
- CrossRef
- Google Scholar
88
ZhuZ.ZhengZ.ZhangF.WuY.TrzaskowskiM.MaierR.et al (2018). Causal Associations between Risk Factors and Common Diseases Inferred from GWAS Summary Data. Nat. Commun.9, 224. 10.1038/s41467-017-02317-2
- CrossRef
- Google Scholar

Summary

Keywords

population genetics, genome-wide association study, statistical genetics, evolution, quantitative genetics

Citation

Koch EM and Sunyaev SR (2021) Maintenance of Complex Trait Variation: Classic Theory and Modern Data. Front. Genet. 12:763363. doi: 10.3389/fgene.2021.763363

Received

23 August 2021

Accepted

19 October 2021

Published

12 November 2021

Volume

12 - 2021

Edited by

Mashaal Sohail, University of Chicago, United States

Reviewed by

Jeremy Berg, University of Chicago, United States

Diego Ortega-Del Vecchyo, National Autonomous University of Mexico, Mexico

Updates

This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Shamil R. Sunyaev, ssunyaev@hms.harvard.edu

This article was submitted to Evolutionary and Population Genetics, a section of the journal Frontiers in Genetics

Disclaimer

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

Evolutionary, Population, and Conservation Genetics

REVIEW article

Maintenance of Complex Trait Variation: Classic Theory and Modern Data

Abstract

1 Introduction

2 Theoretical Models of Maintenance of Complex Traits and Predictions They Generate

3 Detecting Negative Selection in Genome-Wide Association Studies Data

4 Model Building Using Geometry and Pleiotropy

5 Empirical Demonstrations of the Existence and Nature of Pleiotropy

6 Constancy of Effect Sizes Across Time and Space

7 Conclusion

Statements

Author contributions

Funding

Acknowledgments

Conflict of interest

Publisher’s note

Glossary

References

Summary

Outline

Figures

Cite article

Article metrics

REVIEW article

Maintenance of Complex Trait Variation: Classic Theory and Modern Data

Abstract

1 Introduction

2 Theoretical Models of Maintenance of Complex Traits and Predictions They Generate

3 Detecting Negative Selection in Genome-Wide Association Studies Data

4 Model Building Using Geometry and Pleiotropy

5 Empirical Demonstrations of the Existence and Nature of Pleiotropy

6 Constancy of Effect Sizes Across Time and Space

7 Conclusion

Statements

Author contributions

Funding

Acknowledgments

Conflict of interest

Publisher’s note

Glossary

References

Summary

Outline

Figures

Cite article

Share article

Article metrics