Original Research ARTICLE
Polygenic Basis and Variable Genetic Architectures Contribute to the Complex Nature of Body Weight —A Genome-Wide Study in Four Chinese Indigenous Chicken Breeds
- 1Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction of Ministry of Education, College of Animal Science and Technology, Huazhong Agricultural University, Wuhan, China
- 2State Key Laboratory for Agro-Biotechnology, China Agricultural University, Beijing, China
- 3National Engineering Laboratory for Animal Breeding, China Agricultural University, Beijing, China
Body weight (BW) is one of the most important economic traits for animal production and breeding, and it has been studied extensively for its phenotype–genotype associations. While mapping studies have mostly aimed at finding as many loci as possible that contributed to the variation in BW, the role of other factors in its genetic architecture, including their frequencies in the population and their interactions, have been largely overlooked. To comprehensively characterized the genetic architecture of BW, we performed a genome-wide association study (GWAS) both at the single-marker and haplotype level on birds from four indigenous Chinese chicken breeds (Chahua, Silkie, Langshan, and Beard), rather than studying crosses between two founder lines. Additionally, samples from two more breeds (Red Junglefowl and Recessive White) were included to better reflect variable genetic characteristics across populations. Six loci were mapped in this study, revealing the polygenic basis underlying BW. Moreover, by further examining the frequencies of the significantly associated haplotypes in each subpopulation and their effect sizes, most of the loci were found to affect BW in the Beard chicken breed alone. Two loci in GGA9 and GGA27, however, had a common effect on BW across subpopulations, showing that different underlying genetic mechanisms contribute to the phenotypic variability. These findings, particularly the variable genetic architectures found in different loci, improve our understanding of the overall genetic contributions to the large variability in BW among Chinese indigenous chicken breeds. These findings thus will have important implications for future chicken breeding.
As a well-studied quantitative trait in chickens, body weight (BW) is not only a major breed characteristic but also a trait of economic importance. One of the most used study approaches is gene mapping. Phenotype to genotype associations have revealed substantial information regarding the polygenic basis of chicken BW. To date, several hundred quantitative trait loci (QTLs) affecting BW at different stages have been listed in chickenQTLdb (Hu et al., 2016). As sequencing has become a more routine study tool, comparative population genomics have further advanced our understanding from an evolutionary prospective (Rubin et al., 2010; Elferink et al., 2012; Qanbari et al., 2015). Although one earlier study found that the body composition of mice was determined by a very large number of loci, each with an infinitesimally small effect (Martinez et al., 2000), recent studies showed that some genes could still have more important roles than others in contributing to the phenotypic variability. Regardless, only a handful of genes have been found to be candidates with relatively large effects on BW in chickens (Rubin et al., 2010; Gu et al., 2011; Tang et al., 2011; Xie et al., 2012; Jia et al., 2016; Wang et al., 2016). One reason underlying this dilemma is likely the highly variable genetic architecture. Apart from the number of genetic variants affecting a trait, other factors, such as the frequencies in the population, the magnitude of the effect sizes and their interactions with each other, and the environment, are also important to a comprehensive understanding of the complex genetic architecture (Timpson et al., 2018).
The classical experimental design in mapping studies involves crosses between two founder lines, one of which is often a commercial line with limited genetic diversity. The abundant indigenous chicken breeds in China, which have a wide geographical distribution across the country and thus show remarkable differences in morphology, production and BW growth (Zhang et al., 2011), are valuable genetic resources to study polygenic basis and elucidate variable genetic architecture of the BW traits. Many QTLs have been repeatedly detected in several Chinese indigenous chicken breeds (Hu et al., 2016), which indicates a shared genetic basis. In contrast, due to the complex origin and demographic history of adaptation of the native birds (Miao et al., 2013), a few identified genes have common influences in several native breeds. One recent finding of a bone morphogenetic protein 10 (BMP10) mutation in Yuanbao chickens, which significantly decreases body length, reinforces this impression (Wang et al., 2016).
The aim of this study was to comprehensively characterize the genetic architecture affecting BW in Chinese chickens. We sampled birds from four Chinese indigenous breeds and recorded their BW from birth to 15 weeks of age. By performing a genome-wide association (GWA) study on both single-markers and haplotypes, we identified six loci contributing to the variation of BW in different breeds and characterized their frequencies across the subpopulations as well as the corresponding effect sizes. Epistatic scans further uncovered several pairs of interactive loci. These findings reveal that the complexity of BW not only originates from multiple loci contributing to the same trait but also the underlying variable genetic architectures. These will aid in the experimental design of future studies to provide a comprehensive understanding of the genetic contributions to phenotypic variance.
Materials and Methods
Animal Experimental Ethics
All animals used in the current study were cared for and used according to the guidances (HZAUMU2013-0005) approved by the Ethics Committee of Huazhong Agricultural University.
The samples of our studies were collected from the National Chickens Genetic Resources (NCGR, Jiangsu Province)1, which is the off-site conservation base for 29 Chinese indigenous chicken breeds. In the NCGR, approximately 60 families are maintained for each generation of a single breed. Within each family, the mating ratio is 1–12. In this study, four indigenous breeds consisting of two typical low-body-weight breeds [Chahua chicken (C) and Silkie (S)] and two intermediate and high-body-weight breeds [Beard chicken (B) and Langshan chicken (L)], were included. Rather than directly sampling from the conserved population, birds were specifically bred for our study by performing artificial insemination on hens with sperm pools; in this way, birds used in this study were randomly selected from the original covserved population. After hatching these fertilized eggs to chicks, approximately 100 birds with approximately equal numbers of cocks and hens were phenotyped from each breed (Supplementary Table S1). Additionally, 4 Red Junglefowls (RJ) [data from (Elferink et al., 2012)] and 99 Recessive White chickens (RW) were used in the haplotype-based analyses.
Live BW was measured at hatch and every week until 15 weeks of age, after which all chickens were euthanized. For each BW trait of a single breed, boxplots were generated by R (3.3.0)2 (R Core Team, 2017) to screen for outliers. Records that were more than 1.5 times the interquartile range away from the lower or upper quartile of the boxplots were marked for further examinations. Such outliers were maintained only when they were consistently high across the growth phase; otherwise, the data points were eliminated from further analysis. The phenotypic records of eight individuals (2C, 4L, and 2B) were removed, as more than one-third of their data points failed the quality control.
Blood samples were collected at 15 weeks of age. Genomic DNA was then extracted by the phenol-chloroform method and diluted to 50 ng/ml. Genotyping was performed using Illumina 60K Chicken SNP BeadChips (Groenen et al., 2011). Quality control was conducted on all 394 birds (after quality control of their phenotypic records) across four breeds by customized scripts in R (3.3.0) (R Core Team, 2017) using the following criteria: individual samples were excluded with call rates < 0.9; single-nucleotide polymorphisms (SNPs) were removed as a result of call rates < 0.9, minor allele frequency (MAF) < 0.05, or undetermined positions on the chromosome. For the Z chromosome, 1611 markers were excluded since they were falsely genotyped as heterozygous in female individuals.
Genetic Diversity and Population Structure Analysis
PLINK (1.9) (Purcell et al., 2007) was used to evaluate the genetic diversity within each subpopulation. Firstly, we calculated the allele frequencies of both alleles at each locus using the command “--freq” and then obtained the proportion of polymorphic loci (Ppoly), whose MAF > 0.05, across all 46211 SNPs. Next, the pairwise linkage disequilibrium values (r2) from the command “--ld-window” were used to select the independent SNP sets of each subpopulation and to plot the pattern of linkage disequilibrium (LD) decay. Markers were determined to be “independent” when the r2 value between them was below 0.2. The expected heterozygosity values (He) were derived at each of the independent SNP loci using the command “--hardy”; the Hardy–Weinberg equilibrium exact test p-values were acquired from the same command. Finally, we assessed the population structure by principal component analysis using TASSEL (5.2.30) (Bradbury et al., 2007).
Consequently, in addition to the two non-genetic factors “sex” and “birth-weight,” “breed” was included as the third fixed effect in the following models. To further correct for population stratifications, the polygenic effect as a random effect was also included. Therefore, the three fixed effects and the polygenic effect constituted the basic model for all of the following analyses done by the R package – GenABEL (Aulchenko et al., 2007b) throughout this study. (All scripts are available from the authors on request).
A two-step score test using mixed model and regression (GRAMMAR) (Aulchenko et al., 2007a) was adopted in GenABEL. Briefly, by implementing the kinship matrix from “ibs” into the function “polygenic” in GenABEL, we obtained the residuals from the model, which accounted for all the fixed effects, the polygenic effect, and covariates, if they were present. Then, these residuals were used as the dependent traits in a simple linear regression for single-marker, haplotype-based association, or epistatic scans.
Single-Marker Association Analysis
For the single-marker association study, the linear mixed model was constructed as follows:
Here, y is the phenotypic value, μ is the overall mean, β is the fixed effects, F is the design matrix of all three fixed effects, aS is the marker genotype effect, and XS is the vector of genotypes at the tested SNP, G is the random polygenic effect; its variance is defined as ∅, where ∅ is the kinship matrix from the whole-genome SNPs and is the additive genetic variance due to the polygenes. In addition, the residual effects 𝜀 ~ (0, I).
In GenABEL, the first step is to build a kinship matrix ϕ with the whole-genome SNPs using the function “ibs,” then, by using “polygenic” and “mmscore,” we further estimated the genetic effects of each SNP.
Meanwhile, a second R package, FarmCPU (Liu et al., 2016), was also used here. In FarmCPU, rather than using the whole-genome SNPs, the kinship matrix ∅ was defined by a selected set of “pseudo quantitative trait nucleotides (QTNs).” Briefly, to remove the confounding between the tested SNP and both population structure and kinship, one fixed model and one random model were iteratively tested. The random model was used to select and evaluate the set of “pseudo QTNs” for every tested SNP, and the fixed model then fitted these “QTNs” to control false positives. Estimated genetic effects of the tested SNP were obtained, when a stage of convergence of the two models was reached after the iterative process.
The genomic inflation factor λ of both methods was further examined. Values of λ below 1.1 were considered acceptable, and the test statistics were further divided by λ to ensure that there were no indications of population stratification or cryptic relatedness in the final corrected dataset.
When multiple SNPs on the same chromosome were found to be significantly associated with the same trait, tests for their independence were performed. We first included the most significant SNP as an additional covariate in the model, and then we tested the remaining SNP(s) individually. If there were tested SNP(s) still showing significant association with the traits, the most significant SNP from this round of testing was also included as one covariate; the model then contained two SNPs as covariates. The testing continued until no significantly associated SNP was found [see model (2) below]. Last, we defined those remaining significant SNPs as independent signals, which led to our final result set.
Here, y, μ, β, F, as, Xs, G, and 𝜀 are the same as described in model 1; N is the number of covariates included in the model, γi is the effect of the ith SNP as the covariate, and Ci is the vector of the corresponding covariate.
The genome-wide significance thresholds were defined using a randomization test based on 1,000 permuted datasets (Churchill and Doerge, 1994). Afterward, 1 and 5% significance thresholds of each trait were used to screen for significantly associated SNPs.
The proportion of the phenotypic variance explained by each of the significant SNPs was calculated as follows:
Where p and q are the frequencies of the two alleles at the tested locus. As all of the final SNPs were associated with BW at multiple weeks of age, we used the phenotype that had the most significant association with the SNPs. Thus, a is the marker genotype effect estimated from the selected BW trait and Vary is the variance of the selected BW trait.
Haplotype-Based Association Analysis
To further explore the effects of haplotypes harboring the identified significant SNPs, the corresponding blocks were first defined. By employing Haploview 4.2 (Barrett et al., 2005), pair-wise LD r2 values were evaluated for each breed separately. Haplotype blocks were either defined directly by the default settings implemented in Haploview or defined as the clusters of SNPs that had r2 > 0.2 between its adjacent SNPs. Within these selected regions, SNPs were phased using fastPHASE 1.2 (Scheet and Stephens, 2006) and then the frequencies of all identified haplotypes were calculated. We retained those haplotypes that had frequencies ≥ 5% in at least one of the subpopulations for further association analyses.
The haplotype effects were evaluated at two levels based on model 3.
Here, y, μ, β, F, G, and 𝜀 are the same as described in model 1.
First, we tested the haplotypes separately to obtain the individual effects. For example, there were six haplotypes defined for BW_Q2 (Supplementary Table S3). When the effect of haplotype “CGAG” was to be estimated, two “alleles” were defined accordingly, one was this “CGAG” and all remaining haplotypes were defined as the other “allele.” Therefore, its effect could be evaluated as the allele substitution effect. In addition, ah is the estimate of the tested haplotype, while Xh is a column vector filled with the counts of the selected haplotype (0, 1, or 2). Bonferroni correction was used to obtain the threshold and the total test number was the sum of the tested haplotypes across the six loci.
Second, the effects of those significantly associated haplotypes were further estimated in their diploid form. Using haplotype “CGAG” in BW_Q2 as an example, the estimates of homozygotes “CGAG/CGAG” were evaluated in this step. Here, the coefficient vector Xh consists of values inferring the corresponding genotypes of each sample (e.g., coefficient “66” means the corresponding individual had haplotype “CGAG” on both chromosomes, and there were tens of different genotypes for one locus); ah is the estimates of the two combined haplotypes. Again, Bonferroni correction was used. The total test number was the number of tested loci, which equaled six.
To perform the variance-heterogeneity genome-wide association study (vGWAS), we transformed the phenotypes following an inverse-normal transformation (customized R scripts can be found in Supplementary Data Sheet S1). Then, the values were squared before we used them in the standard single-maker GWA analyses by GenABEL, as model 1 [details can be found in Yang et al. (2012), Shen et al. (2014)].
Significant results from the vGWAS together with selected SNPs from the GWA analyses (a total of 10 SNPs) were further examined individually against the whole-genome SNP set, based on a two-locus model, to scan for the potential G × G pairs.
Here, y, μ, β, F, G, and 𝜀 are the same as model 1. The null model (model 4) included the marker genotype effects of two loci, where X1 and X2 are the coefficient vectors of additive effects, and a1 and a2 are the effect estimates of the two tested SNPs, respectively. The full model (model 5) fitted an extra interactive term with E, the design matrix for the interactive effect between the two loci, and a1 × 2, the estimated epistatic genetic effect. By comparing the residuals from these two models, F statistics were calculated to evaluate the significance of the effects. Accordingly, permutation tests based on 1000 randomized datasets were performed to empirically derive the significance for each of the 10 targeted SNPs.
Candidate Gene Search and Gene Ontology (GO) Analysis
Based on the pairwise r2 values calculated in the earlier steps, the average lengths between two markers, when their r2 dropped to 0.2, were obtained (Table 1). Given that r2 descreased over the longest distance (72 Kb) in Silkie chicken, the candidate regions were thus determined to 72 Kb upstream and downstream of the significant SNPs. Therefore, candidate genes were searched within these defined regions using ENSEMBL biomart3 (Kinsella et al., 2011). GO enrichment analysis on the identified candidate genes was further carried out using DAVID4.
Genetic Diversity of Beard Chickens Is Different From That of Other Subpopulations
As a result of random sampling from the original conserved populations, the observed genotypes at approximately 99% of SNP loci conformed to Hardy–Weinberg expectations in each subpopulation (Table 1). The distributions of MAF and He estimated from the whole-genome SNP set in the four subpopulations are presented in Figures 1A,B. While Chahua, Silkie, and Langshan breeds show similar distributions of these two parameters, Beard displays a rather different pattern. We observed a decreased proportion of MAF and He between 0 to 0.1 in Beard chickens, which revealed that a relatively smaller part of Beard’s genome has low levels of heterozygosity. The LD-decay pattern (Table 1 and Figure 1C) also shows that r2 decreases over shorter distances (an average of 3.5 Kb) in Beard than in the other three subpopulations. Overall, there was a clear difference in the level of genetic diversity between Beard and the other three subpopulations.
FIGURE 1. Genetic diversity of four subpopulations estimated by MAF, He, and LD decay. (A) Distributions of minor allele frequency (MAF) estimated from the whole-genome single-nucleotide polymorphism (SNP) set in each subpopulation. Rectangles of different colors indicate different clusters of MAF, while rectangles of different lengths indicate the proportions of markers within different clusters. (B) Distributions of expected heterozygosity (He) estimated from the derived independent SNP set in each subpopulation. Rectangles of different colors indicate different clusters of He, while rectangles of different lengths indicate the proportins of markers within different clusters. (C) Decay of linkage disequilibrium (LD) of the four subpopulations across the genome. The pairwise linkage disequilibrium (r2) is plotted against the corresponding physical distances. When r2 drops to 0.2, the longest interval between two SNPs is the 72 Kb observed in Silkie (shown with dotted lines on graph).
Large BW Variations Were Observed Among These Indigenous Breeds
After quality control, 388 birds from four Chinese native chicken breeds were included in this study and their live BW measurements were recorded from birth to 15 weeks of age. The descriptive statistics of the phenotypic measurements of each subpopulation are given in Supplementary Table S4 and their growth curves are presented in Figure 2. In agreement with their adult weight [mean values of BW at 300 days of age recorded in Animal Genetic Resources in China-Poultry (Zhang et al., 2011)], the 15-week BWs of Beard and Langshan chickens were significantly heavier than those of Chahua and Silkie (p < 0.01, t-test). While no significant differences were found between Chahua and Silkie throughout all examined weeks, there were differences between Beard and Langshan chickens from weeks 3 to 14. Therefore, these large variations in BW provided the basis for our next-step analysis.
FIGURE 2. Growth curves of the four studied subpopulations from birth to 15 weeks of age. X-axis of the main plot shows chickens at different weeks of age, while y-axis presents their mean body weights. The inset map of China at the left side is marked with different colors to show the geographical origins of the four subpopulations. Boxplot insets, as two examples, show the mean differences of four subpopulations at 8 and 15 weeks of age. Letters a–d in the boxplots indicate the statistical differences between subpopulations. Subpopulations with different letters have significantly different means (p < 0.05). For example, at 8 weeks of age, the mean body weights between Chahua and Silkie are not significantly different, as they both are labeled with the letter “a.” On the contrary, the mean body weight of Langshan is significantly different from that of Beard, as they are labeled with different letters.
Six Loci Were Found to Be Significantly Associated With BW Across Different Growth Stages in This Population
The principal component analysis of our dataset using the first two components showed that these four subpopulations had distinct genetic backgrounds (Figure 3A). Based on thresholds determined by permutation tests, significantly associated SNPs found by R package GenABEL were clustered into 13 QTLs (Supplementary Table S5). Of these 13 QTLs, five were found to be associated with more than one BW trait and thus were retained for further testing. Moreover, a second R package, FarmCPU was employed to perform a parallel GWA analysis. Three significant signals co-localized with QTLs detected by GenABEL, which were also included in the final result set (Supplementary Table S5). After testing for their independence, a total of 8 SNPs clustered into six independent QTLs on six chromosomes were identified at two significance levels in our single-marker association analysis (Table 2 and Figure 3B). These loci were named as a combination of BW_Q (QTLs associated with BW) and the chromosome number where the QTL was located. Moreover, when multiple independent QTLs co-localized on the same chromosome, ordered letters in the alphabet were added at the end to indicate their sequential position along the same chromosome.
FIGURE 3. Quantitative trait loci (QTL) profile of the six identified loci across the genome. (A) Mapping samples to the space of the first two Principle Components (PCs1 vs. PCs2) resulting from analysis of genomic kinship. Dots in different colors represent samples from different subpopulations. (B) Schematic of the results from the single-marker association study based on GenABEL and FarmCPU methods. The dashed lines in black, and green denote the 5 and 1% genome-wide significance, respectively, based on package GenABEL. The red line shows the 1% genome-wide significance based on package FarmCPU. Dots represent the results from GenABEL, while triangles represent the results from FarmCPU.
All six QTLs were found to be associated with BW at several weeks of age. Especially, two QTLs, BW_Q9 and BW_Q27, were comprised of two correlated SNPs that were only tens of Kbs apart. In addition, more QTLs, with larger average effects, were found for growth at intermediate and late growth phases than for growth at early phases. Each QTL explained only a small proportion of the phenotypic variances, which further revealed the highly polygenic nature of the BW trait.
Haplotypes Within the Six QTLs Revealed Both Unique and Shared Genetic Basis Underlying the Large Variations in BW Among the Four Subpopulations
Next, we examined the identified loci based on haplotypes around the significant SNPs (i.e., core SNPs) listed in Table 2. The effects of all six QTLs were confirmed to be significant in the haplotype-based association (Figure 4). To further clarify the patterns of haplotype frequencies across different subpopulations, additional data from 4 RJ and 99 RW chickens were used.
FIGURE 4. Significant haplotype profile across the genome. The physical position of each QTL is shown at the top; below is the corresponding significant haplotype within each locus. Core SNPs are marked with the underscores. Red letters mean that the corresponding haplotypes increase body weight, while blue letters denote an effect of decreasing the body weight. Their manifested body weight traits at different weeks of age are plotted below, where black and green diamonds denote the 5 and 1% genome-wide significance, respectively.
In general, the identified haplotypes were associated with BW traits for a longer period of time than those found in the single-marker association analysis. In concordance with this result, the estimated effects of these haplotypes were larger than those of the corresponding significant SNPs. For each locus, multiple haplotypes were defined across 2–6 SNPs. Nevertheless, except for BW_Q9, there was only one haplotype that was found to be significantly associated with the BW traits in the other five QTLs. In addition, the phenotypic values of individuals, which were homozygous for the identified haplotype, significantly deviated from the mean values.
Compared with other haplotypes within BW_Q2, the identified haplotype in this locus (hereafter called BW_H2, similar to the rules used in naming QTLs) had a positive additive effect on BW throughout 10 weeks of growth (Table 3). However, rather than being the major haplotype in the two high BW subpopulations, BW_H2 was uniquely found in the Beard chickens. Additionally, while the core SNP was polymorphic in both RJ and RW samples, no birds in RJ and only a few in RW had this haplotype (Supplementary Table S3). Therefore, we infer that this haplotype is newly evolved in Beard chickens, which shows its unique potential in the genetic improvement of BW.
TABLE 3. Estimated effects of the significant haplotypes and their corresponding homozygotes within six identified QTLs.
It is interesting that half of the QTLs had haplotypes (BW_H3, BW_H5 and BW_H20) that were found to significantly decrease the BW (Table 3). More notably, apart from BW_H20 that also represented a small proportion in Langshan (5%), the haplotypes were again represented by one major haplotype (>20%), which was uniquely found in Beard chickens. In four RJ samples, the core SNPs were all fixed for the allele other than those in the identified haplotypes. Consequently, none of these haplotypes were found in RJ. However, unlike BW_Q2, they are quite common in RW (Supplementary Table S3). This might indicate that, regardless of its decreased effects on BW in Beard and RW chickens, it is still preferable in selection due to its association with other traits.
The remaining loci were those containing two core SNPs. The identified haplotype, BW_H27, had a significantly positive effect on BW, compared with that of other haplotypes in locus BW_Q27 (Figure 5D). As expected, BW_H27 shared much higher frequencies in Langshan (23%), Beard (29%), and RW (44%) than it did in Chahua (4%), Silkie (0%), and RJ (not found) (Figure 5B). This observation suggested that this locus is a common contributor to BW in these four indigenous breeds (Chahua, Silkie, Langshan, and Beard). Unlike any loci described above, we found two haplotypes that were significantly associated with BW traits in BW_Q9 (Figure 5A). Moreover, the effects of these two haplotypes, BW_H9a and BW_H9b, were of different directions. BW_H9a had a negative additive effect, and was found to be associated with BW traits for five consecutive weeks during the late growth phase. However, at 14 weeks of age, when both BW_H9a and BW_H9b were significantly associated with the BW, the effect size of BW_H9b was larger than that of BW_H9a (Figure 5C). Additionally, haplotype BW_H9b was only found in Langshan and RW. Therefore, locus BW_Q9 contributes to decrease BW in Chuahua, Silkie, and Beard, while it is more likely to increase BW in Langshan. Therefore, although BW_Q27 and BW_Q9 both had effects on all four indigenous breeds, their underlying genetic architectures are not the same.
FIGURE 5. Genetic characteristics of the significant haplotypes in BW_Q9 and Q27. (A,B) Distribution of the significant haplotypes from BW_Q9 and Q27 in six subpopulations. (C) Estimated effects of the combined two haplotypes, BW_H9a and BW_H9b, on body weight at 14 and 15 weeks of age. (D) Estimated effects of the homozygous BW_H27 on body weight from 4 to 8 weeks of age.
Detected Significant Interactive Pairs Infer a Role of Epistasis in the Genetic Basis of BW
First, we screened for vGWAS signals, which suggest loci that are under the genetic control of plasticity through G × G or G × E interactions. Six loci were detected, two of which overlapped with the core SNPs found in BW_Q5 and BW_Q9 (Table 4). Then, together with six core SNPs found in the single-marker association analysis, a total of 10 SNPs (4 from vGWAS results) were utilized in the two-locus epistatic model. An exhaustive two-locus interaction scan was performed between the targeted SNPs (individually) and the remainder of the full SNP dataset. Based on permutation tests, interactive pairs, including three loci that originated from the vGWAS scan, were detected at a 10% genome-wide suggestive significance level (Supplementary Table S6). Moreover, 6 SNPs from 5 loci were found to significantly interact with the core SNP, rs15896134 in BW_Q2 (5% genome-wide significance). The combinations of genotypes between the detected two loci as well as their corresponding phenotypes are presented as boxplots in Supplementary Figure S1. This finding might imply a role of epistasis in driving haplotypes uniquely evolved in one of our subpopulations, such as BW_H2 in Beard chickens. However, we do advise caution in interpreting these results, as the significance of epistatic interactions were mostly due to deviated phenotypic values of a few individuals.
TABLE 4. Single-nucleotide polymorphisms (SNPs) of significant variance-heterogeneity detected by vGWAS in this study.
To explore the genetic basis underlying the large variation in BW among Chinese indigenous chicken breeds, we herein described a GWA study in a population that consisted of four Chinese indigenous subpopulations. Due to the small sample size and the mixed population structure, a stringent standard was applied to control the stratification. Significant results either were determined by two analytic methods or were found to be associated with BW traits at different weeks of age. Based on the idea that the haplotype would be in greater LD with the causal mutation than a single SNP would (Cedric et al., 2013), haplotype-based association analyses were further performed. Then, not only were all loci validated, but the power of the analyses was increased so that larger effects and more associated traits were obtained (Table 3). Finally, a total of six QTLs with seven underlying haplotypes were found to be significantly associated with BW traits from birth to 15 weeks of age.
We performed candidate gene searches and preliminary GO analysis within both the single and epistatic QTL regions. The QTL regions surrounding the significant SNPs were defined to a relatively small region (2 × 72 Kb) based on the average LD decay distance between two markers in Silkie, which was the longest among these four subpopulations. Therefore, a small number of candidate genes were identified and no enriched pathway was found. Because functional validation of these candidate genes was beyond the scope of the current study, we did not list all of them in the “Results” section, but we have included some of the promising candidates in the following discussion. These candidates had been functionally studied in previous reports.
Given the large confidence intervals normally obtained in classical QTL scans, it is not surprising that all six loci co-localize with the QTLs from previous studies in chickenQTLdb (Hu et al., 2016). The earlier studies in Chinese indigenous chickens had also detected BW_Q9 in Xinghua chicken (Fang et al., 2010) and BW_Q27 in Beard chicken (Sheng et al., 2013), which further confirmed their common role in affecting BW among different breeds. Meanwhile, candidate gene searching within the regions that were 72 Kb (defined in Figure 1C) upstream and downstream of the detected SNPs also identified two important functionally related genes. One is delta like non-canonical Notch ligand 1 (DLK1), which is 32 Kb upstream of the identified core SNP in BW_Q5. Previous studies had indicated its association with muscle tissue development in chickens (Shin et al., 2008) and obesity in humans (Persson-Augner et al., 2014). The other is growth differentiation factor 5 (GDF5), which contained the core SNP in BW_Q20 as an intron variant. Its effect on skeletal development in chickens (Francis-West et al., 1999) and adipogenesis in mice (Hinoi et al., 2014) suggests an important role in affecting BW.
The important contribution of epistasis to quantitative traits had long been established in several organisms (Carlborg et al., 2006; Wei et al., 2014; Mackay, 2015; Forsberg et al., 2017). Recently, scaning for genetic variance-heterogeneity (vGWAS) had been proposed as an effective method to reveal epistatic effects (Shen et al., 2014; Forsberg et al., 2015). In the present study, we also performed the epistatic scans first by vGWAS and then by the classical two-locus model. In the exhaustive two-locus epistatic scans, while half of the six genome-wide significant vGWAS results were paired with their suggestive significant interactive loci, only BW_Q2 from the GWA study results was identified with its significant interactive pairs. Although a number of different mechanisms other than epistasis can also lead to such genetic variance-heterogeneity (Struchalin et al., 2010; Rönnegård and Valdar, 2011, 2012), in our study, we did observe a substantial proportion of the vGWAS results showing significant gene–gene interaction effects on the phenotypes. Moreover, known candidate genes, such as ATP binding cassette subfamily D member 2 (ABCD2) (Liu et al., 2012), nuclear receptor interacting protein 1 (NRIP1) (Park et al., 2011), KIT proto-oncogene receptor tyrosine kinase (KIT) (Gutierrez et al., 2015) and butyrylcholinesterase (BCHE) (Lima et al., 2013), had been located in the vicinity of several significant interactive loci. However, we do advise caution in interpreting these results, as our study design is not ideal for such tests.
Studies aimed at revealing the major variant for BW trait had found that multiple important genes or pathways were responsible for the high variability in BW in Chinese indigenous chicken breeds. For example, BMP10 accounted for more than 20% of the phenotypic variation in the Yuanbao chicken (Wang et al., 2016). miR-16 in one major QTL from Xinghua chicken was found to decrease BW by inhibiting cell proliferation (Jia et al., 2016). Ubiquitin mediated proteolysis, identified from several pairs of interactive loci, played an important role in regulating BW in Beard chickens (Sheng et al., 2013). Nevertheless, most of these findings originated from a single indigenous breed, and there is a low degree of overlap between the results. This led to an assumption that the major contributor to BW in different breeds might not be the same. GO analysis in our study seems to support this assumption, since no significant enrichments of the candidate genes were revealed (detailed results not shown). Additionally, by further examining the significant haplotypes within each subpopulation, more loci had effects on BW in a single breed than had a common role in regulating BW across all breeds.
In concordance with the relatively higher level of genetic diversity observed in Beard chickens, four out of the six identified QTLs were the breed-specific loci, in which the significant haplotypes were uniquely found in Beard chickens (Supplementary Table S3). More notably, BW_Q3, 5, and 20 were transgressive loci, i.e., the haplotypes having negative effects on BW were of significantly higher proportion in the high BW breed (Beard) of this study. Similar findings had been observed in other chicken QTL scans (Abasht et al., 2006), even in one population under intense artificial selection for BW (Sheng et al., 2015), which suggested the presence of transgressive loci. One possibility underlying such observations would be epistasis, where the effects of these transgressive loci depended on the genetic context in which they resided. In our vGWAS analysis, BW_Q5 showed variance heterogeneity, suggesting that this locus was a candidate of G∗E or G∗G interactions. However, the sample size and marker density in our study limited our ability to find its interactive pair in the exhaustive two-locus scan. Another potential explanation would be that these loci are pleiotropic for traits that are negatively correlated with BW, similar patterns have been found in some plant studies (Qian et al., 2016; Li et al., 2018). Therefore, regardless that their effects on BW seemed to be against the selection in this population, the overall fitness could still increase as they might have positive effects (in terms of selection) on other trait(s). Further work, (e.g., more phenotyping) is needed to dissect the underlying mechanisms.
Unlike the four QTLs discussed above, BW_Q9 and Q27 were the breed-shared loci that were associated with BW across the four subpopulations. Earlier studies in Chinese indigenous chickens had already detected BW_Q9 in the Xinghua breed (Fang et al., 2010) and BW_Q27 in Beard chicken (Sheng et al., 2013), which further indicates the common role of these two loci in different breeds. However, by further examining the frequencies of their significant haplotypes across the four subpopulations as well as in the two comparisons, RJ and RW samples, the underlying genetic patterns of these two loci were not the same. Haplotype BW_H27, which increased BW, was present in greater proportions in the high BW subpopulations than in the low BW subpopulations. Locus BW_Q9 had two significant haplotypes that were of opposite effects, so that different combinations of these two haplotypes would contribute differently to the BW. Most studies that dissect the genetic architecture of a complex trait, such as BW, focus on the number of genetic variants affecting the trait and their physical positions in the genome. Herein, we showed an example where the additional information on the frequencies of underlying genetic variants across different subpopulations, their magnitude of effects, and gene–gene interactions provided an in-depth understanding of the genetic architecture of a complex trait, which will be important to future advances in animal breeding.
In summary, we performed a GWA study across four Chinese indigenous chicken breeds. The single-marker association study identified six QTLs that were significantly associated with BW from birth until 15 weeks of age, two of which (BW_Q9 and Q27) co-localized with known QTLs found in Chinese indigenous breeds. Additionally, four out of the six QTLs were polymorphisms in Beard chickens alone, thus, they were breed-specific loci that contributed to BW variability in one breed. This is probably caused by the complex origin and demographic history of domestic chickens (Miao et al., 2013). While most mapping studies have focused on the number of genetic variants that were responsible for the phenotypic variability, we have further examined their frequencies and the magnitude of effects at the haplotype level. The breed-shared loci BW_Q9 and Q27 showed distinct underlying genetic patterns, which suggests not only a polygenic basis but also a variant genetic architecture leading to the complex quantitative characteristics of BW.
Data Availability Statement
ZS, XH, and YG conceived and designed the experiments. DP, XG, and ZS collected the samples and performed the experiments. YY and ZS analyzed the data and drafted the manuscript. XH and YG revised the manuscript. All authors read and approved the final manuscript.
This study was supported by grants from the National Natural Science Foundation of China (31472083 and 31702108), the Fundamental Research Funds for the Central Universities (2662016QD018), and Open Research Program of State Key Laboratory for Agro-Biotechnology (2015SKLAB6-11). The funders had no role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Conflict of Interest Statement
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
The data of red junglefowl samples were kindly provided by Prof. Martien A. M. Groenen from Wageningen Univerity. We thank Dr. Lucas Janss from Aarhus University for his suggestions on the language and the methods.
The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fgene.2018.00229/full#supplementary-material
FIGURE S1 | Two-locus genotype-phenotype map for all significant interactive pairs. X-axis is the genotype combination of the two tested loci, while y-axis presents the residuals of phenotypes after correction for fixed and random effects. Numbers shown at the top in blue are the individual counts for each genotype combination.
TABLE S1 | Data structure of the four subpopulations.
TABLE S2 | Summary statistics of SNP markers acorss whole-genome in this dataset.
TABLE S3 | Distribution of main haplotypes within each QTL across six sub-populations.
TABLE S4 | Descriptive statistics of the body weight traits in four sub-populaitons.
TABLE S5 | A full list of detected SNPs from single-marker association analyses.
TABLE S6 | Significant SNP pairs from the exhaustive two-loci epistatic san in this study.
DATA SHEET S1 | The R code for transforming the raw phenotypes for the vGWAS analysis.
DATA SHEET S2 | The genotype datasets of this study.
DATA SHEET S3 | The phenotype datasets of this study.
- ^ http://www.genebank.org.cn/read.asp?id=1
- ^ https://www.r-project.org/
- ^ http://www.ensembl.org/biomart
- ^ https://david.ncifcrf.gov/
Aulchenko, Y. S., de Koning, D.-J., and Haley, C. (2007a). Genomewide rapid association using mixed model and regression: a fast and simple method for genomewide pedigree-based quantitative trait loci association analysis. Genetics 177, 577–585. doi: 10.1534/genetics.107.075614
Bradbury, P. J., Zhang, Z., Kroon, D. E., Casstevens, T. M., Ramdoss, Y., and Buckler, E. S. (2007). TASSEL: software for association mapping of complex traits in diverse samples. Bioinformatics 23, 2633–2635. doi: 10.1093/bioinformatics/btm308
Elferink, M. G., Megens, H.-J., Vereijken, A., Hu, X., Crooijmans, R. P. M. A., and Groenen, M. A. M. (2012). Signatures of selection in the genomes of commercial and non-commercial chicken breeds. PLoS One 7:e32720. doi: 10.1371/journal.pone.0032720
Fang, M., Nie, Q., Luo, C., Zhang, D., and Zhang, X. (2010). Associations of GHSR gene polymorphisms with chicken growth and carcass traits. Mol. Biol. Rep. 37, 423–428. doi: 10.1007/s11033-009-9556-9
Forsberg, S. K. G., Andreatta, M. E., Huang, X.-Y., Danku, J., Salt, D. E., and Carlborg, Ö (2015). The multi-allelic genetic architecture of a variance-heterogeneity locus for molybdenum concentration in leaves acts as a source of unexplained additive genetic variance. PLoS Genet. 11:e1005648. doi: 10.1371/journal.pgen.1005648
Forsberg, S. K. G., Bloom, J. S., Sadhu, M. J., Kruglyak, L., and Carlborg, Ö (2017). Accounting for genetic interactions improves modeling of individual quantitative trait phenotypes in yeast. Nat. Publ. Group 49, 497–503. doi: 10.1038/ng.3800
Groenen, M. A., Megens, H.-J., Zare, Y., Warren, W. C., Hillier, L. W., Crooijmans, R. P., et al. (2011). The development and characterization of a 60K SNP chip for chicken. BMC Genomics 12:274. doi: 10.1186/1471-2164-12-274
Gu, X., Feng, C., Ma, L., Song, C., Wang, Y., Da, Y., et al. (2011). Genome-wide association study of body weight in chicken F2 resource population. PLoS One 6:e21872–e21875. doi: 10.1371/journal.pone.0021872
Gutierrez, D. A., Muralidhar, S., Feyerabend, T. B., Herzig, S., and Rodewald, H.-R. (2015). Hematopoietic kit deficiency, rather than lack of mast cells, protects mice from obesity and insulin resistance. Cell Metab. 21, 678–691. doi: 10.1016/j.cmet.2015.04.013
Hinoi, E., Nakamura, Y., Takada, S., Fujita, H., Iezaki, T., Hashizume, S., et al. (2014). Growth differentiation factor-5 promotes brown adipogenesis in systemic energy expenditure. Diabetes 63, 162–175. doi: 10.2337/db13-0808
Jia, X., Lin, H., Nie, Q., Zhang, X., and Lamont, S. J. (2016). A short insertion mutation disrupts genesis of miR-16 and causes increased body weight in domesticated chicken. Sci. Rep. 6:36433. doi: 10.1038/srep36433
Kinsella, R. J., Kähäri, A., Haider, S., Zamora, J., Proctor, G., Spudich, G., et al. (2011). Ensembl BioMarts: a hub for data retrieval across taxonomic space. Database 2011:bar030. doi: 10.1093/database/bar030
Li, Y.-H., Reif, J. C., Hong, H.-L., Li, H.-H., Liu, Z.-X., Ma, Y.-S., et al. (2018). Genome-wide association mapping of QTL underlying seed oil and protein contents of a diverse panel of soybean accessions. Plant Sci. 266, 95–101. doi: 10.1016/j.plantsci.2017.04.013
Lima, J. K., Leite, N., Turek, L. V., Souza, R. L. R., da Silva Timossi, L., Osiecki, A. C. V., et al. (2013). 1914G variant of BCHE gene associated with enzyme activity, obesity and triglyceride levels. Gene 532, 24–26. doi: 10.1016/j.gene.2013.08.068
Liu, J., Liang, S., Liu, X., Brown, J. A., Newman, K. E., Sunkara, M., et al. (2012). The absence of ABCD2 sensitizes mice to disruptions in lipid metabolism by dietary erucic acid. J. Lipid Res. 53, 1071–1079. doi: 10.1194/jlr.M022160
Liu, X., Huang, M., Fan, B., Buckler, E. S., and Zhang, Z. (2016). Iterative usage of fixed and random effect models for powerful and efficient genome-wide association studies. PLoS Genet. 12:e1005767. doi: 10.1371/journal.pgen.1005767
Martinez, V., Bünger, L., and Hill, W. G. (2000). Analysis of response to 20 generations of selection for body composition in mice: fit to infinitesimal model assumptions. Genet. Sel. Evol. 32, 3–21. doi: 10.1051/gse:2000103
Miao, Y.-W., Peng, M.-S., Wu, G.-S., Ouyang, Y.-N., Yang, Z.-Y., Yu, N., et al. (2013). Chicken domestication: an updated perspective based on mitochondrial genomes. Heredity 110, 277–282. doi: 10.1038/hdy.2012.83
Park, S. H., Lee, J. Y., and Kim, S. (2011). A methodology for multivariate phenotype-based genome-wide association studies to mine pleiotropic genes. BMC Syst. Biol. 5(Suppl. 2):S13. doi: 10.1186/1752-0509-5-S2-S13
Persson-Augner, D., Lee, Y.-W., Tovar, S., Dieguez, C., and Meister, B. (2014). Delta-like 1 homologue (DLK1) protein in neurons of the arcuate nucleus that control weight homeostasis and effect of fasting on hypothalamic DLK1 mRNA. Neuroendocrinology 100, 209–220. doi: 10.1159/000369069
Purcell, S., Neale, B., Todd-Brown, K., Thomas, L., Ferreira, M. A. R., Bender, D., et al. (2007). PLINK: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 81, 559–575. doi: 10.1086/519795
Qanbari, S., Seidel, M., Strom, T.-M., Mayer, K. F. X., Preisinger, R., and Simianer, H. (2015). Parallel selection revealed by population sequencing in chicken. Genome Biol. Evol. 7, 3299–3306. doi: 10.1093/gbe/evv222
Rubin, C.-J., Zody, M. C., Eriksson, J., Meadows, J. R. S., Sherwood, E., Webster, M. T., et al. (2010). Whole-genome resequencing reveals loci under selection during chicken domestication. Nature 464, 587–591. doi: 10.1038/nature08832
Scheet, P., and Stephens, M. (2006). A fast and flexible statistical model for large-scale population genotype data: applications to inferring missing genotypes and haplotypic phase. Am. J. Hum. Genet. 78, 629–644. doi: 10.1086/502802
Shen, X., De Jonge, J., Forsberg, S. K. G., Pettersson, M. E., Sheng, Z., Hennig, L., et al. (2014). Natural CMT2 variation is associated with genome-wide methylation changes and temperature seasonality. PLoS Genet. 10:e1004842. doi: 10.1371/journal.pgen.1004842
Sheng, Z., Pettersson, M. E., Honaker, C. F., Siegel, P. B., and Carlborg, Ö (2015). Standing genetic variation as a major contributor to adaptation in the Virginia chicken lines selection experiment. Genome Biol. 16:219. doi: 10.1186/s13059-015-0785-z
Sheng, Z., Pettersson, M. E., Hu, X., Luo, C., Qu, H., Shu, D., et al. (2013). Genetic dissection of growth traits in a Chinese indigenous × commercial broiler chicken cross. BMC Genomics 14:151. doi: 10.1186/1471-2164-14-151
Shin, J., Lim, S., Latshaw, J. D., and Lee, K. (2008). Cloning and expression of delta-like protein 1 messenger ribonucleic acid during development of adipose and muscle tissues in chickens. Poult. Sci. 87, 2636–2646. doi: 10.3382/ps.2008-00189
Struchalin, M. V., Dehghan, A., Witteman, J. C., van Duijn, C., and Aulchenko, Y. S. (2010). Variance heterogeneity analysis for detection of potentially interacting genetic loci: method and its limitations. BMC Genet. 11:92. doi: 10.1186/1471-2156-11-92
Tang, S., Ou, J., Sun, D., Xu, G., and Zhang, Y. (2011). A novel 62-bp indel mutation in the promoter region of transforming growth factor-beta 2 (TGFB2) gene is associated with body weight in chickens. Anim. Genet. 42, 108–112. doi: 10.1111/j.1365-2052.2010.02060.x
Timpson, N. J., Greenwood, C. M. T., Soranzo, N., Lawson, D. J., and Richards, J. B. (2018). Genetic architecture: the shape of the genetic contribution to human traits and disease. Nat. Rev. Genet. 19, 110–124. doi: 10.1038/nrg.2017.101
Wang, M.-S., Huo, Y.-X., Li, Y., Otecko, N. O., Su, L.-Y., Xu, H.-B., et al. (2016). Comparative population genomics reveals genetic basis underlying body size of domestic chickens. J. Mol. Cell Biol. 8, 542–552. doi: 10.1093/jmcb/mjw044
Xie, L., Luo, C., Zhang, C., Zhang, R., Tang, J., Nie, Q., et al. (2012). Genome-wide association study identified a narrow chromosome 1 region associated with chicken growth traits. PLoS One 7:e30910. doi: 10.1371/journal.pone.0030910
Yang, J., Loos, R. J. F., Powell, J. E., Medland, S. E., Speliotes, E. K., Chasman, D. I., et al. (2012). FTO genotype is associated with phenotypic variability of body mass index. Nature 490, 267–272. doi: 10.1038/nature11401
Keywords: genome-wide association study, haplotype-based association study, genetic architecture, body weight, polygenic basis, Chinese indigenous chicken
Citation: Yuan Y, Peng D, Gu X, Gong Y, Sheng Z and Hu X (2018) Polygenic Basis and Variable Genetic Architectures Contribute to the Complex Nature of Body Weight —A Genome-Wide Study in Four Chinese Indigenous Chicken Breeds. Front. Genet. 9:229. doi: 10.3389/fgene.2018.00229
Received: 31 January 2018; Accepted: 11 June 2018;
Published: 02 July 2018.
Edited by:Nguyen Hong Nguyen, University of the Sunshine Coast, Australia
Reviewed by:Tomasz Szwaczkowski, Poznań University of Life Sciences, Poland
Ikhide G. Imumorin, Georgia Institute of Technology, United States
Copyright © 2018 Yuan, Peng, Gu, Gong, Sheng and Hu. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.