Genome-Wide Association Analyses Highlight the Potential for Different Genetic Mechanisms for Litter Size Among Sheep Breeds

Reproduction is an important trait in sheep breeding as well as in other livestock. However, despite its importance the genetic mechanisms of litter size in domestic sheep (Ovis aries) are still poorly understood. To explore genetic mechanisms underlying the variation in litter size, we conducted multiple independent genome-wide association studies in five sheep breeds of high prolificacy (Wadi, Hu, Icelandic, Finnsheep, and Romanov) and one low prolificacy (Texel) using the Ovine Infinium HD BeadChip, respectively. We identified different sets of candidate genes associated with litter size in different breeds: BMPR1B, FBN1, and MMP2 in Wadi; GRIA2, SMAD1, and CTNNB1 in Hu; NCOA1 in Icelandic; INHBB, NF1, FLT1, PTGS2, and PLCB3 in Finnsheep; ESR2 in Romanov and ESR1, GHR, ETS1, MMP15, FLI1, and SPP1 in Texel. Further annotation of genes and bioinformatics analyses revealed that different biological pathways could be involved in the variation in litter size of females: hormone secretion (FSH and LH) in Wadi and Hu, placenta and embryonic lethality in Icelandic, folliculogenesis and LH signaling in Finnsheep, ovulation and preovulatory follicle maturation in Romanov, and estrogen and follicular growth in Texel. Taken together, our results provide new insights into the genetic mechanisms underlying the prolificacy trait in sheep and other mammals, suggesting targets for selection where the aim is to increase prolificacy in breeding projects.


INTRODUCTION
Reproduction is one of the most important traits in livestock production particularly for females. Selection for higher prolificacy in domestic sheep (Ovis aries) has led to variable litter size (LS) within and among breeds. For example, individual litter size of 1 to 8 has been recorded in the Hu sheep and Finnsheep (Yue, 1996;Davis et al., 2006a).
Previous studies reported that the exceptional prolificacy of the Booroola Merino was attributed to a single major gene, while a number of mutations of a major effect on litter size have been identified in other sheep breeds (Table 1; see also Xu and Li, 2017). Vage et al. (2013) detected a mutation FecG F in gene GDF9 strongly associated with litter size in Norwegian White Sheep and Finnish Landrace (Finnsheep) using a genome-wide association analysis. Demars et al. (2013) reported the mutations FecX Gr in Grivette sheep and FecX O in Olkuska sheep associated with the highly prolific phenotype by a genome-wide association analysis. Cao et al. (2016) found that nine candidate genes including the well-known FecB mutation played important roles in the variable litter size in Hu and Small-tailed Han sheep through methylated DNA-immunoprecipitation sequencing data. Miao et al. (2016) identified a set of differentially expressed genes (e.g., FecB) between low-and high-prolificacy breeds (Dorset vs. Small-tailed Han sheep) through implementing integrated analysis of miRNAs and lncRNAs. Lassoued et al. (2017) found the mutation FecX Bar associated with the prolificacy in Tunisian Barbarine. Despite its great importance the genetic mechanisms of the high prolificacy trait in domestic sheep are still poorly understood, partly due to shortage of studies conducted across multiple prolific sheep breeds. To date, numerous fecundity-associated mutations have been identified in different sheep breeds, but very few mutations have been consistently detected across the breeds. Despite the reproduction of ewes can be affected by the complex interactions of environmental conditions (i.e., climate, density, and food abundance) (Wilson et al., 2009), previous studies suggested that genetic factor could play important roles in the variable litter size of ewes.
In this study, we conducted multiple independent genomewide association studies (GWAS) on litter size in the sheep breeds of high (Wadi, Hu, Icelandic, Finnsheep, and Romanov) and low (Texel) prolificacy with a litter size ranging from 1 to 6 from different geographic regions ( Figure 1A) and genetic origins ( Figure 1B) of the world, respectively. Wadi sheep is a high-prolificacy native breed from the Shandong Province of China (Peng et al., 2017). Hu sheep is famous for early sexual maturity and high fecundity, and are distributed in the Taihu Lake area of Eastern China (Yue, 1996). Icelandic and  Mulsant et al., 2001;Souza et al., 2001;Wilson et al., 2001;Chu et al., 2011;Zhang et al., 2011;Cao et  Finnsheep (Finnish Landrace) sheep are northern European high-fecundity breeds (Mullen and Hanrahan, 2014;Eiriksson and Sigurdsson, 2017). Romanov sheep from the Volga Valley shows outstanding reproduction qualities: early sexual maturity, out-of-season breeding and extraordinary prolificacy (Deniskova et al., 2017). The Texel sheep is a relatively low-prolificacy breed originally from the island of Texel in the Netherlands and excels in muscle growth and lean carcasses (Casas et al., 2004). Our results will be important for further genetic improvement of the trait and for better understanding the molecular basis of reproduction in sheep as well as other mammals.

Sample Collection and Phenotyping
A total of 522 ewes from five sheep breeds of high (Wadi, n = 160; Hu, n = 117; Icelandic, n = 54; Finnsheep, n = 54; and Romanov, n = 78) and one low (Texel, n = 59) prolificacy were collected from farms in China, Iceland, Finland, and Russia ( Figure 1A). Animals included were as unrelated as possible based on analysis of pedigree records and farmers' knowledge. Data for the phenotype of litter size and the total number of litters collected from farm records are shown in Figure 2. The litter size ranged from 1 to 6 based on parity from 1 to 11 in six sheep breeds. Genomic DNA was extracted from the ear marginal tissues following a standard phenol/chloroform method and was diluted to 50 ng/µl for the SNP BeadChip genotyping (Köchl et al., 2005), except for the Icelandic samples which were isolated from whole-blood using MasterPure TM Complete DNA Purification Kit (Epicentre Biotech) following the manufacturers protocol.

Genotyping and Quality Control
All the samples were genotyped using the Ovine Infinium HD BeadChip according to the manufacturer's protocol. Genotypes of a total of 606,006 SNPs were obtained (genotype and phenotype datasets 1 ). We implemented quality control of these SNPs using PLINK v1.07 software (Purcell et al., 2007). The SNPs or individuals were excluded if they met any of the criteria: (1) no chromosomal or physical location, (2) call rate < 0.95, (3) missing genotype frequency > 0.05, and/or (4) minor allele frequency (MAF) < 0.05. SNPs were excluded from the analysis if a p-value of Fisher's exact test for Hardy-Weinberg equilibrium less than 0.001.

Genetic Relationships and Population Structure
To investigate the genetic relationships and population structure among the six domestic sheep, we performed global F ST , neighbor-joining (NJ) tree and principle component analysis (PCA). The global F ST value was calculated using GENEPOP v4.2 (Raymond and Rousset, 1995). The genetic distances between populations were calculated using an identity by state (IBS) similarity matrix (Kang et al., 2010). Then, the distances were used to construct a NJ tree with 1000 bootstraps using the package PHYLIP v.3.695 (Felsenstein, 1989). In addition, PCA was conducted using the SmartPCA program from the EIGENSOFT package version 4.2  based on the genotypes data.

Genome-Wide Association Analysis
To explore genetic structure within the breeds, multidimensional scaling (MDS) analysis was performed based on the independent SNPs using PLINK v1.07. Firstly, we implemented the option of 'indep-pairwise 50 5 0.05' in PLINK v1.07, which calculated pairwise linkage disequilibrium (LD) in a 50-SNP-window shifted at a pace of five SNPs. If the LD estimate was r 2 > 0.05, one of the pairs of SNPs was removed (Purcell et al., 2007). The independent SNPs retained by the LD criteria were then used in the MDS analysis, and the results were plotted using the GenABEL package in R v3.2.2 (Aulchenko et al., 2007). We performed genome-wide association studies within five sheep breeds of high prolificacy (Wadi, Hu, Icelandic, Finnsheep, and Romanov) and one low prolificacy (Texel) using the case/control design. We ranked all individuals within the breeds according to their litter size from the highest to lowest. Then, we selected individuals from two tails for each breed as 'case' and 'control, ' respectively. Based on the distribution of phenotypes, 114 samples (LS ≥ 2) in Wadi, 66 samples (LS ≥ 2) in Hu, 20 samples (LS > 2) in Icelandic, 37 samples (LS ≥ 2.5) in Finnsheep, 40 samples (LS ≥ 2.5) in Romanov and 28 samples (LS ≥ 1.6) in Texel sheep were selected as 'cases, ' while 28 samples (LS = 1) in Wadi, 15 samples (LS = 1) in Hu, 15 samples (LS ≤ 1.75) in Icelandic, 9 samples (LS ≤ 2) in Finnsheep, 26 samples (LS ≤ 2) in Romanov and 14 samples (LS ≤ 1.33) in Texel sheep were selected as 'controls.' In the GWAS, we used the function of "qtscore" in the GenABEL package. Associated SNPs were identified at both the genome-wide and chromosomewise significance levels (p < 0.05) after the Bonferroni correction (Bonferroni, 1936). To account for systematic biases caused by within-population substructure, the first and second dimensions from the MDS analyses were used as the covariates . The correlation analysis between litter size and parity within breeds showed that there were significant effects between litter size and parity in four breeds (Wadi, Hu, Icelandic, and Texel), and the effect of parity 1 on litter size was less than that of parities 2 through 10 (Supplementary Table S1 and Supplementary Figure S1). However, the parity of individuals within breeds was different, and we mainly focused on the mean of litter size of individual (total litter size/parity) in per breed. Therefore, we excluded the effect of parity from the model. The Quantile-Quantile (Q-Q) plots were visualized by plotting the distribution of obtained vs. expected genome-wide p-values. For genotype effect of potential SNPs on litter size in each breed, differences between means were analyzed by the Student's t-test. The p < 0.05 was considered statistically significant. All the results were presented as mean ± standard error (SE). We implemented pairwise tests of linkage disequilibrium (LD) between the most significant SNPs and their flanking SNPs within approximately 1 Mb upstream and downstream using PLINK v1.07. Regional association plots were generated using the R package v3.2.2.

Bioinformatics Analysis
We annotated the genes associated with litter size in each breed using the O. aries assembly Oar_v.4.0 2 . Further, we submitted the genes to the DAVID (database for annotation, visualization and integrated discovery) database 3 for gene ontology (GO) enrichment and pathways analyses (Huang et al., 2009a,b). The p-value of 0.1 and at least two genes from the input gene list in the enriched category were considered for the enriched GO terms. Also, we investigated the protein-protein interaction network for the candidate genes using the STRING database version 10.5 (Szklarczyk et al., 2017). In addition, differential expressions of the candidate genes in various tissues were examined using the EMBL-EBI Expression Atlas database 4 (Petryszak et al., 2016).

Population Relationship and Differentiation
Pairwise F ST value varied from 0.023 to 0.104 among the populations with the least genetic differentiation observed between Wadi and Hu sheep breeds (Supplementary Table S2).
The NJ tree showed that these breeds were clustered into two major groups according to their Chinese and European origins ( Figure 1B). A similar geographic pattern was seen in the PCA analyses with the grouping of Wadi and Hu sheep separated from the other four European breeds (Supplementary Figure S2).

Genome-Wide Association Analysis
After the quality control, 508,444 SNPs and 114 individuals (91 cases vs. 23 controls) in Wadi, 506,031 SNPs and 80 individuals (66 cases vs. 14 controls) in Hu, 443,125 SNPs and 23 individuals (8 cases vs. 15 controls) in Icelandic, 492,165 SNPs and 37 individuals (28 cases vs. 9 controls) in Finnsheep, 465,794 SNPs and 38 individuals (29 cases vs. 9 controls) in Romanov, 475,955 SNPs and 39 individuals (28 cases vs. 11 controls) in Texel sheep were retained in the working dataset for the GWAS. We did find several animals outlying the clusters of cases, which might cause biases in the association analyses (Supplementary Figure S3). We have repeated the association analyses without these animals, and found the results are very similar. Thus, we did not exclude these animals in the association analyses due to the small sample size for the breeds. The resulting genomic inflation factors were equal to 1.07 in Wadi, 1.14 in Hu,1.12 in Icelandic, 1.14 in Finnsheep, 1.10 in Romanov, and 1.05 in The 5% genome-wide significant threshold value is indicated by a dotted line. The significant SNPs surrounding the genes previously reported to be associated with reproduction are annotated at the chromosome-wise and genome-wide 5% significance after the Bonferroni correction.
In Hu sheep, we identified 98 and 9 SNPs at the chromosomewise and genome-wide (p < 2.18 × 10 −6 ) 5% significance after Bonferroni correction (Figure 3B and Supplementary  Tables S3, S4). The regional plot showed that the top significant SNPs rs429755189 and rs420460180 on chromosome 17 were in an LD block that contained gene GRIA2 (Figure 4B). For the rs429755189, average litter size of individuals with the genotypes G/G (n = 38, LS = 1.99 ± 0.07) and A/G (n = 52, LS = 1.94 ± 0.06) were significantly (p < 0.001) higher than that of ewes with the genotype A/A (n = 20, LS = 1.40 ± 0.09) in the present population ( Figure 5B). Among these significant SNPs, 3 (rs406357666, rs427436644 and rs412185353) are located within the genes SMAD1 and CTNNB1 ( Table 2 and Supplementary Table S3).
In Icelandic sheep, we found 22 SNPs at the chromosomewise 5% significance after the Bonferroni correction ( Figure 3C and Supplementary Tables S3, S4). The top significant SNP rs429836421 on chromosome 3 was located within gene NCOA1 (Figure 4C). For rs429836421, average litter size of individuals with the A/G genotype (n = 19, LS = 2.03 ± 0.05) is significantly The means LS were calculated for various breeds. Number of ewes per group of genotype is mentioned. Pairwise statistical comparisons between means of genotype's clades were performed using Student's t-test. * p < 0.05, * * p < 0.01, * * * p < 0.001.
In addition to the source breed where the target SNPs have been detected, we further assessed genotype effect of the most significant SNPs on litter size in the other five sheep breeds. In general, genotypes of the target SNPs did not show significant association with increased litter size in the breeds other than the source breed ( Supplementary  Table S7). Nevertheless, we observed some exceptions. For example, the genotype A/G of rs429836421, which was identified in Icelandic sheep, showed significant associations with increased litter size in both Icelandic and Hu sheep breeds. However, a lack of homozygotes for the SNPs such as the genotype G/G for rs412280524 in Finnsheep, G/G for rs423810437 in Romanov and C/C for rs161146164 in Texel sheep could be because of low frequency of the mutations and small sample size.

DISCUSSION
In this study, we conducted multiple independent GWAS in different sheep breeds to investigate the genetic mechanisms underlying the litter size in sheep. Coupled with population relationship and bioinformatics analyses, the GWAS identified different genes associated with the litter size in different breeds and revealed their differentially genetic regulation mechanisms associated with follicle growth and ovulation in the reproduction of ewes.
The diverse biological pathways identified from the novel genes annotation play an important role in follicle growth and ovulation of females in different sheep breeds (Figure 9). The three genes identified in Wadi sheep, BMPR1B, FBN1, and  MMP2, all play a crucial role in regulating hormone secretion (Mulsant et al., 2001;Basini et al., 2011;Zhang et al., 2011;Zhai et al., 2013). For example, BMPR1B gene can lead to an increased density of the follicle-stimulating hormone (FSH) and luteinizing hormone (LH) receptors with a concurrent reduction in apoptosis to increase the ovulation rate of ewes (Regan et al., 2015;Hu et al., 2016). As the main component of microfibrils in the extracellular matrix, the gene FBN1 regulates cumulus cell apoptosis by reducing the expression level of BMP15 involved in estrogen signaling in porcine ovaries (Zhai et al., 2013). The MMP2 gene plays a key role in ovulation and follicle atresia by regulating FSH and insulin like growth factor 1 (IGF1) (Knapp and Sun, 2017). In Hu sheep, the three genes GRIA2, SMAD1, and CTNNB1 are related to estrogen response element (Chang et al., 2013;Kumar et al., 2016;Vastagh et al., 2016). For example, the gene GRIA2 has been shown to participate in the glutamatergic pathway that regulates gonadotropin-releasing hormone (GnRH), a known prerequisite of the subsequent hormonal cascade inducing the ovulation in mice (Vastagh et al., 2016). The gene SMAD1 encodes an intracellular BMP signaling molecule, which is involved in mediating ovulation rate of ewes . The CTNNB1 gene enhances FSH and LH actions in follicles by stimulating WNT/CTNNB1 pathway and G protein-coupled gonadotropin receptors in female (Fan et al., 2010). In Icelandic sheep, the gene NCOA1 can alter the expression of multiple key genes PBP, AIB3, and FGFR2, which are important for aberrant labyrinth morphogenesis of the placenta and embryonic lethality Huang et al., 2011). In Finnsheep, the five candidate genes INHBB, NF1, FLT1, PTGS2, and PLCB3 played important roles in the development of folliculogenesis and LH signaling (Ding et al., 2006;Tal et al., 2014;De Cesaro et al., 2015;Ben Sassi et al., 2016;Cadoret et al., 2017). For example, the INHBB gene encodes an inhibitor of apoptosis, which regulates porcine ovarian follicular atresia (Terenina et al., 2017). The coding region of gene NF1 presents non-CpG methylation in the murine oocyte, which plays a critical role in mammalian development (Haines et al., 2001). The FLT1 gene has an important role in the activity of vascular endothelial growth factor that linked to folliculogenesis (Celik-Ozenci et al., 2003). The PTGS2 gene plays a critical role in the ovulation by stimulating LH signaling in zebrafish (Tang et al., 2017). The PLCB3 gene is highly expressed in bovine cells of the ovulatory-sized follicles, with the role of activating LH/LHR signaling (Castilho et al., 2014). In Romanov sheep, the gene ESR2 activates ovulation and regulates preovulatory follicle maturation through regulating estrogen response element (Laliotis et al., 2017;Rumi et al., 2017). In Texel sheep, the six candidate genes ESR1, GHR, ETS1, MMP15, FLI1, and SPP1 are relevant to estrogen and follicular growth (Putnova et al., 2001;Bachelot et al., 2002;Munoz et al., 2007;Xiao et al., 2009;Hatzirodos et al., 2015;Ogiwara and Takahashi, 2017). As a key gene affecting estrogen biosynthesis, ESR1 gene functions similarly to ESR2, and is critical for follicular growth and successful ovulation in ewes (Foroughinia et al., 2017). The GHR gene plays a role in follicular growth through stimulating IGF1 in mice (Bachelot et al., 2002). The ETS1 gene was linked to the regulator of protein signaling protein-2 (RGS2) involved in the ovulation in bovine (Sayasith et al., 2014). As a proteolytic enzyme gene, the MMP15 gene has been shown to mediate LH and its receptor in the preovulatory follicles of teleost medaka (Ogiwara and Takahashi, 2017). The FLI1 gene encodes a critical transcription factor, which regulates gene ETS1 (Vo et al., 2017). The SPP1 gene accounts for establishing and maintaining cellular interactions between steroidogenic and non-steroidogenic cells during the development of corpus luteum (Poole et al., 2013). In addition, the GO categories as well as protein-protein network and expression analysis showed that these genes played an essential role in follicle growth and ovulation of ewes. However, further expression analyses of these genes in each breed are necessary in future study. Taken together, the apparent difference for the litter size among the breeds might be explained by diverse regulation mechanisms.
Also, we calculated genetic differentiation among populations using the global F ST , PCA, and NJ tree methods to obtain a refined picture of population genetic relationships. The result showed that the genetic groups were consistent with the geographic origins of the breeds. The different genetic mechanisms associated with physiological processes for the litter size among sheep breeds could be related to the various environments in different geographic regions.
We noticed that previous studies had identified several genes of major effect such as BMPR1B, BMP15, and GDF9 for the prolificacy in ewes (Table 1). Different from early investigations, we detected a set of novel genes for the litter size in ewes. The main reason could be that most of early studies are based on genome-wide selection tests between prolific and nonprolific breeds using a lower density of SNPs. Instead, here we implemented GWAS within specific sheep breeds of high or low prolificacy using a high density SNP BeadChip array, which should lead to more reliable associations. In addition, the difference in threshold value used to define the 'case' and 'control' groups for each breed was also another potentially influential factor. When we implemented the GWAS using a two-step approach via the general linear model and genomewide efficient mixed-model analysis (GEMMA), we did not find interesting candidate genes associated with reproduction across the six breeds (see Supplementary Material for further details). The fact that no candidate genes associated with reproduction were detected could be due to that the power to detect such associations will be weak when treating the trait of interest as quantitative given the small sample size. Also, these populations could have been subjected to selection on litter size through environmental variables such as climate and diet. However, we did not obtain data for local environmental variables in our data. Thus, environmental variables as well as the age of reproduction for the ewes were not taken into account in the model of the GWAS, which would be essential for future study.

CONCLUSION
We revealed a set of novel functional genes for the litter size in different sheep breeds across the world. Our results suggested differentially genetic regulation mechanisms for the functional genes in the reproduction of sheep. The significant SNPs and genes identified here are useful for future molecular-based breeding for a higher fertility. Also, our results provide important insights into the regulation of reproduction in sheep and other mammals.

SUPPLEMENTARY MATERIAL
The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fgene. 2018.00118/full#supplementary-material FIGURE S1 | Parity effect for litter size in the six breeds. X-axis is labeled as the number of parity and Y-axis represents litter size. Pairwise statistical comparisons between means of litter size in parity's clades were performed using Student's t-test. * p < 0.05; * * p < 0.01, and * * * p < 0.001. TABLE S1 | Parity effect for litter size and pairwise statistical comparisons between means of litter size in parity's clades in the six breeds.