Uncovering the candidate genes related to sheep body weight using multi-trait genome-wide association analysis

In sheep, body weight is an economically important trait. This study sought to map genetic loci related to weaning weight and yearling weight. To this end, a single-trait and multi-trait genome-wide association study (GWAS) was performed using a high-density 600 K single nucleotide polymorphism (SNP) chip. The results showed that 43 and 56 SNPs were significantly associated with weaning weight and yearling weight, respectively. A region associated with both weaning and yearling traits (OARX: 6.74–7.04 Mb) was identified, suggesting that the same genes could play a role in regulating both these traits. This region was found to contain three genes (TBL1X, SHROOM2 and GPR143). The most significant SNP was Affx-281066395, located at 6.94 Mb (p = 1.70 × 10−17), corresponding to the SHROOM2 gene. We also identified 93 novel SNPs elated to sheep weight using multi-trait GWAS analysis. A new genomic region (OAR10: 76.04–77.23 Mb) with 22 significant SNPs were discovered. Combining transcriptomic data from multiple tissues and genomic data in sheep, we found the HINT1, ASB11 and GPR143 genes may involve in sheep body weight. So, multi-omic anlaysis is a valuable strategy identifying candidate genes related to body weight.


Introduction
GWAS have been widely used in gene mapping research to understand the genetic mechanisms governing economically important traits in sheep, including weight, reproductive fitness, horn number, ear type, hair color, and disease resistance (1)(2)(3)(4)(5)(6).The first study to use this approach in sheep focused on horn shape and revealed that the RXFP2 gene is related to horn type in sheep (7).
Body weight is the most important index of growth and development in farmed sheep.Studies have noted heritabilities of 0.30-0.35for weaning weight and 0.40-0.45for yearling weight, indicating that the heritability of these traits is moderately (8).Based on GWAS, Gholizadeh et al. (9) who used the Illumina ovine SNP50 BeadChip in 96 Baluchi sheep discovered the candidate genes TRBP and TRAMIL1 for birth weight; APIP and DAAM1 for weaning weight; PHF15, PRSS12, and MAN1A1 for 6-month weight; and SYNE1, WAPAL, and DAAM1 for yearling weight.Al-Mamun et al. (10) used data from 1781 Australian Merino sheep genotyped with the Illumina Ovine SNP 50 K BeadChip and found that the genes LAP3, NCAPG, and LCORL are related to body weight traits in sheep.Similarly, using GWAS, Ghasemi et al. (11) demonstrated that RAB6B and GIGYF2 are candidate genes for birth weight using Illumina Ovine SNP50 Bead Chip from 132 Lori-Bakhtiari sheep, and Lu et al. (1) performed a genome-wide associations of birth, weaning, yearling, and adult weights of 460 fine-wool sheep were determined using resequencing technology.The results showed that 113 single nucleotide polymorphisms (SNPs) reached the genome-wide significance levels for the four body weight traits and 30 genes were annotated effectively, including AADACL3, VGF, NPC1, and SERPINA12.When traits are highly correlated with each other, multi-trait analysis is more advantageous than single-trait analysis.Because multi-trait GWAS involves only one statistical test and considers both the intra-and inter-trait variance components of multiple traits (12), it can reduce the errors caused by multiple testing (13).Hence, it improves the accuracy (14,15) and precision of parameter estimation (16).Multi-trait GWAS also increases the statistical power by exploiting the genetic correlation between different traits.
In this study, we used the Affymetrix Ovis600K genotyping bead chip to identify candidate genes related to weaning weight and yearling weight using multi-trait and single-trait GWAS.Our findings provide a reference for understanding the inheritance mechanism of weight traits in sheep.

Ethics statement
All experimental procedures were in accordance with animal welfare legislation and were approved by the Experimental Animal Care and Use Committee of Xinjiang Academy of Agricultural and Reclamation Sciences (Shihezi, China, Ethics committee approval number: XJNKKXY-2020-34; December 30, 2020).

Sample collection and genotyping
A total of 218 ewes (a composite line bred from Australian Suffolk sheep, Chinese Hu sheep, and Chinese Kazakh sheep) were collected from the Xinjiang Academy of Agricultural and Reclamation Science.We recorded their weights at two stages: weaning and yearling.All sheep were fasted for 12 h before their weaning weights and first yearling weights were measured.
Single nucleotide polymorphisms (SNPs) were examined using the Affymetrix Ovis600K genotyping bead chip, which contains 604,721 SNPs.Plink 1.9 software (17) was used to control the quality of genotype data; (1) minor allele frequency (MAF) ≥5% SNPs, (2) SNP call rate ≥ 95%, (3) individual call rate ≥ 90%, and (4) SNPs mapped to X chromosomes and autosomes were evaluated.After the quality control was performed on the raw genotypes, a total of 218 animals and 479,470 SNPs were obtained.In this study, Beagle software was used to fill in the missing genotypes.The Haploview software (18) was used to analyze the linkage disequilibrium (LD).The haplotype block recognition algorithm proposed by Gabriel et al. (19), their criterion is that the one-sided upper 95% confidence bound on D′ is >0.98 and the lower bound is >0.70.

Estimation of genetic parameters of weaning weight and yearling weight
GCTA was developed as a method for estimation the variance explained by all the SNPs on a chromosome or on the whole genome for a complex trait (20).In this study, −-reml and −-reml-bivar were set to calculate heritability and genetic correlation, respectively.Using SAS9.4 (21), descriptive statistics were performed for these two traits, including mean, standard deviations, coefficients of variation.

Single-trait and multi-trait GWAS
In this study, a mixed linear model was used to analyze the association between SNPs and body weight traits, including weaning weight and yearling weight.The model used was as follows: where Y is the phenotype value vector, b is the fixed effect vector (year effect), p is the top three eigenvectors of principal component analysis (PCA), s is the SNP effect vector and SNP genotypes coded as 0, 1 and 2 for aa, Aa and AA, a is the individual residual polygene effect (random effect), c is the birth weight vector (covariant), e is the random residual effect vector, and X, K, Z, Q are the design matrices of b, p, s, a, and c, respectively.
Multivariate mixed linear models (mvLMMs) were used to conduct joint association analysis between SNPs and two traits due to strong genetic correlation between weaning weight and yearling weight.
GEMMA software (16) was used to perform single-trait and bi-trait GWAS.The Wald test was used to evaluate the significance of each genetic marker.In order to reduce false positives, the Bonferroni correction method was applied correction, and the threshold value was p = 0.05/479470 = 1.04 × 10 −7 (single-trait) and p = 0.05/479470/2 = 5.70 × 10 −8 (bi-trait).

Function annotations
In this study, we first downloaded the sheep Ovis aries (Oar_v3.1)gene annotation information 1 from the Ensembl database (22), then used the intersect parameter of BEDTools 2.1.2software to annotatin the significant SNPs and identifying gene within 200 kb upstream and downsteam of these SNPs (23).We collected quantitative trait loci (QTLs) related to sheep weight traits from the animal QTL database.

Expression analysis
We collected RNA-Seq data from the European Bioinformatics Institute (EBI) for 17 tissues, including muscle long dorsal, muscle biceps, spleen, lung, pituitary gland, brain, hypothallamus, mammary gland, kidney cortex, kidney medulla, heart, rectum, abomasum, uterus, colon, rumen, ovary tissues of juvenile and adult sheep (BioProject number: PRJEB6169).There are more than 3 samples each tissues of juvenile or adult sheep.After using Trimmomatic to remove adapter and low-quality sequences (24), all RNA-Seq datasets were processed using FastQC v0.11.3 3 and quality inspection was conducted.Reads were aligned to the sheep reference genome (OARv3.1)using STAR v.2.7.6a (25) and counted with the RNA-Seq by Expectation Maximization (RSEM) software v. 1.3.3 (26).The DESeq2 package in R was used to DEGs with significant differences between different samples (27).The tissue specificity index (τ) of the candidate genes was calculated, which is defined as where x i is the expression profile component normalized by the maximal component value and N is the number of tissues (28).

Descriptive statistics and genetic parameters
In this study, the descriptive statistics of weaning weight and yearling weight traits were analyzed including mean, standard deviations, coefficients of variation (Table 1).Based on SNP genotype, we used the GCTA software calculate genetic parameter of weaning and yearling weight traits.We found the heritability of weaning weight and yearling weight was 0.54 and 0.44, respectively.There was a significant positive genetic correlation between weaning weight and yearling weight with a correlation coefficient of 0.73 (p < 0.05).
In addition, we also found two SNPs that were only related to yearling weight, i.e., Affx-281233109 and Affx-28153321.

Multi-trait GWAS results
Since there are significant highly genetic correlation between weaning weight and yearling weight, the multuvariate model is more accurate.We used bi-traits GWAS to identify novel significant SNPs.In this study, 138 SNPs related to sheep weight were identified, and the Q-Q and Manhattan plots are displayed in Figure 1.In contrast to single-trait GWAS, multi-trait GWAS identified 93 novel SNPs (Table 3), which were mainly located on five chromosomes, including OAR3, 8, 10, 16, and 18.
Using multi-trait GWAS, we identified a new genomic region at 76.04-77.23Mb on chromosome 10 containing 22 significant SNP loci (Figure 3).These loci were missed in the single-trait GWAS analysis.However, the observed odds ratios of the two traits showed sufficient deviations, and the statistical significance could only be identified after considering the joint statistics of the two phenotypes (Figure 4).These results showed that multi-trait GWAS can increase statistical power and complement the results of single-trait GWAS.Of those significant SNPs, Affx-280892681, located at 76.34 Mb on the PCCA (Propionyl-CoA carboxylase subunit alpha) gene, was the most significant loci (p = 1.43 × 10 −212 ).But all near loci with it is not significant, we speculate that it is a false positive loci related to body weight.In this region, there are 12 significant SNP loci at 76.40-76.90Mb were strongly linkage disequilibrium (r 2 = 0.89).The Affx-122835917 SNP, located near the HINT1 gene, was associated with BW (p = 9.22 × 10 −10 ).

Expression profiles of candidate genes across multiple tissues
To validate biological function of these candidate genes in this study, we explored RNA-Seq data of multi-tissues of juvenile and adult sheep.We found there were 13 and 6 genes expressed in all 17 tissues of juvenile and adult stages, respectively.Of these genes, ARID1B (AT-Rich Interaction Domain 1B), DNM1L (Dynamin 1 Like), FANCM (Fanconi anemia complementation group M), HINT1 (Histidine Triad Nucleotide Binding Protein 1), and ZCCHC17 (Zinc Finger CCHC-Type Containing 17) were expressed in all development stages, and the expression of HINT1 gene was highest.According STRING Interaction Network database, the fatty acid-binding protein family gene, including FABP3, FABP5, FABP7 proteins, were interacted with HINT1.So we deem the HINT1 gene might play important role involving in body weight.
The expression patterns each gene varied in different tissues of different development stages.We calculated index of tissue specificity  By difference expression analysis, we found ASB11 gene was significantly higher expressed in muscle biceps tissue of juvenile sheep (p = 6.69 × 10 −4 ), and GPR143 gene was significantly higher expressed in hypothalamus tissue of adult sheep (p = 1.64 × 10 −4 ).

Discussion
Multi-trait GWAS is usually used to detect QTLs associated with multiple traits, when there is a covariance between traits.The higher the genetic and phenotypic correlation between traits, the higher is the statistical power of multi-trait GWAS.In this study, our results showed that the genetic correlation between weaning weight and yearling weight was 0.73 and Singh et al. (29) found the genetic correlation between the two traits in marwari sheep was 0.56.These findings indicate that there is a positive genetic correlation between weaning weight and yearling weight in sheep.To improve the power of GWAS results, we conducted bi-trait GWAS for two correlated traits.In comparison single trait GWAS, we yielded 93 novel SNPs related to these traits.Using the same strategy, Zhou et al. (30) conducted multitrait GWASs for chest, abdominal, and waist circumferences in Duroc Pig populations and detected four additional SNPs.Yan et al. (31) identified 16 novel loci associated with hematological traits in the White Duroc × Erhualian F2 resource population; Bolormaa et al. (32) discovered that multi-trait analysis improves the detection of polymorphic QTLs for 32 traits in beef cattle.Together with our findings, these results show that multi-trait GWAS can complement single-trait GWAS results and thus increase the statistical power of GWAS, when there is genetic correlation between different traits.
Very few studies have examined SNPs or QTLs related to weaning weight and yearling weight in sheep.According to the SheepQTLdb database (as of April 25, 2023), there are 11 QTLs and 4 QTLs related to weaning weight and yearling weight in sheep, respectively, based on QTL mapping or GWAS.These QTLs are distributed on OAR2-OAR4, OAR7, OAR9, OAR15, OAR19, and OAR24 (1, 9, 10, 33).Our study expanded this list considerably, identifying 148 SNPs that are significantly correlated with weaning weight and yearling weight through a combination of single-and multi-trait GWAS.However, we found that the candidate genetic markers of body weight identified in this study were less consistent than those reported from previous GWAS.This difference may be due to differences in the genetic background and breeds of sheep or their size Association mapping results for SNPs significantly related to weaning weight and yearling weight located at 6.74-7.04Mb on chromosome X.YW: yearling weight; WW: weaning weight.and population structure.Differences in the detection platforms or algorithms used for analysis and random or technical errors in some analyses may have also contributed to these differences.Nevertheless, this suggests that many important genetic markers and candidate genes of weight traits in the sheep genome remain to be discovered.In this study, we performed a genome-wide association study of body weights of 218 ewes.This is a small study.Although, many researchers insist on a large sample, and "the larger the sample, the more reliable is the result" is their dictum.Multiple problems have been cited with the studies on a small sample (34,35).Whether based on a small sample or a large sample, no single study is considered conclusive.A large number of small studies can be done easily in different condition.Anderson and Vingrys (36) argued that small samples may be enough to show the presence of an effect but not for estimating the effect size.If most of small studies point toward the same direction, a possibly robust conclusion can be drawn through a meta-analysis.Animal experiments can be done in highly controlled conditions to nearly eliminate all the confounders, thus it may be used  small sample to establish the cause-effect relationship (37).When more confounders were under control, sufficient power is achieved with a smaller sample.This study was only few confounders, such as birth year of ewes.So far, there are many examples exist of useful studies on small samples.For example, significant associations with body weight, growth-related and body conformation traits were identified by GWAS in 96 Baluchi sheep (38), 69 Egyptian Barki sheep (39), 150 Dazu Black goats (40), respectively.Furthermore, we also deem that estimated effects, confidence intervals and exact p values should be considered when interpreting a study's results, but only sample size (41), and some exact methods of statistical analysis may help in reaching more valid conclusions for small sample size.

SNP
In contrast to long-held notions whereby single genes were believed to encode single functions, most genes are now recognized to have multiple qualitatively distinct functions.This phenomenon is termed pleiotropy (42).Pleiotropy is defined as a condition in which a single locus affects two or more distinct phenotypic traits (43, 44).It is very common phenomenon in nature for pleiotropism.The present study also found an interesting phenomenon where both the GPR143 and SHROOM2 genes were significantly associated with weaning weight and yearling weight.Hence, these two genes appear to be pleiotropic.Lu et al. (1) also revealed that GPR143 and SHROOM2 are associated with birth weight, weaning weight, yearling weight, and adult weight in sheep, which is consistent with the results of the present study.Zhang et al. and Jahejo et al. also found SHROOM2 gene is closely associated with tibial cartilage dysplasia (45,46).It is of great significance to make a profound study of the pleiotropy so that it can reveal common genetic mechanisms between closely related phenotypes, as well as the molecular functions of genes.So, further functional data are required for the validation of these findings.
Due to high conservation across species, the identified genes related to body weight traits in humans and other animals may also be important for sheep growth and development.In this study, we found that some of these genes, including ARID1B, ASB11, DNM1L, HNF4A (Hepatocyte Nuclear Factor 4 Alpha), MKX (Mohawk Homeobox), PKIB (CAMP-Dependent Protein Kinase Inhibitor Beta), TBL1X and TMTC4 (Transmembrane O-Mannosyltransferase Targeting Cadherins 4) may be related to sheep body weight traits (Table 4).Liu et al. (47) found that ARID1B mutations are strongly associated with growth and weight traits in humans.ASB11 is a major regulator of human embryonic and adult regenerative myogenesis (48).Increased expression levels of the DNM1L proteins may correlate with the degree of weight gain, and is closely related to the development of obesity (49)(50)(51)(52).HNF4A mutations are associated with a considerable increase in birth weight and macrosomia, and the gene acts in the intestine and kidney to promote white adipose tissue energy storage (53)(54)(55)(56).MKX is a potential regulator of brown adipose tissue development associated with obesity-related metabolic dysfunction in children (57).PKIB plays a central role in human obesity and metabolism (58,59).TBL1X mainly plays an important role in maintaining precursor adipocytes in an undifferentiated state by inhibiting adipogenesis (60).Ma et al. (61) showed that TMTC4 is significantly related to the formation of human skeletal muscle.From the above elementary description of the candidate genes, we find some of them are more or less associated with muscle development and body weight in different species, which allows us to predict the genes might take part in similar processes in sheep genome.Subsequent studies, such as functional verification, will be done in the candidate genes, which could ultimately reveal the causal mutations underlying body weigh traits in sheep.Scatter plot comparing all Beta/SE values for the two traits across the genome.Novel loci detected by muti-trait GWAS are marked at the edges of the plot.

Conclusion
In this study, we identified 148 significant SNPs related to weaning weight or yearling weight based on single-trait and multi-trait GWAS.Two important chromosomal regions were discovered, including the 6.74-7.04Mb interval on chromosome X and the 76.04-77.23Mb interval on OAR10.Our results suggest that multi-trait GWAS is a powerful statistical tool for identifying novel loci missed by conventional single-trait GWAS.Incorporated transcript expression data of candidate genes, HINT1, ASB11 and GPR143 genes may involve in sheep body weight.This study show multi-omic anlaysis is a valuable strategy identifying candidate genes.Moreover, they provide key insights into the genetic determinants of weight traits in sheep.

FIGURE 3
FIGURE 3 Linkage disequilibrium (LD) blocks of the novel loci detected using multi-trait GWAS.Manhattan plot (top) and LD plot (bottom) of the 76.04-77.23Mb genomic region on chromosome 10.

TABLE 1
Descriptive statistics of body weight.

TABLE 2
Significant loci and genes identified using single-trait GWAS.

TABLE 3
Novel significant loci identified using multi-trait GWAS.

TABLE 4
Basic functions of the identified genes.