ORIGINAL RESEARCH article

Front. Genet., 19 July 2021

Sec. Livestock Genomics

Volume 12 - 2021 | https://doi.org/10.3389/fgene.2021.659507

Genome Divergence and Dynamics in the Thin-Tailed Desert Sheep From Sudan

  • 1. Institute of Animal Science, Chinese Academy of Agricultural Sciences, Beijing, China

  • 2. Dry Land Research Centre and Animal Production, Agricultural Research Corporation, Khartoum, Sudan

  • 3. Small Ruminant Genomics, International Center for Agricultural Research in the Dry Areas (ICARDA), Addis Ababa, Ethiopia

  • 4. Arid Land Research Centre, Tottori University, Tottori, Japan

  • 5. International Center for Agricultural Research in the Dry Areas (ICARDA), Amman, Jordan

  • 6. CAAS-ILRI Joint Laboratory on Livestock and Forage Genetic Resources, Institute of Animal Science, Chinese Academy of Agricultural Sciences, Beijing, China

  • 7. Livestock Genetics Program, International Livestock Research Institute, Nairobi, Kenya

  • 8. Animal and Veterinary Sciences, Scotland Rural College and Centre for Tropical Livestock Genetics and Health (CTLGH), Roslin Institute, Midlothian, United Kingdom

Abstract

With climate change bound to affect food and feed production, emphasis will shift to resilient and adapted indigenous livestock to sustain animal production. However, indigenous livestock comprise several varieties, strains and ecotypes whose genomes are poorly characterized. Here, we investigated genomic variation in an African thin-tailed Desert Sheep sampled in Sudan, using 600K genotype data generated from 92 individuals representing five ecotypes. We included data from 18 fat-tailed and 45 thin-tailed sheep from China, to investigate shared ancestry and perform comparative genomic analysis. We observed a clear genomic differentiation between the African thin-tailed Desert Sheep and the Chinese thin-tailed and fat-tailed sheep, suggesting a broad genetic structure between the fat-tailed and thin-tailed sheep in general, and that at least two autosomal gene pools comprise the genome profile of the thin-tailed sheep. Further analysis detected two distinct genetic clusters in both the African thin-tailed Desert Sheep and the Chinese thin-tailed sheep, suggesting a fine-scale and complex genome architecture in thin-tailed sheep. Selection signature analysis suggested differences in adaptation, production, reproduction and morphology likely underly the fine-scale genetic structure in the African thin-tailed Desert Sheep. This may need to be considered in designing breeding programs and genome-wide association studies.

Introduction

A common research thread that links population and quantitative genomics is the elucidation of patterns and processes underlying population genetic structure. Whether such structure is stable in time and space is increasingly addressed for its utility in determining how many genetically distinct populations exist and their inter-relationships (Nielsen et al., 1999; Waples and Gaggiotti, 2006). Insights from such investigations inform management decisions that define conservation units and the design of genetic monitoring and breeding programs (Palsbøll et al., 2007; Schwartz et al., 2007). The tropics and sub-tropics are home to a large reservoir of indigenous livestock with a high degree of adaptive resilience, and which support agricultural and non-agricultural industries with minimal anthropogenic interventions (FAO, 2007). Indigenous livestock can therefore provide a foundation to sustain production under increasing challenges resulting from global warming and rising human demand for livestock products. It is also likely that future livestock production will come from marginal areas where arable agriculture is at high risk of failure and thus particular attention would have to be given to the uniqueness of genetic features, as it is difficult to predict the future importance of traits and alleles (Taberlet et al., 2008).

Domestic sheep (Ovis aries) are central to national economies as a source of cash, meat, milk, fiber, etc., and to traditional societies as repositories of socio-cultural values. Sheep are also essential components of diverse production systems due partly to their versatility to adapt to local biophysical and production environments. Domestic sheep comprise three broad types: thin-tailed, fat-tailed, and fat-rumped sheep (Porter, 2020). Thin-tailed sheep are the most ancient and in the African continent, two types are recognized: the long-legged (Sahelian) and the tropical Dwarf (Djallonké) sheep. The Sahelian is confined to the hot arid marginal environments in eastern, western and northern Africa, while the Djallonke is well adapted to sub-humid and humid tropics of western and central Africa. The analysis of mitochondrial genomes has shown that the Sahelian and Djallonke comprise separate maternal ancestries (Brahi et al., 2015).

The long-legged thin-tailed sheep found in Sudan represents the complexity that is typical of most indigenous livestock. They are subdivided into Desert, Nilotic, Arid upland, Arid equatorial, and West African populations, including their inter-crosses, following their eco-geographic distribution (Abualazayium, 2004). Within the Sudanese thin-tailed Desert Sheep, a long-legged Sahelian thin-tailed sheep, at least eight ecotypes, Hammari, Kabashi, Shanbali, Shugor, Dubasi, Watish, Al Ahamda, and Borouge, have been described (Abualazayium, 2004). Whether these ecotypes represent real underlying genetic variation remains unknown. If confirmed, they could offer a powerful genetic model to investigate drivers of divergence in indigenous livestock. Understanding fine-scale genetic structure is also important to control confounding effects of population stratification in association studies (Price et al., 2010). In this study, we applied distance- and model-based comparative genomic approaches to 600K single nucleotide polymorphism (SNP) genotype data from 121 individuals of five ecotypes of the Sudanese thin-tailed Desert Sheep, to investigate their genome architecture and dynamics. We analyzed the dataset alongside similar data from one fat-tailed and four thin-tailed breeds of sheep from China, to investigate their shared genome ancestry and for comparative genomic assessment.

Materials and Methods

Sample Collection and DNA Extraction

Whole blood was collected through jugular venipuncture from 121 animals representing five ecotypes of thin-tailed Desert Sheep in Sudan (Supplementary Tables 1, 2). Blood samples were also collected from 65 individuals of four breeds of Chinese sheep, including one fat-tailed breed (Tan Sheep) from a dry lowland environment in Ningxia Province and three thin-tailed breeds (Oula, Zeku, and Black Tibetan) from the alpine high-altitude in Qinghai Province (Supplementary Table 1); these were used for comparative genomic analysis. The DNeasy® Blood and Tissue Kit (Qiagen Inc., United States) and the phenol-chloroform method were used to extract DNA from the Sudanese and Chinese samples, respectively. The samples were genotyped with the Ovine Infinium 600K BeadChip at GeneSeek Inc. (Sudanese samples) and Compass Biotechnology (Chinese samples). Of the 606,006 SNPs present in the BeadChip, 577,401 are autosomal, 27,314 are on the X-chromosome and 1,291 remain unmapped.

Data Screening and Quality Control

The genotypes from both sets of samples were merged and the data were screened for quality with PLINK (Chang et al., 2015). Samples with more than 10% missing genotypes, SNPs with less than 90% genotype call rates, Hardy-Weinberg equilibrium (HWE) threshold of 1e-10 and minor allele frequency (MAF) < 0.01 were discarded. Using the “genome – min 0.05 – max 1” flag, the Pi-HAT statistic was calculated to assess the level of genetic relatedness between individuals and determine outliers with the objective of excluding the outliers and at least one amongst a pair of individuals that showed close genetic relationship. The value of 0.1875 which represents the half-way point between the 2nd and 3rd degree relatives was used as the cut-off threshold. These filtering thresholds left 155 samples (Table 1) and 519,711 SNPs which were used for selection signature analyses. Using PLINK, the 519,711 SNPs were subjected to linkage disequilibrium (LD) pruning using the parameters 50, 5, and 0.5 representing window size (kb), step size (kb), and r2 threshold, respectively, resulting in 390,243 SNPs that were used for population structure and phylogenetic analysis.

TABLE 1

Ecotype/BreedAbb.NHO (mean ± SD)HE (mean ± SD)Pi-HATROH (Mb) (mean ± SD)FROH (mean ± SD)F (mean ± SD)Origin
Al-AhamdaAL190.345 ± 0.0100.347 ± 0.00010.092 ± 0.0091.886 ± 0.6340.014 ± 0.0260.006 ± 0.031Sudan
BuzeeBU180.347 ± 0.0080.348 ± 0.00010.096 ± 0.0131.735 ± 0.3240.015 ± 0.0130.002 ± 0.023
HammariHA230.341 ± 0.0040.343 ± 0.00000.081 ± 0.0051.841 ± 0.5260.013 ± 0.0090.005 ± 0.011
KabashiKA170.353 ± 0.0100.357 ± 0.00010.112 ± 0.0391.748 ± 0.7180.019 ± 0.0120.011 ± 0.029
ShanbaliSH150.355 ± 0.0080.353 ± 0.00010.110 ± 0.0251.504 ± 0.2270.009 ± 0.010−0.005 ± 0.024
Thin-tail DesertSDN920.329 ± 0.0080.332 ± 0.00010.056 ± 0.0161.852 ± 0.3350.014 ± 0.0150.009 ± 0.023
Black TibetanHZ150.336 ± 0.0450.332 ± 0.00010.253 ± 0.0681.208 ± 0.1250.005 ± 0.006−0.011 ± 0.135China
OulaQOL150.346 ± 0.0200.353 ± 0.00010.096 ± 0.0311.169 ± 0.1080.001 ± 0.0010.018 ± 0.020
ZekuZK150.348 ± 0.0050.352 ± 0.00020.103 ± 0.0491.247 ± 0.1240.001 ± 0.0010.011 ± 0.016
TanTS180.346 ± 0.0190.360 ± 0.00000.083 ± 0.0341.351 ± 0.2030.009 ± 0.0110.039 ± 0.053
Chinese sheepCHN630.329 ± 0.0080.346 ± 0.00010.097 ± 0.0361.319 ± 0.1760.004 ± 0.0070.089 ± 0.072

Genetic diversity estimates in the five ecotypes of Sudanese thin-tailed Desert Sheep and the four breeds of sheep from China analyzed in this study.

N: sample size; HE: expected heterozygosity; HO: observed heterozygosity; Pi-HAT: average relatedness; F: inbreeding coefficient; SD: standard deviation; Values in bold represent the average values for each statistic in across each sheep group from Sudan and China.

Genetic Diversity, Relationship, and Structure

The “het” and “ibc” options in PLINK were used to calculate the observed (HO) and expected (HE) heterozygosity, inbreeding coefficient F and Pi-HAT statistics. The “detectRUNS” package (Biscarini et al., 2019) was used to scan the genomes for runs of homozygosity (ROH) using the consecutive method (Marras et al., 2015). The parameters used to detect ROH were: (i) minimum number of SNPs in a sliding window = 50, (ii) minimum ROH length = 1 Mb, (iii) minimum number of consecutive SNPs for each ROH = 50, (iv) number of heterozygous SNPs per window = 1, (v) missing SNP calls per window = 5, (vi) minimum SNP density = 1 SNP/100 kb, and (vii) maximum gap between consecutive SNPs = 1 Mb. The ROH coefficient depicting genome-wide inbreeding (FROH) was computed as the ratio of the total length of ROH to the length of autosomes (2.45 Gb) (McQuillan et al., 2008).

To explore demographic dynamics in the African thin-tailed Desert Sheep and in the thin-tailed and fat-tailed sheep from China, the trends in LD over genomic distances were examined by calculating the correlation coefficient (r2) between alleles at two SNP loci using the “indep” option in PLINK. Following Sved (1971), the effective population size (NE) was estimated with the equation NEt = (1/4c) (1/r2 - 1), where NEt represents the effective population size t generations ago, t = 1/2c, r2 is the LD between pairwise SNPs and c is the genetic distance in Morgan between a pair of SNPs (Tortereau et al., 2012).

Weir and Cockerham (1984)FST statistic was calculated to determine the extent of genetic differentiation using Arlequin v.3.5.2 (Excoffier and Lischer, 2010). The significance of the pairwise FST values was determined following 1,000 permutation replications of the dataset.

To investigate genetic structure, we performed neighbour-joining (NJ) phylogeny, principal component analysis (PCA) and ADMIXTURE modeling. The NJ tree was generated to visualize relationships using pairwise FST genetic distances. MEGA7 (Kumar et al., 2016) was used to construct the NJ tree with the five ecotypes of thin-tailed Desert Sheep and the four sheep breeds from China anchoring at the nodes. PCA was performed using the “pca” option in PLINK and the first two PCs were visualized using GENESIS (Buchmann and Hazelhurst, 2014). Genomic ancestry and admixture were investigated using the unsupervised clustering algorithm implemented in the ADMIXTURE toolkit v1.3 (Alexander et al., 2009). The patterns of population structure were explored by varying the number of assumed ancestral clusters between 2 ≤ K ≤ 8. Five iterations were performed for each K, summarized using CLUMPP (Jakobsson and Rosenberg, 2007) and visualized with GENESIS.

Genome-Wide Scans for Signatures of Divergent Selection

The NJ tree, PCA and ADMIXTURE revealed evidence of broad- and fine-scale genetic structures in the datasets. To detect genomic regions showing divergence between the observed genetic structures, we analyzed the dataset for signatures of divergent selection.

Using the detectRUNS package, we implemented the frequency-based threshold approach to define genome-wide ROH islands in each genetic cluster that was revealed by the NJ tree, PCA and ADMIXTURE. This approach set a percentage of animals within a genetic cluster or a population having a SNP in an ROH region as the threshold. The threshold used in our analysis to select the genomic regions with a high frequency of ROH islands was 50%. Private ROH islands were determined by filtering out homozygous variants in ROHs in a specific genetic cluster, but which could not be found in ROHs of the other genetic clusters. This allowed the detection of either whole or segments of ROHs that were either shared or were private to a genetic cluster. The frequency of ROHs were plotted against their physical genomic positions and visualized with a Manhattan plot.

Signatures of selection were investigated using FST (Weir and Cockerham, 1984). This approach analyses allele frequency differentiation between populations and detects almost complete or long-term selection signatures. The FST algorithm was implemented with HIERFSTAT (Goudet, 2005) using a window size of 200 kb and a sliding step size of 60 kb. The window and slide-step sizes were inferred from LD decay patterns. The pairwise FST values for each SNP in each window between the genetic clusters tested were estimated as:

where p1, p2 and q1, q2 are the frequencies of alleles “A” and “a” in the first and second genetic clusters, respectively, and pr and qr are the frequencies of alleles “A” and “a,” respectively, across the tested genetic clusters (Zhi et al., 2018). The FST values were standardized into Z-scores as follows:

where μFST is the overall average value of FST and σFST is the standard deviation (SD) derived from all the windows tested for a given comparison. Supplementary Figure 1 shows the distribution of the ZFST values. We set the top 0.1% positive values of ZFST as the threshold to identify the candidate regions under selection.

We also investigated signatures of selection using XP-EHH (Sabeti et al., 2007). It assesses haplotype differences between two test populations, and it can detect a wider range of selection scenarios, including selection on standing variation or incomplete sweeps or on-going selection (Sabeti et al., 2007; Innan and Kim, 2008). The test uses the integrated EHH (iHH) of a core SNP in two test populations, rather than two alleles in one test population. The unstandardized XP-EHH statistic is calculated as:

where iHHA and iHHB are the integrated EHH of a given core SNP in the A and B test populations, respectively. A large and positive value of XP-EHH at a locus suggests selection in reference population while a negative value suggests selection in the alternate one. We used the software developed by Pickrell and Pritchard (2012) to estimate the unstandardized XP-EHH statistics for all SNPs in each comparison. These were then standardized using their means and variances, and the p-values were estimated from the standard normal distribution (Supplementary Figure 2). For each comparison, we determined the regions under selection based on the threshold P value < 0.001.

To investigate the functional significance of selection signatures, the candidate regions revealed by ROH, FST, and XP-EHH were identified using the intersectBed function of BedTools software (Quinlan and Hall, 2010). The O. aries reference genome assembly (OAR_v3.0) was used to annotate the candidate regions using the BioMart Tool in Ensembl and NCBI databases. The functional annotation tool in DAVID v6.8 (Huang et al., 2009) was used to perform enrichment and gene ontology (GO) analyses using O. aries as the background. The functions of genes overlapping the candidate regions were also determined from literature.

Results

Genetic Diversity and Differentiation

The average values of HO and HE for the Sudanese thin-tailed Desert Sheep and the Chinese breeds exceeded 0.300 (Table 1). Among the Sudan thin-tailed Desert sheep, Hammari (HA) had the lowest values of HO (0.341 ± 0.004) and HE (0.343 ± 0.000) while the highest values were observed in Shanbali (SH) (HO = 0.355 ± 0.008) and Kabashi (KA) (HE = 0.357 ± 0.000). Among the Chinese breeds, the Black Tibetan (HZ) had the lowest values of HO (0.336 ± 0.045) and HE (0.332 ± 0.0001), whereas the highest values were present in Zeku (ZK) (HO = 0.348 ± 0.005) and Tan Sheep (HE = 0.360 ± 0.000). The average Pi-HAT value for the Sudan thin-tailed Desert Sheep was 0.056 ± 0.016, ranging between 0.081 and 0.112 (Table 1). This supported a low level of genetic relatedness between the individuals, which were confirmed by the low values of inbreeding (average FROH = 0.014 and F = 0.009). This validated our sampling strategy which targeted only mature unrelated animals. The Sudanese thin-tailed Desert Sheep had on average longer lengths of ROH, higher FROH but lower F compared to the Chinese breeds.

We investigated the trends in LD decay over genomic distances (Figure 1A) and in NE over generation time (Figure 1B). Across all genomic distance intervals, the Sudanese thin-tailed Desert Sheep showed lower LD compared to the Chinese breeds. At 1,000 generations ago, the Sudanese thin-tailed Desert Sheep had lower NE compared to the Chinese sheep. However, the thin-tailed Desert Sheep showed an expansion in NE up to around 270 generations ago, after which it started to contract drastically up to the present time. The NE for the Chinese breeds remained stable up to around 500 generations ago, upon which it also started to contract but at a gradual pace to the present time.

FIGURE 1

Genetic differentiation was assessed by calculating pairwise FST values between ecotypes and breeds (Supplementary Table 3). The overall highest levels of genetic differentiation were between the fat-tailed Tan Sheep against all the other ecotypes and breeds. Among the Sudanese thin-tailed Desert Sheep, AL-Ahamda (AL) showed the highest FST values with the other ecotypes while the lowest values were between Buzee (BU) with both HA and SH. Among the Chinese breeds, the lowest values of FST occurred between the thin-tailed Oula (QOL) and Zeku (ZK).

To visualize the extent of genetic differentiation, we generated an NJ phylogeny with the ecotypes and breeds at the nodes using the pairwise FST genetic distance matrix (Figure 2A). The tree revealed a clear separation of the fat-tailed Tan Sheep from the Chinese thin-tailed sheep and the Sudanese thin-tailed Desert Sheep. The latter clustered very close together, suggesting lower genetic differentiation. The HZ was separated from both ZK and QOL, suggesting a genetic divergence within the Chinese thin-tailed sheep. ZK and QOL had the lowest value of FST between them. An NJ phylogeny for the five ecotypes of Sudanese thin-tailed Desert Sheep was generated to obtain a better picture of the extent of their genetic relationships (Figure 2B). It showed that AL was genetically differentiated from BU, HA, KA, and SH.

FIGURE 2

We used individuals as the taxonomic units and performed a PCA to ascertain the divergence revealed by the NJ tree. In the first instance, all the study individuals were included in the PCA, and PC1 and PC2 explained 26.50 and 8.09% of the total genetic variation, respectively (Figure 2C). PC1 separated the Chinese breeds into two groups showing a marked concordance with their tail type and geographic origin. One group included the Tan Sheep (fat-tailed sheep from the lowlands, Ningxia Province) while the other group comprised HZ, QOL, and ZK (thin-tailed sheep from the alpine high-altitude, Qinghai Province). This grouping corresponded to the NJ phylogeny. Although HZ appeared to be separated from both QOL and ZK, its divergence was not as distinct as on the NJ tree. PC2 separated the Chinese breeds from the five ecotypes of Sudanese thin-tailed Desert Sheep which, as in the NJ tree, clustered close together. To further investigate the clustering pattern in the latter, we performed the PCA but excluding the Chinese breeds. It revealed three outlier individuals of KA and a slight separation of AL from the other ecotypes (Figure 2D). Given the likelihood that the three outliers could be masking the divergence of AL, we performed another PCA while excluding them. It resulted in a clear divergence of AL from the other ecotypes (Figure 2E). Given this result, we excluded the three outliers from subsequent analyses as their extreme divergence could not be explained.

We used the ADMIXTURE tool to investigate genome architecture and complement the NJ phylogeny and PCA. Running ADMIXTURE using all the study individuals resulted in the lowest CV error at K = 3 (Figure 3A). At this K, three gene pools were observed (Figure 3B). The first comprised the thin-tailed Desert Sheep from Sudan, the second was made up of the thin-tailed sheep from China (ZK, QOL, and HZ) and the third is exclusive to the Chinese fat-tailed Tan Sheep. We suggest that this reveals the broad-scale genetic structure defining the dataset. Considering the results of the NJ phylogeny and PCA, which showed AL to be genetically divergent and a slight separation of HZ from both ZK and QOL, we investigated the ADMIXTURE patterns at 4 ≤ K ≤ 8 (Figure 3B). At K = 4, HZ diverges from ZK and QOL and this divergence is retained up to K = 8. At K = 5, AL diverges from the other ecotypes of Sudanese thin-tailed Desert Sheep by its unique genetic background which is also retained up to K = 8. As the pattern of genetic structure remains consistent from 5 ≤ K ≤ 8 and it corresponds to the clusters revealed by the NJ phylogeny (Figures 2A,B) and PCA (Figures 2C,E), we suggest that K = 5 represents the fine-scale genomic structure in the study individuals. We therefore refer the two genetic clusters in the Sudanese thin-tailed Desert sheep as SD_G1 (AL) and SD_G2 (BU, HA, KA, and SH). We also classify the Chinese breeds into three genetic clusters, i.e., CN_G1 (Tan Sheep), CN_G2 (ZK and QOL), and CN_G3 (HZ).

FIGURE 3

Signatures of Selection

Based on these genetic clusters, we used ROH, FST and XP-EHH to investigate the genetic signatures of their divergence. We performed the ROH analysis to identify ROH segments that were specific to SD_G1 or SD_G2. The FST and XP-EHH were used for comparative selection signature analyses that contrasted the two genetic clusters of the Sudanese thin-tailed Desert Sheep with each other and with two of the Chinese clusters. That was, for SD_G1, the following comparative analyses were performed: SD_G1 verses SD_G2, CN_G1, or CN_G2. A similar comparative analysis was performed for SD_G2. We excluded CN_G3 from these analyses because its genome showed an admixed profile. All the candidate regions and genes identified by each approach for each comparison involving SD_G1 and SD_G2 are shown in Supplementary Tables 4–10.

The ROH analysis identified 107 ROH regions, overlapping 286 genes, that were specific to SD_G1 (Figure 4A), and 78 ROH regions, spanning 281 genes, that were specific to SD_G2 (Figure 4B). In total, 88 ROH regions were common between SD_G1 and SD_G2. The most significant ROH region which was common to SD_G1 and SD_G2 occurred on OAR3 (10,700,001–11,800,000 bp) and spanned 19 genes (Table 2). For the Chinese groups, 146 and 69 ROH islands spanning 257 and 43 genes were specific to CN_G1 (Figure 4C) and CN_G2 (Figure 4D), respectively.

FIGURE 4

TABLE 2

RegOarStartStopSize (Mb)MethodCompressionNo. of genesGenes
112224820612225595680.078ROHXP-EHH_SD_G2
225525099256812930.156ROH_FSTSD_G24TTC39A, EPS15, LOC105609505, LOC105609934
32526638912528055820.142ROH_FSTSD_G22KY, LOC105604327
425525099256812930.156ROHXP-EHH_CN_G22TTC39A, EPS15
51060503701062839180.234ROH_FSTCN_G12ENSOARG00000006782, ENSOARG00000006800
668580173687765360.196ROH_FSTCN_G25BTBD8, U6, C1orf146, GLMN, RPAP2
72912844791571610.029ROHXP-EHH_CN_G11ATP6V1G1
81828158421828248460.009ROHXP-EHH_CN_G1
914616917147979150.181ROHXP-EHH_CN_G21U6
10310700001118000001.099ROHXP-EHH_CN_G1, CN_G219HSPA5, RABEPK, PPP6C, SCAI, ENSOARG00000025028, GOLGA1, ARPC5L, ENSOARG00000013275, WDR38, U6, OLFML2A, ENSOARG00000023155, oar-mir-181a-2, NR6A1, NR5A1, ADGRD2, PSMB7, NEK6, LHX2
111318000011326000000.799ROHXP-EHH_SD_G219PPP1R1A, PDE1B, NCKAP1L, GTSF1, ITGA5, ZNF385A, COPZ1, NFE2, CBX5, HOXC4, HOXC5, HOXC6, HOXC8, HOXC9, HOXC10, HOXC11, HOXC12, HOXC13, STAB2
1212404574124977070.093ROHXP-EHH_CN_G1, CN_G21U6
132067450612068658540.121ROHXP-EHH_CN_G14LOC105614286, LOC105608615, LOC105608603, LOC105614853
141359040271359052570.001ROHXP-EHH_CN_G21LIMA1
151359830811359830810.000ROHXP-EHH_CN_G21CERS5
161538138361538639130.050ROH_FSTCN_G13ENSOARG00000002929, ENSOARG00000023603, LOC105609946
171856662251858733530.207ROH_FSTCN_G1
1851074000011080000000.599__FSTSD_G2, CN_G15ENSOARG00000000146, CAMK4, STARD4, ENSOARG00000010495, WDR36, ENSOARG00000025339, ENSOARG00000025340, ENSOARG00000025340
1941700001420000000.299_XP-EHH_CN_G211ENSOARG00000013233, ENSOARG00000013103, SOWAHA, SHROOM1, GDF9, UQCRQ, LEAP2, AFF4, U6, ZCCHC10, ENSOARG00000013524
2049700001500000000.299_XP-EHH_CN_G210PCDHB14, PCDHB15, ENSOARG00000014442, TAF7, PCDHGA1, PCDHGA2, ENSOARG00000000218, PCDHGC3, ENSOARG00000023940, DIAPH1
2148997231491376480.140ROHXP-EHH_CN_G29LOC101121602, 5S_rRNA, U6, SRA1, APBB3, LOC105615270, AO21, AO45, SLC35A4, ENSOARG00000018230
2249139142491714180.032ROHXP-EHH_CN_G25ENSOARG00000018230, CD14, TMCO6, NDUFA2, IK
2349589412497045690.115ROHXP-EHH_CN_G25PCDHB6, PCDHB7, LOC101103233, LOC101102062, PCDHB14
24669896247700001350.104ROHXP-EHHFSTSD_G22LOC105613062, LOC105613064
2585447324856950880.248ROHXP-EHHFSTSD_G2, CN_G1, CN_G26ENSOARG00000011228, ENSOARG00000008596, AMTN, AMBN, ENAM, JCHAIN
261160519311161253360.073ROH_FSTSD_G24NSD2, LETM1, FGFR3, ENSOARG00000015218
2769700001700000000.300ROHXP-EHHFSTSD_G2, CN_G22PDGFRA, ENSOARG00000021645
2873500001739000000.400_XP-EHHFSTSD_G22LOC101118699, TRNAC-GCA
2983200001836000000.400_XP-EHHFSTSD_G27CENPC, STAP1, UBA6, GNRHR, ENSOARG00000007652, TMPRSS11D, TMPRSS11A
3069896247700001350.104ROH_FSTCN_G22LOC105613062, LOC105613064
31827957061283189810.362ROHXP-EHH_CN_G18ENSOARG00000010506, ENSOARG00000010556, ENSOARG00000010591, CEP57L1, SESN1, 5S_rRNA, U12, ARMC2
321071700001721000000.400_XP-EHHFSTSD_G25LOC101109370, LOC101109105, LOC105616210, LOC101107612, LOC101109626
3336524922366511870.126ROHXP-EHH_CN_G24MPHOSPH8, PARP4, ENSOARG00000024226, U6
3470812659708636860.051ROH_FSTCN_G1, CN_G22ENSOARG00000001156, ENSOARG00000001163
351124600001248000000.200ROH_FSTCN_G17SPNS2, MYBBP1A, GGT6, TEKT1, SMTNL2, FBXO39, XAF1
361356200001566000000.399__FSTSD_G1, CN_G13PHACTR3, ENSOARG00000015902, ZNF831
3742500001428000000.300ROH_FSTCN_G18LOC101109370, LOC101109105, LOC105616210, LOC101107612, LOC101109626, LOC101109633, LOC101109111, LOC101109377
381434400001346000000.200ROHXP-EHHFSTCN_G210ATP6V0D1, AGRP, RIPOR1, CTCF, CARMIL2, ACD, PARD6A, ENKD1, C16orf86, GFOD2
3938400001386000001.999ROHXP-EHH_CN_G16PKD1L3, IST1, U6, ZNF821, ATXN1L, AP1G1
401542448209424619890.014ROH_FSTCN_G21SBF2
411734524230345454540.021ROHXP-EHHFSTSD_G21FGF2
4234400001347000000.300_XP-EHHFSTSD_G25FGF2, ENSOARG00000023095, NUDT6, U4, LOC101104012
4334524230345454540.021ROHXP-EHH_CN_G21FGF2
441817316593174375610.121ROHXP-EHH_CN_G1, CN_G21LOC105603074
4523475785235808010.105ROH_FSTSD_G2, CN_G21EFL1 (EFTUD1)
462136959560371786720.219ROH_FSTCN_G26PAG11, LOC105604177, LOC101123081, PAG3, LOC105604087, LOC105604088
4726150615319252930.419ROHXP-EHH_CN_G1, CN_G21CSMD1
48134469113945640.050ROHXP-EHH_CN_G21ENSOARG00000026782

Candidate regions detected by at least two approaches of selection signature analysis in the SD_G1 (AL) sheep group.

Reg = candidate region; Oar = ovine chromosome.

The XP-EHH identified 32 candidate regions in the comparative analysis between SD_G1 and SD_G2 (Figure 5A). These regions spanned 73 putative genes and the most significant region was the same as that identified by ROH on OAR3 (Table 2). Between SD_G1 and CN_G1, XP-EHH identified 34 candidate regions (Figure 5B) and against CN_G2, it identified 46 candidate regions (Figure 5C). These regions spanned 83 and 225 genes, respectively. The most significant region with CN_G1 was on OAR6 (85,447,324–85,695,088 bp) and spanned six genes while the one with CN_G2 occurred on OAR14 (34,400,001–34,600,000 bp) and spanned 10 genes (Table 2). The XP-EHH analysis between SD_G2 and CN_G1 (Figure 5D) or CN_G2 (Figure 5E) identified 31 and 46 candidate regions which spanned 83 and 208 genes, respectively. The most significant regions occurred on OAR10 (78,200,001–78,500,000 bp) and OAR14 (34,400,001–34,600,000 bp) spanning 7 and 10 putative genes (Table 3), respectively.

FIGURE 5

TABLE 3

RegOarStartStopSize (Mb)MethodCompressionNo. of genesGenes
111026802141027869430.107ROHXP-EHH_SD_G13NPR1, INTS3, SLC27A3
21055678421056418920.074ROH_FSTCN_G13ENSOARG00000022142, ENSOARG00000006704, ETV3
31290000011292000000.200_XP_EHHFSTCN_G13MRPL39, MIR155, U6
42551000012553000000.200_XP_EHHFSTCN_G14ACAD11, ACKR4, DNAJC13, ENSOARG00000009070
521840000011845000000.500_XP_EHHFSTCN_G15PTPN4, EPB41L5, U4, TMEM185B, RALB
6310700001118000001.099ROHXP-EHH_CN_G1, CN_G219HSPA5, RABEPK, PPP6C, SCAI, ENSOARG00000025028, GOLGA1, ARPC5L, ENSOARG00000013275, WDR38, U6, OLFML2A, ENSOARG00000023155, oar-mir-181a-2, NR6A1, NR5A1, ADGRD2, PSMB7, NEK6, LHX2
71927000011931000000.400_XP-EHHFSTSD_G15ST8SIA1, ENSOARG00000023574, LOC101115359, CMAS, ABCC9
811763552118256680.062ROHXP_EHH_CN_G1, CN_G22DENND1, ENSOARG00000013754
910512650105837050.071ROHXP_EHH_CN_G22MAPKAP1, U5
101071000011073000000.200_XP_EHHFSTCN_G11TSPAN8
111297000011299000000.200_XP_EHHFSTCN_G1, CN_G22SOCS2, CRADD
12494273495944452130.172ROHXP-EHH_SD_G110ENSOARG00000022095, ENSOARG00000021327, ENSOARG00000022560, MEST, MIR335, COPG2, ENSOARG00000025252, ENSOARG00000025253, ENSOARG00000025253, TSGA13
1387332113873488490.017ROH_FSTSD_G1
1448500001488000000.300_XP_EHHFSTCN_G18COG5, GPR22, DUS4L, ENSOARG00000005784, ENSOARG00000024106, SLC26A4, CBLL1, SLC26A3
15541700001420000000.299_XP-EHH_CN_G211ENSOARG00000013233, ENSOARG00000013103, SOWAHA, SHROOM1, GDF9, UQCRQ, LEAP2, AFF4, U6, ZCCHC10, ENSOARG00000013524
1649700001500000000.299_XP-EHH_CN_G210PCDHB14, PCDHB15, ENSOARG00000014442, TAF7, PCDHGA1, PCDHGA2, ENSOARG00000000218, PCDHGC3, ENSOARG00000023940, DIAPH1
171074634231075591500.096ROH_FSTSD_G12CAMK4, ENSOARG00000025339
1828500001289000000.400_XP-EHHFSTSD_G11SNCAIP
1974969987750833850.113ROHXP_EHH_CN_G11LOC105606744
2049868921499291700.060ROHXP_EHH_CN_G22LOC101105495, LOC101104318
21624953079249682460.015ROHXP_EHH_CN_G21LOC105615434
22755764564559012430.137ROHXP_EHH_CN_G11DMXL2
2334293259343821380.089ROHXP_EHH_CN_G22SPTBN5, EHD4
24890416289906800860.264ROH_FSTSD_G1, CN_G1, CN_G28ENSOARG00000005108, FAM120B, PSMB1, ENSOARG00000005142, PDCD2, ENSOARG00000027055, ENSOARG00000027056, ENSOARG00000027057
251050000001504000000.400_XP-EHHFSTSD_G1
2678200001785000000.300_XP-EHHFSTSD_G17TPP2, METTL21C, ENSOARG00000005160, TEX30, POGLUT2 (KDELC1), ERCC5, LOC101114712
27730000176000000.300_XP_EHHFSTCN_G22ENSOARG00000006632, ENSOARG00000006641
281151283288513610050.078ROH_FSTCN_G21RNF213
291269500001701000000.600_XP-EHHFSTSD_G17DTL, INTS7, LPGAT1, ENSOARG00000022944, NEK2, ENSOARG00000004286, SLC30A1
3061370423613876250.017ROHXP_EHH_CN_G1
311356326781565224170.196ROHXP-EHHFSTSD_G12ENSOARG00000015902, ZNF831
3246300001467000000.400_XP-EHHFSTSD_G16RASSF2, SLC23A2, TMEM230, PCNA, CDS2, ENSOARG00000022618
3355900001564000000.500_XP-EHHFSTSD_G16FAM217B, PPP1R3D, SYCP2, PHACTR3, ENSOARG00000021945, ENSOARG00000015902
3456400001567000000.300_XP-EHHFSTSD_G17ENSOARG00000015902, ZNF831, ZNF831, PRELID3B, ATP5F1E, TUBB1, CTSZ
3533415419335149370.100ROHXP_EHH_CN_G21ZNF438
361434400001346000000.200ROHXP-EHH_CN_G210ATP6V0D1, AGRP, RIPOR1, CTCF, CARMIL2, ACD, PARD6A, ENKD1, C16orf86, GFOD2
3738400001386000001.999ROHXP-EHH_CN_G16PKD1L3, IST1, U6, ZNF821, ATXN1L, AP1G1
381670528271707339100.206ROH_FSTCN_G18ENSOARG00000026989, ENSOARG00000015678, ENSOARG00000015729, ENSOARG00000015756, 5S_rRNA, EXOC3, ENSOARG00000026990, SLC9A3
391819336673193496590.013ROH_FSTCN_G21LOC101122139
401931400001317000000.300_XP-EHHFSTSD_G12ENSOARG00000009783, MITF
4137200001377000000.500_XP-EHHFSTSD_G16PRICKLE2, PSMD6, ATXN7, THOC7, C3orf49, ENSOARG00000011488
4248100001484000000.300_XP-EHHFSTSD_G115NEK4, ENSOARG00000024525, SPCS1, GLT8D1, GNL3, SNORD69, SNORD19C, SNORD19B, SNORD19, PBRM1, PBRM1, SMIM4, NT5DC2, STAB1, NISCH
4347900001481000000.200_XP_EHHFSTCN_G26SFMBT1, ENSOARG00000000541, ENSOARG00000000606, ITIH4, ITIH3, ITIH1
442016646531167202820.074ROHXP-EHHFSTSD_G17PTCRA, CNPY3, GNMT, PEX6, PPP2R5D, MEA1, KLHDC3
452146517968466313170.113ROH_FSTSD_G11ANO1
4638518624385191330.001ROH_FSTCN_G21ENSOARG00000026107
472539166409392058200.039ROHXP_EHH_CN_G11ENSOARG00000000408 (LOC101114083)

Candidate regions detected by at least two approaches of selection signature analysis in the SD_G2 sheep group.

Reg = candidate region; Oar = ovine chromosome.

The FST analysis identified 73 candidate regions with extreme allele frequency differentiation between SD_G1 and SD_G2 (Figure 6A), which spanned 288 putative genes. The most significant region was on OAR6 (69,700,001–70,000,000 bp) spanning two genes (Table 2). Between SD_G1 and CN_G1, FST revealed 24 regions spanning 56 genes (Figure 6B) and between SD_G1 and CN_G2, it identified 38 regions spanning 89 genes (Figure 6C). For SD_G2, FST identified 35 and 33 candidate regions that differentiated the group with CN_G1 and CN_G2, respectively. These regions spanned 107 and 68 genes, respectively, and the most significant regions occurred on OAR20 (16,646,531–16,720,282 bp) and OAR3 (129,700,001–129,900,000 bp) spanning seven and two genes (Table 3), respectively.

FIGURE 6

In general, there were 48 candidate regions in SD_G1, overlapping 206 genes that were simultaneously identified by a combination of at least two methods (ROH, FST and/or XP-EHH) and/or two comparative analyses (SD_G1 verses SD_G2/CN_G1/CN_G2) (Table 2). Among these regions, 39 were identified by ROH with either FST or XP-EHH, four by all the three approaches and five by FST and XP-EHH. For SD_G2, 47 candidate regions overlapping 209 genes across 18 autosomes were identified by a combination of at least two approaches and/or two comparative analyses (Table 3). Among these regions, 24 were identified by a combination of ROH with either FST or XP-EHH, two by all the three approaches and 19 by FST and XP-EHH.

The three approaches (ROH, XP-EHH, and FST) identified a total of 697 and 765 putative genes that overlapped with candidate regions identified in SD_G1 (Table 4) and SD_G2 (Table 5), respectively. These genes were used for GO and enrichment analyses for each group and the top-most significant GO terms and KEGG pathways are shown in Tables 4, 5. The “hyaluronan metabolic process (GO:0030212)” is the most common GO term across the comparisons. Since it is associated with a wide range of functions (Stern et al., 2006), it may be relevant to the two groups of Sudanese thin-tailed Desert sheep.

TABLE 4

GroupTypeCategoryTermGene countP valueGenesBenjamini
SD_G1 vs. SD_G2 (334 annotated genes)GOTERM_BP_DIRECTGO:0009952Anterior/posterior pattern specification100.00001HOXC5, HOXC4, HOXC9, GRSF1, HOXC8, HOXC11, GLI3, HOXC10, ACVR2A, HOXC60.00933
GOTERM_MF_DIRECTGO:0004861Cyclin-dependent protein serine/threonine kinase inhibitor activity40.00042CDKN2D, KAT2B, CDKN2C, INCA10.08136
GOTERM_CC_DIRECTGO:0005737Cytoplasm580.00052SLC23A2, STAB2, SRA1, HOXC11, EFTUD2, EPB41L4B, PPP2R5E, SESN1, MMP28, MEIOC, ARIH1, TRIM21, PDGFRA, PARP4, ARID1B, GLTPD2, RNF167, NEIL2, RUFY3, SLFN14, PICK1, CACTIN, DTL, ARF5, GTSF1, ATG4D, GTF2A1, ITIH4, UBA6, SRF, PRICKLE2, DBF4B, PLA2G6, RPAP2, CENPC, EPB41L5, ORC4, STK36, PSMB1, STXBP5, APBB3, HOXC6, CDKN2D, KLHDC3, RRM1, ZFHX3, TGFB1, PPIL2, CDKN2C, STAT1, SNCAIP, ACVR2A, COPS2, NF1, TPP2, PDCD2, PTPN4, MPHOSPH80.10205
KEGG_PATHWAYoas05152Tuberculosis100.00244TGFB1, SC5, STAT1, IFNGR1, CATHL3, AKT3, MAPK1, CD14, FADD, BAC50.27089
GOTERM_MF_DIRECTGO:0043022Ribosome binding50.00255LETM1, SPCS1, EFL1, SLFN14, RICTOR0.24848
KEGG_PATHWAYoas05212Pancreatic cancer60.00279RALB, TGFB1, STAT1, AKT3, TGFA, MAPK10.27089
GOTERM_BP_DIRECTGO:0030212Hyaluronan metabolic process30.00407ITIH4, ITIH3, ITIH11.00000
GOTERM_CC_DIRECTGO:0005730Nucleolus190.00507ZFHX3, WDR36, IK, CBX5, RNMT, UTP3, STAT1, NAT10, KRI1, RPAP2, GNL3, ORC4, EXOSC5, NOVA1, CAMK4, TRIM68, NOL10, DTL, MPHOSPH80.49461
GOTERM_MF_DIRECTGO:0008289Lipid binding60.00607BPIFA1, STARD4, BPIFB1, FABP5, BPIFA3, FABP120.39428
GOTERM_BP_DIRECTGO:0019985Translesion synthesis30.00832POLN, SPRTN, DTL1.00000
SD_G1 vs. CN_G1 (224 annotated genes)GOTERM_BP_DIRECTGO:0045921Positive regulation of exocytosis30.01215CADPS2, VSNL1, CFTR1.00000
GOTERM_CC_DIRECTGO:0005922Connexon complex30.01325GJB2, GJA3, GJB61.00000
GOTERM_MF_DIRECTGO:0030345Structural constituent of tooth enamel20.02370ENAM, AMBN1.00000
GOTERM_BP_DIRECTGO:0070175Positive regulation of enamel mineralization20.02610ENAM, AMTN1.00000
GOTERM_CC_DIRECTGO:0005730Nucleolus120.02915RSL1D1, ZFHX3, WDR36, IK, PSPC1, CAMK4, ATXN1L, CRYL1, NOL10, SNRPB2, RPAP2, MPHOSPH81.00000
SD_G1 vs. CN_G2 (365 annotated genes)GOTERM_CC_DIRECTGO:0005737Cytoplasm600.00031GMEB2, BICDL1, SRA1, STMN3, MSI1, HOXC11, XPO4, SIN3A, SESN1, RNF17, ARIH1, AZI2, KPNA1, PARP4, DTX3L, MYT1, ARID1B, COMMD4, ASPM, LATS2, TIPRL, ZNF438, PICK1, TLN1, NEIL1, ACTL7B, PXN, TUBD1, PLA2G6, RPAP2, LIMA1, PSMB7, NUAK1, STK36, DND1, APBB3, GCN1, SCAI, HOXC6, GINS1, PTPN18, ACD, ZFHX3, SRMS, SPAG8, NEK6, PTK6, PARP14, PTPN14, MSMP, GDF9, CENPF, CDAN1, MYBBP1A, COPS2, NF1, PTPN9, MPHOSPH8, EEF1AKMT1, CDK5R20.06403
GOTERM_BP_DIRECTGO:0009952Anterior/posterior pattern specification80.00050HOXC5, HOXC4, HOXC9, HOXC8, HOXC11, GLI3, HOXC10, HOXC60.37470
GOTERM_MF_DIRECTGO:0003950NAD+ ADP-ribosyltransferase activity40.00244PARP4, SIRT4, PARP9, PARP140.44686
GOTERM_BP_DIRECTGO:0007194Negative regulation of adenylate cyclase activity30.00839P2RY13, GPR87, DRD31.00000
GOTERM_MF_DIRECTGO:0043565Sequence-specific DNA binding110.01275NR5A1, NFE2, ZFHX3, ZGPAT, LHX2, FEV, HOXC5, HOXC4, HOXC9, HOXC8, HOXC60.87513
GOTERM_BP_DIRECTGO:0019731Antibacterial humoral response30.01400PLA2G1B, PLA2G6, JCHAIN1.00000
GOTERM_MF_DIRECTGO:0043022Ribosome binding40.01621LETM1, HSPA5, EFL1, GCN10.87513
GOTERM_CC_DIRECTGO:0008180COP9 signalosome40.01830STOML2, HSPA5, COPS2, DOCK70.97140
KEGG_PATHWAYoas04721Synaptic vesicle cycle40.04776ATP6V1G1, CLTC, ATP6V0D1, CPLX11.00000

Enriched functional terms and their enrichment scores following DAVID analysis for genes identified by all methodologies in the SD_G1 (AL) Sudanese sheep group.

TABLE 5

GroupTypeCategoryTermGene countP valueGenesBenjamini
SD_G2 vs. SD_G1 (455 annotated genes)GOTERM_CC_DIRECTGO:0005737Cytoplasm770.00016SLC23A2, LPGAT1, PREP, OXT, DNPH1, EFTUD2, HOXA10, EPB41L4B, XPO1, RASSF2, SOX15, MMP28, TUBB1, XPO5, MEIOC, POLH, PNISR, PDGFRA, USP45, ATP1B2, MYT1, THOC7, ENAH, GLTPD2, RNF167, NEIL2, IRF4, RUFY3, SLFN14, ZNF438, PRKD1, CACTIN, TP53, KIZ, DTL, ATG4D, GTF2A1, ITIH4, SPTBN5, PCNA, UBA6, TUBGCP2, SATB2, SRF, PRICKLE2, CACNA1A, DBF4B, PSMB10, CENPC, EPB41L5, ORC4, FXR2, ATXN7, PSMB1, STXBP5, TSNAXIP1, DTD2, GINS1, CDKN2D, KLHDC3, NOP58, TGFB1, PPIL2, CDKN2C, STAT1, MICAL3, MAD2L1BP, KLHL2, SNCAIP, ACVR2A, GNMT, NSFL1C, TPP2, PDCD2, PTPN4, TUBA8, WRAP530.04116
GOTERM_MF_DIRECTGO:0043022Ribosome binding60.00083LETM1, YTHDF1, SPCS1, EFL1, SLFN14, RICTOR0.21739
KEGG_PATHWAYoas04925Aldosterone synthesis and secretion70.00587MC2R, NPR1, CAMK4, CREB3L2, ITPR2, CACNA1D, PRKD10.95128
GOTERM_CC_DIRECTGO:0005654Nucleoplasm460.00644PCNA, RNMT, TUBGCP2, MAPKAP1, KEAP1, DBF4B, IRF2BPL, EDC4, ORC4, GGA1, DHX33, CREB3L2, XPO5, PHACTR3, HMG20B, MAPK1, NUP88, POLH, CDKN2D, PNISR, PBRM1, PHC2, PPIL2, PHC1, POLN, SFMBT1, MICAL3, TMEM192, EVX1, ILF2, MYT1, THOC7, GATAD2B, GNL3, NSFL1C, STAG1, PRPF6, IRF4, CAMK4, GID8, ZNF438, TCEA2, RANGRF, CACTIN, TKT, DTL0.65751
GOTERM_BP_DIRECTGO:0030212Hyaluronan metabolic process30.00746ITIH4, ITIH3, ITIH11.00000
GOTERM_CC_DIRECTGO:0072562Blood microparticle70.00777ITIH4, TGFB1, AHSG, HRG, ITIH1, KNG1, JCHAIN0.65751
GOTERM_BP_DIRECTGO:0006611Protein export from nucleus40.00781XPO1, TGFB1, XPO5, NUTF21.00000
KEGG_PATHWAYoas05212Pancreatic cancer60.01308RALB, TGFB1, STAT1, TGFA, MAPK1, TP530.95128
GOTERM_MF_DIRECTGO:0004869Cysteine-type endopeptidase inhibitor activity40.01447AHSG, FETUB, HRG, KNG11.00000
SD_G2 vs. CN_G1 (282 annotated genes)GOTERM_BP_DIRECTGO:0007265Ras protein signal transduction50.00207RALB, CDK2, NF1, HRAS, TP530.93013
GOTERM_BP_DIRECTGO:0030212Hyaluronan metabolic process30.00302ITIH4, ITIH3, ITIH10.93013
GOTERM_BP_DIRECTGO:0007093Mitotic cell cycle checkpoint40.00431INTS3, MAD2L1BP, KNTC1, HRAS0.93013
GOTERM_CC_DIRECTGO:0070062Extracellular exosome470.00828CRB2, FBLN7, NDUFB9, ITIH4, RAB5B, ITIH3, SPTBN5, RALB, ECHS1, CAB39L, ARPC5L, PDXP, LCAT, SAT2, PKD1L3, MLLT3, ITIH1, PSMB7, LGALS1, FXR2, GMDS, TSPAN8, PSMB1, GLUL, TSPAN1, MEST, ATP6V1C1, MPDU1, ATP6V1G1, HSPA5, IST1, TMEM192, NUTF2, DNAJC13, SLC9A3, ANO1, MYL6, EHD4, PTPRA, CAMK4, GPD1, TMBIM1, CPE, PDCD2, RPL26, SLC26A4, SHBG0.81138
GOTERM_BP_DIRECTGO:0051603Proteolysis involved in cellular protein catabolic process40.00844PSMB7, HSPA5, PSMB1, PSMB101.00000
GOTERM_CC_DIRECTGO:0005839Proteasome core complex30.01096PSMB7, PSMB1, PSMB100.81138
GOTERM_CC_DIRECTGO:0015030Cajal body40.01261NOP58, CDK2, ZC3H8, WRAP530.81138
GOTERM_BP_DIRECTGO:0045807Positive regulation of endocytosis30.01837CALY, CBLL1, TSPAN11.00000
KEGG_PATHWAYoas05231Choline metabolism in cancer50.02594SLC44A5, DGKA, PLCG1, GPCPD1, HRAS1.00000
SD_G2 vs. CN_G2 (374 annotated genes)GOTERM_CC_DIRECTGO:0005737Cytoplasm730.00000SLC23A2, GMEB2, BICDL1, STMN3, MSI1, OXT, ENDOV, XPO1, SIN3A, SOX15, POGZ, XPO5, TNFAIP8L1, POLH, KPNA1, DTX3L, ATP1B2, MYT1, RC3H2, COMMD4, ENAH, ASPM, IRF4, ALDH1A2, ZNF438, HCLS1, NSMF, NEIL1, TP53, KIZ, PLIN5, ITIH4, SPTBN5, PCNA, TUBGCP2, SATB2, ACTL7B, PXN, PSMB10, SOCS2, LIMA1, FBXO40, EPB41L5, PSMB7, PSMB4, NT5E, FXR2, DPP9, PSMB1, TSNAXIP1, GCN1, SCAI, GINS1, KLHDC3, ACD, NOP58, SRMS, CRADD, NEK6, MGA, MAD2L1BP, KLHL2, PTK6, PARP14, EIF2S2, GNMT, GDF9, CDAN1, MYBBP1A, TPP2, PTPN9, PDCD2, WRAP530.00009
GOTERM_CC_DIRECTGO:0005839Proteasome core complex40.00095PSMB7, PSMB4, PSMB1, PSMB100.09601
GOTERM_CC_DIRECTGO:0070062Extracellular exosome610.00149CAB39L, SNAP23, ARPC5L, SAT2, ZDHHC1, TXNDC17, MYDGF, FAM162A, UBXN6, CDK5RAP2, IQCB1, NUTF2, SLC9A3, ANO1, DNAJC5, GPD1, TMBIM1, RAB35, RPL26, ATP6V0D1, FBLN7, CRB2, ITIH4, ITIH3, SPTBN5, PCNA, ECHS1, RALB, ARL6, LCAT, ITIH1, SELENBP1, PSMB7, PSMB4, NT5E, FXR2, KIF3B, TSPAN8, GMDS, CREG1, PSMB1, MEST, TBC1D15, MPDU1, HSPA5, SPHKAP, TMEM192, UBE2G1, ASXL1, TPPP3, EHD4, LRG1, PTPRA, CAMK4, CPE, FNDC11, AGRP, PDCD2, SLC26A4, SHBG, LAMTOR30.10010
GOTERM_BP_DIRECTGO:0051603Proteolysis involved in cellular protein catabolic process50.00173PSMB7, PSMB4, HSPA5, PSMB1, PSMB101.00000
GOTERM_BP_DIRECTGO:0030212Hyaluronan metabolic process30.00463ITIH4, ITIH3, ITIH11.00000
GOTERM_MF_DIRECTGO:0004298Threonine-type endopeptidase activity40.00683PSMB7, PSMB4, PSMB1, PSMB101.00000
GOTERM_BP_DIRECTGO:0045740Positive regulation of DNA replication40.01049CTC1, PCNA, PLA2G1B, HRAS1.00000
GOTERM_MF_DIRECTGO:0043022Ribosome binding40.01923SPCS1, HSPA5, EFL1, GCN11.00000
KEGG_PATHWAYoas03050Proteasome40.02103PSMB7, PSMB4, PSMB1, PSMB101.00000
KEGG_PATHWAYoas00564Glycerophospholipid metabolism50.04408PNPLA7, PLA2G1B, GPD1, LCAT, CDS21.00000

Enriched functional terms and their enrichment scores following DAVID analysis for genes identified by all methodologies in the SD_G2 (BU, HA, KA, and SH) Sudanese sheep group.

Discussion

The history of indigenous livestock and their physiological, anatomical and genetic responses to natural and artificial selection is at the core of their diversity (phenotypic and genetic) and resilience. Here, we present findings of the analysis of genomic variation in the indigenous African long-legged thin-tailed Desert Sheep from Sudan. The overall average HO and HE for the Sudanese thin-tailed Desert Sheep exceeded 0.300, suggesting high genetic variation. The values for the individual ecotypes are, however, similar to those reported in Barki sheep (Kim et al., 2016), an indigenous breed from a hot desert environment in Egypt, are higher than the values reported in Ethiopian sheep (Edea et al., 2017), but fall within the range reported in sheep raised by South African smallholder farmers (Molotsi et al., 2017) and in New Zealand breeds (Brito et al., 2017). The level of diversity in the four Chinese breeds analyzed here is similar to that of the Sudanese thin-tailed Desert Sheep in spite some of the breeds, such as the Tan Sheep and HZ having been exposed to artificial selection. Ascertainment bias and deliberate avoidance of inbreeding in the Chinese breeds could explain this result. The former should however be seen from the context that the Ovine Infinium® HD SNP BeadChip carries assays for 606,006 loci with an average spacing of around 5 kb across the genome. These loci were selected from groups that differed in their MAF across 75 animals from an international panel of 34 domestic sheep breeds and two wild species of sheep (Bighorn and Thinhorn) (Anderson et al., 2014). The chip was also validated using 288 samples, generating 99% average call rates across SNPs and animals, and a call rate repeatability of 99.9978%.

The lack of stringent artificial selection coupled with random mating in the Sudanese thin-tailed Desert Sheep may explain their high levels of genetic diversity but low levels of genetic differentiation and inbreeding. The former may be enhancing their fitness and could be responsible for their adaptive resilience to the desert environments where they are raised.

We investigated demographic dynamics by assessing the trends in LD over genomic distances and NE over the past 1,000 generations. All the samples analysed showed a rapid decay in LD within the first 300 kb. From around 0.1 Mb, the Chinese breeds had higher r2 values, which most likely reflected an attempt at their standardization as distinct breeds compared to the Sudanese thin-tailed Desert Sheep, where such efforts are lacking. However, for both subsets, the r2 values averaged below 0.15, suggesting very limited extension of high LD blocks across their genomes. This r2 value ranked below the minimum threshold range of 0.33 ≤ r2 ≤ 0.80 that is meaningful for GWAS (Ardlie et al., 2002; Carlson et al., 2004). Much denser marker coverage may thus be required for association analysis in the thin-tailed Desert Sheep and the Chinese breeds. Besides its importance for mapping traits, LD allows the estimation of NE over generation time (Pritchard and Przeworski, 2001), acts as a significant genetic parameter for understanding population dynamics and can act as a measure of long-term performance of a population with regard to diversity and inbreeding (Fernández et al., 2005), and is useful for evaluating the risk status of populations/breeds (FAO, 1998; Duchev et al., 2006). A contraction in NE from 270 and 500 generations ago was revealed in both the Sudanese thin-tailed Desert Sheep and the Chinese breeds, respectively. These contractions appeared not to have resulted in a concomitant reduction in genetic diversity. The contraction in the Chinese breeds may be associated with the start of their establishment as distinct breeds while that in the Sudanese thin-tailed Desert Sheep is difficult to explain. However, whole-genome sequence analysis has revealed declines in NE in Ethiopian, Libyan and Sudanese sheep (Ahbara, 2019) and in Ethiopian goats based on the analysis of 50K SNP genotype data (Tarekegn et al., 2020). Mbole-Kariuki et al. (2014) also reported a bottleneck event in East African shorthorn Zebu cattle from western Kenya. The reduction in population sizes of the other African sheep and goat populations falls within the time period of the shrinkage in Sudanese thin-tailed Desert Sheep, while that of the East African shorthorn Zebu cattle started around 240 years ago. Historical fluctuations in climatic patterns resulting in cycles of favorable and deteriorating conditions in the continent (Verschuren, 2007) have been used to explain the fluctuations in NE.

The NJ phylogeny, PCA and ADMIXTURE allowed us to reveal the underlying genetic structure and divergence in the datasets analysed. We hypothesized that the ADMIXTURE pattern at K = 3, which was supported by NJ phylogeny and PCA, revealed an underlying broad-scale genetic structure as it showed (i) a separation of African sheep from the Chinese breeds, (ii) a separation of fat-tailed sheep (Tan breed) from both African and Chinese thin-tailed sheep, and (iii) different genetic backgrounds in both the African and the Chinese thin-tailed sheep. While the first result suggest genetic divergence that has most likely arisen due to reproductive isolation between sheep in different continents coupled with genetic drift, the second result confirm the existence of differences in genetic makeup of fat-tailed and thin-tailed sheep. A similar result based on the analysis of microsatellites was reported previously between African fat-tailed and thin-tailed sheep (Muigai, 2003). Furthermore, the divergence of the fat-tailed Tan Sheep from the other Chinese breeds can be due to artificial selection for lamb pelt and lustrous white curly fleece in the Tan sheep. The clear divergence between the Sudanese thin-tailed Desert Sheep and the Chinese thin-tailed sheep suggest at least two possibilities: (i) the domestication of at least two autosomal gene pools of thin-tailed sheep in the Fertile Crescent and their subsequent independent dispersal westwards to Africa and eastwards to China, and (ii) the dispersal of one gene pool followed by genetic divergence driven by genetic drift due to reproductive isolation and natural selection driving adaptation to low altitude hot arid environments in Africa or high altitude alpine arid environments in China. Although mitogenome analysis has identified two waves of sheep migration comprising three maternal lineages across eastern Eurasian (Lv et al., 2015), the study did not include sheep from Africa and therefore the first scenario is difficult to ascertain. We therefore favor the second scenario given that recent analysis of autosomal genomic data has provided compelling evidence for climate-mediated selection pressure driving genetic divergence in sheep (Lv et al., 2014) while differences in precipitation affecting feed availability has also been shown to drive variation in natural selection at a global scale (Siepielski et al., 2017).

A combination of NJ phylogeny, PCA and ADMIXTURE (K > 5) results also revealed what we hypothesized to be a fine-scale genetic structure among the individuals analyzed. The analyses revealed at least two distinct genetic clusters in the Sudanese thin-tailed Desert Sheep; one which was specific to AL and another to the four remaining ecotypes. The analyses also identified two genetic clusters in the Sudanese thin-tailed sheep from China: one cluster was exclusive to HZ and another occurred in QOL and ZK and at low frequency in HZ. Given that the five ecotypes of thin-tailed Desert Sheep are all derived from a low altitude hot arid environment while the Chinese thin-tailed breeds are adapted to an alpine/sub-alpine high-altitude rangeland, this fine-scale genetic structure was unexpected. It may hint at a complex genome architecture in the thin-tailed sheep because it cannot be explained by differential selection for adaptation. It is, however, likely that artificial selection may be driving the divergence in the Chinese thin-tailed sheep. With a common genomic background observed in QOL and ZK and the same occurring at a low frequency in HZ, it suggests that HZ has been gradually diverging from QOL and ZK. HZ has a predominantly black coat while QOL and ZK have mainly white coats with some black to brown patches around heads and legs. QOL and ZK also occur in close geographic proximity and can thus cross mate while HZ is isolated and farmers keeping this breed avoid mating it with other breeds, so as to maintain its distinct black coat, genetic purity and recognition.

We investigated molecular signatures of selection to gain insights on the divergence of SD_G1 and SD_G2 genetic clusters of the Sudanese thin-tailed Desert Sheep. For this reason, we used three approaches and contrasted SD_G1 and SD_G2 with each other and with two of the genetic clusters: CN_G1 and CN_G2 observed in Chinese sheep. We used the ROH analysis to provide evidence for selection within a cluster. If such signatures overlapped with those identified by FST and/or XP-EHH and were observed in at least two comparative analyses, then we took this to be a reliable evidence of positive selection in the specific genetic cluster. We therefore limited our discussion to the putative genes found within the candidate regions that overlapped between at least two approaches and in more than two comparative analyses involving the two genetic groups observed in the Sudanese thin-tailed Desert Sheep (Tables 2, 3).

Based on our criteria, nine candidate regions that overlapped between at least two approaches and were observed in at least two comparative analyses were identified in SD_G1. These candidate regions were located on OAR3, 5, 6 (two regions), 10, 13, 18 (two regions), and 26 (one region) (Table 2). Within these candidate regions, we found genes that are associated with functions relating to adaptation to abiotic and biotic factors and morphology. For instance, the most significant region on OAR6 (Figures 7, 8) spanned six genes, one of which was AMBN (ameloblastin), a candidate gene for gastrointestinal nematode resistance in sheep (Atlija et al., 2016; Berton et al., 2017). The PDGFRA gene occurred in another candidate region on OAR6 (Figures 7, 8). It has been reported to be a key gene in cytokine signaling and was observed to be amongst genes in biological pathways that are involved in host immune responses against parasitic infections (Benavides et al., 2015). PDGFRA has also been reported to be tightly associated with the dominant white coat color in the pig (Johansson et al., 1992; Moller et al., 1996). Nazari-Ghadikolaei et al. (2018) also found significant association between PDGFRA and KIT with white coat color in Markhoz goat from Iran. The white coat color predominated in AL based on field observations made by the first author during sampling. PDGFRA has also been linked to fat deposition in Libyan long fat-tailed sheep (Mastrangelo et al., 2019) while CAMK4 found on OAR5 has been associated with tail morphology and fat deposition in sheep (Moradi et al., 2012; Ma et al., 2018; Ahbara et al., 2019; Li et al., 2020). These results likely suggest that divergent selection for parasite resistance, coat color, differential fat deposition and tail morphology may explain the divergence of SD_G1. With the lack of phenotypic data, our suggestions would need to be confirmed with more detailed studies that include an assessment of appropriate phenotypes.

FIGURE 7

FIGURE 8

Our criteria also revealed four candidate regions that overlapped between at least two approaches and in at least two comparative analyses in SD_G2 (Table 3). These regions were observed on OAR3 (three regions) and OAR8 (one region). The region on OAR8 was the strongest and it spanned three genes, FAM120B, PSMB1, and PDCD2. FAM120B has been suggested to be a potential candidate for twinning rate in humans (Painter et al., 2010) and in lowly ovulating mammals (Vinet et al., 2012). The PSMB1, which was the top candidate gene at this region, has been associated with functional adaptation of the transcriptome to mastitis-causing pathogens in mammary gland of livestock (Loor et al., 2011). The SOCS2, which was found in one of the candidate regions on OAR3, has been linked with body weight and milk production, and an increased susceptibility to inflammation of mammary glands (Rupp et al., 2015). It is important to take note that most pastoral societies in Africa prefer larger sized animals with higher milk production as offspring from such animals are thought to thrive better in hot arid environments. This indirect preference for such individuals may be responsible for this selection signal. SOCS2 has also been reported to play important roles in key adaptive phenotypes in sheep, including general immune response following infection (Yang et al., 2015; Al Kalaldeh et al., 2019) and resistance to gastrointestinal nematode (Haemonchus contortus) (Estrada-Reyes et al., 2019). The proteasome PSMB7 and the heat shock protein HSPA5, both found in the region on OAR3, were reported to be upregulated during blastocyst implantation in hamsters (Lei et al., 2014). These results suggest that the divergence of SD_G2 could be associated with differential resistance to bacterial infections, productive and reproductive performance. However, as for SD_G1, this finding will need to be investigated further with much detailed analyses of individuals with relevant phenotypes.

The candidate region on OAR3 (10,700,001–11,800,000 bp) differentiated both SD_G1 (Table 2) and SD_G2 (Table 3) from both CN_G1 and CN_G2 (Figure 9) and was therefore of interest. The top significant window at this region spanned five genes (NR6A1, NR5A1, ADGRD2, PSMB7, and NEK6) and the top-most significant position was close to NR5A1. The expression of NR5A1 drives Leydig cell differentiation from progenitor cells (Barsoum and Yao, 2010), thus initiating steroidogenesis (Martin and Tremblay, 2010). In mice, NR5A1 has been shown to be essential in the formation of pituitary, gonad and adrenal glands (Luo et al., 1994). Another region that differentiated SD_G1 and SD_G2 from both CN_G1 and CN_G2 occurred on OAR14. Two genes were present in this region, one of which was AP1G1. Proteomic analysis of sheep uterus has revealed the role of AP1G1 in prolificacy (La et al., 2020). Whether the function of NR5A1 and AP1G1 suggested inherent genetic differences in fertility and prolificacy between the Sudanese thin-tailed Desert Sheep and the Chinese breeds is difficult to say in the absence of appropriate phenotype data. Strong evidence suggests that NR6A1 is a strong candidate gene underlying vertebrae number in domestic pigs (Mikawa et al., 2007; Rubin et al., 2012) and the number of thoracic vertebrae in domestic sheep (Li et al., 2019). From a phenotypic perspective, this result is interesting as it suggests that the Sudanese thin-tailed Desert Sheep can be differentiated from the Chinese thin-tailed sheep based on the number of vertebrae. Indeed, the tail of Sudanese sheep consists of 23 caudal vertebrae (Ahbara, 2019), while that of Chinese sheep comprises less than 18 (Zhi et al., 2018).

FIGURE 9

The African long-legged thin-tailed sheep is raised in desert environments where they are exposed year-round, to complex interacting biophysical stressors, including high temperatures, physical exhaustion, direct solar radiation, feed and water scarcity. In this context, the revelation of the GO biological term “hyaluronan metabolic process (GO:0030212)” in four out of the six comparisons involving SD_G1 and SD_G2 is relevant. Hyaluronan (HA) comprises a major component of the extracellular matrix (ECM) in vertebrates and is a straight chain glycosaminoglycan which mediates diverse functions depending on molecular size and tissue concentration, both of which are regulated by the balance between its biosynthesis and catabolism (Hascall and Esko, 2015). HA occurs virtually in all vertebrate tissues and fluids, but skin, the first defense against environmental insults, is its largest reservoir containing more than 50% of the total body HA. The HA content of dermis is far greater than that of epidermis, and accounts for most of the 50% of the total body HA in skin. HA has excellent water retention ability and remarkable tissue hydration capacity, and at high concentrations, as found in the ECM of dermis and epidermis, it regulates water balance and osmotic pressure, functions as an ion exchange resin and regulates ion flow (Stern, 2010). HA also functions in the reepithelization process due to several of its properties, including being an integral part of the ECM of basal keratinocytes, the major constituent of epidermis, its free-radical scavenging function and its role in keratinocyte proliferation and migration (Tammi et al., 1989). It has been observed that the content of HA increases in the presence of retinoic acid (vitamin A) (Tammi et al., 1989) and the effects of retinoic acid against UV-induced skin damage may be correlated, at least in part, with an increase in skin HA content, giving rise to increased tissue hydration. It has been suggested that the free-radical scavenging property of HA contributes to protection against repeated exposure to the sun’s UV radiation (Tuhkanen et al., 1998; Averbeck et al., 2007). The rapid turnover of HA in tissues suggests tightly controlled modes for modulating steady state levels of HA and this is important because rapid increases are required in situations of extreme stress (Kobayashi et al., 2020). Therefore, the ability to provide immediate high HA levels acts as a survival mechanism for vertebrates and may explain the rapid rates of HA turnover under basal conditions. Furthermore, HA plays a role in innate immunity. Although it binds to CD44, there is evidence showing that HA degradation products transduce inflammatory signal through toll-like receptor 2 (TLR2), TLR4 or both in macrophages and dendritic cells. Thus, the HA metabolic process may be facilitating the adaptation to desert environments in African long-legged thin tailed sheep.

Conclusion

In this study, we analyzed the genetic diversity, structure and signatures of selection in the African thin-tailed Desert Sheep sampled from Sudan. We included one breed of fat-tailed and three breeds of thin-tailed sheep from China for comparative genomic analysis. We found high levels of genetic diversity but low levels of genetic differentiation among the five ecotypes of Sudanese thin-tailed Desert Sheep. The analysis also revealed a broad- and fine-scale genetic structures in the sheep analyzed, suggesting that these would need to be accounted for in genome-wide association analysis in the discovery of the genetic basis of important traits and in breeding program design. Selection signature analysis identified candidate regions that could potentially differentiate the two genetic clusters observed in the African thin-tailed Desert Sheep from the two genetic clusters observed in the Chinese thin-tailed sheep. These regions spanned a set of potential candidate genes associated with traits of adaptive, production and reproduction significance as well as morphological differentiation. While our study provides a foundation for understanding the genome structure and dynamics of African indigenous sheep, it reveals findings that could form the basis of studies that combine genomic and phenomic approaches in the quest to understand the genome architecture of indigenous livestock.

Statements

Data availability statement

The data presented in the study are deposited in the figshare repository, accession number doi: 10.6084/m9.figshare.14785278.v1.

Ethics statement

The blood samples from the Sudanese thin-tailed Desert Sheep were collected after consent was granted by flock owners and local administration officials in Sudan. No further permissions were required from the ethics committee of the Organization of Veterinary Service, Government of Sudan. All animal experiments in this study were fully approved by the Animal Care and Use Committee of the Institute of Animal Sciences, Chinese Academy of Agricultural Sciences (IAS-CAAS) with the following reference number: IASCAAS-AE-03, on September 1, 2014.

Author contributions

QZ, YM, and JM conceived, designed, and supervised the study and genotyped the Chinese breeds. JM, MR, and AH genotyped the Sudanese thin-tailed Desert Sheep. FE organised the sampling in Sudan. AAh and AAb analyzed the data with support from HB, RI and LX. AAb, AAh, J-LH, and JM wrote and revised the manuscript. All authors read and approved the final version.

Funding

This work was supported by the Sudan Agricultural Research Corporation (ARC) and Scientific Research and the Dry Land Research Centre (DLRC) and Animal Production. Sampling and genotyping of the Sudan thin-tailed Desert sheep was supported by the CGIAR Research Program on Livestock (Livestock CRP) and the China and Sudan contribution to the CGIAR through ICARDA. Additional financial support was also availed by the Science and Technology Innovation Project of Chinese Academy of Agricultural Sciences (ASTIP-IAS01) as well as the Modern Cashmere Sheep Industry System (CARS-39-01).

Acknowledgments

We acknowledge the assistance of the technical staff at El-Obeid Research Station, during sampling.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Supplementary material

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fgene.2021.659507/full#supplementary-material

Supplementary Figure 1

Z-transformed average fixation index (ZFST) for autosomal 200 kb windows.

Supplementary Figure 2

Normalized distribution of XP-EHH scores.

Supplementary Table 1

The sampling locations of the thin-tailed Desert Sheep ecotypes from Sudan and the breeds from China that were used in the current study.

Supplementary Table 2

Phenotypic characteristics of the thin-tailed Desert Sheep ecotypes and the climatic characteristics of their production environment.

Supplementary Table 3

Pairwise genetic differentiation (FST) among the five ecotypes of Sudanese thin-tailed Desert Sheep and four breeds of sheep from China herein analyzed.

Supplementary Table 4

The candidate regions spanning genes within the SD_G1 sheep group identified through the analysis of ROH islands.

Supplementary Table 5

The candidate regions spanning genes within the SD_G2 sheep group identified through the analysis of ROH islands.

Supplementary Table 6

The candidate regions spanning genes within the SD_G1 (SD_G1 vs. SD_G2, CN_G1, and CN_G2) sheep group identified by XP-EHH.

Supplementary Table 7

The candidate regions spanning genes within the SD_G2 (SD_G2 vs. SD_G1, CN_G1, and CN_G2) sheep group identified by XP-EHH.

Supplementary Tables 8

The candidate regions spanning genes within the Sudanese sheep groups (SD_G1 vs. SD_G2) identified by FST.

Supplementary Table 9

The candidate regions spanning genes within the SD_G1 (SD_G1 vs. CN_G1 and CN_G2) sheep group identified by FST.

Supplementary Table 10

The candidate regions spanning genes within the SD_G2 (SD_G2 vs. CN_G1 and CN_G2) sheep group identified by FST.

References

  • 1

    AbualazayiumM. (2004). Animal Wealth and Animal Production in Sudan.Khartoum: Khartoum University Press.

  • 2

    AhbaraA.BahbahaniH.AlmathenF.Al AbriM.AgoubM. O.MwacharoJ. M. (2019). Genome-wide variation, candidate regions and genes associated with fat deposition and tail morphology in Ethiopian indigenous sheep.Front. Genet.9:699. 10.3389/fgene.2018.00699

  • 3

    AhbaraA. M. (2019). Autosomal Genome-wide Analysis of Diversity, Adaptation and Morphological Traits in African Indigenous Sheep.Ph.D. Thesis. Nottingham: University of Nottingham, 183.

  • 4

    Al KalaldehM.GibsonJ.LeeS. H.GondroC.van der WerfJ. H. J. (2019). Detection of genomic regions underlying resistance to gastrointestinal parasites in Australian sheep.Genet. Sel. Evol.51:37. 10.1186/s12711-019-0479-1

  • 5

    AlexanderD. H.NovembreJ.LangeK. (2009). Fast model-based estimation of ancestry in unrelated individuals.Genome Res.1916551664. 10.1101/gr.094052.10910.1080/10495390902786926

  • 6

    AndersonR.McEwanJ. C.BrauningR.PickeringN. K.KijasJ.DalrympleJ.et al (2014). “Development of a high density (600K) Illumina ovine SNP chip and its use to fine map the yellow fat locus,” in Proceedings of the (ISGC) Plant and Animal Genomics XXII Conference – January 2014, San Diego, CA. 10.1016/j.vetpar.2008.12.017

  • 7

    ArdlieK. G.KruglyakL.SeielstadM. (2002). Patterns of linkage disequilibrium in the human genome.Nat. Rev. Genet.3299309. 10.1038/nrg777

  • 8

    AtlijaM.ArranzJ.-J.Martinez-ValladaresM.Gutiérrez-GilB. (2016). Detection and replication of QTL underlying resistance to gastrointestinal nematodes in adult sheep using the ovine 50K SNP array.Genet. Sel. Evol.48:4. 10.1186/s12711-016-0182-4

  • 9

    AverbeckM.GebhardtC. A.VoigtS.BeilharzS.AndereggU.TermeerC. C.et al (2007). Differential regulation of hyaluronan metabolism in the epidermal and dermal compartments of human skin by UVB irradiation.J. Invest. Dermatol.127687697. 10.1038/sj.jid.570061410.1038/nature11837

  • 10

    BarsoumI. B.YaoH. H. (2010). Fetal Leydig cells: progenitor cell maintenance and differentiation.J. Androl.311115. 10.2164/jandrol.109.00831810.30861/9780860543251

  • 11

    BenavidesM. V.SonstegardT. S.KempS.MugambiJ. M.GibsonJ. P.BakerR. L.et al (2015). Identification of novel loci associated with gastrointestinal parasite resistance in a red Maasai x Dorper backcross population.PLoS One10:e0122797. 10.1371/journal.pone.0122797

  • 12

    BertonM. P.de Oliveira SilvaR. M.PeripolliE.StafuzzaN. B.MartinJ. F.ÁlvarezM. S.et al (2017). Genomic regions and pathways associated with gastrointestinal parasites resistance in Santa Ines breed adapted to tropical climate.J. Anim. Sci. Biotechnol.873. 10.1186/s40104-017-0190-4

  • 13

    BiscariniF.CozziP.GaspaG.MarrasG. (2019). detectRUNS: An R Package to Detect runs of Homozygosity and Heterozygosity in Diploid Genomes. Available online at: https://cran.r-project.org/web/packages/detectRUNS/vignettes/detectRUNS.vignette.html(accessed October, 2020).

  • 14

    BrahiO. H. D.XiangH.ChenX.FarougouS.ZhaoX. (2015). Mitogenome revealed multiple postdomestication genetic mixtures of West African sheep.J. Anim. Breed. Genet.132399405. 10.1111/jbg.12144

  • 15

    BritoL. F.McEwanJ. C.MillerS. P.PickeringN. K.BainW. E.DoddsK. G.et al (2017). Genetic diversity of a New Zealand multibreed sheep population and composite breeds’ history revealed by a high-density SNP chip.BMC Genet.18:25. 10.1186/s12863-017-0492-8

  • 16

    BuchmannR.HazelhurstS. (2014). Genesis Manual.Johannesburg: University of the Witwatersrand.

  • 17

    CarlsonC. S.EberleM. A.RiederM. J.YiQ.KruglyakL.NickersonD. A. (2004). Selecting a maximally informative set of single-nucleotide polymorphisms for association analyses using linkage disequilibrium.Am. J. Hum. Genet.74106120. 10.1086/381000

  • 18

    ChangC. C.ChowC. C.TellierL. C.VattikutiS.PurcellS. M.LeeJ. J. (2015). Second-generation PLINK: rising to the challenge of larger and richer datasets.GigaScience4:7. 10.1186/s13742-015-0047-8

  • 19

    DuchevZ.DistlO.GroeneveldE. (2006). Early warning system for loss of diversity in European livestock breeds.Archiv. Anim. Breed.49521531. 10.5194/aab-49-521-2006

  • 20

    EdeaZ.DessieT.DadiH.DoK. T.KimK. S. (2017). Genetic diversity and population structure of Ethiopian sheep populations revealed by high-density SNP markers.Front. Genet.8:218. 10.3389/fgene.2017.00218

  • 21

    Estrada-ReyesZ. M.TsukaharaY.AmadeuR. R.GoetschA. L.GipsonT. A.SahluT.et al (2019). Signatures of selection for resistance to Haemonchus contortus in sheep and goats.BMC Genom.20:735. 10.1186/s12864-019-6150-y

  • 22

    ExcoffierL.LischerH. E. (2010). Arlequin suite ver 3.5: A new series of programs to perform population genetics analyses under Linux and Windows.Mol. Ecol. Resour.10564567. 10.1111/j.1755-0998.2010.02847.x

  • 23

    FAO (1998). Secondary Guidelines for the National Farm Animal Genetic Resources Management Plans: Management of Small Populations at Risk.Rome: FAO.

  • 24

    FAO (2007). The State of the World’s Animal Genetic Resources for Food and Agriculture, edsRischkowskyB.PillingD. (Rome: FAO).

  • 25

    FernándezJ. B.VillanuevaB.Pong-WongR.ToroM. A. (2005). Efficiency of the use of pedigree and molecular marker information in conservation programs.Genetics17013131321. 10.1534/genetics.104.037325

  • 26

    GoudetJ. (2005). Hierfstat, a package for R to compute and test hierarchical F-statistics.Mol. Ecol. Notes5184186. 10.1111/j.1471-8286.2004.00828.x10.1016/S2095-3119(14)60928-X

  • 27

    HascallV.EskoJ. D. (2015). “Hyaluronan,” in Essentials of Glycobiology, 3rd Edn, edsVarkiA.CummingsR. D.EskoJ. D.StanleyP.HartG. W.et al (Cold Spring Harbor, NY: Cold Spring Harbor Laboratory Press), 197206.

  • 28

    HuangD. W.ShermanB. T.LempickiR. A. (2009). Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists.Nucleic Acids Res.37113. 10.1093/nar/gkn923

  • 29

    InnanH.KimY. (2008). Detecting local adaptation using the joint sampling of polymorphism data in the parental and derived populations.Genetics17917131720. 10.1534/genetics.108.086835

  • 30

    JakobssonM.RosenbergN. A. (2007). CLUMMPP: a cluster matching and permutation program for dealing with label switching and multimodality in analysis of population structure.Bioinformatics23, 18011806. 10.1093/bioinformatics/btm233

  • 31

    JohanssonM.EllegrenH.MarklundL.GustavssonU.Ringmar-CederbergE.AnderssonK.et al (1992). The gene for dominant white color in the pig is closely linked to ALB and PDGRFRA on chromosome 8.Genomics14965969. 10.1016/S0888-7543(05)80118-110.1111/age.12197

  • 32

    KimE. S.ElbeltagyA. R.Aboul-NagaA. M.RischkowskyB.SayreB.MwacharoJ. M.et al (2016). Multiple genomic signatures of selection in goats and sheep indigenous to a hot arid environment.Heredity116255264. 10.1038/hdy.2015.94

  • 33

    KobayashiT.ChanmeeT.ItanoN. (2020). Hyaluronan: metabolism and function.Biomolecules10:1525. 10.3390/biom10111525

  • 34

    KumarS.StecherG.TamuraK. (2016). MEGA7: molecular evolutionary genetics analysis 7.0 for bigger datasets.Mol. Biol. Evol.33, 18701874. 10.1093/molbev/msw054

  • 35

    LaY.TangJ.GuoX.ZhangL.GanS.ZhangX.et al (2020). Proteomic analysis of sheep uterus reveals its role in prolificacy.J. Proteom.210:103526. 10.1016/j.jprot.2019.103526

  • 36

    LeiW.HeringtonJ.GalindoC. L.DingT.BrownN.ReeseJ.et al (2014). Cross-species transcriptomic approach reveals genes in hamster implantation sites.Reproduction148607621. 10.1530/REP-14-0388

  • 37

    LiC.LiM.LiX.NiW.XuY.YaoR.et al (2019). Whole-genome resequencing reveals loci associated with thoracic vertebrae number in sheep.Front. Genet.10:674. 10.3389/fgene.2019.00674

  • 38

    LiX.YangJ.ShenM.XieX. L.LiuG. J.XuY. X.et al (2020). Whole-genome resequencing of wild and domestic sheep identifies genes associated with morphological agronomic traits.Nat. Commun.11:2815. 10.1038/s41467-020-16485-1

  • 39

    LoorJ. J.MoyesK. M.BionazM. (2011). Functional adaptations of the transcriptome to mastitis-causing pathogens: the mammary gland and beyond.J. Mammary Gland Biol. Neoplasia16305322. 10.1007/s10911-011-9232-2

  • 40

    LuoX.IkedaY.ParkerK. L. (1994). A cell-specific nuclear receptor is essential for adrenal and gonadal development and sexual differentiation.Cell77, 481490. 10.1016/0092-8674(94)90211-9

  • 41

    LvF. H.AghaS.KantanenJ.ColliL.StuckiS.KijasJ. W.et al (2014). Adaptations to climate-mediated selective pressures in sheep.Mol. Biol. Evol.3133243343. 10.1093/molbev/msu264

  • 42

    LvF. H.PengW. F.YangJ.ZhaoY. X.LiW. R.LiuM. J.et al (2015). Mitogenomic meta-analysis identifies two phases of 1migration in the history of eastern Eurasian sheep.Mol. Biol. Evol.3225152533. 10.1093/molbev/msv139

  • 43

    MaL.LiZ.CaiY.XuH.YangR.LanX. (2018). Genetic variants in fat-and short-tailed sheep from high-throughput RNA-sequencing data.Anim. Genet.49483487. 10.1111/age.12699

  • 44

    MarrasG.GaspaG.SorboliniS.DimauroC.Ajmone-MarsanP.ValentiniA.et al (2015). Analysis of runs of homozygosity and their relationship with inbreeding in five cattle breeds farmed in Italy.Anim. Genet.46110121. 10.1111/age.12259

  • 45

    MartinL. J.TremblayJ. J. (2010). Nuclear receptors in Leydig cell gene expression and function.Biol. Reprod.83314. 10.1095/biolreprod.110.083824

  • 46

    MastrangeloS.MoioliB.AhbaraA.LatairishS.PortolanoB.PillaF.et al (2019). Genome-wide scan of fat-tail sheep identifies signals of selection for fat deposition and adaptation.Anim. Prod. Sci.59835848. 10.1071/AN17753

  • 47

    Mbole-KariukiM. N.SonstegardT.OrthA.ThumbiS. M.BronsvoortB. M. C.KiaraH.et al (2014). Genome-wide analysis reveals the ancient and recent admixture history of East African shorthorn Zebu from western Kenya.Heredity113, 297305. 10.1038/hdy.2014.31

  • 48

    McQuillanR.LeuteneggerA.-L.Abdel-RahmanR.FranklinC. S.PericicM.Barac-LaucL.et al (2008). Runs of homozygosity in European popula1tions.Am. J. Hum. Genet.83359372. 10.1016/j.ajhg.2008.08.007

  • 49

    MikawaS.MorozumiT.ShimanukiS. I.HayashiT.UenishiH.DomukaiM.et al (2007). Fine mapping of a swine quantitative trait locus for number of vertebrae and analysis of an orphan nuclear receptor, germ cell nuclear factor (NR6A1).Genome Res.17586593. 10.1101/gr.6085507

  • 50

    MollerM. J.ChaudharyR.HellmenE.HöyheimB.ChowdharyB.AnderssonL. (1996). Pigs with the dominant white coat color phenotype carry a duplication of the KIT gene encoding the mast/stem cell growth factor receptor.Mamm. Genome7822830. 10.1007/s003359900244

  • 51

    MolotsiA. H.TaylorJ. F.CloeteS. W.MuchadeyiF.DeckerJ. E.WhitacreL. K.et al (2017). Genetic diversity and population structure of South African smallholder farmer sheep breeds determined using the OvineSNP50 beadchip.Trop. Anim. Health Prod.4917711777. 10.1007/s11250-017-1392-7

  • 52

    MoradiM. H.Nejati-JavaremiA.Moradi-ShahrbabakM.DoddsK. G.McEwanJ. C. (2012). Genomic scan of selective sweeps in thin and fat tail sheep breeds for identifying of candidate regions associated with fat deposition.BMC Genetics13:10. 10.1186/1471-2156-13-10

  • 53

    MuigaiA. (2003). Characterization and Conservation of Indigenous Animal Genetic Resources: The Fat-tailed and Thin-tailed Sheep of Africa.Ph.D. Thesis. Juja: Jomo Kenyatta University of Agriculture and Technology. 10.1007/s10437-013-9129-0

  • 54

    Nazari-GhadikolaeiA.Mehrabani-YeganehH.Miarei-AashtianiS. R.StaigerE. A.RashidiA.HusonH. J. (2018). Genome-wide association studies identify candidate genes for coat color and mohair traits in the Iranian Markhoz goat.Front. Genet.9:105. 10.3389/fgene.2018.00105

  • 55

    NielsenE. E.HansenM. M.LoeschckeV. (1999). Genetic variation in time and space: microsatellite analysis of extinct and extant populations of Atlantic salmon.Evolution53261268. 10.1111/j.1558-5646.1999.tb05351.x

  • 56

    PainterJ. N.WillemsenG.NyholtD.HoekstraC.DuffyD. L.HendersA. K. 9, et al. (2010). A genome wide linkage scan for dizygotic twinning in 525 families of mothers of dizygotic twins.Hum. Reprod.2515691580. 10.1093/humrep/deq084

  • 57

    PalsbøllP. J.BerubeM.AllendorfF. W. (2007). Identification of management units using population genetic data.Trends Ecol. Evol.221116. 10.1016/j.tree.2006.09.003

  • 58

    PickrellJ.PritchardJ. (2012). Inference of population splits and mixtures from genome-wide allele frequency data.PLoS Genet.8:e1002967. 10.1371/journal.pgen.1002967

  • 59

    PorterV. (2020). Mason’s World Dictionary of Livestock Breeds, Types and Varieties, 6th Edn. Wallingford: CABI Publishing. 10.1079/9781789241532.0000

  • 60

    PriceA. L.ZaitlenN. A.ReichD.PattersonN. (2010). New approaches to population stratification in genome-wide association studies.Nat. Rev. Genet.11459463. 10.1038/nrg2813

  • 61

    PritchardJ. K.PrzeworskiM. (2001). Linkage disequilibrium in humans: models and data.Am. J. Hum. Genet.69114. 10.1086/32127510.1086/519795

  • 62

    QuinlanA. R.HallI. M. (2010). BEDTools: a flexible suite of utilities for comparing genomic features.Bioinformatics26841842. 10.1093/bioinformatics/btq033

  • 63

    RubinC. J.MegensH. J.BarrioA. M.MaqboolK.SayyabS.SchwochowD.et al (2012). Strong signatures of selection in the domestic pig genome.Proc. Natl Acad. Sci. U.S.A.1091952919536. 10.1073/pnas.1217149109

  • 64

    RuppR.SeninP.SarryJ.AllainC.TascaC.LigatL.et al (2015). A point mutation in suppressor of cytokine signalling 2 (Socs2) increases the susceptibility to inflammation of the mammary gland while associated with higher body weight and size and higher milk production in a sheep model.PLoS Genet.11:e1005629. 10.1371/journal.pgen.1005629

  • 65

    SabetiP. C.VarillyP.FryB.LohmuellerJ.HostetterE.CotsapasC.et al (2007). Genome-wide detection and characterization of positive selection in human populations.Nature449913918. 10.1038/nature06250

  • 66

    SchwartzM. K.LuikartG.WaplesR. S. (2007). Genetic monitoring as a promising tool for conservation and management.Trends Ecol. Evol.222533. 10.1016/j.tree.2006.08.009

  • 67

    SiepielskiA. M.MorrisseyM. B.BuoroM.CarlsonS. M.CarusoC. M.CleggS. M.et al (2017). Precipitation drives global variation in natural selection.Science355959962. 10.1126/science.aag2773

  • 68

    SternR. (2010). “Hyaluronan and the process of aging in skin,” in Textbook of Aging Skin, edsFarageM. A.MillerK. W.MaibachH. I. (Berlin: Springer), 225238. 10.1007/978-3-540-89656-2_22

  • 69

    SternR.AsariA. A.SugaharaK. N. (2006). Hyaluronan fragments: an information-rich system.Eur. J. Cell Biol.85699715. 10.1016/j.ejcb.2006.05.009

  • 70

    SvedJ. A. (1971). Linkage disequilibrium and homozygosity of chromosome segments in finite populations.Theor. Popul. Biol.2125141. 10.1016/0040-5809(71)90011-6

  • 71

    TaberletP.ValentiniA.RezaeiH. R.NaderiS.PompanonF.NegriniR.et al (2008). Are cattle, sheep, and goats endangered species?Mol. Ecol.17275284. 10.1111/j.1365-294X.2007.03475.x

  • 72

    TammiR.RipellinoJ. A.MargolisR. U.MaibachH. I.TammiM. (1989). Hyaluronate accumulation in human epidermis treated with retinoic acid in skin organ culture.J. Invest. Dermatol.92326332. 10.1111/1523-1747.ep12277125

  • 73

    TarekegnG. M.KhayatzadehN.LiuB.OsamaS.HaileA.RischkowskyB.et al (2020). Ethiopian indigenous goats offer insights into past and recent demographic dynamics and local adaptation in sub-Saharan African goats.Evol. Appl.00, 116. 10.1111/eva.13118

  • 74

    TortereauF.ServinB.FrantzL.MegensH. J.MilanD.RohrerD.et al (2012). A high density recombination map of the pig reveals a correlation between sex-specific recombination and GC content.BMC Genom.13:586. 10.1186/1471-2164-13-586

  • 75

    TuhkanenA. L.TammiM.PelttariA.AgrenU. M.TammiR. (1998). Ultrastructural analysis of human epidermal CD44 reveals preferential distribution on plasma membrane domains facing the hyaluronan-rich matrix pouches.J. Histochem. Cytochem.46241248. 10.1177/002215549804600213

  • 76

    VerschurenD. (2007). “Decadal and century-scale climate variability in tropical Africa during the past 2000 years,” in Past Climate Variability Through Europe and Africa, edsBattarbeeR. W.GasseF.StickleyC. E. (Dordrecht, the Netherlands: Kluwer Academic Publishers), 139158.

  • 77

    VinetA.DrouilhetL.BodinL.MulsantP.FabreS.PhocasF. (2012). Genetic control of multiple births in low ovulating mammalian species.Mamm. Genome23727740. 10.1007/s00335-012-9412-4

  • 78

    WaplesR. S.GaggiottiO. (2006). What is a population? An empirical evaluation of some genetic methods for identifying the number of gene pools and their degree of connectivity.Mol. Ecol.1514191439. 10.1111/j.1365-294X.2006.02890.x

  • 79

    WeirB. S.CockerhamC. C. (1984). Estimating F-statistics for the analysis of population structure.Evolution3813581370. 10.1111/j.1558-5646.1984.tb05657.x

  • 80

    YangY.ZhouQ. J.ChenX. Q.YanB. L.GuoX. L.ZhangH. L.et al (2015). Profiling of differentially expressed genes in sheep T lymphocytes response to an artificial primary Haemonchus contortus infection.Parasit. Vectors8:235. 10.1186/s13071-015-0844-z10.1111/asj.1246810.1093/molbev/msx181

  • 81

    ZhiD.DaL.LiuM.ChengC.ZhangY.WangX.et al (2018). Whole genome sequencing of Hulunbuir short-tailed sheep for identifying candidate genes related to the short-tail phenotype.G38377383. 10.1534/g3.117.300307

Summary

Keywords

adaptation, climate change, genetic diversity, selection signatures, SNP genotypes

Citation

Abied A, Ahbara AM, Berihulay H, Xu L, Islam R, El-Hag FM, Rekik M, Haile A, Han J-L, Ma Y, Zhao Q and Mwacharo JM (2021) Genome Divergence and Dynamics in the Thin-Tailed Desert Sheep From Sudan. Front. Genet. 12:659507. doi: 10.3389/fgene.2021.659507

Received

27 January 2021

Accepted

25 June 2021

Published

19 July 2021

Volume

12 - 2021

Edited by

Xianyao Li, Shandong Agricultural University, China

Reviewed by

Maulik Upadhyay, Ludwig Maximilian University of Munich, Germany; Ahmed Rady A. E. Elbeltagy, Animal Production Research Institute, Egypt

Updates

Copyright

*Correspondence: Qianjun Zhao, Joram M. Mwacharo,

This article was submitted to Livestock Genomics, a section of the journal Frontiers in Genetics

Disclaimer

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

Outline

Figures

Cite article

Copy to clipboard


Export citation file


Share article

Article metrics