Detection of Selection Signatures Among Brazilian, Sri Lankan, and Egyptian Chicken Populations Under Different Environmental Conditions

Extreme environmental conditions are a major challenge in livestock production. Changes in climate, particularly those that contribute to weather extremes like drought or excessive humidity, may result in reduced performance and reproduction and could compromise the animal’s immune function. Animal survival within extreme environmental conditions could be in response to natural selection and to artificial selection for production traits that over time together may leave selection signatures in the genome. The aim of this study was to identify selection signatures that may be involved in the adaptation of indigenous chickens from two different climatic regions (Sri Lanka = Tropical; Egypt = Arid) and in non-indigenous chickens that derived from human migration events to the generally tropical State of São Paulo, Brazil. To do so, analyses were conducted using fixation index (Fst) and hapFLK analyses. Chickens from Brazil (n = 156), Sri Lanka (n = 92), and Egypt (n = 96) were genotyped using the Affymetrix Axiom®600k Chicken Genotyping Array. Pairwise Fst analyses among countries did not detect major regions of divergence between chickens from Sri Lanka and Brazil, with ecotypes/breeds from Brazil appearing to be genetically related to Asian-Indian (Sri Lanka) ecotypes. However, several differences were detected in comparisons of Egyptian with either Sri Lankan or Brazilian populations, and common regions of difference on chromosomes 2, 3 and 8 were detected. The hapFLK analyses for the three separate countries suggested unique regions that are potentially under selection on chromosome 1 for all three countries, on chromosome 4 for Sri Lankan, and on chromosomes 3, 5, and 11 for the Egyptian populations. Some of identified regions under selection with hapFLK analyses contained genes such as TLR3, SOCS2, EOMES, and NFAT5 whose biological functions could provide insights in understanding adaptation mechanisms in response to arid and tropical environments.


INTRODUCTION
Extreme environmental conditions are a major challenge in livestock production. Changes in climate, particularly those that contribute to weather extremes like drought or extreme temperatures or humidity may result in reduced performance, reproduction and could compromise the animal's immune function (St-Pierre et al., 2003). In chickens, extreme environmental temperatures lead to generation of reactive oxygen species (ROS), causing oxidative stress and lipid peroxidation (Altan et al., 2003). However, chickens particularly the local (indigenous) breeds often adapt over time to tolerate extreme challenging environments. Local chicken populations are characterized in terms of production status by limited management and veterinary services but are considered important genetic resources. They are reported to have been derived after many hundreds of years of successful adaptations to extreme environments (Hall and Bradley, 1995). In Egypt, there is undisputed evidence that chickens (domestic fowls) were kept since 1840 B.C (Coltherd, 1966), and Egypt was a major entry of Indian chickens to the African continent (Eltanany and Hemeda, 2016;Osman et al., 2016). Egyptian local breeds are generally characterized into three groups: the first group are the native breeds such as Fayoumi and Dandarawi, second group includes the Baladi and Sinai strains, and third group results from the cross between exotic and local strains accompanied by various trait selection (Osman et al., 2016). The native/local breeds/ecotypes have been kept as backyard or free-range chickens and could have developed adaptation mechanisms to their respective climates. In spite of successful adaptations to their environments, there is limited knowledge about genomic regions involved in the adaptation of local village chickens to the specific environmental conditions. There is also uncertainty whether geographical locations of local chicken populations could be the cause of their genetic differentiation (Mahammi et al., 2016). Domestication by humans and subsequent breed formation has led to chickens being adapted in physiology, morphology, fertility, and behavior to increase production (Ericsson et al., 2014). Selection pressure, natural or artificial, has been influential in enabling chickens to adapt to their environments and may leave signatures of selection in chicken population genomes. Signatures of selection, or selective sweeps as they are sometimes called, are particular patterns of DNA that are identified in regions of the genome with mutation or have been under selection pressure in a population (Qanbari and Simianer, 2014). Larger homozygosity regions are exhibited in such regions than expected under Hardy-Weinberg equilibrium whenever there is positive selection for a particular allele. These regions may have genes with functional importance in particular processes and reflect allelic selection under differing environmental conditions.
There are many methods used in the detection of selection signatures in the genome. These methods are classified into intrapopulation and inter-populations statistics. Inter-population statistical analyses can be categorized into single site or haplotype differentiation analyses (Qanbari and Simianer, 2014). To detect regions of divergence or similarity, most studies have used the single site differentiation statistic commonly known as Fixation Index, Fst (Elferink et al., 2012;Gholami et al., 2015;Fleming et al., 2017) and hapFLK (Gholami et al., 2015) analyses to detect selection signatures in both commercial and noncommercial breeds. Inter-population statistics are reported to have more statistical power to detect selection signatures in recently diverged populations (Yi et al., 2010). The major concern with Fst is that it assumes the populations have same effective population size and are derived independently from one ancestral population (Price et al., 2010). HapFLK is a method that is based on extension of the FLK statistic and accounts for both the hierarchical structure and haplotype information, and its use greatly improves the detection power and can detect signatures of selection that may be occurring across several populations .
In this study we applied both Fst and hapFLK statistical analyses on indigenous chicken breed/ecotype populations from three countries that have different climates [Brazil and Sri Lanka = Tropical, and Egypt = Arid] for regions where selection may have taken place and shaped the genome to enable the chickens to adapt to different environments.

Genotyping and Quality Control
Genotyping for all samples was conducted at GeneSeek (Lincoln, NE, United States) using the Affymetrix Axiom R 600k Array. SNP (single nucleotide polymorphism) genotype data quality filtering was assessed with PLINK 1.9 software (Chang et al., 2015) and only autosomal SNPs were screened based on parameters of >90% call rate (-geno 0.1) and minor allele frequency (MAF) > 0.02. In total, 523,186 SNPs were utilized for downstream analysis.

Population Stratification Analyses
Multi-dimensional scaling (MDS) was performed to examine population structure for stratification in two dimensions using cluster algorithm in PLINK v1.9 (Chang et al., 2015). Shared ancestry, with no prior knowledge on the origin of the breeds, was explored using the Admixture software (Alexander et al., 2009) for varying K-values, ranging from 1 to 12, where K is the number of expected subpopulations. The optimum K-value of K = 10 was determined based on the lowest value of the cross-validation error.

Fst Analyses
The Fst statistic analysis is a widely used approach and was performed to determine genetic differentiations between populations (Barreiro et al., 2008;Bonhomme et al., 2010;Fariello et al., 2013). Three pairwise comparisons were performed in Plink v1.9 (Purcell et al., 2007) for Brazil vs. Egypt, Sri Lanka vs. Egypt, and Brazil vs. Sri Lanka ecotypes to identify any genomic regions under increasing differentiation using an overlapping sliding window approach. The populations were designated as a case or control category based on hypothesized proxy climatic phenotype of tropical (Brazil and Sri Lanka) vs. arid (Egypt) climatic conditions. For each comparison, mean Fst (mFst) value was calculated in 100 kb sliding windows with a step size of 50 kb to examine data with 50% overlap using an in-house script (Karlsson et al., 2007). Genomic regions with the highest peaks, 0.2% of the empirical distributions of the mFst values, were considered for downstream analyses.

HapFLK Analyses
The hapFLK statistic accounts for varying effective population sizes and haplotype structure of the populations using multipoint linkage disequilibrium model (Scheet and Stephens, 2006;Bonhomme et al., 2010;Fariello et al., 2013). This approach was   used to identify possible regions under selection across chicken breeds/ecotypes within each country. To do this, it required estimation of a neighbor joining tree and a kinship matrix based on the matrix of Reynolds' genetic distances between ecotypes/breeds (Bonhomme et al., 2010). A phylogenetic tree was constructed among the populations from the three countries: Sri Lanka (KR, UPA, and GN), Brazil (Sedosa, Cochinchina, Ketros Oceania, Suri, Backyard Giant Indian, Shamo, Brahman, Backyard, Bantham, Brazilian Musician, Bakiva), and Egypt [Baladi (Bal), Fayoumi (Fay), and Dandarawi (Dan)]. To identify any regions under selection, analyses were performed separately across breeds/ecotypes within each climatic region (country). The number of haplotype clusters per chromosome was determined in fastPHASE using cross-validation based estimation and was set at 15 (Scheet and Stephens, 2006). The hapFLK values were generated for each SNP and computation of P-values were performed using a chi-square distribution with a python script that is provided on the hapFLK webpage 1 . A q-value threshold of 0.05 was applied to limit the number of false positives.

Gene Annotation
Gene annotation of the identified regions under possible selection was completed using NCBI's Genome Data Viewer 2 on the chicken genome version Gallus gallus 5.

Population Stratification
The MDS plot in Figure 1 shows distinct separation among ecotypes from the three countries and separation of Brazilian and Sri Lankan ecotypes from the Egyptian ecotypes. The Brazilian breeds, Cochinchina and Brahma (black circled) and Sedosa (red circled) are separated from the rest of the Brazilian breeds/ecotypes, but closer to Sri Lanka ecotypes. The admixtures analysis based on the SNP genotyping calls showed evidence of shared ancestry among breeds/ecotypes within each country and limited across countries (Figure 2). Although the Brazilian breeds/ecotypes were sampled from one location, admixture results revealed limited crossover among breed/ecotypes. The phylogenetic tree based on Reynolds' distances with all the SNPs that passed quality control is shown in Figure 3. Here, 2 https://www.ncbi.nlm.nih.gov/genome/gdv/ the Sri Lankan ecotypes were separated from Egyptian breeds and some Brazilian breeds/ecotypes grouped in sub-trees. This is consistent with MDS plot. The Brazilian breeds, Cochinchina and Brahma, that are historically known to originate from Asia are grouped in one sub-tree with the Sri Lankan ecotypes.

Fst Analyses
The Fst analyses for the comparisons between Brazil or Sri Lanka vs. Egypt generally indicated the strongest peaks on chromosomes 2, 3, and 8 (mFst > 0.28) (Figure 4)

Genes Under Putative Selection Within Egyptian, Sri Lankan, and Brazilian Populations
The hapFLK statistic is an extension of FLK, accounts for the haplotype information and hierarchical structure Servin et al., 2013) and greatly improves the power of detection of selection signatures that may be occurring across several populations. HapFLK analyses revealed significant unique selection signals within Sri Lankan, Egyptian, and Brazilian chicken populations. Eight significant regions on chromosomes 1 (1.71-2.72 Mb; 43.05-46.79 Mb), 2 (38.74-38.96 Mb), 3 (102.39-103.09 Mb), 4 (71.24-71.34 Mb), 5 (28.61-29.14 Mb), 10 (14.06-14.09 Mb), and 11 (18.79-20.20 Mb) were detected as strong selection signatures across the Egyptian breeds ( Figure 5A). Multiple genes, with a majority of them such as Suppressor of cytokine signaling 2 (SOCS2), Eomesodermin (EOMES) and Nuclear factor of activated T-cells 5 (NFAT5) are involved in the immune system were identified within the regions under selection (Tables 1, 2), but to date there were no annotated genes within the regions on chromosomes 4 and 10. Two regions with strong selection signals were detected on chromosomes 1 (34.44-34.53 Mb) and 4 (61.18-62.15 Mb) across the Sri Lankan chicken ecotypes ( Figure 5B). One gene was identified within the chromosome 1 region, while 18 genes, including genes involved in the immune system such as Toll like receptor 3 (TLR3) and Nuclear factor kappa B subunit 1 (NFKB1) were identified within the chromosome 4 selection region (Tables 3, 4). In addition to immune response genes, hapFLK analyses revealed genes associated with production traits in the regions under selection across Egypt and Sri Lanka chicken populations. Genes such as SNRPF, MRPL42, and ACSF3 on chromosomes 1 and 11 ( Table 2) were identified across the Egypt populations, whilst MTNR1A and CYP4V2 on chromosome 4 ( Table 4) were identified across the Sri Lanka populations.
There were no strong selection signals across the eleven Brazilian breeds/ecotypes, but two regions with strong signals were detected across the two Brazilian breeds with Asian ancestry, Cochinchina and Brahma on chromosomes 1 and 14 ( Figure 5C).  Three genes were identified within the selection signature region on chromosome 1 and there were no annotated genes within the chromosome 14 region (Tables 5, 6). No selection signals were detected across the rest of the nine Brazilian breeds/ecotypes (results not shown). None of the selection signature regions from the hapFLK in any country (Egypt, Sri Lanka, and Brazil) populations were consistent with Fst analyses.

DISCUSSION
The admixture of populations in the three countries indicates mixed genetic backgrounds of the chickens (Figure 3). The overlap across ecotypes/breeds within individual countries could be due to unrestricted inter-mating among chickens of different genetic backgrounds, resulting in chickens with ancestors from different groups that eventually contribute to the shared ancestry. The other factor that might contribute to the admixture within and across the respective countries could be due to movement of birds through trading. Although chickens were sampled from one location, Porto Ferreira in Brazil, it is surprising that there was more admixture and more discrete breeds in the Brazil population, unlike Egypt and Sri Lanka populations. Moreover, the Brazilian breeds/ecotypes clustered closer to the Sri Lankan ecotypes (Figures 1, 3). This is, however, not surprising because chickens in Brazil are not indigenous and are reported to have been imported from Asia (Komiyama et al., 2004). The Reynolds' genetic distances population tree compliments the stratification by the MDS plot and admixture of the populations. The Egyptian breeds are within their own sub-tree and appear to have some shared ancestry with some Asian breeds as revealed by the admixture plot. The indication of shared ancestry is in agreement with previous findings which reported that Egyptian local/native breeds/ecotypes originated from Asia or the Indian sub-continent (Elferink et al., 2012;Elkhaiat et al., 2014;Eltanany and Hemeda, 2016). The MDS results allowed the analyses to be performed on a case/control basis, with environmental/climatic conditions of the three countries as the proxy phenotype to allow the results to be viewed as regions of the genome under possible selection for environmental tolerance/adaptation by the local chicken populations of each of the three countries. The Fst results indicated possible selection signatures on chromosomes 2 and 8 for the Brazil vs. Egypt comparison, and on chromosome 3 for the Sri Lanka vs. Egypt comparison and common differences between Arid (Egypt) and Tropical (Sri Lanka and Brazil). The two genes, TRMT1L and MicroRNA 6545 detected in regions for the Brazil vs. Egypt comparison could suggest chicken adaptation and survival in hot conditions. TRMT1L catalyzed tRNA modification is required for redox homeostasis to ensure proper cellular proliferation and oxidative stress survival. Cells that are deficient in the TRMT1L will exhibit a decrease in proliferation rates, alteration in protein synthesis and perturbation in redox homeostasis including hypersensitivity to oxidizing agents (Dewe et al., 2017). The second gene, MicroRNA 6545, is reported to be involved in reproductive processes and embryogenesis, including TGF-β and Wnt that specifies the neutral fate of the blastodermal cells (Shao et al., 2012). For the Sri Lanka vs. Egypt comparison, a gene, HS3ST5 that could be important in immune response was detected. HS3ST5 is involved in immunity and defense molecular functions (Szauter et al., 2011). Although we did not detect annotated genes in the common regions between the two analyses of chickens from Brazil or Sri Lanka vs. Egypt, these regions could present recent important selection signatures that could enable chicken survival in either the tropics or arid conditions. The common genomic regions of chickens from Sri Lanka or Brazil when compared to Egypt could indicate exposure of chickens from Sri Lanka and those from Porto Ferreira (Brazil) to same environmental conditions and they may have evolved similar selection signatures for adaptation and survival.
The identification of genomic regions that may be under both artificial and natural selection could help identify possible selection signatures across breeds/ecotypes within a country. Several genomic regions with putative selection were identified in the current study using the hapFLK method across Egyptian and Sri Lankan breeds and ecotypes, respectively. The hapFLK analyses identified several regions under selection on chromosomes 1, 2, 3, 4, 5, 10, and 11, across the three Egyptian breeds; Fayoumi, Dandarawi, and Baladi ( Figure 5A and Table 1). Some genes detected in the genomic regions under selection across the Egyptian chickens are reported to be involved in the modulation of growth (Bolamperti et al., 2013), and the immune system (Szczesny et al., 2014;Zhang et al., 2018) and others could possibly be important in thermal/heat tolerance. These genes could be relevant in the adaptation of the Egyptian chickens to the arid hot dry conditions. One notable gene in a region under selection, on chromosome 2 is the SOCS2. Suppressor of cytokine signaling (SOCS) proteins generally play vital roles in the feedback inhibition of cytokine receptor signaling (Larsen and Röpke, 2002). The SOCS2 gene is a multifunctional protein that is involved in growth hormone signaling through cytokine-dependent pathways and the JAK/STAT pathway (Metcalf et al., 2000;Rico-Bautista et al., 2006). This gene is important in the regulation of several biological processes that control growth, development, immune function, homeostasis (Rico-Bautista et al., 2006), and has been hypothesized to have an effect on breast meat yield during heat stress (Van Goor et al., 2015). The region on chromosome 2 under selection contains two genes, and one of the genes, EOMES is also important in the immune system. The EOMES is one of the two T-box proteins expressed in the immune system and are responsible with driving the differentiation and function of cytotoxic innate lymphocytes such as the natural killer (NK cells). NK cells are endowed with cytotoxic properties and contribute to the early defense against pathogens and immunosurveillance of tumors (Zhang et al., 2018). The regions under selection on chromosome 11 contains 66 annotated genes, with some genes involved in immune response. One of the genes, NFAT5 is required for TLR-induced responses to pathogens, and previous studies have shown that TLR-induced NFAT5-regulated genes such as TNF-α play a vital role in inflammatory responses (Buxadé et al., 2012;Tellechea et al., 2017). We have reported only a few genes plus their associated roles/functions in regard to the regions under selection across the Egyptian breeds. Most of the genes in these regions on the different chromosomes (1, 2, 3, 5, and 11) could play vital roles in the adaptation mechanisms to enable the survival of the Egyptian chicken breeds in the hot arid climatic conditions. Although we did not detect any annotated genes in the regions under selection on chromosomes 4 and 10, it is important to note that these could be recent possible selection signatures for the Egyptian breeds to their climate. In other parallel studies, it has been shown that domesticated animals often develop physiological and genetic adaptations when encountered with harsh or new environments such as hypoxia (Ramirez et al., 2007;Storz et al., 2010). A study conducted on Tibetan chickens that primarily live at high altitudes of between 2,200 and 4,100 m revealed several candidate genes that are involved in the calcium signaling pathway to possibly enable them adapt to hypoxia (Wang et al., 2015). There were two regions under selection on chromosomes 1 and 4 across the Sri Lanka ecotypes. Like the selection in the Egyptian breeds, the region under selection on chromosome 4 of the Sri Lanka ecotypes contain several genes and two of them, Toll like receptor 3 (TLR3) and Nuclear factor kappa B subunit 1 (NFKB1) are important in the immune system. A TLR signaling pathway is an innate immune defense mechanism against pathogen attack in both vertebrates and invertebrates. TLR3 in chickens is orthologous to its mammalian counterpart (Kannaki et al., 2010), and together with TLR7 it is known in the recognition of RNA virus encoded pathogen associated molecular patterns (PAMPs) (Akira, 2001). TLR3 are able to recognize and bind to double-stranded RNA intermediates that are produced during viral replication (Iqbal et al., 2005), and the end product of its signaling pathway is the production of antiviral type I inferno (IFN)-α and -β (Guillot et al., 2005). Another important gene, NFKB1 could also be of importance to the survival of Sri Lanka chicken ecotypes in the tropical hot humid climate climatic conditions of Sri Lanka. NFKB transcription factors are important in immunity and inflammation (Hayden and Ghosh, 2008). TLR are activated by binding to the PAMPs that in turn initiates MAPK-or nuclear factor kappa B (NFkB) dependent cascades that leads to a proinflammatory response, resulting in the secretion of antibacterial substances, such as β-defensins and cytokines (Kogut et al., 2006). NFKB proteins are also involved in a wide range of processes, including; cell development, growth and survival, proliferation and are also involved in many pathological conditions (Morgan and Liu, 2011). Sri Lanka has hot humid climatic conditions that besides being favorable for pathological infection to livestock, also presents challenging conditions like heat stress, especially during a drought that requires the animal to adapt to such conditions. Challenges like heat stress result in the production of ROS that are produced by a variety of cellular processes. NFKB-regulated genes are vital in regulating the amount of ROS in cells (Morgan and Liu, 2011). The ROS have several stimulatory and inhibitory roles in NFKB signaling.
Chicken survival in challenging environments involves different adaptation mechanisms, among which is the ability to perform under harsh conditions. The current study indicated selection signatures with genes associated with production traits in both Egypt and Sri Lanka populations. For Egypt populations, we identified MRPL42 which is a candidate gene associated with breast yield under heat stressed chickens. The MRPL42 gene is vital in DNA synthesis, transcription, RNA processing and translation (Van Goor et al., 2015). Another gene ACSF3, belonging to the ACSF gene family is reported to be correlated to egg laying performance in chickens (Tian et al., 2018). For Sri Lanka chicken populations, the CYP4V2 gene associated with control of fat deposition in chickens was identified on chromosome 4 of the region under selection (Claire D' Andre et al., 2013). Because local chickens are mostly free range and exposed to high humid hot conditions in developing countries, such as Sri Lanka, it could be vital for chickens to control the depositions of fat as an adaption mechanism.
There were no regions of selection across all the eleven Brazilian breeds/ecotypes, but we detected possible regions of selection across two breeds, Cochinchina and Brahma, known to have Asian ancestry, on chromosomes 1 and 4. However, these regions didn't overlap with regions under selection across the Asian Sri Lankan ecotypes. This could be due to the fact that chickens were introduced to Brazil from Asia over a few hundred years ago, and possibly because of the differences in climatic conditions between Porto Ferreira, Sao Paolo and Sri Lanka. The chicken genomes from these locations could have been modified to enable chicken adaptation and survival in the respective changing climates.
There is clear evidence that chickens, particularly the domestic fowl, were kept in Egypt for thousands of years and this is dated back to 1840 B.C (Coltherd, 1966). For other traditional breeds such as Fayoumi and Dandarawi, studies based on mitochondrial (mtDNA) sequence variation have shown that these Egyptian indigenous breeds could have roots in Indian subcontinent and southwest Asia (Elkhaiat et al., 2014;Eltanany and Hemeda, 2016), because Egypt was an entry route of Indian chickens to Africa. In spite of the fact that Egyptian chicken breeds might have Asian origin, none of the regions under selection was shared between Egyptian breeds and Sri Lanka ecotypes. Asian chicken breeds could have been imported to Egypt over thousands of years ago, and because of the difference in climatic conditions; hot arid and hot humid for Egypt and Sri Lanka, respectively, chickens in the two climatic conditions developed different adaptation mechanisms to survive in the different climates.
The two methods, Fst and hapFLK, did not detect any overlapping regions, and we noted that hapFLK detected more selection signals with several important genes compared to Fst. HapFLK approach has been reported by previous simulation studies to have the ability to greatly increase the detection power of selection signatures occurring across several populations (Bonhomme et al., 2010;Fariello et al., 2013). Due to this, were able to detect several regions under selection; within Egypt and Sri Lanka populations with hapFLK that were not detected by the Fst analyses. HapFLK considers the hierarchical structure of the population and this improves the detection power of soft sweeps.

CONCLUSION
There is evidence of stratification and admixture, particularly among breeds/ecotypes within each country's populations. The Fst differences between Sri Lanka and Egypt populations could indicate the differences in the chicken adaptations due to the different climatic conditions in the two countries. The low Fst values between Sri Lanka and Brazil could possibly be due to common shared ancestry of Asian origin over a few years ago rather than climate. This might change with the continuous changes in climatic conditions where local Brazilian chickens from Porto Ferreira, Sao Paolo region might develop certain genome modification to adapt to the climate. For hapFLK analyses, there were no common regions under selection among breeds/ecotypes across the populations from the three countries. This could indicate climatic specific selection signals that have enabled those chickens to develop adaptation mechanisms in response to their respective climatic conditions. In that regard, Sri Lanka and Egypt chicken ecotypes/breeds have developed mechanisms to survive in their humid and dry hot climates.