Breed Ancestry, Divergence, Admixture, and Selection Patterns of the Simbra Crossbreed

In this study, we evaluated an admixed South African Simbra crossbred population, as well as the Brahman (Indicine) and Simmental (Taurine) ancestor populations to understand their genetic architecture and detect genomic regions showing signatures of selection. Animals were genotyped using the Illumina BovineLD v2 BeadChip (7K). Genomic structure analysis confirmed that the South African Simbra cattle have an admixed genome, composed of 5/8 Taurine and 3/8 Indicine, ensuring that the Simbra genome maintains favorable traits from both breeds. Genomic regions that have been targeted by selection were detected using the linkage disequilibrium-based methods iHS and Rsb. These analyses identified 10 candidate regions that are potentially under strong positive selection, containing genes implicated in cattle health and production (e.g., TRIM63, KCNA10, NCAM1, SMIM5, MIER3, and SLC24A4). These adaptive alleles likely contribute to the biological and cellular functions determining phenotype in the Simbra hybrid cattle breed. Our data suggested that these alleles were introgressed from the breed's original indicine and taurine ancestors. The Simbra breed thus possesses derived parental alleles that combine the superior traits of the founder Brahman and Simmental breeds. These regions and genes might represent good targets for ad-hoc physiological studies, selection of breeding material and eventually even gene editing, for improved traits in modern cattle breeds. This study represents an important step toward developing and improving strategies for selection and population breeding to ultimately contribute meaningfully to the beef production industry.


INTRODUCTION
Cattle play an important part in the agricultural economy worldwide. Modern cattle were derived from at least two independent domestication events that gave rise to two subspecies of cattle (Loftus et al., 1994;Ajmone-Marsan et al., 2010). The one is humpless Taurine (Bos taurus taurus) cattle, with Bos primigenius primigenius ancestry, which was domesticated ∼10,500 years ago in Eastern Europe. The other is the humped zebu or Indicine (Bos taurus indicus) cattle, with Bos primigenius namadicus ancestry, which was domesticated ∼7,000 years ago in India (Bradley et al., 1996). Domestication of cattle resulted in animals with high overall genetic and phenotypic variability (Taberlet et al., 2008).
The rise of the "breed" concept, and associated intensive artificial selection, had resulted in specialized cattle breeds that underwent further organized selection to enhance production and adaptability (Iso-Touru et al., 2016). Taurine breeds have been intensively selected for milk and meat yield (Low et al., 2020). For example, selection for traits associated with meat production (e.g., fast growth, carcass quality, meat quality, and meat yield) and increased fertility gave rise to Simmental, which is the oldest and one of the most widespread Taurine beef breeds (Bordbar et al., 2020;Ríos-Utrera et al., 2020). In contrast, selection for high tolerance to parasites, heat resistance and overall hardiness gave rise to Indicine breeds, such as Brahman, the first beef cattle breed developed in the United States (Dikmen et al., 2018).
Various crossbreeds have also been developed to improve environmental adaptability and desirable performance (Paim et al., 2020). These cattle breeds combine the favorable traits/genes that characterized their purebred parental breeds. An added benefit inherent of crossbreeding is heterosis or hybrid vigor that may give rise to qualities that are more superior in the crossbreed than its parental inbred lines (Harrison and Larson, 2014;Frankham, 2015;Gouws, 2017). Furthermore, crossbreeding remains an important mechanism for increasing the overall genetic variation of modern cattle breeds (Kristensen et al., 2015), especially given the substantial losses incurred due to intensive selection for improved productivity and adaptability (Albertí et al., 2008;Taberlet et al., 2008). However, despite these benefits, it is still unclear whether the genetic composition of a crossbreed is stable over time (Paim et al., 2020). It is also not known if crossbreeding may cause reduction in performance and fitness due to genetic erosion and outbreeding depression (Harrison and Larson, 2014;Frankham, 2015;Gouws, 2017). Genetic erosion may cause reduction in performance since genetic diversity is necessary for evolution to occur, while loss of genetic diversity is related to inbreeding that reduces reproductive fitness (Reed and Frankham, 2003).
The Simbra crossbreed was developed in the United States in the late 1960s, shortly after the first Simmental arrived from Europe (Gouws, 2016). It has been described as the "allpurpose American breed "and was developed by hybridization of the Brahman and Simmental breeds (Gouws, 2016). Generally, crossbreeding of Brahman with Taurine breeds produces hardy animals with better meat quality than purebred Brahmans (Crouse et al., 1989;Johnson et al., 1990;Schatz et al., 2014). The high tolerance of Simbra to harsh conditions (e.g., heat, humidity, parasites, seasonally poor pasture quality, and large distances required to be walked while grazing) is thus derived from its Brahman parentage. In turn, its good meat quality (e.g., carcass composition and conformation), early sexual maturity, milking ability, rapid growth, and docile temperament are attributed to its Simmental ancestry (Smith, 2010). Although Simbra cattle are mainly produced in the USA, the breed was also introduced to other countries. For example, Simbra was introduced to South Africa in the late 1990s where it is among the 10 most popular breeds in the country (Scholtz et al., 2008).
Several population studies provided insight regarding genetic structure of popular South African cattle breeds (e.g., Simmental, Afrikaner and Nguni) (Bennett and Gregory, 1996;Pico, 2004;Martínez and Galíndez, 2006;Greyling et al., 2008;Sanarana et al., 2016;Pienaar et al., 2018). However, little is known about the genetic diversity and population structure within and between South African Simbra and the ancestral Brahman and Simmental breeds.
Various studies showed that information mined from whole genome data is useful for estimating proportional ancestry, maximizing genetic variability and for developing breeding strategies Sharma et al., 2017;Bhati et al., 2020). In other words, knowledge emerging from genomic studies can be used to improve livestock in terms of meat and milk production, disease resistance and reproductive health Sharma et al., 2017;Bhati et al., 2020). For example, genome-wide association studies (GWAS) have been used to identify genes involved in meat quality in different Taurine (Gutiérrez-Gil et al., 2008;McClure et al., 2012;Allais et al., 2014;Xia et al., 2016), Indicine (Tizioto et al., 2013;Magalhães et al., 2016), and crossbreeds (Bolormaa et al., 2011;Lu et al., 2013;Hulsman et al., 2014). Genome-based selection strategies are thus increasingly regarded as invaluable for ultimately improving cattle fitness, productivity, and quality (Daetwyler et al., 2014;Kim et al., 2017).
The overall goal of this study was to estimate the adaptive potential of the Indicine-and Taurine-derived genomic components in the South African Simbra cattle breed. We therefore aimed to (i) determine levels of heterozygosity; (ii) infer the overall population structure and admixture ancestry in Simbra cattle; (iii) and identify genomic regions subject to positive selection and to associate these with putative productivity and adaptive traits. For this purpose, Simbra, Brahman and Simmental animals were genotyped using the cost-effective Illumina's low density Bovine BeadArray (7K) technology that allows the genotyping of a larger number of individuals, as part of the South African Beef Genomics Project. Several studies have successfully used this approach in genome-wide association studies as genotyping large numbers of individuals with thousands of SNPs remains prohibitively expensive for many research groups. The data generated in this study will be instrumental for informing and designing appropriate management and breeding strategies for maximizing Simbra productivity in South Africa and cattle in general.

Animals
A total of 321 animals were genotyped in this study. These included animals from the South African Simbra crossbred population (Simbra, n = 69), as well as Brahman (Bos taurus indicus, n = 161) and Simmental (Bos taurus taurus, Simmental n = 91) populations. These animals were part of stud breeding programs aimed at producing registered Simbra (3/8 Brahman, 5/8 Simmental; Figure 1) that is registered in a herdbook, Brahman and Simmental cattle and were not part of a designed experiment. They were selected based on phenotypic appearance, which was consistent with typical breed characteristics and pedigree information accepted by local breeders and breed societies.

SNP Genotyping and Quality Control
Genomic DNA was extracted at the ARC-Biotechnology Platform from blood/hair root samples using Qiagen's DNeasy extraction kit (Qiagen, Valencia, CA). The quality and quantity of the DNA were estimated using a Qubit R 2.0 fluorometer (Life Technologies, ThermoFisher Scientific, USA), Nanodrop 1000 spectrophotometer (Nanodrop Technologies, Wilmington, DE), and agarose gel electrophoresis. These DNAs were then used in genotyping experiments at the ARC-Biotechnology Platform as part of the SA Beef Genomics Project during the period 2015-2018. This was done using the Illumina BovineLD v2 BeadChip (7K) (Illumina, San Diego, CA), which features 7,931 single nucleotide polymorphism (SNP) probes that are distributed across the whole bovine genome, with <3 kilobase pair (kb) median gap spacing. Samples were processed according to the Illumina Infinium-II assay protocol (Illumina, Inc. San Diego, CA, 92122, USA). Only autosomal chromosomes were used, and SNP quality control was assessed using PLINK (Purcell et al., 2007). SNPs with a call rate <95% and minor allele frequencies (MAF) <5% across all breeds were removed. SNPs with a high linkage disequilibrium (LD) at a threshold of LD ≥0.8 were also pruned. The SNP & Variation Suite v.8.8.3 (Golden Helix Inc., Bozeman, MT, USA; www.goldenhelix.com) was used to estimate the identity-by-descent (IBD) values between pairs of individuals that can be used to detect and remove related and duplicate samples.

Genetic Diversity
Various analytical tools were used to estimate the genetic diversity among the Simbra, Brahman and Simmental populations. The observed heterozygosity estimates for each population, as an indication of within-breed diversity, were calculated from observed genotype frequencies obtained from PLINK (Purcell et al., 2007). Here, observed heterozygosity was calculated as (N -O)/N, where N is the number of "non-missing genotypes" for a given individual and O is the number of observed homozygous genotypes for that individual. We also estimated the inbreeding coefficient (F) as a measure of "excess" homozygosity using the SNP & Variation Suite.

Population Structure
Principal Components Analysis (PCA) (Patterson et al., 2006) and fastSTRUCTURE (Raj et al., 2014) analyses were used to identify patterns of admixture and relatedness among the Simbra cattle, in relation to the Simmental and Brahman populations. PCA was performed using the EIGENSTRAT methodology embedded in the SNP & Variation Suite. The fastSTRUCTURE analysis employed an admixture model and two clusters (K = 2; based on the number of ancestral populations) (Smith, 2010). The analysis was executed using independent allele frequencies, and a burn-in of 100 000 iterations, followed by 1 000 000 Markov Chain Monte Carlo iterations. Graphical display of the admixture output was generated using Distruct v1.1 (http://web.stanford. edu/group/rosenberglab/distruct.html).
Local ancestry for admixed Simbra animals were inferred using PCAdmix (Brisbin et al., 2012), which uses PCA to determine the posterior probabilities for the ancestry of a genomic region along each chromosome. More specifically, PCAdmix classifies blocks of SNPs by ancestry through PCA, projecting the loadings of admixed individuals based on the loadings of putative ancestors. It employs a Hidden Markov Model (HMM) to smooth the results and returns the posterior probabilities of ancestry affiliation for each block from the HMM (Brisbin et al., 2012).
To prepare input files for PCAdmix, haplotypes were built using Beagle 5.1 by phasing and imputing missing genotypes from the SNP unphased data (Browning et al., 2018). Chromosomes for each individual in a population were artificially strung together to create two haploid genomes for the individual to increase the amount of information used for PCA. Since PCAdmix requires predefined ancestral groups, we selected two main ancestral groups (Simmental and Brahman cattle) for the Simbra cattle. PCAdmix was assigned with a posterior probability threshold of 0.8. In order to remove highly linked alleles from different populations and avoid spurious ancestry transitions, ancestral populations were thinned using a SNPs pairwise linkage disequilibrium (LD) value (r 2 ) of <0.8. We defined a constant recombination rate of 1e-8 based on the assumption that 0.01 recombination occur per 1,000 kb (equivalent to 1 cM) (Khayatzadeh et al., 2016).

Identification of Selection Signatures
To identify signatures of selection we used LD-based methods that search for haplotypes driven to complete fixation (Vitti et al., 2013). These include the integrated haplotype score (iHS), which is a within-population statistic reflecting the amount of extended haplotype homozygosity (EHH) for a given SNP along the ancestral allele relative to the derived allele. Because of the limitation of this statistic when the selected allele is near fixation, we also used the method developed by Tang et al. (2007) that compares EHH profiles between pairs of populations. Based on EHHS, a so-called "site-specific EHH measure, " the Tang et al. method estimates a weighted average of the EHH at both alleles of each SNP in each population. Then, the distribution of the standardized log-ratio of the integrated EHHS (iES) between pairs of populations (referred to as Rsb) is used to detect signals of selection. The advantage of the Tang et al. method is that it calculates EHH for the entire population instead of partitioning it into ancestral and derived alleles, which eliminates the allele frequency constraint and makes it capable of detecting selection sweeps near fixation. The Rsb scores for Simbra crossbred cattle were calculated using the Simmental and Brahman as a reference population.
In this study, the ancestral alleles required for the computation of iHS were inferred as the most common alleles in the entire dataset following Bahbahani and Hanotte (2015). Haplotypes for the iHS and Rsb analyses were derived with fastPHASE (Scheet and Stephens, 2006) using 10 starts (T10) and 25 iterations FIGURE 1 | Illustration of two hybridization schemes (A) and (B) used to establish the Simbra crossbreed (adopted from Paim et al., 2020). A 5/8 Simmental and 3/8 Brahman are the optimum composition needed to retain the favorable traits both parental breeds (O' Connor et al., 1997;Smith, 2010). Controlled breeding programs are used to establish the next Simbra generations with the optimum composition.
(C25) of the expectation-maximization (EM) algorithm (Scheet and Stephens, 2006). The iHS and Rsb analyses were performed using the rehh package (Gautier and Vitalis, 2012) in R version 3.4.4. For the analysis of within-population an iHS score >5 (equivalent to P-value = 1e-06) and for the analysis of betweenpopulation differences a Rsb score >5 (equivalent to P-value = 1e-06) were used to infer the candidate genomic regions under selection.
We also examined the gene content within genomic regions containing signatures of selection. This was done using the annotated UMD3.1 reference genome for the Taurine breed Hereford available on the Bovine Genome Database (https://bovinegenome.elsiklab.missouri.edu/). To determine potential overlap of these regions with previously published quantitative trait loci (QTLs), the bovine database (http://www. animalgenome.org/cgi-bin/QTLdb/BT/search) incorporated in the Animal QTL database (Animal QTLdb) of Hu et al. (2019), was used.

SNP Genotyping and Quality Control
After quality control to remove SNPs with <95% call rate, MAF < 0.05 and LD (r 2 = 0.8), 4 488 SNPs were retained for analyses. We also performed a sample filtering to limit the inclusion of very closely related individuals (Figure 2A). Accordingly, all 321 animals were retained for analysis (i.e., 69 Simbra, 161 Brahman, and 91 Simmental genomes), based on IBD values of ≥0.45. IBD represents the probability that two randomly chosen alleles of an individual are inherited from a common ancestor, with the length of haplotypes shared between individuals being inversely proportional to the time since divergence from that common ancestor (Browning and Browning, 2010).

Genetic Diversity
Among the three populations, Simbra and Simmental had comparable observed heterozygosity values (i.e., 0.427 with FIGURE 2 | Identity-by-descent (IBD) results of the crossbred South African Simbra population, as well as the ancestral South African Simmental and Brahman populations (A). Green indication a closer genetic distance and red indicating that the genetic distance is farther. FastSTRUCTURE (Raj et al., 2014) results from the 7k SNP panel set at K = 2 according to the historical number of ancestral populations (Smith, 2010). Simmental ancestry are indicated in red, while Brahman ancestry are indicated in blue (B). standard deviation [±SD] of 0.020 and 0.417 with ±SD 0.015, respectively), which were much higher than those for Brahman (0.295, ±SD 0.029, n = 161). In comparison with the Simmental (0.0003, ±SD 0.031) and Simbra cattle (−0.011, ±SD 0.045) populations, limited diversity was observed for Brahman (0.022, ±SD 0.103) population.

Simbra Population Structure and Genomic Content
FastSTRUCTURE separated the animals genotyped in this study into three distinct clusters ( Figure 2B). A similar clustering pattern was observed using PCA (Figure 2C), where 55.66% of the genetic variability was explained by the first two principal components (with the first explaining 50.2%). These three clusters corresponded to the Brahman and Simmental ancestor populations, and the Simbra population, representing an admixture between the Taurine and Indicine cattle.
The Simbra hybrid genomes were partitioned into segments of inferred Simmental and Brahman ancestry using the PCAdmix algorithm (Figure 3). We used the default parameters in PCAdmix thereby removing SNPs in high LD (r 2 > 0.8) and SNPs that were monomorphic between the breeds. Subsequent ancestry inference of each genome revealed that the South African Simbra breed is composed of a higher average proportion of Taurine (64.8%, ±SD 8) than Indicine (35.2%, ±SD 8) backgrounds (Figure 3A), as was expected for the breed (O' Connor et al., 1997;Smith, 2010). However, 19 of the 69 Simbra individuals had genomic compositions that deviated substantially from this expectation ( Figure 3A); i.e., the Indicine contribution was <27.2% in 9 genomes and >43.2% in 10 genomes.

Genomic Regions Containing Signatures of Positive Selection
Our analyses revealed the presence of nine genomic regions containing signatures of positive selection in the Simbra genome ( Table 1). These regions were identified using intra-population iHS and inter-population Rsb analyses (Vitti et al., 2013). Focusing on the Simbra hybrid cattle, the intra-population iHS analysis identified eight of these regions, which were located on BTA 1, BTA 2, BTA 3, BTA 9, BTA 19, BTA 20, and BTA 21 (Table 2; Figure 4A). Additionally, the Rsb analyses identified five positive selection regions (i.e., on BTA 2, BTA 3, BTA 19, BTA 20, and BTA 21) using Simmental as reference population, and two using Brahman as reference population (i.e., on BTA 21 and on BTA 23) (Table 2; Figures 4B,C). Five of these genomic regions were detected using both the iHS and Rsb statistics. The region on BTA 21 was identified with Rsb analyses employing both Simmental and Brahman as reference populations, while the remainder (i.e., on BTA 2, BTA 3, BTA 19, and BTA 20) were detected using the Simmental reference population. Overall, five (BTA 1, BTA 3, BTA 5, BTA 21, and BTA 23) of the nine regions in which positive selection was detected were located FIGURE 3 | Local ancestry for the crossbred South African Simbra cattle population (A) and representative haplotypes (B,C) inferred using PCAdmix (Brisbin et al., 2012). The Brahman and Simmental cattle populations were used as source populations (Smith, 2010).
within genetic ancestry blocks that displayed a deviation in the expected genomic composition for Simbra ( Table 2).

DISCUSSION
This is the first study to utilize genome-wide polymorphism data to investigate the genetic diversity, population structure and patterns of local ancestry of the South African Simbra hybrid breed and its Taurine and Indicine ancestor breeds. We also used the SNP data obtained to identify candidate genomic regions with signatures of adaptive introgression and positive selection. The availability of the genome sequencing data from the SA Beef Genomics Project will make it possible in the future to augment conventional livestock breeding and performance management programmes with genomic information.
Our results showed that hybridization of the Taurine and Indicine breeds conferred a higher genetic diversity of the Simbra breed in comparison with the purebred breeds (Ghafouri-Kesbi, 2010;Zhang et al., 2015). This was obvious from the negative inbreeding coefficient (f ) estimate that indicated an excess of heterozygosity even beyond what is expected under Hardy-Weinberg equilibrium in the Simbra population (Maiorano et al., 2018). Compared to the two ancestral breeds, the South African Simbra population had the highest genetic diversity, although it was only marginally higher than that of the Simmental breed. Therefore, hybridization of subspecies remains an important tool for expanding the genetic variation within modern cattle breeds (Gregory and Cundiff, 1980). Also, the genetic diversity inherent to South African Simbra holds significant potential for improvements in production and environmental adaptability (Sölkner et al., 1998;Becker et al., 2013).
The limited diversity observed for Brahman breed is most likely a consequence of intensive artificial selection for improved productivity (Albertí et al., 2008). It was previously suggested that the low genetic diversity in the Brahman breed may be partly ascribed to the use of elite sires (Makina et al., 2014). Such practices are consistent with the observed F value (0.0003), which are suggestive of some inbreeding in the Brahman populations examined (van der Westhuizen et al., 2019). Genetic diversity within the Simmental population was slightly higher than in the Brahman breed. This may be because the cattle BeadChip was optimized for use in Bos taurus taurus breeds (Cheruiyot et al., 2018).
Genome-wide polymorphism data indicated that the genomic background of the South African Simbra hybrid breed represents a mosaic of the Taurine and Indicine ancestor breeds, as was expected (Smith, 2010). Our data also confirmed the optimal 5/8 Simmental and 3/8 Brahman composition of the Simbra genomes included in this study, since this composition ensures maintenance of favorable traits from both breeds (i.e., meat tenderness of the Simmental breed and heat-tolerance of the Brahman breed) (O' Connor et al., 1997;Smith, 2010). , and Rsb analysis with the Simbra and Brahman cattle (C). The iHS and Rsb analysis was performed using the rehh package (Gautier and Vitalis, 2012) in R v. 3.4.4. The dashed line corresponds to a significance threshold (-log 10 ) that was set at 6, which is equivalent to P-value = 1e−06.
Additionally, the PCA and FastSTRUCTURE data also clearly demonstrated that the South African Simbra has evolved into a unique breed, as three distinct clusters were identified. This suggests that, after initial formation and subsequent intense artificial selection and breeding, the Simbra breed composition has stabilized over time (Paim et al., 2020).
Our results suggested that crossbreeding, followed by selection, was key in shaping the genome of the South African Simbra hybrid breed (Ríos- Utrera et al., 2020). Consistent with previous studies (e.g., Bahbahani and Hanotte, 2015;Bahbahani et al., 2017), the two EHH-based statistics used in this study allowed for the identification of genomic regions that display signatures of positive selection in the hybrid genome. These included regions that were identified using the intra-population iHS statistics, as well as the inter-population Rsb statistics using the Simmental and Brahman cattle as reference populations. The candidate regions identified using the iHS and Rsb statistics supports the role of selection pressures, and not natural demographic processes, in shaping the genomic pattern of these regions (Bahbahani et al., 2018). Also, 25% of the regions displayed ancestry deviation. Furthermore, only five genomic regions that displayed signatures of positive selection overlapped with regions containing locusancestry deviation. This may be because EHH-based statistics identify older signals of selection, while ancestry deviation is likely caused by recent post-admixture selection (Oleksyk et al., 2010;Bahbahani et al., 2018). Regions that display ancestry deviation observed in the young Simbra crossbreed that was developed in the United States in the late 1960s (Gouws, 2016), is most likely the result of recent postadmixture selection.
The South African Simbra hybrid breed appears to be evolving separately from its ancestoral breeds, with selection driving the increase in prevalence of advantageous alleles derived from both the parent breeds (Xu et al., 2015). The presence of genomic regions displaying locus-ancestry deviation supports the likelihood that they are important for the adaptability of Simbra cattle to the local environment (Bahbahani et al., 2018). The inter-population Rsb statistics, using Brahman as reference, allowed for the identification of Taurine haplotypes in regions that are under selection. Similarly, Rsb statistics using Simmental as reference allowed for the identification of regions that support selection pressures on Indicine haplotypes. As suggested recently, the identified genomic regions under selection may have adaptive significance to maximize their reproductive fitness and their adaptability to environmental challenges (Bahbahani et al., 2018).
Analysis of genes and known QTLs in regions of the Simbra genome that harbor signals of positive selection suggest that these are likely involved in its improved environmental adaptability and productivity (Paim et al., 2020;Ríos-Utrera et al., 2020). Many of the genes located in these genomic regions have previously been implicated in traits that are highly valued in the Simbra composite breed (Smith, 2010). The location of these regions also overlapped or co-occurred with previously reported bovine quantitative trait loci (QTLs) (https://www. animalgenome.org), which strongly reflect the overall breeding goals of the Simbra breed (Smith, 2010). For example, one of the adaptive regions located on BTA 23 co-occurred with a QTL associated with body weight (Lu et al., 2013). This region that is derived from the Simmental ancestry is important for growth performance in the Simbra breed (Pico, 2004;Amen et al., 2007;Smith, 2010;Maúre et al., 2018). The heritability of these traits may be due to positive selection of gene regions that is caused by beneficial polymorphisms in the genes affecting the traits, because mutation that provides a fitness advantage will increase in frequency in the population (Taye et al., 2017).
Most of the genomic regions experiencing positive selection were implicated in traits that are valued in breeds of Indicine ancestry. For example, the region located on BTA 5 that displays locus-ancestry deviation (excess of Brahman parent alleles) co-occurred with a QTL associated with ovulation rate. This confirms that regions/genes related to fertility and reproduction are hotspots of selection in breeds living in tropical environments (Bahbahani et al., 2018). The region located on BTA 20 co-occurred with a QTL associated with heat intensity (i.e., heat tolerance), and is derived from the Brahman ancestry. Adaptation to the harsh South African environment that is valued in the Indicine parent breed will allow for the Simbra breed to adapt to climate change that will likely cause South Africa to become hotter and drier (Girvetz et al., 2019). Of the genomic regions displaying positive selection, and that cooccurred with known QTLs linked with production in the Simmental breed, many were also previously demonstrated to be under selection in Western and Russian Simmental populations (Mészáros et al., 2019). These included QTLs associated with carcass weight that are located on BTA 9, milk production located on BTA 2 and BTA 21, as well as fertility located on BTA 1, that display locus-ancestry deviation (excess of Simmental parent alleles) (Berkowicz et al., 2012;Do et al., 2014;Gebreyesus et al., 2019;Zhang et al., 2019). These genomic regions include genes that encode for a SLC24A4 homolog located on BTA 21, which is known to be associated with milk production and fertility (Nayeri and Stothard, 2016;. Our results could therefore highlight new regions and pathways that may contribute to variation in reproductive health, fertility, and milk production in cattle in general. Many of the genes occurring in regions under positive selection in Simbra were previously identified using genomewide association studies (GWAS) where they were linked to meat quality of Taurine, Indicine and composite breeds (Allais et al., 2014;Hulsman et al., 2014;Magalhães et al., 2016;Xia et al., 2016). For example, KCNA10 encoded on BTA 3 is likely involved in determining meat quality in Simbra that may be derived from the Simmental parent breed (Lang et al., 2000;Fleet et al., 2011). Other genes, derived from the Brahman parent breed that include SMIM5 encoded on BTA 19 that display locus-ancestry deviation (excess of Brahman parent alleles), may negatively influence carcass and meat properties (e.g., marbling) (Mateescu et al., 2017;Taye et al., 2017). Some of the adaptive alleles identified in Simbra were implicated in the sensory characteristics of meat (e.g., tenderness, flavor, juiciness, and color), which are mainly affected by proteolytic activities of muscle (Taye et al., 2017). For example, a homolog of TRIM63 (also called MuRF-1), located on BTA 2, has been linked with meat tenderness in Nellore cattle (Indicine) (Muniz et al., 2016). MuRF-1 is an important component of the ubiquitin-proteasome system, which is the main proteolytic pathway in skeletal muscle growth in domestic animals (Koohmaraie et al., 2002). This pathway regulates the balance between the amounts of muscle proteins synthesized and degraded to control the skeletal muscle mass (Koohmaraie et al., 2002). Accordingly, the ubiquitin-proteasome system and its components have been linked to meat tenderness (Yin et al., 2010;Taye et al., 2017), productivity and economic value of animals (Sadri et al., 2016;Nakanishi et al., 2019). The high number of genes identified in this study and other studies that are associated with meat quality, underscore the complexity of this trait and that it is regulated by multiple interrelated causative factors and layers of feedback regulation (Diniz et al., 2019).
Some of the genomic regions subject to positive selection are likely involved in overall health and fitness of the Simbra breed. For example, the region located on BTA 3, which is known to be under selection in Western and Russian Simmental populations (Mészáros et al., 2019) and most likely derived from the Simmental parent breed, overlaps with a QTL associated with ketosis (QTL:179821). The latter is a metabolic disorder where negative energy balances (when energy demand exceeds intake) affect animal health and productivity (Nayeri et al., 2019). It has been postulated that such failure to maintain internal homeostatic and homeorhetic regulation maybe caused by intense genetic selection (Nayeri et al., 2019). Furthermore, metabolic disorders have also been demonstrated to negatively influence the immune response in cattle (Wathes et al., 2009;Esposito et al., 2014). The results of this study can be used for further genetic analysis to identify causal variants that affect ketosis and metabolic diseases.
Likewise, health and fitness traits that had likely been derived from Indicine ancestry were also encoded in Simbra genomic regions subject to selection. These regions are located on BTA 5, BTA 19, BTA 20, and BTA 21, which appear to be derived from Brahman. BTA 5 harbors a gene encoding KCNA10 (potassium voltage-gated channel subfamily A member 10) known to influence potassium metabolism and play a role in human and animal production and health (Lang et al., 2000;Fleet et al., 2011). This protein regulates acid-base balance and maintains cellular pH and electrical gradients (Lang et al., 2000;Fleet et al., 2011), which has previously been demonstrated to influence meat quality in cattle (Diniz et al., 2019). Likewise, BTA 21 contains the SLC24A4 gene that encodes a member of potassiumdependent sodium or calcium exchanger protein family, which may influence pigmentation related traits that may influence health (e.g., UV protection) (Sulem et al., 2007). The selection region on BTA 19 contains a gene encoding the small integral membrane protein 5 (SMIM5) that is associated with udder health and clinical mastitis in Holstein cattle (Wu et al., 2015). The region experiencing selection on BTA 20 harbors a gene that encodes MIER family member 3 Uncharacterized protein (MIER3), which is associated with survival in Holstein and Jersey cattle (Raven et al., 2014).
Finally, analysis of genome-wide polymorphisms further showed that the genetic diversity of the South African purebred Brahman parental breed was slightly lower than the Simmental population. This is similar to what has been reported previously (Qu et al., 2006;Agung et al., 2016;Utrera et al., 2018). The low level of diversity in the Brahman breed may be an indication of relative homogeneity in the South African populations as a consequence of intensive artificial selection for improved productivity (Albertí et al., 2008;Taberlet et al., 2008). It was also previously suggested that the low genetic diversity observed in the Brahman breed may be partly ascribed to the use of elite sires (Makina et al., 2014). Such practices are consistent with the observed inbreeding coefficient (f ) estimate (0.022), which is suggestive of some inbreeding in the Brahman populations examined (van der Westhuizen et al., 2019). Although it cannot be excluded that the low genetic diversity in the Brahman population may be due to the fact that the cattle BeadChip was optimized for use in Bos taurus taurus breeds (Cheruiyot et al., 2018), it is important that genetic diversity must be maintained and increased for sustainable production and management of this purebred cattle breed.

CONCLUSIONS
The SNP array data allowed for the assessment of genetic diversity, population structure and admixture of the South African Simbra population. Our findings contribute to the current knowledge of the genetics of the Simbra breed, and provides insight into how genomic architecture changes with hybridization and crossbreed formation. Results of this study emphasize the importance of assessing the genetic diversity, population structure and admixture of other South African cattle breeds. It also emphasize the importance of implementing a management strategy to increase diversity in the purebred breeds.
The genome-wide SNP array further allowed for the identification of signatures of positive selection in the Simbra hybrid genome, and these putatively introgressed genomic regions may have adaptive significance, affecting important phenotypic traits (e.g., adaption, reproduction, and production) in the breed. These include Indicine-derived alleles associated with heat tolerance and Taurine-derived alleles that are associated with body weight.
Knowledge of the genetics controlling meat quality will increase the ability of the industry to produce better meat, which will benefit consumers and should increase the demand for beef, which is of great interest to the beef industry (Mateescu et al., 2017). The identified adaptive introgression of alleles of Indicineand Taurine derived ancestral genes may lay the foundation for ad-hoc physiological studies and targets for selection (and potentially gene editing), that may increase production and health in modern cattle breeds. Ultimately, this study represents an important step toward developing and improving strategies for targeted selection and breeding that will ultimately contribute meaningfully to the beef production industry of South Africa.

DATA AVAILABILITY STATEMENT
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

ETHICS STATEMENT
The animal study was reviewed and approved by Agricultural Research Council Animal Research Ethics Committee.

AUTHOR CONTRIBUTIONS
MN, FM, and MB conceived of the presented idea. MN, LD, and NM developed the theory. MN, NH, KH, and W-YC performed the computations. BG, BK, ED, and PS verified the analytical methods. All authors discussed the results and contributed to the final manuscript.

FUNDING
Financial support from the ARC is greatly appreciated. The genotypes were generated under the Technology Innovation Agency (TIA) Beef Genomics Project (BGP).