Genome Wide Assessment of Genetic Variation and Population Distinctiveness of the Pig Family in South Africa

Genetic diversity is of great importance and a prerequisite for genetic improvement and conservation programs in pigs and other livestock populations. The present study provides a genome wide analysis of the genetic variability and population structure of pig populations from different production systems in South Africa relative to global populations. A total of 234 pigs sampled in South Africa and consisting of village (n = 91), commercial (n = 60), indigenous (n = 40), Asian (n = 5) and wild (n = 38) populations were genotyped using Porcine SNP60K BeadChip. In addition, 389 genotypes representing village and commercial pigs from America, Europe, and Asia were accessed from a previous study and used to compare population clustering and relationships of South African pigs with global populations. Moderate heterozygosity levels, ranging from 0.204 for Warthogs to 0.371 for village pigs sampled from Capricorn municipality in Eastern Cape province of South Africa were observed. Principal Component Analysis of the South African pigs resulted in four distinct clusters of (i) Duroc; (ii) Vietnamese; (iii) Bush pig and Warthog and (iv) a cluster with the rest of the commercial (SA Large White and Landrace), village, Wild Boar and indigenous breeds of Koelbroek and Windsnyer. The clustering demonstrated alignment with genetic similarities, geographic location and production systems. The PCA with the global populations also resulted in four clusters that where populated with (i) all the village populations, wild boars, SA indigenous and the large white and landraces; (ii) Durocs (iii) Chinese and Vietnamese pigs and (iv) Warthog and Bush pig. K = 10 (The number of population units) was the most probable ADMIXTURE based clustering, which grouped animals according to their populations with the exception of the village pigs that showed presence of admixture. AMOVA reported 19.92%–98.62% of the genetic variation to be within populations. Sub structuring was observed between South African commercial populations as well as between Indigenous and commercial breeds. Population pairwise FST analysis showed genetic differentiation (P ≤ 0.05) between the village, commercial and wild populations. A per marker per population pairwise FST analysis revealed SNPs associated with QTLs for traits such as meat quality, cytoskeletal and muscle development, glucose metabolism processes and growth factors between both domestic populations as well as between wild and domestic breeds. Overall, the study provided a baseline understanding of porcine diversity and an important foundation for porcine genomics of South African populations.


INTRODUCTION
Pigs were domesticated over 5,000 years ago, leading to the gradual and cumulative development of modern pig breeds with very distinctive phenotypes and production abilities (Zeder et al., 2006;Rothschild and Ruvinsky, 2010). Domesticated pig (Sus Scrofa domesticus) originated from the Sus scrofa, which is commonly known as the wild boar belonging to the Suidae family (Jones, 1998). This family includes species of wild pigs such as Phacochoerus africanus (Common warthog), Potamochoerus larvatus (Bush pig) and Hylochoerus meinertzhageni (Giant Forest hog) some that are indigenous to Africa (Jones, 1998). The Wild Boars are widely distributed covering areas such as Europe, Asia, and North Africa and were introduced as game species in all other continents including Africa (Jones, 1998;Scandura et al., 2011).
Pig breeds worldwide are either of well-defined ancestry or in certain instances crossbreds from populations of diverse origins (Amills et al., 2010). South African pig production consists of a commercial intensive sector with defined breeds and an extensive sector that is mainly associated with smallscale farmers in the rural areas. Village production system is characterized by non-descript populations raised under extensive low-input management. Commercial breeds such as the Large White, Landrace and Duroc have worldwide distribution in modern commercial farming systems including South Africa and are widely used (Amills et al., 2010). Indigenous breeds classified under Sus indica such as Kolbroek and Windsnyer are geographically restricted to Southern Africa (Nicholas, 1999). The Kolbroek, which is of Chinese origin, is speculated to have pigs that ended up in the hands of South African farmers when a sailing ship wrecked at the Cape Hangklip (Ramsay et al., 1994). Although the origin of the Windsnyer is unknown, there are observed similarities to Chinese breeds (Nicholas, 1999) thereby suggesting that it is of Chinese origin. Regardless of their origins and domestication routes, pig breeds in South Africa have become closed genetic pools restricted to specific farming systems and molded by artificial selection and possibly genetic drift (Amills et al., 2010). In addition to these domesticated breeds are the Warthog, Bush pig and Red River Hog wild pigs that are native to Africa and are found roaming in forests or in the zoos (Porter, 1993). The common Warthog (Phacochoerus Africanus) which was first discovered at Cape Verde, Senegal is one of the three species found in Africa. The Cape Warthog (Phacochoerus aethiopicus) is now extinct due to the rinderpest epizootic of the 1860s (Pallas, 1766;Gmelin, 1788;D'Huart and Grubb, 2003). Another Warthog (Phacochoerus delamerei) species was described in Somalia and later renamed Phacochoerus aethiopicus delamerei as it is similar to the Cape Warthog (Lönnberg, 1908(Lönnberg, , 1912Roosenvelt and Heller, 1915). Muwanika et al. (2003) studied the phylogeography of the common Warthog in Africa and found three clades representing West, South and East African Warthogs. There is no enough evidence to support the origin of the Bush pig, which was assumed to have originated from Asia (White and Harris, 1977). There are recordings of the Bush pig in the Swellendam and Outeniqualand in the Western Cape provinces of South Africa (Rookmaaker, 1989). Hybrids between the domestic and Bush pigs have been recorded with the introduction of Bush pigs to South Africa being as far as 1400 years ago (Linnaeus, 1758;Mujibi et al., 2018). The existence of hybrids is a concern, as they could become asymptomatic carriers of diseases such African swine fever (Jori and Bastos, 2009).
Indigenous breeds are often geographically restricted and harbor unique genetic variants that may provide future breeds with the flexibility to change in response to product market preferences and production environments. While low-input and indigenous breeds may not compete with exotic breeds in terms of production performance, they are considered hosts to unique genetic diversity that should be protected as sources of variation. Local pigs are important because of their hardiness and ability to survive in extreme conditions (Taverner and Dunkin, 1996;Zadik, 2005). Most indigenous breeds are, however, threatened by small and fragmented flock sizes, which predispose them to lose genetic diversity as a result of genetic drift and indiscriminate crossbreeding with exotic germplasm that can lead to genetic erosion and the eradication of the local genetic pool. Globally, 35% of pig breeds are classified as at risk or already extinct (FAO, 2009) demonstrating the threat to local biodiversity.
Genomics have emerged as an effective tool for assessing diversity within and amongst populations. Swart et al. (2010) observed low differentiation among pig populations in Southern Africa using microsatellites. Heterozygosity levels ranged from 0.531 to 0.692 for commercial and indigenous breeds. The availability of the Porcine SNP60K BeadChip has opened new avenues of examining genetic diversity (Ramos et al., 2009) at a genome wide scale relative to that using microsatellite and other low-coverage markers. Mujibi et al. (2018) observed close clustering of Warthogs and Bush pigs using the Porcine SNP60K BeadChip. The Porcine SNP60K BeadChip has been used to infer on population structure and selection signatures in Chinese and European pig populations (Ai et al., 2013). Using this SNP panel in South African pig populations will provide comprehensive information on the genomic architecture of local, exotic and wild pig populations, which will guide future management and conservation. The objective of the present study was to provide a large-scale analysis of the genetic diversity and structure of South African local pig populations using the Porcine SNP 60K BeadChip. The study investigated diversity of South African pigs relative to global populations of 389 pigs consisting of villages and out-group pigs from South America, Europe, United States, and China amongst other countries.

Breeds/Populations Sampled
South African specimens were collected from a total of 234 samples from different production systems, representing village, intensively farmed populations in conservation units and free ranging populations. Village and non-descript pig populations were sampled from Alfred Nzo (ALN; n = 17) and Oliver Reginald Tambo (ORT; n = 22) districts in Eastern Cape province and Mopani (MOP; n = 27) and Capricorn (CAP; n = 25) districts in Limpopo province. Commercial pig breeds of Large White (LWT; n = 20), South African Landrace (SAL; n = 20) and Duroc (DUR; n = 20) were sampled from commercial farmers in Limpopo province. Indigenous populations Kolbroek (KOL; n = 20.) and Windsnyer (WIN; n = 20) were sampled from the Agricultural Research Council-Animal Production Institute in Pretoria, South Africa (Table 1). Vietnamese Potbelly breed (VIT; n = 5) was sampled from the Johannesburg Zoo and represents a breed that is endangered in Vietnam, its country of origin but has been raised in a conservation zoo in South Africa. European Wild Boar (n = 4), Warthogs (n = 31), and Bush pigs (n = 3) were sampled as representatives of the wild pig populations. The European Wild Boar and Bush pigs were sampled from the surrounding villages in the North-West whilst the Warthog samples were collected from geographically separated National Parks from North-West (n = 4), Eastern Cape (n = 3), and Limpopo (n = 24). The distribution of the sampled individuals is illustrated in Figure 1. Ear tissue samples were collected using the tissue sampling applicator gun while pliers were used to collect the hair samples according to standard procedures and ethical approval from ARC-Irene Animal Ethics committee (APIEC16/028).

Genotyping and Quality Control
DNA was extracted at the Agricultural Research Council-Biotechnology Platform from the ear tissue and hair samples using a commercially available Perkin Elmer Genomic DNA kit according to the manufacturer's protocol. DNA concentration was quantified using the Qubit R 2.0 Fluorometer. Gel electrophoresis (5%) was used to assess the quality and integrity of the DNA. All 234 animals were genotyped using PorcineSNP60 v2 genotyping BeadChip (Illumina, United States) containing 62,163 SNPs with an average gap of 43.4 kb. Genotyping was done using the standard infinium assay at the ARC-Biotechnology Platform in South Africa. GenomeStudio version 2.0 (Illumina, United States) was used to process the genotype data, including raw data normalization, clustering and genotype calling. A final custom report was created to be able to generate a Plink Ped (Pedigree file) and Map (SNP panel file) for use in downstream analysis.
Golden Helix SNP Variation Suite (SVS) version 8.5 was used to update the SNPs marker file (Golden Helix Inc., 2016) based on the pig genome assembly (Sus Scrofa v10.2). Markers were then filtered to exclude SNPs located on the sex chromosomes. From this data set, Minor allele frequency (MAF) and deviation from Hardy-Weinberg equilibrium (HWE) were estimated per population for the 10 populations that excluded BSP, VIT, and WBO, which were left out due to small sample sizes. Additional quality control (QC) was also performed per population to remove SNPs with less than 85% call rate, MAF < 0.02 and HWE < 0.0001. The resultant filtered dataset was used to calculate observed (H O ), and expected (H E ) heterozygosities, inbreeding (F IS ) and effective population size (N e ).
Quality control was then performed overall population to remove SNPs with less than 85% call rate, MAF < 0.02 and HWE < 0.0001 and generate a dataset used for analysis of molecular variance (AMOVA) and F ST analysis. Using this dataset, further QC filtered for SNPs in high LD (r 2 = 0.2) and closely related individual [Identity By Descent (IBD) ≥ 0.45] to produce a filtered dataset used for population structure analysis using ADMIXTURE and Principle Component Analysis (PCA).

Genetic Diversity Within Population
The MAF, H E and H O were calculated as measures of within population genetic variation using PLINK 1.07 (Purcell et al., 2007). In addition, inbreeding coefficient (F IS ) was calculated on Golden Helix SNP Variation Suite (SVS) version 8.5 (Golden Helix Inc., 2016). Effective population size (N e ) trends across generations were estimated based on a relationship between r 2 (expected LD), N e and C (recombination rate). SNeP software (Version 1.1) tool was used based on the following formula suggested by Corbin et al. (2012) using the equation:  N T(t) : Effective population size estimated t generations ago C t : Recombination rate t generations ago r 2adj : Linkage disequilibrium estimation adjusted for sampling biasness α: a constant.
The recombination rate was estimated by using the following formula proposed by Sved (1971): The Bush pig, Vietnamese Potbelly and Wild Boar were excluded from the diversity within population analysis due to their small sample sizes. The few available samples were sampled from zoos and game reserves in the country where only few animals are often rescued and kept in conservation.

Population Differentiation and Structure
Analysis of Molecular Variance (AMOVA) was used to determine the genetic variance within populations (F IS ), among populations within group (F SC ) and among groups (F CT ) using ARLEQUIN v3.5 (Excoffier et al., 2005). The populations were categorized into villages, commercial, indigenous and wild populations and Principal Component Analysis (PCA) using SVS version 8.5 (Golden Helix Inc., 2016) and the eigenvector method was used to determine population clustering. ADMIXTURE version 1.20 (Alexander and Lange, 2011) was used to detect the most likely clusters (K) for the population. ADMIXTURE was run from K = 2 to K = 15. The number of potential genetic clusters (K) was tested from 1-15 to reassign each sample to its population of origin. The optimum K-value was that with the lowest crossvalidation error value. Initially, all the 13 populations sampled from South Africa were included in the population structure analysis. After this the South African data set was merged to Porcine SNP60K genotype data from Burgos-Paz et al. (2013) described above.
Population pairwise F ST values were estimated according to the formula of Weir and Cockerham (1984) implemented in the Golden Helix SNP Variation Suite (SVS) version 8.5 (Golden Helix Inc., 2016). Based on population pairwise F ST values, PCA and ADMIXTURE based clustering, F ST analysis per marker was estimated between pairs of highly differentiated populations of the village populations, indigenous populations and commercial breeds as well as amongst highly differentiated commercial breeds and wild populations. To reduce noise, an F ST averaged smooth value was used to identify genomic regions differentiating pairs of populations. Manhattan plots of per marker F ST values between pairs of populations were plotted against chromosomal coordinates using the porcine assembly (Sus Scrofa 10.2). Highly differentiating SNPs (F ST ≥ 0.8) were subsampled and genes associated with these SNPs searched using genome browse including their associations with known QTLs in the pig genome based on the Sus Scrofa 10.2 on Ensembl 1 .

Genotypes and Quality Control
The percentage of polymorphic and number of SNPs (N SNP ) remaining after QC per population and overall is presented in Table 2. Two hundred and eleven individuals with a genotyping rate of 85% remained after QC. Windsnyer pigs had the highest percentage of informative markers (95%) after QC, whilst Warthog had the lowest at 82%. About 31,705 SNPs were removed leaving 30,458 polymorphic SNPs of the loci distributed over 18 autosomal chromosomes, which were used for AMOVA and F ST analysis. After LD and IBD pruning, 23,345 SNPs and 176 individuals were used for the population structure analysis.

Genetic Diversity Across Populations
Genetic diversity parameters among the 10 populations are summarized in Table 2 Expected heterozygosity values ranged from 0.204 ± 0.151 from Warthog to 0.371 ± 0.126 for Capricorn. The highest inbreeding coefficient (F IS ) was for Warthog at 0.398 ± 0.475 while the Duroc had the lowest and slightly negative value of −0.067 ± 0.153. F IS values were positive for all village populations as well as Warthog suggesting some level of inbreeding within these populations. MAF was the highest in village population from Capricorn (0.264 ± 0.147) and the least in Warthog pigs (0.076 ± 0.109).     (Figure 4). Genetic structure of the South African breeds was further investigated using ADMIXTURE. The results presented in Figure 5 show the Warthog and Bush pig populations clustering together and clearly separated from the rest of the other populations at K = 2. Duroc separated from the rest of the populations at K = 3 followed by Vietnamese at K = 4. K = 4 clustered animals in the same way observed with PCA based clustering. Beyond K = 8, the genetic clusters of the commercial, indigenous, Asian and wild breeds are maintained whilst the added K is distributed within the village populations. K = 10 which was the optimal K (Supplementary Figure S1) with lowest CV (0.551) resulted in the eight distinct genetic clusters of commercial, indigenous, Asian and wild breeds plus highly admixed clusters consisting of all village pig populations from Limpopo and Eastern Cape provinces of South Africa.

Population Differentiation
Population pairwise F ST values are shown in Table 3. Low F ST were observed between village populations with values ranging from 0.022-0.060 (P < 0.05) within South Africa and in global populations. The highest differentiation was found between Warthog and Duroc at F ST = 0.481. Warthog and Kolbroek pigs showed the high differentiation at 0.468. All other populations had F ST values above 0.282. The extent of differentiation between Warthog and all the other populations was high ranging from 0.312 (Warthog and Creole from Columbia) to 0.589 (Warthog and Vietnamese). Highest F ST observed was between Vietnamese and Bush pig populations at 0.700 (Supplementary Table S3).

DISCUSSION
The Porcine SNP60K BeadChip was developed in 2009 (Ramos et al., 2009) and has been used to analyze genetic diversity and population structure in several pig populations (Ai et al., 2013;Burgos-Paz et al., 2013;Yang et al., 2017;Mujibi et al., 2018). This is the first report using the Porcine SNP60K BeadChip to explore diversity of domestic and wild pig populations covering the commercial, village, wild and conserved pigs farmed and reared in Africa. Pigs are possibly known to have reached Sub-Saharan Africa through the Nile corridor and later dispersed to the West-Central Africa (Blench, 2000). There are 541 pig breeds worldwide (Rischkowsky and Pilling, 2007) but the dominating commercial breeds in the pork industry are the Large White, Landrace, Duroc, Hampshire, Berkshire and Piétrain (Rothschild and Ruvinsky, 2010). The source of the improved breeds found in Southern Africa is believed to be the European settlers in 1600s (Krige, 1950;Blench and MacDonald, 2000;Swart et al., 2010). This was when Jan van Riebeeck brought some pigs to the Cape of Good Hope (Naude and Visser, 1994). The Large White, South African Landrace and the Duroc are the breeds mostly found and used in the commercial sector while the Kolbroek and Windsnyer are considered as indigenous and are mostly found in rural areas (Kem, 1993;Ramsay et al., 2000). The Vietnamese, Bush pig and Wild Boar populations constitute a small component of the genetic pool of pigs in the country often restricted to the game reserves and zoos.
The Porcine SNP60K BeadChip was designed using genomic resources from Western pig genomes (Ramos et al., 2009) and hence the number of SNPs after QC for the commercial population was higher ( Table 2). The village populations had a higher number of polymorphic SNPs and moderate-high MAF compared to that of commercial pigs. Non-descript livestock populations including pigs are often observed to be highly diverse probably due to open mating systems and gene flow between populations. In South Africa similar observations of highly diverse and polymorphic populations were observed in village chicken populations (Khanyile et al., 2015), cattle (Makina et al., 2014), and village goats (Mdladla et al., 2016). The Warthog and other indigenous pigs were observed to be the least polymorphic and diverse which could be attributed to ascertainment bias as the Kolbroek, Windsnyer, Vietnamese Potbelly, Warthog and Bush pigs were not used in the development of the Porcine SNP60K BeadChip. Overall, the porcine SNP panel showed moderate MAF for the village, commercial and indigenous purebred pig populations such as the Windsnyer implying utility of the chip in the prevalent farmed pig populations of South Africa.
A study conducted by Swart et al. (2010) using microsatellite markers in various Southern African pig breeds revealed higher levels of diversity within population than was observed in this study for the same breeds ( Table 2). High heterozygosity levels (0.61-0.75) were also reported by Halimani et al. (2012). In contrast to Swart et al. (2010) the Large White had the lowest diversity (H o = 0.358) compared to the South African Landrace (H o = 0.372) and other breeds of the Duroc and Kolbroek. It must be noted that these previous studies used microsatellite markers that are highly polymorphic markers and cannot be compared to SNPs that are biallelic in nature. High gene diversity is therefore expected in microsatellites markers. However, results on genetic diversity from this study were comparable to other studies that used the Porcine SNP60K BeadChip in Chinese and Western pig populations (Ai et al., 2013).  The heterozygosity values for the indigenous pigs were relatively similar to those of the commercial pigs ( Table 2). A lower diversity was expected for the commercial pigs as they are under selection while the indigenous pigs are known to be rich reservoirs of distinct alleles, coupled with presence of gene flow (Amills et al., 2012). However, the indigenous pig populations are also of very small flock sizes and often fragmented and restricted to specific farming communities and conservation units hence diversity was low. Small and fragmented populations and the possibility of natural selection due to disease and unfavorable climatic conditions could explain the genetic diversity observed in the village populations. The high inbreeding levels observed in the Warthog populations might have been promoted by its family structuring where pigs are organized into fragmented breeding and social units ( Table 2). Somers et al. (1995) noted that a group of Warthogs consist of about 40% of adults with changes seasonally. The number of mature individuals is estimated to be between 2000 and 5000 in the Kruger National Park (Ferreira et al., 2013). The geographical separation of the three national parks from which the warthogs were sampled, could have created small and fragmented subpopulations leading to escalated F IS values    (Smith et al., 2019); body width in gilts and sows (Rothschild, 2010), body weight traits (Borowska et al., 2017), altitude (Zhang et al., 2014) (Raschetti et al., 2013) due to Wahlund effect. As expected, we found that the village pig populations of South Africa had high inbreeding values compared with other populations. The negative F IS values for commercial and indigenous populations are reflective of their intensive production environment as individuals are outbred to avoid mating to close relatives. The low levels of effective population size (N e ) in the recent 12-22 generations for both commercial and indigenous populations are of concern (Supplementary Table S1). More so in the indigenous breeds since low levels of genetic diversity are likely to diminish overtime and increase the risk of extinction. The effective population of the Kolbroek of 34 at 12 generations ago is even lower than the minimum threshold N e of 50 set by the FAO (2000). Franklin (1980) recommended a N e of at least more than 500 while Willi et al. (2006) suggested N e of more than 1,000 to maintain the evolutionary potential of any population. The genetic diversity of these populations will likely continue to be negatively impacted by the small number of founders and them being farmed in fragmented populations. Small effective population size of the Kolbroek might be due to pigs being raised in a research facility with limited boars and sows. Large White, Duroc and South African Landrace are commercial pigs that have undergone strong selection for meat and carcass traits thus resulting in small effective population sizes. Long-term sustainability of the populations might be compromised due to the small population size as it increases the effects of genetic drift and reduction in fitness traits (Frankham et al., 1998).
The high F IS values observed within populations across breeds are similar to previous studies (SanCristobal et al., 2006;Swart et al., 2010;Gama et al., 2013;Edea et al., 2014). An overall AMOVA F IS value of 93.95% was comparable to Halimani et al. (2012) value of 92.90% in indigenous pigs of Southern Africa. Diversity amongst South African populations that ranged from F CT = 0.92 (village pigs) to F CT = 5.42 (Commercial populations) might be due to gene flow between different populations within a sub-populations. Moderate diversity within population (i.e., F IS ranging from 19.92 in the category consisting of South African Wild Boar and worldwide Wild Boar to F IS = 35.52 in the categories consisting on South African villages and Worldwide villages) relative to elevated F CT in the same categories implies a higher genetic variation distributed among groups from different geographic locations. This genetic variation observed amongst groups of the South African and Burgos-Paz et al. (2013) pig populations (i.e., F CT = 62.35-73.58) is higher than the variation reported amongst Angora goats from South Africa, France and Argentina using 50K SNP BeadChip (Visser et al., 2016), which could be explained by limited exchange of breeding animals across geographic boundaries in the studied pig populations. The amongst population within groups diversity values ranging from F SC = 0.46 for South African villages to F SC = 18.17 for South African commercial demonstrates evidence of population sub-structure and genetic differentiation between the well-defined commercial and indigenous breeds relative to non-descript village populations that are characterized by weak population boundaries.
The PCA demonstrates the impact of domestication and geographic history on the clustering of populations. European populations as represented by Wild Boar, South African Landrace, and Large White, clustered together as expected (Figure 3). Considering the history that the Wild Boar is an ancestor to the domestic pigs of today, some gene flow may have remained from the Wild Boar in the domestic pigs . The clustering of the Wild Boars reflects a European ancestry of those populations within that cluster. The slight difference between the Wild Boar and domestic populations might have been due to geographic isolation and artificial selection. Geographic structures were evident amongst most of the pig populations that were aligned to production systems and their founder effects. The clustering of the Windsnyer and the village populations could be due to gene flow between indigenous breeds and village populations. Limpopo populations had a closer proximity to Large White and South African Landrace, and farmers in this region are more likely to buy pigs from commercial herds. The Large White and South African Landrace are also closer together as these are both European breeds. It was interesting that generally the village populations were closer to the Windsnyer and Kolbroek as these are both indigenous breeds in South Africa. Although not much is known about our indigenous breeds, different theories suggest that the Kolbroek might have far Eastern alleles while the Windsnyer is known to be dominant in other parts of Southern Africa like Mozambique, Zambia and Zimbabwe (Holness, 1973(Holness, , 1991. The village populations and other Large Whites and Landraces from the global data set clustered together with the South African village, commercial and indigenous pigs demonstrating genetic similarities that could be aligned to founder effects and similarities in production systems. The clustering of Duroc away from other commercial populations (Large White and South African Landrace) was expected. The Duroc breed was created in the United States with pigs of several ancestries, including African pigs (Porter, 1993). Studies conducted by Kotze and Visser (1996) and Swart et al. (2010) using the microsatellite markers on the Large White, South African Landrace and Duroc also reported similar results. The Large White and South African Landrace were more genetically similar when compared to the Duroc. The inclusion of global populations did not alter this clustering (Figure 4).
The distance of Vietnamese Potbelly population from the rest of the domestic pigs is clear evidence of independent domestication that took place between the European and Asian subspecies of the wild boar . The PCA including pigs genotyped from all over the world clearly shows the geographical effect of the populations as the Vietnamese Potbelly clustered in close proximity to the Chinese population.
ADMIXTURE K = 2 presented the first level of ancestry of the Suidae family representing Phacochoerus africanus (Warthog) and Potamochoerus larvatus (Bush pig) versus Sus scrofa (domesticated pigs including the Wild Boar) species (Figure 5). The presence of the Wild Boar genomic signature in the domestic pigs from K = 2 to K = 7 is not surprising (Figure 5). It is well documented that the domestic pigs diverged from each other and originated from the ancestral wild boars around 8,000-10,000 years ago Laval et al., 2000;Larson et al., 2005). The Asian and European ancestral wild boars also originated from different subspecies thus the Vietnamese Potbelly diverged early (K = 2) from the rest of the domestic pig population. The results for the village populations showed high levels of admixture and weak between population substructuring. As opposed to pigs from the commercial sector that practices the intensive production systems, pigs in the villages are farmed under semi-intensive of free-range production systems, which might explain the admixture observed in this study. There is considerable indiscriminate crossbreeding that is taking place in village populations (Rege and Gibson, 2003). European and Asian pigs were used to improve the South African pig breeds but the actual contribution is unknown. Although phenotypically distinct from each other, the Bush pigs and warthogs clustered together which is suggestive of either common founder effect or selection pressures in the natural environments.
According to Wright (1978), F ST estimation with values of less than 0.05 represents low differentiation while values between 0.05 and 0.15 represent a moderate genetic differentiation and those between 0.15 and 0.25 and beyond reflect highly differentiated populations. The low levels of genetic differentiation of the village populations from this study ( Table 3) (Cumming, 1975;Somers et al., 1994). In South Africa, Warthog populations are restricted to nature reserves thus creating a physical barrier and huge genetic differentiation between them and other pig populations. This will be in contrast to the greater interaction between village, commercial and indigenous populations. Low F ST values between the villages in South African and village populations from South America ( Supplementary  Table S3) from Burgos-Paz et al. (2013) study, might be an indication that either common founder populations or similarities in production systems leading to common selection pressures. Ramírez et al. (2009) demonstrated that the African and South American pigs were derived from Europe and Far Eastern pigs. The very high genetic differentiation between the Vietnamese Potbelly and Bush pig agrees with the PCA and Admixture clustering.
Per marker pairwise F ST were estimated between pairs highly differentiated populations which were from villages, commercial, indigenous, Asian and wild populations ( Table 3).
From the pairwise F ST , Warthog was found to be genetically different from the rest of the populations. The per marker pairwise F ST analysis used a threshold of 0.8 and above to plot Manhattan graphs of the Warthog against the rest of the populations. From the SNPs showing a threshold of F ST ≥ 0.8, we looked at candidate genes and QTLs that can be associated with those SNPs to infer on traits that might have genetically differentiated the Warthog from Alfred Nzo, Duroc, Kolbroek, Large White, South African Landrace, and Windsnyer populations (Supplementary Figure S2).
Majority of the SNPs that were above the threshold between the Warthog and the rest of the populations were from chromosomes 1, 4, 5, 12, 13, and 15 ( Table 4). Chromosomes 2 (Warthog vs. Alfred Nzo), 3 (Warthog vs. Kolbroek), 6 (Warthog vs. South African Landrace) and 14 (Warthog vs. Large White) seemed to be less common. Chromosome 1 with a total number of 12 SNPs was associated with reproduction and growth traits while the indigenous populations of Kolbroek and Windsnyer were differentiated on chromosome 4 that was also linked to reproduction and growth traits.
Warthog vs. Alfred Nzo had three SNPS (F ST ≥ 0.8) that are associated with reproduction (RPL18, IL17B) and growth (IL17B, ARHGAP23) characteristics (Table 4). It is known that good nutrition is vital to be able to maximize growth performance. Genes IL17B and ARHGAP23 are linked to inflammatory response (Liu, 2015;Bie et al., 2017) and the gastrointestinal tract where they play a role in the digestion and absorption of the nutrients. Inflammatory responses lead to reduction of feed intake, which in turn affects the growth of the animal (Liu, 2015). Selection on genes associated with inflammation in the populations of Warthog vs. Alfred Nzo might be an effect of the different diets these populations scavenge on. Medzhitov (2008) noted the inflammation response to be a protective mechanism from the stress and harmful environment.
Growth linked genes ADGRB3, and ACY1 were dominant in differentiating Warthog vs. Duroc populations with an overall total of 10 SNPs. Emrani et al. (2017) associated ADGRB3 to body weight traits in the broiler chickens. The association of ADGRB3 gene to Duroc rather than Large White or South African Landrace breeds might be linked to the higher percentage of intramuscular fat in Duroc compared to the other two commercial breeds (De Vries et al., 2000). Mature males of Warthog can also reach up to 100 kg and possesses good meat and carcass qualities (Hoffman and Sales, 2007).
A total number of 20 significant SNPs (F ST ≥ 0.8) were linked to the Warthog vs. Kolbroek populations. Growth traits were associated with five of the SNPs between Warthog vs.
Kolbroek. Indigenous Kolbroek are reported to be smaller in size when compared to commercial breeds such as Large White (Chimonyo et al., 2005). Kutwana et al. (2015) reported no significant difference (P > 0.05) between the Kolbroek and Large White populations that had higher fat percentages when compared to the other commercial breeds (Nicholas, 1999).
Chromosome 13 was also highly notable with significant SNPs differentiating Warthog vs. Kolbroek and Warthog vs. Windsnyer. Only two SNPs appeared for Warthog vs. South African Landrace and were on chromosome 6. The Warthog vs. Windsnyer had a total of fourteen SNPs differentiating them. The identification of BRPF1 gene in the Warthog vs. Windsnyer populations is an important observation as this gene is associated with the intramuscular fat (IMF). When it comes to the value and taste of the pork meat, intramuscular fat is an important characteristic because meat that is high in IMF tends to be juicy and tender (Eikelenboom et al., 1996;de Koning et al., 1999). The gene ATPB2 associated with six significant SNPs is linked to heat stress and reproductive performance (Dash et al., 2016). Heat stress might result in poor reproduction for both sows and boars. Pigs cannot sweat and this makes them sensitive to high environmental temperatures making and of concern particularly to commercial pig farmers (Ross et al., 2015).
Genes linked to immune response and mastitis were observed in Indigenous vs. Duroc comparisons. PTPN22 gene on chromosome 4 has a regulatory effect on T-and Bcell activation in immune response (Lamsyah et al., 2009). PTPN22 plays a role in susceptibility to tuberculosis. Pigs are generally natural hosts of mycobacterial infections (de Lisle, 1994). Porcine TB has been reported in South Africa where infections are commonly via infected cattle fecal matter fed to piglets as well as interactions with wild pigs (Muwonge et al., 2012). NXPH1 gene is associated with DMI (dry matter intake) in cattle (Olivieri et al., 2016). Both PTPN and NXPH1 genes were fixed in the Duroc implying natural selection of the Duroc when compared to both indigenous and Wild Boars. Breeds in the commercial sector are mainly selected for growth, carcass and meat quality traits. The indigenous and village population on the other hand has not been systematically selected for such traits.
The NPY5R located on chromosome 8, was associated with feed efficiency and fat deposition. This gene was also reported in Jinhua and Rongchang pigs that belong to Chinese breeds . Fat deposition genes observed in Indigenous vs. Vietnamese, Villages vs. Kolbroek and South African Landrace with Large White vs. Indigenous are evidence in agreement with suggestions that Kolbroek and other indigenous pigs tend to carry their weight in their bellies and backs (Hoffman et al., 2005). Hoffman et al. (2005) also reported breed type and diet to have an influence on the composition of the meat. This study therefore presented a diverse genomic architecture of South African pigs with differentiating selection pressures for meat and carcass quality traits in the different pigs raised in diverse production systems.

CONCLUSION
Overall, the study demonstrated the utility of the Porcine SNP60K BeadChip in elucidating genetic diversity and population genomic structure of South African pig populations relative to other global populations. Village pigs demonstrated distinctiveness from other domestic and commercial populations within South Africa and when compared to global populations. The study provided baseline knowledge with regards to the genetic diversity of the domestic and wild pig populations of South Africa, which is a prerequisite for population/breed characterization, utilization and conservation. A more in-depth analysis of patterns of genetic variations is required to get more insight into factors shaping genetic diversity of these populations.

DATA AVAILABILITY STATEMENT
The datasets generated for this study can be found in Dryad doi: 10.5061/dryad.b0t10b0.

ETHICS STATEMENT
Ear tissue samples were collected from pigs using the Tissue Sampling Applicator Gun while pliers were used to collect the hair samples according to standard procedures and ethical approval from ARC-Irene Animal Ethics committee (APIEC16/028).

AUTHOR CONTRIBUTIONS
NH collected samples, analyzed the data, and wrote the draft manuscript. FM, PS, and ED designed the experiment and sourced funding. KH analyzed the genomic data for the experiment. FM, PS, and ED coordinated the conduct of the study and writing of manuscript and revisions. All authors read and approved the manuscript.

ACKNOWLEDGMENTS
We would like to thank the Agricultural Research Council-Biotechnology Platform (ARC-BTP) for funding the genotyping of samples. We express our gratitude to all the pig farmers, the Department of Agriculture (Eastern Cape and Limpopo) and various stakeholders who allowed us to use their animals in this study. NH holds fellowships from the National Research Foundation and ARC-Professional Development Program.

SUPPLEMENTARY MATERIAL
The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fgene. 2020.00344/full#supplementary-material FIGURE S1 | Cross validation plot for inferring the number of K populations in the analysis of population structure.