Genome-Wide Assessment of a Korean Composite Pig Breed, Woori-Heukdon

A Korean synthetic pig breed, Woori-Heukdon (WRH; F3), was developed by crossing parental breeds (Korean native pig [KNP] and Korean Duroc [DUC]) with their crossbred populations (F1 and F2). This study in genome-wide assessed a total of 2,074 pigs which include the crossbred and the parental populations using the Illumina PorcineSNP60 BeadChip. After quality control of the initial datasets, we performed population structure, genetic diversity, and runs of homozygosity (ROH) analyses. Population structure analyses showed that crossbred populations were genetically influenced by the parental breeds according to their generation stage in the crossbreeding scheme. Moreover, principal component analysis showed the dispersed cluster of WRH, which might reflect introducing a new breeding group into the previous one. Expected heterozygosity values, which were used to assess genetic diversity, were .365, .349, .336, .330, and .211 for WRH, F2, F1, DUC, and KNP, respectively. The inbreeding coefficient based on ROH was the highest in KNP (.409), followed by WRH (.186), DUC (.178), F2 (.107), and F1 (.035). Moreover, the frequency of short ROH decreased according to the crossing stage (from F1 to WRH). Alternatively, the frequency of medium and long ROH increased, which indicated recent inbreeding in F2 and WRH. Furthermore, gene annotation of the ROH islands in WRH that might be inherited from their parental breeds revealed several interesting candidate genes that may be associated with adaptation, meat quality, production, and reproduction traits in pigs.


INTRODUCTION
In the swine industry, crossbreeding has been widely used to exploit the phenomenon of heterosis, or hybrid vigor. The main benefit of the heterosis is increased performance of the resulting crossbred offspring over the average performance of its purebred parent pigs in traits of interest (Johnson, 1981;Falconer and Mackay, 1996). Although not all pig traits that were targeted for hybrid vigor show the same degree of heterosis, there has been significant success in harnessing heterosis to improve productivity of several economically important traits (Baas et al., 1992;Cassady et al., 2002a;Cassady et al., 2002b). Since 1970, most commercial pig producers have been using a classical terminal crossbreeding system with three breeds (Landrace × Yorkshire dam × Duroc sire, or LYD) to produce market pork in South Korea. The LYD system resulted in higher productivities, including increased growth rate, by taking advantage of crossbreeding (Jin et al., 2006;Choi et al., 2016).
In recent years, the South Korea swine industry has been partly changing from emphasis on efficiently producing more pork, mostly by focusing on growth and reproduction, to addressing taste-oriented consumption. There has been consistent consumer demand in Korea for diversified and great-tasting pork, despite its higher price than the typical market pork that is mostly derived from the LYD system (Kim S. G. et al., 2020). In response to such demand, the National Institute of Animal Science in Korea developed Woori-Heukdon (WRH), which is a synthetic breed derived from crossbreeding Duroc (DUC) and the Korean native pig breed (KNP) (Kim Y.-M. et al., 2020). KNP is known to have great meat qualities, such as high glucose content and a high unsaturated/saturated fatty acid ratio. However, because it also has economically unfavorable characteristics, such as slow growth rate, late maturity, and light carcass weight, it has low productivity compared with the commercial breeds; therefore, the population has decreased (Park et al., 2007;Hur et al., 2013). To take advantage of genetic merits of both the Duroc and KNP populations, WRH were generated by the crossing scheme shown in Figure 1. F2 was shown to have a slightly better growth rate compared with WRH; however, meat qualities, such as meat color and shear force, were clearly better in WRH than F2. Therefore, WRH preserved the characteristics of KNP, including superior meat quality, and had an improved growth rate compared with KNP (Kim et al., 2016).
Runs of homozygosity (ROH) are contiguous homozygous stretches that are inherited from each parent. Although ROHs are known to arise from several genetic factors, including genetic drift and population bottlenecks, it has been used as an indicator of selection signatures throughout the genome. In fact, ROHs have been widely used to quantify autozygosity in pigs (Schachler et al., 2020;Wu et al., 2020); because ROH are widely but not randomly distributed across the genome, some ROH overlap with genomic regions associated with economically important traits in pigs. Furthermore, ROH can be used to distinguish between recent and ancient inbreeding (Keller et al., 2011). In particular, shared ROH within a population can be used to identify genomic regions potentially under selection, which could be associated with the environment or production systems. Several studies have assessed autozygosity in livestock species using ROH approaches (Peripolli et al., 2018;Xu Z. et al., 2019;Dzomba et al., 2021). Offspring inherit chromosomal segments from the same ancestor by descent (Broman and Weber, 1999). Consequently, the extent of ROH can be used to estimate the inbreeding coefficient (Bosse et al., 2012;Marras et al., 2015).
Currently, there is still a lack of studies to elucidate genomic characteristics of the composite breeds that were generated by using the indigenous pigs. The main objective of this study were: 1) to estimate various parameters to clarify genomic population structure of the crossbreds with DUC and KNP populations; and 2) to detect and investigate ROH that could be an indicative of selection signature in WRH population.

Quality Control of Genotype Data
The quality control (QC) process for the genotype dataset was conducted separately for SNPs and animals using PLINK v1.9. We applied three different QC procedures for the initial raw SNP dataset to further analyze genetic diversity, population structure, and ROH. All three genotype subsets used the following process: 1) SNPs in sex chromosomes or unmapped in Sscrofa11.1; 2) SNPs with a call rate less than 90%. Individuals with a call rate less than 90% were removed. The ROH analysis was conducted using a subset of data without additional QC process. To assess genetic diversity, we also removed SNPs with a minor allele frequency (MAF) less than .05. Furthermore, SNP filtering based on pairwise linkage disequilibrium (LD) was conducted to minimize the reduction of informativeness of the dataset (Lopes et al., 2013) using the indep-pairwise 50 5 0.5 command for population structure analyses.

Genetic Diversity
To assess genetic diversity within populations, we calculated observed heterozygosity (H O ), expected heterozygosity (H E ), and individual inbreeding coefficient (F HOM ) using PLINK v1.9. The inbreeding coefficient was calculated based on homozygous genotypes as follows:

Population Structure
To explore the pattern of genetic differentiation of samples, we conducted principal component analysis (PCA) using PLINK v1.9, and the first two principal components (PCs) were visualized using R v4.1.0 (www.r-project.org, accessed July 4, 2021). The PCA plot was also considered a QC process because it reveals potential misclassification.
To better understand the relationship among parental breeds and their crossbred populations, an admixture analysis for K = 2 that is based on the number of ancestral populations was performed using ADMIXTURE v1.3 (Alexander and Lange, 2011).

ROH Detection
To detect ROH, we used a dataset of 56,498 SNPs for 2,040 individuals that resulted from QC without filtering based on MAF and LD because of problematic factors in ROH discovery (Marras et al., 2015;Meyermans et al., 2020). To avoid short and common ROH caused by LD, we set the minimum length of ROH to 1 Mb for ROH discovery, as described by several studies (Purfield et al., 2012;Ferencakovic et al., 2013;Marras et al., 2015;Meyermans et al., 2020). ROHs were identified using a sliding window method with the homozyg command in PLINK v1.9. The parameters used in ROH detection were applied as follows: 1) the minimum length of ROH was 1 Mb (--homozyg-kb); 2) an ROH had at least one SNP per 50 kb on average (--homozyg-density); 3) the distance of consecutive SNPs in the same ROH was less than 1,000 kb (--homozyggap); 4) no heterozygous SNP (--homozyg-window-het and-homozyg-het) and one SNP with a missing genotype were allowed (--homozyg-window-missing). In addition, the thresholds for the minimal SNP numbers per window (--homozyg-window-snp) and in the ROH (--homozyg-snp) were calculated by the L-parameter for the populations (Lencz et al., 2007;Purfield et al., 2012;Meyermans et al., 2020). In this study, we classified ROH into three categories based on their physical length: short (1 to <3 Mb), medium (3 to <5 Mb) and long (>10 Mb).
ROH islands, which were defined as regions where SNPs in ROHs had p-values higher than a specific threshold for each population, were marked as a potential selection signature, and ROH islands were determined using R-script at the Open Science Framework (https://doi.org/10.17605/OSF.IO/XJTKV) provided by Gorssen et al. (2021). The population-specific threshold was determined based on z-scores obtained from the distribution of ROH incidences. Additionally, the top 0.1% of SNPs (p-value >.999) were used to form ROH islands (Purfield et al., 2017;Gorssen et al., 2020). Furthermore, a threshold that a ROH should be included in at least 30% of individuals within each population was set for ROH islands. For highly inbred populations such as KNP (F ROH = 41%), the threshold was set to 80% because SNPs with p-values higher than .999 were not found in this population, as described by Gorssen et al. (2021).
Using discovered ROHs, we calculated the inbreeding coefficient based on ROH using the following calculation proposed by McQuillan et al. (2008): Where L. ROH is the sum of all ROH of an individual and L. Autosomes is the total length of autosomal genome covered by SNPs (in this study, 2,262.6 Mb). Furthermore, we investigated

SNP Characteristics
The SNP coordinates for each chromosome (SSC) were updated from Sscrofa10.2 to Sscrofa11.1 according to manifest file provided from Illumina (Supplementary Table 1). After initial QC steps, three additional individuals (DUC = 3) were removed because of breed misclassification according to PCA (Supplementary Figure 1). As a result of the QC procedure, we retrieved three SNP subsets for population structure (12,801 SNPs for 2,040 individuals), genetic diversity (43,809 SNPs for 2,040 individuals), and ROH analyses (56,498 SNPs for 2,040 individuals).

Population Structure
The top three PCs (PC1, PC2, and PC3) explained approximately 39.3, 15.8, and 5.7% of total variation, respectively. As shown in Figure 2A, DUC and KNP were clearly separated, and the cluster of F1 was located in the middle of DUC and KNP populations based on PC1. F2 and WRH were also located between DUC and KNP; however, both populations were closer to DUC. Furthermore, WRH showed a dispersed cluster based on PC2. Clustering patterns for (F1, F2, and WRH) were clear based on PC1 and PC3, which explained approximately 39.3 and 5.7% of the total variation, respectively ( Figure

ROH Patterns
The mean number and size of discovered ROH per pig within each population are described in Table 2. Among the five pig populations, the largest mean size of ROH was observed in KNP   Table 2). This result showed that the highest F ROH value was observed in KNP (0.409) followed by WRH (0.186), DUC (0.178), F2 (0.107), and F1 (0.035), which is the same order as mean ROH length. The correlation between F HOM and F ROH for each population was high in most populations, including DUC (0.84), KNP (0.83), F2 (0.86), and WRH (0.88); however, F1 had a correlation of 0.10 (Supplementary Figure 2). Because F HOM is influenced by allele frequency and sampling (Zhang et al., 2015) but the level of homozygosity in F ROH is independent of allele frequencies, the low correlation between F HOM and F ROH in F1 might be caused by sampling bias derived from the low sample size of F1 (N = 11) (Dixit et al., 2020). Therefore, we focused on F ROH to assess the degree of inbreeding, which is also better for detecting both common and rare variants than F HOM (Wang et al., 2019).

ROH Islands and Gene Annotation
As shown in Figure 5 and Supplementary Table 5, the ROH islands, which were defined as regions where SNPs in ROH had p-values higher than a threshold for each population, were observed for parental (DUC and KNP) and crossbred (F1, F2, and WRH) populations. We identified a total of 365 ROH islands of 1-Mb bins throughout the autosomal regions (Supplementary Table 6). DUC had ROH islands on SSC1-3, SSC7, and SSC14, whereas KNP had ROH islands on 11 autosomes, excluding SSC2, SSC3, SSC4, SSC6, SSC14, SSC15, and SSC18. Crossbred populations, especially F1 and F2, had similar occurrence patterns of ROH islands, with ROH islands on SSC1, SSC9, and SSC14-17 in both populations. Additional islands only in F1 were on SSC4, SSC5, SSC13, and SSC18, and those only in F2 were on SSC2 and SSC7. For WRH, SSC1, SSC3, SSC7, SSC9, and SSC14 had ROH islands. For all 1-Mb bin of ROH islands, we annotated 2,165 genes and 11 QTL information and are listed in Supplementary Table 6.

DISCUSSION
To investigate the population structure of DUC, KNP, and their crossbred populations (F1, F2, and WRH), we conducted PCA and ADMIXTURE analyses (Figure 2 and Figure 3). The most distant genetic relationship was observed between DUC and KNP based on PCA (Figure 2). In addition, KNP separated from DUC at K = 2 and had 100% distinct ancestry (Figure 3; Supplementary Table 2). The results are consistent with those of previous reports that showed clear genetic difference based on population structure and F ST analyses (Edea et al., 2014;   Lee et al., 2020). For the crossbred populations (F1, F2, and WRH) used in this study, all three populations were located between KNP and DUC in the PCA (Figure 2). In particular, F1 was located in the middle of KNP and DUC, which was also shown in other F1 populations generated from purebred parental pig breeds (Grossi et al., 2017) and cattle (Gobena et al., 2018). The ADMIXTURE analysis of K = 2 also revealed that F1 had 48.9% DUC and 51.1% KNP ancestry (Supplementary Table 2). The genetic distance of F2 and WRH crossbred populations was closer to DUC than that to KNP, which can be explained by the higher genetic composition of DUC than that of KNP in the crossing scheme of F2 and WRH (Figure 1) (Kim Y.-M. et al., 2020). This study only used parental breeds (DUC and KNP) and crossbred populations (F1 and F2) that were used to develop WRH, even though the previous study used additional Chinese and commercial breeds. Therefore, we inferred that a better estimation of ancestry for WRH was obtained in this study. However, WRH was shown to have a somewhat dispersed cluster in PCA (Figure 2A). This could be explained by newly generated breeding group of WRH using parental breeds (DUC and KNP) and crossbred populations (F1 and F2). The first national project to develop a Korean composite pig breed (WRH) started in 2008 by generating F1 and F2 populations. Subsequently, the first founder population of WRH was developed in 2010. Since then, the initial founder stock of WRH was used for breeding projects as a closed population until 2018. Because of difficulties maintaining a sufficiently large effective population size with a limited population size, they were subject to inbreeding (Dickerson, 1973). To decrease the level of inbreeding of this population, a recent project was initiated to construct new breeding group of WRH population since 2018. Therefore, recent introduction of new breeding group to the previous one might cause the somewhat widely distributed cluster shown in population structure analyses.
The effect of heterosis is difficult to quantify; however, heterozygosity can be used as an indicator of heterosis (Iversen et al., 2019). Genetic diversity levels were assessed by mean expected heterozygosity rate of five pig populations. The expected heterozygosity gradually increased according to crossing stages used in this study (F1, F2, and WRH) ( Table 1), and those values were higher than in the parental breeds. Similar to this result, a previous study reported that crossbred pigs generated between Dutch Landrace and Dutch Large White had higher heterozygosity levels compared with their parental breeds (Iversen et al., 2019). Among the five pig populations, the lowest degree of genetic diversity was observed in KNP, and this result is concordant with those of previous studies that reported the genetic diversity of KNP relative to most of studied other indigenous and commercial pig breeds using genomic datasets (Kim Y.-M. et al., 2020;Lee et al., 2020). Other studies that assessed genetic diversity of indigenous pigs from other countries also revealed loss of genetic diversity in indigenous pig breeds due to conservation status (Diao et al., 2019;Munoz et al., 2019). Although KNP has not undergone systematic artificial selection during conservation breeding, such loss of genetic diversity could be explained by a small effective  Frontiers in Genetics | www.frontiersin.org February 2022 | Volume 13 | Article 779152 6 population size and the founder effect or a population bottleneck (Munoz et al., 2019). In terms of breeding history, the KNP restoration project started in 1988 using only nine individuals (four males and five females) as a founder population (Kim et al., 2016); simultaneously, the preservation program was conducted as a closed population (Kim Y.-M. et al., 2020). We suggest that those characteristics of the breeding system might have caused a high level of inbreeding, as KNP had the greatest F ROH value (.409) ( Table 2).
In ROH analyses, we found a clear difference in ROH patterns between parental breeds and their crossbred populations (F1 and F2). Among parental breeds, DUC had 24.4% short and 12.7% long ROH segments (Figure 4). Additionally, KNP had abundant medium (56.5%) and long (41.9%) ROH segments. The short ROH segments may indicate evolutionary events from old inbreeding or selection, whereas long ROH may reflect recent inbreeding (Keller et al., 2011;Mastrangelo et al., 2017;Xu L. et al., 2019). DUC was shown to have more old inbreeding than Frontiers in Genetics | www.frontiersin.org February 2022 | Volume 13 | Article 779152 7 recent inbreeding or selection pressure due to intensive selection programs , whereas KNP underwent recent inbreeding of small populations (Valluzzi et al., 2021); this was also confirmed by the genetic diversity and inbreeding results in this study. Of the crossbred populations, F1 has the lowest number of total ROH segments (N = 438) among populations used in this study (Supplementary Table 3), and most of them are short ROHs (90.2%) and no long ROHs. These ROH patterns in F1 also caused the short length of concatenated ROH regions compared to F2 and WRH (Supplementary Table 4), which might be caused by the increase of heterozygous SNPs in the genome due to the hybridization of parental breeds. This is also supported by the highest value of observed heterozygosity in F1 (Table 1). However, a caution is required to interpret the ROH results derived from F1, because there was highly likely to be an underestimation in ROH numbers and size due to a low sample size (N = 11). Furthermore, we identified a total of 8,810 and 36,635 ROHs for F2 and WRH, respectively (Supplementary Table 3). In both populations, short ROH segments gradually decreased, whereas medium and long ROH segments increased; these changes also increased the inbreeding coefficient based on ROH ( Table 2) and the length of concatenated ROH region (Supplementary Table 4) in those populations. A previous study also revealed that, in a 3-way crossbred population [(Pietrain × Large White) × Duroc)], G0 had smaller homozygous segments than their parental populations, and ROH size was increased in the G1 population, which were the offspring of G0 (Ganteil et al., 2020). Consequently, WRH showed a similar proportion of size-classified ROH to DUC, but WRH had a higher proportion of long ROH than DUC; this indicated recent inbreeding events, as discussed earlier in the population structure analyses. We also observed parental ROHs in crossbred populations using coordinates of concatenated ROH regions for each of the populations (Supplementary Table 4). Most ROH regions in crossbreds (>98.8%) were considered to be from the ROH regions shared between parental breeds (DUC and KNP). In particular, the proportions of the ROH regions over the initial total length of ROH region overlapped between parental breeds were 91.55 and 99.55% for F2 and WRH, respectively (data not shown). We suggest that such increased ROH regions in F2 and WRH might be in part due to the inbreeding.
The discovery of ROH islands revealed numerous homozygous regions over five pig populations used in this study ( Figure 5; Supplementary Table 6). Furthermore, we found ROH islands in the parental breeds that were shared with their crossbred populations on SSC1-3, SSC7, SSC9, SSC13, SSC14, and SSC16, which indicated inheritance of homozygous regions (Supplementary Table 5). Most of those regions in crossbred populations were shorter than those in parental populations, which might be explained by ROH degeneration due to an increase of heterozygous SNPs or recombination (Bosse et al., 2012). The breakage of the ROHs is supported by the proportion of parental ROH regions in crossbred populations (Supplementary Table 4), which indicated that most of ROH regions from the crossbreds (F1, F2 and WRH) belongs to shared ROH regions between parental breeds. Previous studies also revealed ROH persistence in several crossbred pigs: Landrace × Large White (Landrace × Large White) × Duroc, and (Pietrain × Large White) × Duroc (Howard et al., 2016;Gomez-Raya et al., 2019;Ganteil et al., 2020).
In this study, we annotated genes to ROH islands to identify inheritance of homozygous regions that might be potential selection signatures from parental breeds to WRH, which is the last stage of the crossbreeding scheme. First, we found some candidate genes in ROH islands that overlapped between KNP and WRH. We found an ROH island on SSC9 (49-50 Mb) harboring the Cytotoxic And Regulatory T Cell Molecule (CRTAM) gene. This gene was reported to have an association with adaptive immune response in cattle (Ben-Jemaa et al., 2021). As KNP is an indigenous pig breed that has been adapted to the local environment of South Korea for a long period, we suggest that CRTAM might be a candidate gene that is associated with local adaptation of KNP and WRH populations. At the same location (SSC9; 49-50 Mb) of a ROH island, we found the Heat Shock Protein Family A Member 8 (HSPA8) gene, which is also known as Hsp70. HSPA8 is known to be associated with pork tenderness because this gene was down-regulated in tender samples (Hamill et al., 2012).
We also retrieved ROH islands that were shared between DUC and WRH. We found the ADAMTS Like 3 (ADAMTSL3) gene at 52-53 Mb on SSC7. ADAMTSL3 is a member of the ADAMTS superfamily of proteins, and this gene was previously reported as a candidate gene for body length in Large White pigs (Li et al., 2017) and height in humans (Weedon et al., 2008). In addition, we located the Cytoplasmic Polyadenylation Element Binding Protein 1 (CPEB1) gene at 52-53 Mb on SSC7. The CPEB1 is an RNA-binding protein that regulates mRNA translation by controlling the poly(A) tail length (Nagaoka et al., 2016). CPEB1 was reported to increase the rate of meiotic resumption and expression of cyclin B when mRNA was injected into immature oocytes (Nishimura et al., 2010). The purpose of using DUC in the crossing scheme includes complementing the body size and reproductive traits of KNP; thus, we suggest that ADAMTSL3 and CPEB1, which are located in ROH islands of DUC and WRH, might be associated with production and reproductive traits in both populations.

CONCLUSION
This study has shown genomic characteristics of crossbred pig populations derived from Korean Duroc and Korean native pigs as founder breeds. Population structure analysis showed genetic influence of founder breeds to crossbred populations. We also observed that WRH had two distinct subgroups due to newly introduced breeding group. For crossbred populations, the genetic diversity was gradually increased according to their crossbreeding stage (F1, F2 and WRH). In ROH analyses, short ROHs were decreased, while medium and long ROHs were increased from F1 to WRH, suggesting that recent Frontiers in Genetics | www.frontiersin.org February 2022 | Volume 13 | Article 779152 8 inbreeding is ongoing in WRH. In this study, there is a partial limitation to conclude on F1 due to its small samples size (N = 11). Furthermore, we identified shared ROH islands which contain candidate genes (CRTAM, HSPA8, ADAMTSL3 and CPEB1 genes) between WRH and founder breeds, suggesting inheritance of homozygous region that might be potential signatures of selection.

DATA AVAILABILITY STATEMENT
The SNP dataset for DUC used in this study is deposited in FigShare, https://doi.org/10.6084/m9.figshare.18318584.v1. The datasets for other populations are available from the corresponding author upon request.

ETHICS STATEMENT
All experimental protocols for this study were approved by the Institutional Animal Care and Use Committee at National Institute of Animal Science (approval number: NIAS 2020-479).