GWAS on Imputed Whole-Genome Resequencing From Genotyping-by-Sequencing Data for Farrowing Interval of Different Parities in Pigs

The whole-genome sequencing (WGS) data can potentially discover all genetic variants. Studies have shown the power of WGS for genome-wide association study (GWAS) lies in the ability to identify quantitative trait loci and nucleotides (QTNs). However, the resequencing of thousands of target individuals is expensive. Genotype imputation is a powerful approach for WGS and to identify causal mutations. This study aimed to evaluate the imputation accuracy from genotyping-by-sequencing (GBS) to WGS in two pig breeds using a resequencing reference population and to detect single-nucleotide polymorphisms (SNPs) and candidate genes for farrowing interval (FI) of different parities using the data before and after imputation for GWAS. Six hundred target pigs, 300 Landrace and 300 Large White pigs, were genotyped by GBS, and 60 reference pigs, 20 Landrace and 40 Large White pigs, were sequenced by whole-genome resequencing. Imputation for pigs was conducted using Beagle software. The average imputation accuracy (allelic R 2) from GBS to WGS was 0.42 for Landrace pigs and 0.45 for Large White pigs. For Landrace pigs (Large White pigs), 4,514,934 (5,533,290) SNPs had an accuracy >0.3, resulting an average accuracy of 0.73 (0.72), and 2,093,778 (2,468,645) SNPs had an accuracy >0.8, resulting an average accuracy of 0.94 (0.93). Association studies with data before and after imputation were performed for FI of different parities in two populations. Before imputation, 18 and 128 significant SNPs were detected for FI in Landrace and Large White pigs, respectively. After imputation, 125 and 27 significant SNPs were identified for dataset with an accuracy >0.3 and 0.8 in Large White pigs, and 113 and 18 SNPs were found among imputed sequence variants. Among these significant SNPs, six top SNPs were detected in both GBS data and imputed WGS data, namely, SSC2: 136127645, SSC5: 103426443, SSC6: 27811226, SSC10: 3609429, SSC14: 15199253, and SSC15: 150297519. Overall, many candidate genes could be involved in FI of different parities in pigs. Although imputation from GBS to WGS data resulted in a low imputation accuracy, association analyses with imputed WGS data were optimized to detect QTNs for complex trait. The obtained results provide new insight into genotype imputation, genetic architecture, and candidate genes for FI of different parities in Landrace and Large White pigs.

The whole-genome sequencing (WGS) data can potentially discover all genetic variants. Studies have shown the power of WGS for genome-wide association study (GWAS) lies in the ability to identify quantitative trait loci and nucleotides (QTNs). However, the resequencing of thousands of target individuals is expensive. Genotype imputation is a powerful approach for WGS and to identify causal mutations. This study aimed to evaluate the imputation accuracy from genotyping-by-sequencing (GBS) to WGS in two pig breeds using a resequencing reference population and to detect singlenucleotide polymorphisms (SNPs) and candidate genes for farrowing interval (FI) of different parities using the data before and after imputation for GWAS. Six hundred target pigs, 300 Landrace and 300 Large White pigs, were genotyped by GBS, and 60 reference pigs, 20 Landrace and 40 Large White pigs, were sequenced by wholegenome resequencing. Imputation for pigs was conducted using Beagle software. The average imputation accuracy (allelic R 2 ) from GBS to WGS was 0.42 for Landrace pigs and 0.45 for Large White pigs. For Landrace pigs (Large White pigs), 4,514,934 (5,533,290) SNPs had an accuracy >0.3, resulting an average accuracy of 0.73 (0.72), and 2,093,778 (2,468,645) SNPs had an accuracy >0.8, resulting an average accuracy of 0.94 (0.93). Association studies with data before and after imputation were performed for FI of different parities in two populations. Before imputation, 18 and 128 significant SNPs were detected for FI in Landrace and Large White pigs, respectively. After imputation, 125 and 27 significant SNPs were identified for dataset with an accuracy >0.3 and 0.8 in Large White pigs, and 113 and 18 SNPs were found among imputed sequence variants. Among these significant SNPs, six top SNPs were detected in both GBS data and imputed WGS data, namely, SSC2: 136127645, SSC5: 103426443, SSC6: 27811226, SSC10: 3609429, SSC14: 15199253, and SSC15: 150297519. Overall, many candidate genes could be involved in FI of different parities in pigs. Although imputation from GBS to WGS data resulted in a low imputation accuracy,

InTRODUcTIOn
Reproductive traits play an important role in pig industry and directly affect the sow reproductive performance. In recent years, the researchers extensively studied reproductive traits (such as litter size, birth weight, and number of teats) and identified many quantitative trait loci (QTLs) and candidate genes in pigs. However, a few studies focused on farrowing interval (FI). FI was defined as the number of days between two consecutive litters in sow's productive life. This trait is one of the major determinants of the efficiency of sow reproduction. The heritability ranged from 0.04 to 0.16 for FI in pigs (Serenius et al., 2003;Cavalcante Neto et al., 2009). The previous literatures have shown that these traits in different parities should be considered as different traits for farrowing traits (Roehe and Kennedy, 1995;Noguera et al., 2002;Onteru et al., 2011). Thus, FIs of different parities need to be used as different traits in analyses. In summary, a total of 28,720 QTLs were reported for 677 complex traits in pigs (PigQTL, https://www. animalgenome.org/cgi-bin/QTLdb/SS/index, March 1, 2019). Among them, 2,129 QTLs were associated with reproductive traits. However, there are no QTLs and genes found for FI in pigs.
Genome-wide association studies (GWASs) are an effective method for identifying the genetic variations involved in complex traits. In pigs, GWAS has been widely conducted to uncover the genetic architecture behind economically important traits, such as reproduction- , growth-, and meat-related traits (Jiang et al., 2018). Using this approach, a range of quantitative trait nucleotides, quantitative trait genes, and QTLs involved in important traits in pigs were identified (Jun et al., 2011;Ma et al., 2014;Derks et al., 2017;Zhang et al., 2018). However, because of the limited number of single-nucleotide polymorphisms (SNPs), the power of GWAS is very limited, and several causal loci were missed in previous studies. This led to an inability to identify the causal loci of complex traits. Whole-genome sequence (WGS) data containing the majority of SNPs were optimized to enhance the accuracy and power of GWAS and the detection of QTLs associated with complex traits. To obtain credible GWAS results, a large number of genotyped individuals were required in association analyses. Although the cost of resequencing is rapidly decreasing, it is still expensive to resequence thousands of individuals. An efficient imputation strategy for WGS data was thus recommended to detect causal loci. Using this method, low-density SNPs were imputed to high-density or WGS data at low cost. A small number of resequenced individuals (called as "reference population") and a large number of individuals with low-density SNPs (called as "target population") were used for genotype imputation. Based on the reference and target genotype data, genotype imputation used linkage disequilibrium (LD) of haplotypes in reference sequence data to predict the SNPs missing from target sequence data. Then, low-density SNPs were imputed to WGS data using the reference data.
Recently, genotype imputation has been successfully implemented and obtained reliable results in humans (Howie et al., 2012) and livestock (Sanchez et al., 2017;Ye et al., 2018;Berg et al., 2019). In cattle, 12 QTLs for mammary gland morphology (Pausch et al., 2016) and 34 QTLs for milk protein composition (Sanchez et al., 2017) were found using imputed data. Based on the imputed WGS data, Berg et al. (2019) reported that the detected QTLs increased with increasing SNP density and identified a clear peak on SSC7 for teats number in Large White and Dutch Landrace pigs. To detect the missing QTLs, the imputed WGS data were used to perform association analyses, and an important QTL was detected on SSC1 for lumbar number in Sutai pigs (Yan et al., 2017).
In general, the imputed data contributed benefit for association studies. However, only few literatures reported the factors that affected genotype imputation in livestock. According to the reported literatures, the imputation accuracy was affected by sequencing depth, size of reference population, the relationship between reference and target population (Ye et al., 2018), and marker density of target population (Ventura et al., 2016). Because the imputation would result in a poor imputation accuracy using multiple reference populations (Berg et al., 2019), a single-breed reference population may be optimal for imputation. To date, few studies analyzed the imputation accuracy from real genotype to imputed WGS data and performed GWAS using imputed WGS data in livestock.
To the best of our knowledge, no studies have reported on the imputation of genotyping-by-sequencing (GBS) data to WGS using whole-genome resequencing data of individuals as a reference population. In this study, the target populations were genotyped by GBS technology, and the reference populations were sequenced using whole-genome resequencing. Then, association analyses were performed for both the unimputed and imputed data. In this context, the objectives of this study were (1) to impute the GBS data to WGS data and analyze the accuracy of imputation to WGS and (2) to perform GWAS to reveal the genetic architecture behind FI of different parities in Large White and Landrace pigs.

Animals and Phenotype Records
A total of 660 pigs, 320 Landrace and 340 Large White pigs, from the national core pig breeding farm of Sichuan Tianzow Breeding Technology Co.,Ltd. (http://www.tianzow.com/ areashow.php?id=790,Nanchong,China), were used in this study. The ear tissues for 660 pigs were collected and stored in 75% alcohol, which was approved by the Institutional Animal Care and Use Committee of the Sichuan Agricultural University (DKY-B20140302).
All pigs with common genetic background were introduced from Canadian Hylife Company at 2008. The farrowing records were collected from parity 1 to 4 during the period of 2012-2015.
The FI values were defined as the number of days between two adjacent litters. Due to the different genetic architecture of each parity, FIs of different parities were considered as different traits. The following reproductive traits were defined and recorded for each pig: (1) FI from parity 1 to 2 (FI_L12), (2) parity 2 to 3 (FI_L23), and (3) parity 3 to 4 (FI_L34) in Landrace pigs; and (4) FI from parity 1 to 2 (FI_Y12), (5) parity 2 to 3 (FI_Y23), and (6) parity 3 to 4 (FI_Y34) in Large White pigs. Totals of 1,980 FI records were collected. The normal transformation of phenotypic data was conducted by R software (Aulchenko et al., 2007).

DnA extraction
The genomic DNA was extracted from ear tissues using the OMEGA Tissue DNA Kit (Omega Bio-Tek) as per the manufacturer's instructions. The Nanodrop-2000 spectrophotometer was used to measure the quality and quantity of the genomic DNA samples. The genomic DNA samples with the ratio of light absorption (A260/280) between 1.8 and 2.0, concentration ≥50 ng/µL, and total volume ≤50 µL were used for sequencing.

Reference Sequence Data
A total of 60 pigs, 20 Landrace and 40 Large White pigs, were selected for resequencing by random selection. The resequencing was performed by Illumina HiSeq PE150 platform, with average sequencing depth of 20-fold. The initial quality of resequencing data was performed by FastQC (http://www.bioinformatics. bbsrc.ac.uk/projects/fastqc/); 3.0-T clean data were contained. The clean reads were mapped to Sscrofa11.1 reference sequence by the BWA (version 0.7.15) software (Li and Durbin, 2009). After that, GATK (version 3.5) software  was used to realign the mapped reads and called SNPs. A total of 21,104,245 SNPs were called by GATK. A quality control procedure was adopted by removing SNPs with minor allele frequency (MAF) <0.05, missing rate (Miss) >0.1, Hardy-Weinberg equilibrium (HWE) >1.0 × 10 -6 , read depth (dp) <6, and SNPs with no position information and located on sex chromosomes. Quality control was conducted using VCFtools (version 4.2) (Danecek et al., 2011). After quality control, a total of 10,501,384 SNPs remained. The WGS data were used as the reference sequence for further study.

Target Sequence Data
The remained 600 pigs, 300 Landrace and 300 Large White pigs, were selected as the target population. A total of 600 samples were genotyped using GBS with Illumina HiSeq PE150 platform.
Quality control with MAF >0.01, Miss <0.2, HWE <1.0 × 10 -6 , and dp > 3 was performed by VCFtools (version 4.2) (Danecek et al., 2011). Then, the SNPs with no position information and located on sex chromosomes were excluded from this dataset. After quality control, a total of 325,557 SNPs were retained for the target population.

Genotype Imputation
The imputation from GBS SNP genotypes to WGSs for Large White pigs and Landrace pigs was performed by Beagle (version 3.3.2) (Browning and Browning, 2007) with default parameter settings. The genotype imputation was separately conducted for each breed. For Landrace pigs, the GBS data for 300 target Landrace pigs were imputed to WGS data, using the WGS reference data of 20 Landrace pigs. For Large White pigs, using the WGS reference data of 40 Large White pigs, the GBS data of 300 target Large White pigs were imputed to WGS data. The imputation accuracy at each SNP was assessed using the estimated squared correlation between the allele dosage and true allele dosage for the marker (allelic R 2 ). After imputation, two filter criteria were conducted: (1) removing SNPs with an imputation accuracy < 0.3 and MAF < 0.01 and (2) removing SNPs with an imputation accuracy < 0.8 and MAF < 0.01.

Genome-Wide Association Study
Single marker regression analyses were performed independently on GBS data and imputed genotype data using GEMMA  software. The following univariate mixed liner model was used to test the association between SNPs and FI: y X Z Wa e = + + + α α β β where y is the vector of phenotypic values; α is the vector of fixed effects, including farrowing year, farrowing month, parity; β is the marker effects; a is the vector of the remaining polygene effect; e is the vector of residual effects; X, Z, and W are incidence matrices for α, β, and a, respectively. The Bonferroni correction method was used to determine the threshold values in this study. The genome-wide significance level (0.05/N) and suggestive level (1/N) were used in this study, where N is the number of analyzed SNPs (Supplementary Table 1).
The Manhattan and QQ plots were drawn by R package "qqman" (Turner, 2014). The genomic inflation factor (λ = the observed P value/the expected P values) was calculated to evaluate the extent of false positive signals using the GenABEL package in R (Aulchenko et al., 2007).

candidate Genes Searching
In order to highlight candidate genes at genome-wide significant loci, candidate genes were searched within a 20-Kb region centering each top SNP on pig genes Sscrofa11.1 (http://asia.ensembl.org/biomart/martview/). The gene function was carried out by NCBI database (https://www. ncbi.nlm.nih.gov/) based on the description of gene function and reported literatures. Furthermore, this study performed the gene ontology (GO) analyses by DAVID Bioinformatics Resources (Dennis et al., 2003). The Fisher exact test was used to detect the significant GO terms, and the genes involved in significant GO terms (P < 0.05) were used for further analyses (Dennis et al., 2003;Rivals et al., 2007).

Genotype Data
The overview of numbers of SNPs for each breed is shown in Table 1. After quality control, a total of 10,501,384 and 325,557 SNPs remained from 18 autosomes for reference and target population, respectively. After imputation, this study obtained 12,835,977 and 13,144,579 SNPs from 18 autosomes for 300 Landrace and 300 Large White pigs, respectively. After removing SNPs with allelic R 2 < 0.3, a total of 4,514,934 SNPs for Landrace pigs and 5,533,290 SNPs for Large White pigs were retained.
After filtering SNPs with allelic R 2 <0.8, 2,093,778 and 2,468,645 were retained for Landrace and Large White pigs, respectively.

Imputation Accuracy
The genotype imputation was conducted by Beagle software, and imputation accuracy at each SNP was assessed using allelic R 2 . The plots of imputation accuracy are shown in Figure 1 for each chromosome before and after filtering data. After imputation, the GBS data were imputed to WGS data with a poor imputation accuracy. The average accuracy of whole genome was 0.42 for Landrace pigs and 0.45 for Large White pigs. The lowest accuracy and highest accuracy were 0.37 (for SSC17) and 0.46 (for SSC4) in Landrace pigs. For Large White pigs, the lowest accuracy and highest accuracy were 0.40 (for SSC10) and 0.51 (for SSC4). After quality control, these retained SNPs had accuracy lower than 0.3, resulting in an average accuracy of 0.73 and 0.72 for Landrace and Large White pigs. After removing the SNPs with the accuracy lower than 0.8, the average accuracies were 0.94 and 0.93 for Landrace and Large White pigs. Factors That Affect Imputation Accuracy To investigate factors that affect imputation accuracy, the distance and MAF difference between an imputed SNP and its closest SNP on the GBS data and MAF of imputed SNPs were analyzed. Figures  2A, B showed the average imputation accuracy versus MAF of imputed SNPs for the two populations. The average imputation accuracy was poor for SNPs with MAF smaller than 0.1 in two breeds. The average imputation accuracy was comparatively stable at MAF 0.10 to 0.45, but there was a temporary reduction when the MAF approached 0.5. After quality control, the average imputation accuracy with different MAFs was stable and both close to 0.72 (0.93 for accuracy > 0.8) in two breeds (Supplementary Figure 1). The large distance and MAF difference between imputed SNPs and their nearest SNPs on GBS data would result in a low imputation accuracy. The average imputation accuracy decreased with increasing distance and MAF difference, as illustrated by chromosome 1 for two breeds (Figures 2C-F).

Whole-Genome Resequencing Association Analyses
This study performed GWAS on two target populations (Landrace and Large White pigs) in three scenarios, using data before and after imputation. In the first scenario, using the GBS data, GWAS was conducted for each breed. In the second scenario, using imputed WGS data (imputation accuracy > 0.3), GWAS was conducted for each breed. In the third scenario, imputed WGS data with accuracy higher than 0.8 were used in association analyses for each breed.
FIGURe 1 | Imputation accuracy from GBS to WGS data for each chromosome in Landrace (A) and Large White pigs (B). Imputation accuracy before filtering (orange), after filtering with allelic R 2 > 0.3 (blue), and after filtering with allelic R 2 > 0.8 (red).
FIGURe 2 | Average imputation accuracy versus minor allele frequency of imputed SNPs, distance, and MAF difference between the imputed SNPs and their closest SNPs on GBS data for Landrace (A, c, and e) and Large White pigs (B, D and F). SNPs were grouped in bins of 1,000 SNPs with similar MAF differences.

GWAS for Data Before Imputation
Using the original GBS data, this study conducted singlemarker association studies for FI in each parity and each population. For Landrace pigs, the Manhattan plots are shown in Figure 3A. The Q-Q plots are shown in Supplementary  Figure 2A, and the genomic inflation factors were between 0.99 and 1.05 (Supplementary  Figure 4A. The Q-Q plots are shown in Supplementary Figure  3A, and the genomic inflation factors were between 1.08 and 1.17 (Supplementary Table 5). Totals of 111, 8, and 9 genomewide significant SNPs were associated with FI_Y12, FI_Y23, and FI_Y34 (Table 2), respectively. The significant chromosome regions on SSC7 (2.04-2.08, 2.20-2.24, and 133.40-133.44 Mb) and  Mb) showed three clear signals. Nine consecutive SNPs were located in the region of SSC7: 2.04 to 2.08 Mb for FI_Y12, and the top SNP was SSC7: 2,061,622 bp (P = 3.32 × 10 -10 ). In the region of SSC7: 2.20 to 2.24 Mb, the significant locus with the top SNP SSC7: 2,221,168 bp (P = 9.60 × 10 -9 ) harbored 13 adjacent SNPs for FI_Y12. In addition, six significant SNPs were located in the region of 5.

GWAS for Data After Imputation
For imputed WGS with accuracy >0.3, in Landrace pigs, totals of 4, 404,137, 4,402,124, and 4,380,454 SNPs were analyzed for FI_L12, FI_L23, and FI_L34, respectively. In Large White pigs, the totals were 5, 486,741, 5,486,265, and 5,474,718 SNPs for FI_ Y12, FI_Y23, and FI_Y34, respectively (Supplementary Table 1). Based on the genome-wide-significance threshold (0.05/N) and  suggestive threshold (1/N), a total of 125 genome-wide significant SNPs and 259 suggestive SNPs were detected in two populations. For Landrace pigs, the results are shown in Figure 3B and Table 3. The Q-Q plots are shown in Supplementary Figure 2B, and the genomic inflation factors were between 0.95 and 1.03 (Supplementary Table 5). No genome-wide significant loci were found for FI of each parity using imputed data. Meanwhile, a total of 15 suggestive loci (Supplementary Table 3) were detected for For Large White pigs, the results are shown in Figure 4B and Table 3. The Q-Q plots are shown in Supplementary Figure 3B, and the genomic inflation factors were between 1.01 and 1.03 (Supplementary Table 5). A total of 125 genome-wide significant SNPs were identified, including 102 significant SNPs for FI_Y12 and 23 SNPs for FI_Y34 ( Table 3). In addition, a total of 244 suggestive SNPs were found, including 111 SNPs for FI_Y12, 1 for FI_Y23, and 132 for FI_Y34 (Supplementary Table 3). Notably, four interesting peaks were observed, as shown in Figure 4B. A significant region at 2.14 to 2.18 Mb on SSC7 with the top SNP of SSC7: 2,159,059 bp (P = 6.93 × 10 -11 ) was found for FI_Y12. At 10.67 to 10.71 Mb in SSC14, this study detected a significant locus (top SNP SSC14: 10,694,678 bp, P = 2.00 × 10 -9 ) for FI_Y12. The third region was located in SSC15: 154.91 to 154.95 Mb with the top SNP SSC15: 154,933,940 bp (P = 1.72 × 10 -9 ) for FI_Y34. The fourth region was located in SSC9: 38.20 to 38.24 Mb (Supplementary Table 3). In this region, a total of 119 consecutive SNPs and two candidate genes (ZC3H12C and RDX gene) for FI_Y34 were detected; the top SNP was located at SSC9: 38,215,712 bp (P = 3.74 × 10 -8 ).
After filtering the imputed data to those with imputation accuracy of higher than 0.8, in Landrace pigs, totals of 2,043,321, 2,044,003, and 2,032,985 SNPs were analyzed for FI_L12, FI_ L23, and FI_L34, respectively. In Large White pigs, the totals were 2, 453,952, 2,453,789, and 2,449,239 SNPs for FI_Y12, FI_Y23, and FI_Y34, respectively (Supplementary Table 1). The GWAS results are shown in Figure 3C, Figure 4C, and Table 4 and Supplementary Table 4. The Q-Q plots are shown in Supplementary Figures 2C and 3C, and the genomic inflation factors were between 0.94 and 1.03 (Supplementary Table 5). Based on the genome-wide significance threshold (0.05/N) and suggestive threshold (1/N), a total of 27 genome-wide significant

comparing GWAS Results for GBS Data and Imputed Data
In this study, a total of six genome-wide significant loci were simultaneously uncovered for genotype data both before and after imputation. In the region of 136.11 to 136.15 Mb, the first locus with the top SNP SSC2: 136,127,645 bp was detected, and the P values of the top SNP were 6.04 × 10 -11 and 5.81 × 10 -11 for GBS data and imputed data, respectively. The second locus was located in the region 103.41 to 103.45 Mb on SSC5. The P values of the top SNP SSC5: 103,426,443 bp in the data before and after imputation were 5.22 × 10 -16 and 4.29 × 10 -16 , respectively. The third locus was located in the region of SSC15: 27.79 to 27.83 Mb; the position of the top SNP was SSC6: 27,811,226 bp (P = 2.56 × 10 -8 for original data; P = 3.44 × 10 -8 for imputed data).
In the region of SSC10: 3.59 to 3.63 Mb, the fourth common top SNP SSC10: 3,609,429 bp was found, with P values of 3.40 × 10 -12 (original) and 2.81 × 10 -13 (imputed). The fifth locus was found in the region of SSC14: 15.18 to 15.22 Mb, and the P values of the top SNP SSC14: 15,199,253 bp were 8.60 × 10 -9 (original) and 8.07 × 10 -9 (imputed). At the sixth locus, the P values of the top SNP SSC15: 150,297,519 bp were 4.32 × 10 -13 and P = 3.82 × 10 -13 for the data before and after imputation, respectively. Moreover, a clear peak (SSC9: 38.20-38.24 Mb) was observed in the imputed data with accuracy higher than 0.3 and 0.8.

candidate Genes
The GWAS results using imputed data were used to detect functional genes in this study. Based on the pig genes Sscrofa11.1, a total of 10 candidate genes were detected near (within 20 Kb) the genome significant SNPs for imputed data. And, within a 20-Kb region centering each suggestive SNP, 20 genes were found for imputed data. These genes were used to conduct GO analyses in DAVID software. A total of 210 GO terms were detected, of which four significant GO terms (P < 0.05) were identified. The significant GO terms were associated with cellular response to heat, secretory granule, regulation of gene expression, and response to hypoxia. Considering the genes involved in these significant GO terms and functional annotation in NCBI database and reported literatures, CHST11, NF1, and ADRA1A promising candidate genes were suggested for FI.

DIScUSSIOn
This is the first study to investigate the imputation accuracy from GBS to WGS data and perform GWAS using the GBS and imputed WGS data for FI of different parities in Landrace and Large White pigs. The previous study demonstrated that using multiple reference populations, the imputation from low-density SNPs chip to WGS resulted in a poor imputation accuracy in pigs (Berg et al., 2019). Thus, 20 Landrace and 40 Large White pigs were separately used as a single-breed reference population for each breed. A total of 300 Landrace and 300 Large White pigs were separately used as the target population in this study. Subsequently, the imputations were conducted for each breed by Beagle software. In summary, GWAS using the imputed WGS demonstrated that use of imputed WGS would improve identification of genetic variants. The imputation accuracy and GWAS results were discussed in the following sections.

Imputation From GBS to WGS
Imputation accuracy is known to be affected by the size of the reference population, population structure, imputation method, and marker density (Ye et al., 2018). To investigate the accuracy of imputation to WGS in Landrace and Large White pigs, the GBS data were directly imputed to WGS data, using a single-breed reference population with Beagle software. After imputation, 35.17% to 42.10% of SNPs had imputation accuracy higher than 0.3. Differences in the reference population size would result in differences in imputation accuracy. Specifically, a limited reference population would result in poor imputation accuracy (Binsbergen et al., 2014). In our study, the average imputation accuracy was higher for Large White pigs than for Landrace pigs. In addition, for most chromosomes (with the exceptions of chromosomes 3, 10, and 16), the accuracy for each chromosome was also higher for Large White pigs than for Landrace pigs. This can be explained by there being only 20 Landrace pigs in the reference population compared with 40 Large White pigs. These results demonstrate that the reference population size contributed to imputation accuracy and that imputation accuracy increased with increasing population size. In addition, the differences in population structure and genetic architecture between Large White and Landrace pigs probably also resulted in the differences in imputation accuracy between two breeds. These factors would result in different rates of LD decay and different numbers of independent chromosome segments (Goddard et al., 2009). Lower LD decay and a high rate of shared haplotypes would result in high imputation accuracy. Using Beagle software, it was reported that the rates of imputation accuracy from BovineHD bead chip to WGS were 0.77 to 0.83 in bovines (Binsbergen et al., 2014). In addition, in a study by Ni et al. (2015) on chicken, the imputation accuracy from a 600 K chip to WGS data was found to be more than 0.95. The imputation accuracy was also reported to increase with increasing density of the target SNP chip, sequencing cost, number of reference individuals (Ye et al., 2018), and MAF (Berg et al., 2019). In that study, it was found that accuracy of imputation from 60-and 600-K chip data to WGS data was 0.62 and 0.81, while the accuracy ranged from 0.421 to 0.897 for 1 to 24 reference individuals in hens (Ye et al., 2018). However, imputation accuracy in this study was similar to that in a study with 90 reference individuals (accuracy of 0.46) in bulls (Binsbergen et al., 2014). Expanding the reference population size would improve imputation accuracy (Bouwman and Veerkamp, 2014). In comparison to these previous studies, the imputation accuracy from GBS to WGS data was relatively low in our study. Possible reasons for this include the limited reference population (reference populations of 20 and 40 individuals) and the lowdensity target genotypes (about 320-K genotypes).
Quantitative traits are controlled by few genes with large effects and numerous polygenes with minor effects. Loci with a low allele frequency may have large effects on the complex traits (Manolio et al., 2009). However, average imputation accuracy was quite low for the genetic variants with lower MAF. Indeed, accuracy is generally lower for SNPs with low MAF. Factors that could affect imputation accuracy are MAF of imputed SNPs, distance, and MAF difference between the imputed SNPs and their nearest SNPs on target data. As expected, the average imputation accuracy sharply decreased for SNPs with MAF of imputed SNPs below 0.1, while it was comparatively stable at MAF of imputed SNPs 0.1 to 0.45 in this study. These findings are in agreement with the literatures (Chen and He, 2012;Hayes et al., 2012;Ye et al., 2018;Bolormaa et al., 2019). In another study, the imputation accuracy showed a sharp decline when MAF was smaller than 0.2 in hens (Ye et al., 2018). However, the average imputation accuracy was here found to decrease at MAF > 0.45. The large distance and MAF difference would result in a low imputation accuracy (Binsbergen et al., 2014;Berg et al., 2019). In this study, the distance and MAF difference between imputed SNPs and their closest SNPs on GBS data were large at MAF <0.1 and MAF > 0.45 in two populations (Supplementary Figures  4 and 5). And the average imputation accuracy decreased with increasing distance and MAF difference. Thus, the low average imputation accuracy at MAF < 0.1 and MAF > 0.45 may have been caused by the large distance and MAF difference between imputed SNPs and their closest SNPs on GBS data. Furthermore, the appropriate selection of key individuals used as a reference population would contribute to increasing the accuracy for low-MAF SNPs compared with random selection (Pausch et al., 2013;Moghaddar et al., 2015;Ye et al., 2018). However, the use of a reference population with the closest relationship would result in a lower imputation accuracy than random selection (Yu et al., 2014). In this study, random selection of the reference population resulted in poor imputation accuracy. In further study, the appropriate selection of key individuals for sequencing should be performed to determine whether it can contribute to increasing the accuracy of imputation from lowdensity data to WGS data in pigs.

The Identified QTls and Potential candidate Genes
Based on low-density SNP chips, association analyses achieved low power and accuracy for detecting loci associated with complex traits. Using WGS data, GWAS is optimized to identify genetic variants for complex traits. However, the imputation from GBS to WGS data obtained a poor imputation accuracy. The imputed WGS data contained the most causal loci. The use of imputed WGS data instead of GBS data improved the identification of loci of interest. In this study, association analyses using imputed WGS data rather than GBS data could reduce the mapping noise and highlight the peaks that are important for FI. All of the SNPs for imputed WGS with accuracy higher than 0.8 matched those in the imputed WGS with accuracy higher than 0.3.
In this imputed GWAS, a total of 90 significant SNPs associated with FI_Y12 were located on 12 QTL regions reported to be associated with reproductive traits, including age at puberty , gestation length (Wilkie et al., 1999;Chen et al., 2010), corpus luteum number (Rohrer et al., 1999;Schneider et al., 2014), teat number (Hernandez et al., 2014), litter size (Hernandez et al., 2014), number of stillborn (Onteru et al., 2012), and number of offspring born alive (Tribout et al., 2008). For FI_Y34, the peak SNP (SSC12: 43730214) was included within seven reproduction-related QTL regions associated with teat number (Hirooka et al., 2001) and gestation length (Chen et al., 2010), in which the NF1 gene was found. This study further confirmed the importance of these 19 QTLs in pigs.
GWAS with imputed WGS data may be effective to detect putative candidate genes. Numerous putative candidate genes for FI located near these identified loci were found in two populations. Of these, the product of the CHST11 gene is localized at the Golgi membrane and is a key enzyme in the biosynthesis of chondroitin sulfates (Mikami et al., 2003). This gene was previously found to be highly expressed in ovarian cancer (Oliveira-Ferrer et al., 2015). The GO terms of the CHST11 gene are related to chondrocyte development, postembryonic development, embryonic digit morphogenesis, developmental growth, and embryonic viscerocranium morphogenesis. The NF1 gene is known as a key transcription factor that modulates the tissue-specific transcription of various genes (Ivanov et al., 1990). In bovines, the expression of the NF1 gene was also found to dramatically increase in the development of lactation (Ivanov et al., 1990). The NF1 gene also plays a significant role during embryogenesis in mice (Gutmann et al., 1995). Moreover, studies involving knockout of the NF1 gene suggested that this gene significantly affects osteoblast development in embryonic stem cells (Yu et al., 2010) and is expressed in lactation and mammary glands. The ADRA1A gene is a member of the G protein-coupled receptor superfamily and modulates the mitogenic response and growth and proliferation of cells. It encodes the α-epinephrine receptor for epinephrine, norepinephrine, and catecholamine (Freitas et al., 2008) and plays an important role in smooth muscle contraction, myocardial inotropism, and hepatic glucose metabolism (Stéphane et al., 2007). Furthermore, this gene was found to play an important role in fetal sheep (Giussani et al., 1995). Therefore, investigation into the molecular mechanisms associated with these identified loci and genes could provide valuable insight into the genetic architecture behind FI in pigs and help to improve pig breeding.
cOnclUSIOn Using a single-breed reference population, imputation from GBS to WGS data resulted in a poor imputation accuracy. After imputation, the average imputation accuracy (allelic R 2 ) was 0.42 and 0.45 for Landrace and Large White pigs, respectively.
The different size of reference population may have contributed to this difference of imputation accuracy in Landrace and Large White pigs. Although the imputation accuracy was low, the imputed WGS data promoted the detection of loci affecting quantitative traits. The use of the imputed WGS for GWAS appeared to reduce the mapping noise and highlight the important peaks in this study. These results provide useful novel insight into the genetic variants and genes associated with FI of different parities in Landrace and Large White pigs. However, further studies are needed to determine the optimal imputation strategy from GBS to WGS data and to validate these identified SNPs and genes.

DATA AVAIlABIlITY STATeMenT
The datasets supporting the results of this article are included within the article. All genotypic and phenotype data were deposited in Figshare