Using Different Single-Step Strategies to Improve the Efficiency of Genomic Prediction on Body Measurement Traits in Pig

In genomic prediction, single-step method has been demonstrated to outperform multi-step methods. This study investigated the efficiency of genomic prediction for seven body measurement traits in Yorkshire population and simulated data using single-step method. For Yorkshire population, in total, 592 individuals were genotyped with Illumina PorcineSNP80 marker panel. We compared the prediction accuracy obtained from a traditional pedigree-based method (BLUP), a genomic BLUP (GBLUP) and a single-step genomic BLUP (ssGBLUP) through 20 replicates of 5-fold cross-validation (CV). In addition, we also compared the performance of two-trait ssGBLUP and single-trait ssGBLUP for the traits with different gradients of genetic correlation. Our results indicated the GBLUP method generally provided lower accuracies of prediction than BLUP and ssGBLUP methods, and the average standard deviation of unbiasedness was as large as 0.278. For single-step methods, the accuracies of ssGBLUP for seven body measurement traits ranged from 0.543 to 0.785, and the unbiasedness of ssGBLUP ranged from 0.834 to 1.026, respectively. ssGBLUP generally generated 1% on average higher prediction accuracy than traditional BLUP, the improvement of ssGBLUP and the performance of GBLUP was lower than expected mainly due to the small number of genotyped animals, it was further demonstrated by our simulation study. We simulated two traits with heritabilities 0.1, 0.3, and with high genetic correlation 0.7, our results also showed that the prediction accuracies were low for GBLUP compared with other three methods with different genotyped reference population sizes and the accuracies were improved with increasing the genotyped reference population size. However, the increase was small for ssGBLUP compared with BLUP when the genotyped reference population size was <500. Our results also demonstrated that the accuracies of genomic prediction can be further improved by implementing two-trait ssGBLUP model, the maximum gain on accuracy was 2 and 2.6% for trait of chest width compared to single-trait ssGBLUP and traditional BLUP, while the gain was decreased with the weakness of genetic correlation. Two-trait ssGBLUP even performed worse than single trait analysis in the scenario of low genetic correlation.

In genomic prediction, single-step method has been demonstrated to outperform multi-step methods. This study investigated the efficiency of genomic prediction for seven body measurement traits in Yorkshire population and simulated data using single-step method. For Yorkshire population, in total, 592 individuals were genotyped with Illumina PorcineSNP80 marker panel. We compared the prediction accuracy obtained from a traditional pedigree-based method (BLUP), a genomic BLUP (GBLUP) and a single-step genomic BLUP (ssGBLUP) through 20 replicates of 5-fold cross-validation (CV). In addition, we also compared the performance of two-trait ssGBLUP and single-trait ssGBLUP for the traits with different gradients of genetic correlation. Our results indicated the GBLUP method generally provided lower accuracies of prediction than BLUP and ssGBLUP methods, and the average standard deviation of unbiasedness was as large as 0.278. For single-step methods, the accuracies of ssGBLUP for seven body measurement traits ranged from 0.543 to 0.785, and the unbiasedness of ssGBLUP ranged from 0.834 to 1.026, respectively. ssGBLUP generally generated 1% on average higher prediction accuracy than traditional BLUP, the improvement of ssGBLUP and the performance of GBLUP was lower than expected mainly due to the small number of genotyped animals, it was further demonstrated by our simulation study. We simulated two traits with heritabilities 0.1, 0.3, and with high genetic correlation 0.7, our results also showed that the prediction accuracies were low for GBLUP compared with other three methods with different genotyped reference population sizes and the accuracies were improved with increasing the genotyped reference population size. However, the increase was small for ssGBLUP compared with BLUP when the genotyped reference population size was <500. Our results also demonstrated that the accuracies of genomic prediction can be further improved by implementing two-trait ssGBLUP model, the maximum gain on accuracy was 2 and 2.6% for trait of chest width compared to single-trait ssGBLUP and traditional BLUP, while the gain was decreased with the weakness of genetic correlation. Two-trait ssGBLUP even performed worse than single trait analysis in the scenario of low genetic correlation.

INTRODUCTION
The purpose of animal breeding is to genetically improve the performance of a population, which is usually achieved by selecting the best animals among the current generation to be served as parents of the next generation. In pig breeding, the main objective traits are growth and reproduction traits, such as days to 100 kg, backfat thickness and total number of piglets born. Body measurement traits are usually not concerned because of the difficulty of phenotypic data collection and its economic importance. However, the genetic evaluation of body measurement traits is becoming more important recently. The estimation of live weight by body measurements have been applied to different animal species (Enevoldsen and Kristensen, 1997;Thiruvenkadan, 2005). And the accuracy of weight prediction using body length and heart girth for pigs from different age categories had been reported by Mutua et al. (2011).
Since the historic work of Meuwissen et al. (2001), combining genome data with corresponding statistical models has been successfully applied to genome selection. The key issue of genomic selection is to predict individual genomic breeding values (GEBV) using genome-wide marker information. Many statistical methods have been developed to predict GEBV, which are basically different in the assumption of distribution of SNP effects. The linear BLUP models (at either the SNP level or the individual animal level) assume that effects of all SNP are normally distributed with same variance (Meuwissen et al., 2001;VanRaden, 2008). On the other hand, the Bayesian Alphabet methods (e.g., BayesA, BayesB, and BayesCpi) (Meuwissen et al., 2001;Habier et al., 2011) allow each SNP effect to have its own variance. Many studies have reported that Bayesian methods performed similar to genomic BLUP (GBLUP) model in real data (Hayes et al., 2009a) and GBLUP is also simpler and lower computation-demanding than the Bayesian Alphabet methods.
Generally, genomic prediction utilizes information of genotyped animals. In practice, however, only a subset of individuals can be genotyped. Furthermore, in order to make use of phenotype information of non-genotyped individuals, a single-step GBLUP (ssGBLUP) has been developed by constructing H matrix using marker genotypes and pedigree jointly instead of G matrix or pedigree-based relationship matrix alone (Legarra et al., 2009;Christensen and Lund, 2010). Field data of cattle, pigs and chickens indicated that single-step method leads to higher accuracy and much simpler than multistep genomic selection methods Chen et al., 2011;Forni et al., 2011;Christensen et al., 2012;Simeone et al., 2012;Li et al., 2014;Song et al., 2017).
Genomic selection usually handles a single trait only. However, many traits are genetically correlated. As in traditional genetic evaluation, a multi-trait model is expected to increase the accuracy of the GEBV by making use of information from genetically correlated traits which will be more profound for traits with low heritability or with a small number of phenotypic records (Jia and Jannink, 2012;Guo et al., 2014). Many studies report multi-trait model for genetically correlated traits could lead to more accurate predictions than single trait genomic prediction (Calus and Veerkamp, 2011;Jia and Jannink, 2012;Guo et al., 2014;Wang et al., 2017).
The objective of this study was to: (1) estimate genetic parameters of seven body measurement traits; (2) assess the accuracy of genomic selection using single-step strategy in the scenario of small genotyped animals; (3) improve the accuracy of genomic prediction using two-trait ssGBLUP with different gradients of genetic correlation; (4) investigate the impact of different genotyped reference population sizes on the accuracy of genomic prediction using simulated data.

Ethics Statement
The whole procedure for collecting pig blood samples was carried out in strict accordance with the protocol approved by the Animal Care and Use Committee of China Agricultural University (Permit Number: DK996).

Simulated Data
The multiple-trait genomic simulation software GPOPSIM (Zhang et al., 2015) was used to simulate the genomic data and phenotypic data. Trait A (h2 = 0.1) and trait B (h2 = 0.3) with high genetic correlation of 0.7 were simulated, the population included 30,000 individuals with 10 generations. We simulated 18 chromosomes with a total length of 18 Morgan, each chromosome included 2,834 markers which were evenly distributed, a total of 51,012 markers and 306 QTLs, and the mutation rates of markers and QTLs were 1.25 × 10 −3 and 2.5 × 10 −3 , respectively. Mimicking the pig breeding, phenotypic values were generated by adding simulated phenotypes and fixed effect values, and the fixed effects include sex-generation which were generated through uniform distribution from 0 to 1. We randomly selected 1,000 individuals from generation 10 as validation population, genotyped reference populations of different sizes of 50, 100, 300, 500, 1,000, 2,000, and 3,000 were randomly sampled from generation 8, 30,519 animals were traced back to construct pedigree relationship matrix. was performed using a PorcineSNP80 BeadChip (Illumina, San Diego, CA, USA) which includes 68,528 SNPs across the entire pig genome. In total, 592 pigs with body measurement traits records were genotyped. Missing genotypes of SNPs with known chromosomal positions were imputed by Beagle (Browning and Browning, 2009), and those with unknown position or on the X chromosomal were discarded from the study. SNPs loci with minor allele frequency < 0.05, call frequency score < 0.90 and Hardy-Weinberg equilibrium with a P-value < 10 −7 were excluded, in addition, the individuals with reliability of EBV <0.35 for 7 body measurement traits were removed. After quality control, all genotyped individuals remained and 55,759 SNPs were finally used.

Statistical Models
Four methods, a traditional BLUP method with pedigreebased relationship matrix, a GBLUP method based on genomic relationship matrix, a single-trait single step GBLUP (ssGBLUP) and two-trait ssGBLUP methods with combined relationship matrix constructed from marker and pedigree information, were used to predict breeding values.

BLUP
The traditional animal model with a pedigree-based relationship matrix was applied to predict breeding values for each trait separately. The model was defined as:

y=Xb+γW+Zg+e,
for real data of pig, where y is the vector of phenotypic values, b is the vector of fixed effects including herd-year-season-sex, X is the incidence matrix associating b with y, W is the covariate of body weight, γ is the regression coefficient, g is the vector of additive genetic effects, following a normal distribution of N(0, Aσ g 2 ), in which A is the matrix of additive genetic relationships, σ g 2 is the variance of additive genetic effect. Z is the incidence matrix associating g with y; e is the vector of random residuals with distribution of N(0, Iσ e 2 ), in which I is the identity and σ e 2 is the residual variance. For simulated data, the model is the same as real data except b is the vector of fixed effects including sex-generation and no covariate.
where y c is the vector of corrected phenotypic values which were computed as the EBV plus estimated residual for each individual for simulated and real data, u is the overall mean, 1 is a vector of 1, a is the vector of genomic breeding values, following a normal distribution of N(0, Gσ a 2 ), in which σ a 2 is the variance of addictive genetic effect and G is the marker-based genomic relationship matrix (VanRaden, 2008). e is the vector of random errors, following a normal distribution of N(0, Iσ e 2 ), in which σ e 2 is the residual variance.

Single-Trait ssGBLUP
The ssGBLUP model using information of both genotyped and non-genotyped phenotype information, and using both marker and pedigree information for genetic evaluations. The single-trait ssGBLUP has the same model as BLUP, except vector g is assumed to follow a normal distribution N(0, Hσ g 2 ). Following Legarra et al. (2009), Christensen and Lund (2010), and Aguilar et al. (2011), the H was defined as: in which A 11 , A 12 , and A 22 were the sub-matrices of A (the pedigree-based relationship matrix), and subscripts 1 and 2 refer to non-genotyped and genotyped animals, respectively. Compared with the H matrix, the inverse of H was simple which used to solve the mixed model equations, the inverse of H was: To avoid singularity problems, G w = 0.95G a +0.05A 22 (Aguilar et al., 2010;Lourenco et al., 2014), G a is an adjusted G, in order to avoid the differences in scale and location between the coefficients of G and pedigree relationship matrix (A 22 ), the G matrix was adjusted according to Christensen et al. (2012), where α and β are adjustment factors derived from the following equations: Avg.offdiag (G) β + α = Avg.offdiag(A 22 ), in which Avg.diag is the average of the diagonal elements, and Avg.offdiag is the average of the off-diagonal elements.

Two-Trait ssGBLUP
The two-trait model was defined as: where y 1 y 2 is the vector of observation values of trait I and II, b 1 and b 2 are the vector of fixed effects of herd-year-season-sex for real data and sex-generation for simulated data of trait I and II, X 1 and X 2 are the incidence matrix associating b 1 and b 2 with y 1 and y 2 , W 1 W 2 is the vector of covariate of body weight of trait I and II for real data and no covariate for simulated data, γ 1 and γ 2 are the regression coefficient associating W 1 and W 2 , g 1 g 2 is the vector of additive genetic effects of the two traits, following a normal distribution of N (0, H⊗M) , where is the variance and covariance matrix of the genomic breeding values of the two traits, Z 1 and Z 2 are the incidence matrix associating g 1 and g 2 with y 1 and y 2 , e 1 e 2 is the vector of random errors with distribution of N(0, I⊗R), where I is the identity matrix and R = σ 2 e1 σ 2 e12 σ 2 e12 σ 2 e2 is the residual variance and covariance matrix of the two traits.

Evaluation of the Accuracy of Genomic Prediction
In this study, the accuracy and unbiasedness of prediction were obtained through 5-fold cross-validation (CV) for real data. The genotyped individuals were randomly partitioned into five nearly equal sized subpopulations. Of the five subpopulations, a single subpopulation was retained as the validation population and the remaining four subpopulations were used as reference population. The cross-validation process was then repeated five times, with each of the five subpopulations used exactly once as the validation population. For all scenarios, the 5-fold CV was replicated 20 times, resulting in 20 averaged accuracies of genomic prediction. The two-trait ssGBLUP with CV was also implemented to predict GEBV for the traits with different gradients of genetic correlation. For BLUP, GBLUP and ssGBLUP, the validation set was same in each replicate of 5-fold CV. For real data, the accuracy of genomic prediction was evaluated as r(EBV, PBV) for BLUP and ssGBLUP methods and r(y c , PBV) for GBLUP method, the correlation between predicted breeding values (PBV) and EBV or y c in the validation population. In addition, b(EBV or y c , PBV), the regression of EBV or y c on PBV was also calculated to assess the possible inflation or unbiasedness of predictions. It should be noted here that EBV and y c were calculated based on a 7-trait animal model as described before, while PBV was calculated for validation population by traditional single-trait BLUP, GBLUP, or ssGBLUP assuming the phenotypic values of validation population were missing. For simulated data, the true breeding values (TBV) of all individuals were available, r(TBV, PBV) and b(TBV, PBV) in the validation population were used to assess the accuracy and unbiasedness.

Descriptive Statistics and Estimates of Genetic Parameters
Number of records, birth year of animals, the descriptive statistics, estimated heritabilities and genetic correlations of the 7 body measurement traits and two simulated traits were presented in Tables 1, 2. The average of body measurement traits ranged from 17.98 to 113.52, and the standard deviation (SD) was between 1.03 and 6.30, and the average (SD) of Trait A and Trait B were 2.25 (3.13) and 1.65 (2.51), respectively. The heritabilities of the 7 body measurement traits ranged from 0.27 to 0.50. Across the 7 analyzed traits, the estimated heritability of BL was the highest of 0.5 and CG was the lowest of 0.27. The standard errors (SE) of estimated heritability for all traits were 0.04. The genetic and phenotypic correlations between traits were also show in Table 2. CW and RW have the highest genetic correlation of 0.84 with SE of 0.03, CG and AG have the second highest genetic correlation of 0.75 with SE of 0.06, some traits had negative or positive medium or low genetic correlation (−0.28-0.25), e.g., BL and RW with −0.28, CW and CG with 0.25. There were also very low genetic correlations among 7 body measurement traits, e.g., BL and BH with −0.01, CW and TG with 0.01. The standard error for genetic and phenotypic correlation ranged from 0.06 to 0.11 and 0.01 to 0.02, respectively. In addition, genetic correlation estimates were generally higher than the phenotypic correlations and the values of genetic correlation and phenotypic correlation had consistency, the traits with high genetic correlation had also high phenotypic correlation.
Accuracy and Unbiasedness Using Single-Step Strategy and GBLUP Methods on Yorkshire Data Table 3 presented the accuracy and unbiasedness of genomic prediction from 20 replicates of 5-fold cross validation on the 7 body measurement traits by applying single-step strategy. In a fold of each 5-fold CV, about 100 genotyped individuals were randomly selected as validation populations, the other individuals (about 400) were genotyped reference population for genome prediction, and all non-genotyped animals were added to the analysis for BLUP and ssGBLUP methods. In all scenarios, because of the small size of reference population, GBLUP method generally yielded lower prediction accuracies than BLUP and ssGBLUP methods, the accuracies ranged from 0.32 for BL and CG to 0.504 for RW, the average unbiasedness of prediction was 1.033, however the average SD of unbiasedness was large as 0.278. For ssGBLUP, the average accuracy and unbiasedness of prediction were 0.693 and 0.941, respectively, and the highest accuracy was 0.785 for CG, however the lowest accuracy was 0.543 for BL. The regression coefficients of prediction were also shown in Table 3 which ranged from 0.834 for BL to 1.026 for CG. In 20 replications of 5-fold CV, the SD of prediction accuracy for 7 body measurement traits were all under 0.08, indicating the accuracy in different folds was similar, while the SD of regression coefficients were 0.148 averagely, implying the unbiasedness in different folds changed dramatically. The accuracy and unbiasedness of traditional BLUP with the same reference and validation population as ssGBLUP was also demonstrated in Table 3. Generally, ssGBLUP provided higher accuracies of predictions than traditional BLUP in all 7 body measurement traits, while the improvement was only 1% on average. The highest increase was 0.013 for AG and the least increase was 0.006 for RW. On the other hand, traditional BLUP generated comparable bias of prediction with ssGBLUP in most traits, the largest difference was for trait of CW, the regression coefficient from BLUP and ssGBLUP was 0.946 and 0.895, respectively. Although the prediction accuracies of ssGBLUP were higher than traditional BLUP, the scope of increase was small, which could be due to small number of genotyped animals.

The Comparison of Two-Trait Model With Single-Trait Model on Yorkshire Data
In order to evaluate the efficiency of multi-trait model on the improvement of genomic prediction, two-trait ssGBLUP model was compared with single-trait ssGBLUP model. Meanwhile, genetic correlations of three different gradients (high, medium and low) were taken in account in two-trait ssGBLUP to evaluate the impact of genetic correlations on the genomic prediction e.g., high, medium and low genetic correlations of CW with RW (0.84), CG (0.25) and TG (0.01), and three gradients genetic correlations of 0.75 (CG), 0.20 (RW) and 0.06 (BH) with AG. Compared to single-trait model, as shown in Table 4, the accuracies of genomic prediction for CW from two-trait ssGBLUP were increased from 0.684 (Table 3) to 0.703 and 0.697 in the scenarios of high and medium correlations but slightly decreased to 0.676 in low genetic correlation. The gain on accuracy was 2% in situation with high genetic correlation, 1.3% in medium genetic correlation. In addition, the standard deviation of the accuracy in 20 replicates of 5-fold cross validation was all decreased to 0.045, 0.051, and 0.056 from 0.079. On the other hand, the bias of genomic prediction using twotrait model was decreased as well, the regression coefficients were 0.970, 0.953, and 0.959 in high, medium and low genetic correlations, while it was 0.895 in single-trait model. Moreover, the standard deviation of the regression coefficients was also decreased to 0.081, 0.105, and 0.079 from 0.165.
Similarly, the superiority of two-trait model was also demonstrated on AG. However, only tiny improvement on accuracy was obtained, the accuracies of genomic prediction in three gradient genetic correlations were nearly same and only slightly higher than the those in single-trait model. Likewise, there was no improvement on unbiasedness of genomic prediction from two-trait model as the regression coefficients with 1.002 from single-trait model was already very close to 1.0. Table 4 and Figure 1 further indicated the outperformance of two-trait ssGBLUP model. As similar as its performance on trait of CW and AG, in the scenarios of high (0.84) and medium (0.25) genetic correlations for RW and CG with CW, the gains on accuracy were 1.5 and 0.1%, respectively. Likewise, in low genetic correlation (0.01) for TG, the prediction accuracy was slightly decreased from 0.743 to 0.736. In addition, the standard deviation of the accuracy in 20 replicates of 5-fold cross validation was decreased to 0.036 and 0.019 from 0.058 and 0.021 for RW and CG, but no difference for TG. The prediction bias of twotrait model was decreased as well, the regression coefficients were 1.009, 1.022, and 0.989 in high, medium and low genetic correlations, lower than the corresponding 0.971, 1.026, and 0.986 in single-trait model. In addition, the standard deviation of the regression coefficients was also decreased to 0.103, 0.166, and 0.085 from 0.136, 0.169, and 0.086.
The same trend was shown in Figure 1 for high, medium and low genetic correlations of AG with CG (0.75), RW (0.20) and BH (0.06), the improvement in the accuracy of two-trait ssGBLUP over single-trait ssGBLUP was consistently increased with the increase of the genetic correlation from medium to high. Moreover, the accuracy was also decreased in low genetic correlation for BH. Similarly, for RW, it had high correlation (0.84) with CW and medium correlation (0.20) with AG, the gain on its accuracy was 1.5 and 0.3% in these two situations, respectively. The bias of genomic prediction using two-trait model was decreased as well, the regression coefficients were 1.009 and 0.982 in high and medium genetic correlations, while it was 0.971 in single-trait model. The same trend was also for CG which was high correlated with AG (0.75) and medium correlated with CW (0.25) as shown in Table 4 and Figure 1.

Accuracy and Unbiasedness With Different Genotyped Reference Population Sizes on Simulated Data
As shown in Figure 2, the same validation population was predicted through different genotyped reference population sizes using BLUP, GBLUP, ssGBLUP, and two-trait ssGBLUP methods. The accuracies of prediction were low for GBLUP compared with other three methods in all scenarios , while the accuracy of genomic prediction from GBLUP was rapidly increased with increasing the reference population size, especially when the reference population size was enlarged over 500. The accuracy of traditional BLUP with the same reference and validation population as ssGBLUP was also shown in Figure 2. Generally, ssGBLUP provided higher accuracies of predictions than   traditional BLUP in different genotyped reference population sizes for Trait A and Trait B, however, the increase was tiny especially in Trait A with low heritability of 0.1 when the genotyped reference population size was below 500, this was consistent with the results of real pig data in this study. In all scenarios, two-trait ssGBLUP produced the highest accuracy for Trait A and Trait B with high genetic correlation of 0.7, but the scope of improvement was low for Trait B with heritability of 0.3. Figure 3 presented the unbiasedness of genomic prediction with different genotyped reference population sizes in Trait A and Trait B. Obviously, the regression coefficients of GBLUP were rapidly increased close to 1.0 with increasing the genotyped reference population size, and the magnitude of the increase was even larger for Trait A which with low heritability of 0.1 than Trait B, the regression coefficient was close to 1 when the genotyped reference population reached 3,000 for Trait A. Furthermore, the biases of genomic prediction for BLUP were a bit larger compared with ssGBLUP and two-trati ssGBLUP methods. However, there was no obvious difference in unbiasedness for Trait B when using BLUP, ssGBLUP and two-trait ssGBLUP methods which were all very close to 1.

DISCUSSION
In this study, the estimated heritabilities of 7 body measurement traits were medium to high ranging from 0.27 to 0.50. The estimates of heritability of body length (BL) and body height (BH) were consistent with the study by Do et al. (2014) where the estimated heritabilities of BL and BH ranged from 0.331 to 0.559 at the ages of 70, 145, and 180 days, but the estimates of heritability of BL in our study was higher than those reported FIGURE 2 | Accuracies with different genotyped reference population sizes on simulated data. The figure indicates the accuracy was improved with increasing the genotyped reference population size. However, the increase was small when the genotyped reference population size was <500 for single-step methods. The heritabilities of Trait A (A) and Trait B (B) were 0.1 and 0.3. by Nikkila et al. (2013), where the estimates of heritability of BL was 0.29 in commercial gilts. Zhou et al. (2016) reported that heritabilities were 0.34 and 0.25 for BL, 0.05 and 0.25 for BH, 0.30 and 0.27 for CG at ages of 210 and 240 days in Chinese indigenous Laiwu pig. The difference in heritability estimates may be due to population structure, environmental factors, etc.
The genetic correlations of 7 body measurement traits ranged from −0.28 to 0.84, these traits were highly, medium or low correlated. Usually, the medium or highly correlated traits was more reasonable for multiple-trait BLUP model, while in the estimation of breeding values in this study, two or three highly correlated trait model performed as similar as the 7-trait model (data not shown). The genetic correlation between CW and RW was lower than the value of 0.56 reported by Kutwana et al. (2015). However, research on other livestock, the genetic correlations between BL and CG were higher in sheep which were 0.52 and 0.57 (Janssens and Vandepitte, 2004;Jafari et al., 2014). In cattle, Kolkman et al. (2010) reported that the genetic correlations between BL and RW was close to 0 which was −0.28 in our study. These results imply that genetic correlations among different livestock are quite different. In addition, that both genetic and phenotypic correlation coefficients were of the same sign and magnitude, Roff (1996) reported that the genetically and environmentally (residual) correlations are likely to share the same pattern.
In the single-step strategy, the construction of H matrix involves several parameters and some of them were shown to influence the accuracy and bias of prediction Koivula et al., 2015;Song et al., 2017). In this study, to avoid singularity problems, w was set to 0.95 in each analysis as other studies (Aguilar et al., 2010;Lourenco et al., 2014). However, in Danish Landrace and Yorkshire populations, Guo et al. (2015) reported the weight value of 0.5 was fit for three traits, and 0.25 was found to be an ideal value for the Danish Duroc population . Generally, the markers may not explain all genetic variance, the weighting factor w is the proportion of genetic variance not captured by markers which is population and trait specific, an optimal weighting factor on A matrix may best use both markers and pedigree information. In addition, in ssGBLUP, the scales of A and G matrix may differ, the adjustment factors (β and α) was used to adjust G matrix to be consistent with scale as A matrix, the commonly used adjustment method was reported by Christensen et al. (2012) as shown in section Single-Trait ssGBLUP of Statistical models. It was reported that the adjusted G resulted in better performance of ssGBLUP than the original G in which the estimates of β and α were 0.859 and 0.298, respectively . Therefore, in this study, we used adjusted G to make the scale of the G matrix to be comparable with that of the A matrix.
We investigated the genomic evaluation of body measurement traits of Yorkshire by applying single-step strategy. In this study, the accuracies of prediction ranged from 0.543 to 0.785 with a 5-fold cross-validation, in order to take advantage of all information, all non-genotyped animals were added to construct H matrix in ssGBLUP. As both the genomic and pedigree information was utilized by ssGBLUP, previous studies have shown that the ssGBLUP method was superior to both the GBLUP method Gao et al., 2012;Li et al., 2014;Guo et al., 2015;Song et al., 2017) and traditional pedigree-based BLUP (Gao et al., 2012;Koivula et al., 2015), in which only genomic or pedigree information was used. Our findings are consistent with these reports and further proved the superiority of single-step strategy, however, compared to the traditional BLUP, in our study, ssGBLUP only generated tiny higher accuracies with 1% on average increase in 7 body measurement traits and simulated data when the number of genotyped reference animals was small.
The lower improvement of ssGBLUP might possibly attribute from the following reasons. (1) The genotyped reference was not large enough to improve the genomic predictive ability.
In real data, about 400 genotyped reference animals could not probably provide more extra information compared to the pedigree information consisting of 5,000 individuals. Similarly, it also deduced the lower accuracy of GBLUP than BLUP in this study. In simulation study, GBLUP also produced lower accuracy than BLUP even the genotyped reference population size reached 3,000, it was still very small compared to nongenotyped individuals of 26,000 utilized by BLUP. Meanwhile, the improvement was also tiny for ssGBLUP compared to BLUP when the genotyped reference population size was small. Lourenco et al. (2014) also reported that GBLUP performed worse than BLUP and ssGBLUP only generated 3% higher accuracy than BLUP for fat percentage in all parities in a relatively small genotyped dairy population; (2) Inappropriate w value may cause the larger bias of ssGBLUP. In constructing H matrix, the weighting factor (w) was the proportion of genetic variance not captured by markers, theoretically, w value depends on the trait, the same w value was assigned in our study, deducing the larger bias of genomic prediction of ssGBLUP than BLUP, e.g., BL and CW. This indicated that in implementation of singlestep genomic evaluation, it is better to test different w to find out the optimal parameter for prediction; (3) In our study, the heritabilities for 7 body measurement traits were high from 0.27 to 0.5, which can obtain sufficient accuracy for traditional BLUP method, and improvement from genomic prediction was not large as expected. Other studies also indicated that for low-heritability traits, a very large number of records will be required in the reference population to subsequently achieve high accuracies of GEBV in unphenotyped animals, while for the traits with medium or high heritabiliries the tiny improvement will be achieved when added a small reference population Hayes et al., 2009b), our findings in simulated data are consistent with these reports.
Genetic correlation between traits has been used to improve the statistical power to detect QTL controlling traits of interest (Chesler et al., 2005;Xu et al., 2009). A series of researches of multiple traits have also been carried out in genomic selection and showed that multiple-trait model can improve the accuracy of genomic prediction compared to single-trait model (Calus and Veerkamp, 2011;Jia and Jannink, 2012;Guo et al., 2014). Our findings further proved that two-trait ssGBLUP yielded higher accuracy and lower bias than single-trait ssGBLUP.
Our results also indicated that the performance of two-trait ssGBLUP was related with the strength of genetic correlation.
The stronger the genetic correlation is, the higher improvement on the accuracy of genomic prediction, e.g., in the scenarios of high and medium genetic correlations, the maximum gain on accuracy was 2 and 2.6% for trait of CW compared to single-trait ssGBLUP and traditional BLUP, while the gain was decreased in the low genetic correlation. The same trend was illustrated on other body measurement traits, the difference was the scope of the improvement. e.g., compared to CW, only tiny improvement on accuracy was obtained for trait of AG in both high and medium correlations. Generally, in our study, as the heritabilities for 7 body measurement traits highly ranged from 0.27 to 0.5, the improvement on accuracy from two-trait genomic model was not as large as on the low-heritability trait. As shown in the results of simulated data, the Trait A with heritability of 0.1 produced larger gain on accuracy than Trait B with heritability of 0.3 when using two-trait ssGBLUP method (Figure 3). This was consistent with other previous reports (Calus and Veerkamp, 2011;Jia and Jannink, 2012;Guo et al., 2014).
However, in our study, in low genetic correlation (e.g., 0.01 and 0.06), the accuracies of genomic prediction were lower than that of separate single trait analysis, which was also reported by Jia and Jannink (2012) and Wang et al. (2017), The reason may be that, in this case, sampling from multiple trait model leads to nonzero estimates of correlation, which then leads to error information sharing across traits, deducing the lower accuracy of two-trait model. Therefore, for one trait correlated with several traits, the largest improvement of the accuracy and unbiasedness of genomic prediction on this target trait can be obtained by choosing one trait with highest genetic correlation with the target trait through two-trait ssGBLUP model. However, only two-trait model was investigated in this study, multiple-trait model is worth exploring in the future, however, it should be noted that much more computational demand will be increased as more random variance components needed to be estimated, and the equation is also difficult to converge. Furthermore, the traditional animal model assumes the linear relationship between the observations breeding values and the covariates, however, in practice, the nonlinear relationship is more common. In theory, the single-step genomic prediction using information from both genotyped and non-genotyped animals could be extended to generalized linear model or Bayesian model (Fernando et al. (2014), while the higher computational demand and the efficiency of genomic prediction in practice should be taken into account in the further investigation.

CONCLUSION
In this study, we used different single-step strategies and GBLUP method to investigate the efficiency of genomic prediction on body measurement traits in Yorkshire and simulated data. We compared the prediction accuracy of a traditional pedigreebased method (BLUP) with the GBLUP and single-step genomic BLUP (ssGBLUP) through 20 replicates of 5-fold cross-validation (CV) in pig data. Meanwhile, we investigated the impact of genotyped reference population size on the accuracy of genomic prediction through simulation study. In addition, we also compared the performance of two-trait ssGBLUP and single-trait ssGBLUP for the traits with different gradients of genetic correlation. Our results indicated that ssGBLUP generally generated higher prediction accuracy than traditional BLUP, the improvement was lower than expected mainly due to the small number of genotyped animals. Our results also demonstrated that the accuracies of genomic prediction can be further improved by implementing two-trait ssGBLUP model.

DATA AVAILABILITY
The simulated data and 7 body measurement traits data of Yorkshire supporting the conclusions of this article are available from the Figshare: https://figshare.com/articles/single-step_strategies/7434203.

AUTHOR CONTRIBUTIONS
XD and QZ designed and supervised the study. HS conducted the statistical analyses and interpretation of results. JZ conceived the experimental traits and sample preparation for DNA genotyping. HS simulated the data and wrote the manuscript, which was critically remarks by XD and QZ. All authors read and approved the manuscript.