Impact of the Order of Legendre Polynomials in Random Regression Model on Genetic Evaluation for Milk Yield in Dairy Cattle Population

The random regression test-day model has become the most commonly adopted model for routine genetic evaluations in dairy populations, which allows accurately accounting for genetic and environmental effects over lactation. The objective of this study was to explore appropriate random regression test-day models for genetic evaluation of milk yield in a Holstein population with a relatively small size, which is the common situation in regional or independent breeding companies to preform genetic evaluation. Data included 419,567 test-day records from 54,417 cows from the first lactation. Variance components and breeding values were estimated using a random regression test-day model with different orders (from first to fifth) of Legendre polynomials (LP) and accounted for homogeneous or heterogeneous residual variance across the lactation. Models were compared based on Akaike information criterion (AIC), Bayesian information criterion (BIC), and predictive ability. In general, models with a higher order of LP showed better goodness of fit based on AIC and BIC values. However, models with third, fourth, and fifth order of LP led to similar estimates of genetic parameters and predictive ability. Models with assumption of heterogeneous residual variances achieved better goodness of fit but yielded similar predictive ability compared with those with assumption of homogeneous residual variances. Therefore, a random regression model with third order of LP is suggested to be an appropriate model for genetic evaluation of milk yield in local Chinese Holstein populations.


INTRODUCTION
The random regression test-day model has been widely used in genetic evaluation for production traits in dairy cattle and has many advantages, including more accurately accounting for genetic and environment effects at different stages of lactation, thus resulting in more reliable genetic evaluation (Schaeffer and Dekkers, 1994;Jamrozik et al., 1997b;Schaeffer et al., 2000;Jensen, 2001;Schaeffer, 2004). The test-day model is significantly better than the lactation model (using full and extended 305 days lactation records) with 2-3% increase in accuracy for bulls and 6-8% for cows for milk yield first lactation (Kistemaker, 1997). In addition, the test-day model allows predicting estimated breeding value (EBV) for each test-day, each particular period, or the full lactation (305 days) and persistency (Jamrozik et al., 1997b).
The functions generally used to model the lactation curve include Woods' model (Wood, 1967), Wilmink's function (Wilmink, 1987), spline function (White et al., 1999), and Legendre polynomial (LP) function (Kirkpatrick et al., 1990). Because of differences in production environments and management systems, optimal functions for test-day models in different countries may be different (Reinhardt et al., 2002;Mrode et al., 2003). Several studies have shown that LPs performed well in random regression test-day models, but there is no "gold standard" reported in literature on choosing optimal order of LPs in the model, and the choice of the order of fit is highly dependent on the practical data structures and populations. For example, a fourth order LP (LP4) was used for national genetic evaluations in Canada and Italy, and a fifth order LP (LP5) was used in the United Kingdom (Muir et al., 2007). The joint Nordic test-day model was a multivariate model for milk, protein, and fat in lactation one to three (in total, nine traits), the genetic and permanent environmental effects were modeled by second order LP and an exponential term e −0.04×DIM (Lidauer et al., 2015).
The dairy herd improvement (DHI) project was first implemented in four provinces in the 1990s in China. In 2006, the Ministry of Agriculture of China approved a project to promote the DHI project in eight provinces where there were many large dairy populations (Miglior et al., 2009). Recently, the project has been expanded to 25 provinces in China, where provincial DHI laboratories and data centers have been established (Yearbook, 2016), and there were about 700,000 cows recording milk production in China each year. Many organizations and companies prefer to conduct genetic evaluation based on their own populations by considering the differences among populations. In general, these populations are relatively small in size. Therefore, an appropriate parametric function that can sufficiently account for the lactation curve and avoid overparameterization is critical for relatively small populations when using a random regression test-day model to estimate genetic parameters and breeding values. Relatively little research has been placed on the impact of parametric functions on genetic evaluation in a numerically small size dairy cattle population.
The aim of this study was to investigate the impact of the order of LPs on estimation of genetic parameters and prediction of breeding values for the milk yield trait and find appropriate orders of LPs in random regression test-day models for genetic evaluation in numerically small cattle populations, such as genetic evaluations at the regional level in China.

Data
Data were obtained from the database at the Dairy Cattle Research Centre DHI Lab, Shandong Academy of Agricultural Sciences. First were lactation records from 2004-2015 that fulfill the following criteria: ages at calving between 20 and 38 months, daily milk yield between 5 and 80 kg, days in milk (DIM) between 5 and 305, and cows with at least three test-day records were extracted. The final data set consisted of 419,567 testday records from 54,417 cows. The number of records in each DIM class ranged from 576 to 1768. Descriptive statistics of the data are presented in Table 1. The traced pedigree included 104,884 individuals.

Model
We used first to fifth order LPs (LP1-LP5) to fit random regression test-day models, respectively. The model equation was as follows: where y itklm is the observation of cow i measured at time t within herd-year-season (HYS) subclass l and age (AGE) subclass m; DIM t is the fixed effect of DIM in 301 classes from 5 to 305 days; a ik and pe ik are the kth random regression of additive genetic and permanent environmental effects for cow i, respectively; z itk is the kth order of LPs for cow i measured at DIM t, nr and np denote the order of LPs (from one to five) for additive genetic and permanent environmental effects, respectively; and e itklm is the random residual. LPs in this study were defined as follows (Kirkpatrick et al., 1990): where x is the standardized DIM, calculated as x = 2(DIM−DIM min ) (DIM max −DIM min ) − 1, and DIM min and DIM max are the minimum and maximum DIM, respectively. In this study, March, April, May, September, and October were defined as calving season one; June, July, and August as calving season two; and November, December, January, and February as calving season three, and there were 1891 herd-year-season classes in total. Calving age was classified into four levels: 20-23, 24-27, 28-31, 32 months or later. Residual variance was assumed to be either homogeneous or heterogeneous across lactation. For models with heterogeneous residual variances, residuals were divided into 10 classes   (Pereira et al., 2013). The heterogeneous residual variances were handled by putting different weights on residual variance for different periods of DIM. The weights were calculated as w i =¯v v i , where v i is the residual variance of the ith DIM class ( Table 2), andv is the mean of residual variances.
Additive genetic variance for a particular DIM was calculated as σ 2 g k = z k Gz k , z k is a column vector of LP coefficients at the kth DIM, and G is covariance matrix of additive genetic effect. Permanent environmental variance for a particular DIM was σ 2 p k = z k Pz k , matrix P is a covariance matrix of permanent environmental effect, and z k is the same as above; EBV of a particular animal at a particular DIM was calculated as EBV mk = z k a, a is column vector of additive genetic random regression coefficients of a particular animal, and z k is same as above; the EBV for the whole lactation was calculated as EBV m305 = 305 t=5 EBV mk . The estimation of variance components and prediction of breeding values using different models were carried out by the DMU package (Madsen et al., 2014).

Model Comparison
Models with different orders of LPs were compared using the following methods based on full and reduced data sets: 1. Akaike information criterion (AIC; Akaike, 1973) and Bayesian information criterion (BIC; Schwarz, 1978). AIC  1 v li is the residual variance of ith class in lactation with LP of order l.
was computed as AIC = −2logL + 2p, where −2logL is the restricted maximum log likelihood value, p is the total number of parameters estimated in the corresponding model. Bayesian information criterion was computed as BIC = −2logL + plog(n), where −2logL and p are the same as defined in AIC, n is the difference between number of test-day records and the rank of the fixed effects design matrix (Strabel et al., 2005). For both AIC and BIC, a lower value indicates a better goodness of fit. 2. Pearson correlation coefficient and Spearman's rank correlation coefficient between EBV of 305-day (EBV305) estimated based on data with (full data) and without (reduced data) the phenotypes of animals calving in the last year was used to assess the prediction accuracy and the similarity between the rankings of animals based on those two groups of EBV305.

), and repeatabilities
Rep calculated for each day along the lactation trajectory based on the estimated covariance function coefficients. The curve of genetic variances showed a sharp decrease in the early lactation and an increase from the middle to the end of lactation. The curve of permanent environmental variances presented similar trends as the genetic variances.
Estimates of genetic and permanent environmental variance were slightly different in models with different orders because higher order had more inflexion points. For estimates of residual variance, fitting higher order of polynomial yielded a lower horizontal line when assuming homogeneity in the model. In addition, when assuming heterogeneity in the model, although the whole horizontal line was split into 10 segments, the same trend as the homogeneous model was observed with respect to the order of fit. In particular, peaks can be seen at the beginning of lactation when fitting lower orders of polynomials. Similarly, the heritability was high at the beginning and end of lactation and lower in the middle.  Repeatability decreased in early lactation and increased to the end of lactation when using the models with homogeneous residual variance although it decreased at the first stage of lactation and increased to about a DIM of 200, then decreased slightly to the end of lactation when using the models with heterogeneous variances. Table 4 shows the heritability (h 2 ) and repeatability (Rep) for 305-day milk yield and the minimum and maximum values for TD milk yield. h 2 for 305 days estimated from models with different LPs ranged from 0.250 to 0.257, and Rep were from 0.741 to 0.749 when considering homogeneous variance. When considering heterogeneous variance, h 2 for 305 days were between 0.250 and 0.260, and Rep were between 0.738 and 0.749. h 2 for TD from models with different LPs ranged from 0.142 to 0.372 and Rep were between 0.566 and 0.786 when considering homogeneous variance. When considering heterogeneous residual variances, h 2 for TD ranged from 0.143 to 0.326, and Rep were between 0.520 and 0.821. Models with LP1 and LP2 led to higher estimated heritability and lower repeatability than the models with higher order. Heritability and repeatability for 305 days milk yield estimated from the models with LP3, LP4, and LP5 were very similar, ranging from 0.249 to 0.250 for heritability and from 0.748 to 0.749 for repeatability. Models with homogeneous or heterogeneous residual variances led to similar estimates of heritability and repeatability except for the LP1 model. Table 5 presents Pearson correlation and Spearman's rank correlation coefficients between EBV305 from full and reduced data. In general, the highest correlation was observed from LP3. For both Pearson correlations and Spearman's rank correlations, the correlation coefficients increased from LP1 to LP3 and kept relatively stable from LP3 to LP5 for both homogeneous and heterogeneous residual variances. Models with homogeneous and heterogeneous residual variances yielded similar prediction accuracies across different orders of LP. A similar pattern was found for the rank correlations.

DISCUSSION
In this study, various criteria were used to compare random regression test-day models with different order of LPs. Different criteria for model comparison have been discussed by Druet et al. (2003) and Odegard et al. (2003). In our study, models with higher order obtained lower AIC and BIC values, which was in line with previous studies (Pereira et al., 2013). This means models with LP5 fit data best regardless of complexity. However, models with higher orders introduced more parameters and resulted in a heavier computational demand. Therefore, in practice, model selection needs to balance between goodness of fit and model complexity.
The further improvements of goodness of fit by increasing order of LP became smaller when using higher order of LP. In particular, when assuming homogeneous residual variance, there was a smaller improvement by increasing order of fit from LP3 to LP4 and from LP4 to LP5 based on values of AIC and BIC. Similar results were observed by Lopez-Romero and Carabano (2003) based on a Spanish Holstein population, in which the reduction in residual variance was small when increase order of fit from LP4 to LP6. In addition, they reported the order of LP3 was the optimum choice based on the BIC values although higher order (LP6) was needed for permanent environmental effects.
The trajectory of additive genetic variances and permanent environmental variances showed a decrease at the beginning of lactation and an increase at the end of lactation. A similar pattern was reported by Van Der Werf et al. (1998), Pool et al. (2000, and Strabel et al. (2005) in Polish, Dutch, and Australian Holstein populations. Higher additive genetic variances at the beginning and end of lactation might be attributed to variations in the number of TD records, milk yield level, or non-genetic factors, for example, pregnancy effects (Meyer, 2005). Other studies have also shown that fitting a higher order of LP produced higher estimates of genetic variances at the edges of lactation and an oscillatory pattern along the lactation trajectory, which might be unlikely biologically (Jamrozik et al., 1997a;Pool et al., 2000;Lopez-Romero and Carabano, 2003;Meyer, 2005). This indicates that a model with a higher order (e.g., LP5) may not be more optimal than a model with a lower order (e.g., LP3 or LP4).
Heterogeneous residual variances were generally observed over the lactation when analyzing test-day records in dairy cattle populations (White et al., 1999;Brotherstone et al., 2000;Jensen, 2001). Therefore, it is reasonable to overcome this impact. It can be clearly seen that the residual variances changed across 10 DIM groups defined in this study although the differences were small when fitting higher orders of LP. More specifically, the estimated residual variances were larger in the early lactation as also shown in other studies (Swalve, 1995;Jensen, 2001). In addition, Figure 1 reveals that the slope of the permanent environmental variance curve was affected by the change of residual variance over the lactation when assuming homogeneous residual variance in the model (Figure 1). This pattern can be explained by the fact that the permanent environmental effects have the ability to absorb part of the heterogeneity of the residuals (Ødegard et al., 2003).
In practical breeding programs, a critical procedure is to obtain accurate breeding values for the candidates via genetic evaluation and then make selection decisions based on the rankings of EBVs of the animals. Therefore, predictive ability is an essential property in these circumstances. In this study, predictive ability of the models was assessed by both Pearson correlations and Spearman's rank correlations between EBV305 from a full and a reduced data set to evaluate the predictive abilities between models. A model with better goodness of fit does not necessarily indicate better predictive ability, whereas a poorly fit model, such as underfitting, induces bias. This was demonstrated by the highest coefficients of Pearson and Spearman's rank correlation from models using LP of order three regardless of the assumption of residual variance. Hence, this suggests that fitting a model with order three of LPs is reasonable to achieve better predictive ability for milk yield.

CONCLUSION
The results in this study show that goodness of fit increased with increased order of LP. The results demonstrate that further improvement became small when using models with order of LP higher than three. This indicates that a minimum of LPs of order three is necessary for estimating variance components and breeding value of milk yield. Considering a balance between goodness of fit, computational demand, and predictive ability, random regression test-day models using LP3 or LP4 appear to be the appropriate models for implementation of genetic evaluation in different Chinese Holstein populations.

AUTHOR'S NOTE
This manuscript has been released as a pre-print at bioRxiv (Li et al., 2019).

DATA AVAILABILITY STATEMENT
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

ETHICS STATEMENT
Ethical review and approval was not required for the animal study because The data are milk production of dairy cattle collected from dairy farms. Written informed consent was obtained from the owners for the participation of their animals in this study.

AUTHOR CONTRIBUTIONS
JL, HG, and GS conceived and designed the study. JL and HG conducted all analysis and wrote the draft. GS revised manuscript. PM maintained the DMU software package used in the statistical analysis. RL, WL, PB, GX, YG, and XD helped with the data collection and project coordination. All authors provided critical feedback and helped shape the manuscript.