Genetic Parameters and Genome-Wide Association Studies of Eight Longevity Traits Representing Either Full or Partial Lifespan in Chinese Holsteins

Due to the complexity of longevity trait in dairy cattle, two groups of trait definitions are widely used to measure longevity, either covering the full lifespan or representing only a part of it to achieve an early selection. Usually, only one group of longevity definition is used in breeding program for one population, and genetic studies on the comparisons of two groups of trait definitions are scarce. Based on the data of eight traits well representing the both groups of trait definitions, the current study investigated genetic parameters and genetic architectures of longevity in Holsteins. Heritabilities and correlations of eight longevity traits were estimated using single-trait and multi-trait animal models, with the data from 103,479 cows. Among the cows with phenotypes, 2,630 cows were genotyped with the 150K-SNP panel. A single-trait fixed and random Circuitous Probability Unification model was performed to detect candidate genes for eight longevity traits. Generally, all eight longevity traits had low heritabilities, ranging from 0.038 for total productive life and herd life to 0.090 for days from the first calving to the end of first lactation or culling. High genetic correlations were observed among the traits within the same definition group: from 0.946 to 0.997 for three traits reflecting full lifespan and from 0.666 to 0.997 for five traits reflecting partial productive life. Genetic correlations between two groups of traits ranged from 0.648 to 0.963, and increased gradually with the extension of lactations number regarding the partial productive life traits. A total of 55 SNPs located on 25 chromosomes were found genome-wide significantly associated with longevity, in which 12 SNPs were associated with more than one trait, even across traits of different definition groups. This is the first study to investigate the genetic architecture of longevity representing both full and the partial lifespan simultaneously, which will assist the selection of an appropriate trait definition for genetic improvement of longevity. Because of high genetic correlations with the full lifespan traits and higher heritability, the partial productive life trait measured as the days from the first calving to the end of the third lactation or culling could be a good alternative for early selection on longevity. The candidate genes identified by this study, such as RPRM, GRIA3, GTF2H5, CA5A, CACNA2D1, FGF10, and DNAJA3, could be used to pinpoint causative mutations for longevity and further benefit the genomic improvement of longevity in dairy cattle.

Due to the complexity of longevity trait in dairy cattle, two groups of trait definitions are widely used to measure longevity, either covering the full lifespan or representing only a part of it to achieve an early selection. Usually, only one group of longevity definition is used in breeding program for one population, and genetic studies on the comparisons of two groups of trait definitions are scarce. Based on the data of eight traits well representing the both groups of trait definitions, the current study investigated genetic parameters and genetic architectures of longevity in Holsteins. Heritabilities and correlations of eight longevity traits were estimated using single-trait and multi-trait animal models, with the data from 103,479 cows. Among the cows with phenotypes, 2,630 cows were genotyped with the 150K-SNP panel. A single-trait fixed and random Circuitous Probability Unification model was performed to detect candidate genes for eight longevity traits. Generally, all eight longevity traits had low heritabilities, ranging from 0.038 for total productive life and herd life to 0.090 for days from the first calving to the end of first lactation or culling. High genetic correlations were observed among the traits within the same definition group: from 0.946 to 0.997 for three traits reflecting full lifespan and from 0.666 to 0.997 for five traits reflecting partial productive life. Genetic correlations between two groups of traits ranged from 0.648 to 0.963, and increased gradually with the extension of lactations number regarding the partial productive life traits. A total of 55 SNPs located on 25 chromosomes were found genome-wide significantly associated with longevity, in which 12 SNPs were associated with more than one trait, even across traits of different definition groups. This is the first study to investigate the genetic architecture of longevity representing both full and the partial lifespan simultaneously, which will assist the selection of an appropriate trait definition for genetic improvement of longevity. Because of high genetic correlations with the full lifespan traits and higher heritability, the partial productive life trait measured as the days from the first calving to the end of the third lactation or culling could be a good alternative

INTRODUCTION
Longevity is an economically important trait in dairy cattle, due to its large impact on the efficiency of dairy farming (Weigel et al., 1995;Essl, 1998). The improvement of longevity allows a farm to have not only a higher voluntary culling rate but also a lower involuntary culling rate. On one hand, by reducing involuntary culling rate, the extra costs used for the replacement heifers can be reduced (Jairath et al., 1994;Veerkamp et al., 1995;Brotherstone et al., 1997;Essl, 1998). On the other hand, genetic improvements of a population can be speeded up due to the possibility to increase voluntary culling rate while keep a relatively constant population size (Essl, 1998).
Since 1990s, longevity has been included in the total selection index in many countries (Interbull 1 ) (Miglior et al., 2017). For example, by giving 5 ∼ 14% weights to productive life (covering the full lifespan trait) in the selection index, the longevity in United States Holsteins has started to improve. However, the longevity traits representing the full lifespan is only available after the individual being culled or dead, and thus, the selection response of longevity trait was slow down by the balance between a long generation interval (collecting large scale phenotype) and low selection accuracy (data available). An early selection of longevity could be achieved by using traits which measure a partial lifespan, such as the days from the first calving to a certain lactation. For example, a total of 5 traits including productive days during period from first calving to the end of the first (Lon11), second (Lon12), third (Lon13), fourth (Lon14), or fifth lactation (Lon15) were used to evaluate longevity in the Nordic Cattle Genetic Evaluation 2 . Despite the potential to improve longevity genetically, in Chinese Holsteins, the selection for longevity traits has not yet been implemented. The investigation of genetic parameters for longevity traits with different definitions is critical for the selection of proper longevity traits to be added to the selection index [i.e., China Dairy Performance Index (CPI)] (Dairy Association of China, Beijing 3 ) for the genetic improvement of longevity in Chinese Holsteins. A large number of studies on genetic analysis of longevity were performed in various dairy cattle populations, but all studies only investigated one of the two groups of trait definition, either full lifespan or partial lifespan (Tsuruta et al., 2005;Sewalem et al., 2007;van Pelt et al., 2015;Clasen et al., 2017;Imbayarwo-Chikosi et al., 2017) and none of them ever explored the genetic relationships among longevity traits with different definitions.
By incorporating the information of genetic makers associated with the target trait into selection decisions, faster genetic progress could be achieved (Liu et al., 2017(Liu et al., , 2020. With the help of single nucleotide polymorphism (SNP) panel, genomewide association studies (GWAS) have been used as primary strategies to identify genetic makers for complex traits since Klein (2005). For longevity traits, GWAS have been performed for productive life (Cole et al., 2011;Nayeri et al., 2017) and herd life in United States Holsteins (Nayeri et al., 2017), for productive life and herd life in Italy Holsteins (Steri et al., 2019), for productive life in Thai crossbred Holsteins (Saowaphak et al., 2017), German-Austrian Fleckvieh (Mészáros et al., 2014), and United States composite beef breeds (1/2 Red Angus, 1/4 Charolais, 1/4 Tarentaise) (Hay and Roberts, 2017), and for partial productive life in Nordic Holsteins, Red cattle and Jersey . Nevertheless, the genetic markers of longevity obtained from various GWAS did not overlap well across populations. In addition to the possible differences in linkage disequilibrium (LD) structures across populations and the insufficient detection power due to low heritabilities, the different trait definitions being used in different studies could also be a reason.
The objectives of this study were (1) to estimate heritabilities and genetic correlations for longevity traits with different definitions, including both traits representing full lifespan and traits representing partial lifespan; and (2) to identify genetic variants associated with longevity. Results of this study will assist the selection of appropriate trait definitions to be used for genetic improvement of longevity in dairy cattle.

Longevity in Chinese Holsteins
The descriptive statistics of longevity traits in the Chinese Holstein population are presented in Table 1, and the distributions of each longevity trait is presented in Supplementary Figure 1. The Chinese Holstein cows had 2.7 ± 1.6 lactations in average. In line with the definitions, the length in days of full lifespan traits (from 822 to 1, 618 days) were larger than the partial productive life, except for the milking life (738 days). The coefficient of variation for partial lifespan traits increased gradually with the extension of number of lactations, and the coefficients of variation for full lifespan traits were larger than that for partial lifespan traits, except for herd life. With the extension of the number of lactations, the average partial productive life change from 331 (Lon11) to 749 days (Lon15). However, the magnitude of increase in partial productive life with increasing number of lactations decreased gradually. For example, the increase of productive life was 231 day when increasing from Lon11 to Lon12, but only 18 days when increasing from Lon 14 to Lon15. The change of phenotype value from Lon 11 to Lon15 reflects both individual variation for real longevity and impacts from censored records.

Heritabilities and Genetic Correlations
The estimates of variance component and heritability for eight longevity traits from single-trait animal models are shown in Table 2. Generally, the partial productive life traits had higher heritabilities (ranged from 0.051 for Lon14 to 0.090 for Lon11) than those for the full lifespan traits (ranged from 0.038 for PL and HL to 0.040 for ML). Among the five partial productive life traits, heritabilities decreased with the increase of number of lactations. Among the three full lifespan traits, ML had the highest heritability. Standard errors of heritabilities were low (lower than 0.010), suggesting that all the heritabilities were accurately estimated. Genetic and phenotypic correlations among HL, PL, and ML and among Lon11, Lon12, Lon13, Lon14, and Lon15 estimated using multi-trait animal models are presented in Table 3. Genetic correlations among three traits representing the full lifespan were close to the unity and with standard errors lower than 0.01. For any of the two partial productive life traits, the traits with the closest lactation number usually had higher genetic correlations. For example, the genetic correlation between Lon11 and Lon12 was 0.912, while genetic correlations between Lon11 and Lon13, Lon14, and Lon15 were 0.774, 0.698, and 0.666, respectively. For the partial productive life traits with a period at least 2 lactations (including Lon12, Lon13, Lon14, and Lon15), genetic correlations among these traits were higher than 0.9, ranging from 0.922 between Lon12 and Lon15 to 0.997 between Lon14 and Lon15. The genetic correlations between PL and various partial productive life traits increased gradually with increasing number of lactations of the partial productive life traits. For example, a very high genetic correlation was observed between PL and Lon15 (0.963 ± 0.007), whereas a moderate genetic correlation was observed between PL and Lon11 (0.648 ± 0.044).

GWAS
The SNPs reached the genome-wide significant level in association with longevity traits in Chinese Holsteins are presented in Table 4. In total, 55 SNPs located on 25 chromosomes were genome-wide significantly associated with longevity traits. There were 12 SNPs significantly associated with more than one trait, in which 8 SNPs were shared among the traits within the same group of partial productive life and 4 SNPs were across two groups. For example, the SNP rs135565406 was significantly associated with HL, ML, PL, Lon12, and Lon15. In terms of chromosomes, the chromosome with most SNPs (9 SNPs) significantly associated with longevity traits was chromosome X. For each trait, the number of significantly associated SNPs range from 6 (ML) to 11 (HL, PL, or Lon12) with an average of 9.
Manhattan plots and Q-Q plots for each longevity trait in Chinese Holsteins are presented in Figures 1, 2 Table 1) ranged from 1.06 for PL to 1.15 for Lon11. Results from the Q-Q plots and λ showed that the population stratification was well controlled, since the deviation of the observed distribution of the P-values from the expected distribution was minor.

Annotation of Candidate Genes
Genes harboring or closest (with 200 kb distance) to the significant SNPs were suggested as potential candidate genes for longevity traits. By using this strategy, a total of 106 proteincoding genes and 1 micro RNAs were identified as candidate genes for longevity traits in Chinese Holsteins (Table 4). There are 13 significant SNPs located within CACNA2D1, NAF1, SRD5A2, CDADC1, NCAM1, CAR, HMOX2, CALN1, UBE2E2, CTNNA3, GRIA3, and GUCY2F genes. Furthermore, genes NEFM (rs43559099 and rs134248248), NEFL (rs43559099 and 1 Lon11, the days from the first calving to the end of the first lactation or culling; Lon12, the days from the first calving to the end of the second lactation or culling; Lon13, the days from the first calving to the end of the third lactation or culling; Lon14, the days from the first calving to the end of the fourth lactation or culling; Lon15, the days from the first calving to the end of the fifth lactation or culling; PL, productive life referring the days from the first calving to culling or death; ML, milking life referring the days from the first calving to culling or death but excluding all dry periods; HL, herd life referring the days from birth to culling or death. rs134248248), and GRIA3 (rs135876977and rs41622390) harbor significant SNPs associated with more than one longevity trait.
In terms of chromosomes, there were the most candidate genes in Chromosome X (14 genes), BTA25 (15 genes), BTA 7 (14 genes), and BTA 18 (9 genes). We listed all of the candidate genes detected in current study together with their functions and associated traits information in Supplementary

Longevity in Chinese Holsteins
Our study showed that the average number of lactations in Chinese Holsteins was 2.7, based on the data of dairy cows born from 1999 to 2017. A previous study using data from 1990s showed that Chinese Holsteins usually culled or dead at 3.4 -4.0 lactations (Chu et al., 1995;Li et al., 2000), which was 20.5 -32.5% higher than that observed in current study. There are many possible reasons for the shorter longevity in current Chinese Holstein population. The downtrend of genetic merit on longevity trait in Chinese Holstein population is important reason. On one hand, no direct selection for longevity has been performed in Chinese Holsteins since longevity has not yet been included in the selection index due to the difficulty to collect records of longevity. The longevity has not received the attention it deserves in the past, and a national data collecting system has not yet been established. On the other hand, the intensive selection for milk production traits, which had negative genetic correlations with the longevity traits, has been performed in Chinese Holsteins since the 1990s. Because of its huge impact on the profit of dairy farm, the longevity in dairy cows should not be neglected in the breeding scheme. A similar trend of decrease in the number of lactations was also observed in the United States Holstein population, dating back to the 1990s (Hare et al., 2006). Interestingly, by including the longevity (productive life) into the selection index (TPI), a slight rise of longevity has been achieved since 1994 (Garcia-Ruiz et al., 2016). The example from the United States Holsteins showed that the genetic improvement of longevity can be achieved by selection, when proper longevity traits are employed with sufficient selection intensity. Actually, longevity trait was gradually included into selection index in many countries since 1990s, such as Germany (RZG), Canada (LPI), Denmark (NTM), Netherlands (NVI), and Australia (BPI). The weight of longevity in total selection index ranged from 5 to 20% in various country.

Heritabilities and Correlations
In the present study, the estimated heritabilities of three full lifespan traits were relatively low (0.038 ∼ 0.040), which was Frontiers in Genetics | www.frontiersin.org  (Klassen et al., 1992;Short and Lawlor, 1992;Chauhan et al., 1993;Jairath et al., 1994). For example, there were a high genetic correlation (0.986) between milking life and total productive life in Dutch Holsteins (Vollema and Groen, 1996) and high genetic correlations (0.98 ∼ 1.00) among milking life, total productive life and number of lactation in Canadian Holsteins (Jairath et al., 1994). Because of the high genetic correlations, only one of three full lifespan traits is needed in the breeding program in order to select for longevity representing the full lifespan. In the present study, heritabilities for partial productive life ranged from 0.051 to 0.090, which is similar to the findings (0.022 ∼ 0.090) of crossbred Danish dairy cattle (consisting of Danish Holstein, Jersey, and Red cattle) (Clasen et al., 2017). Furthermore, the FIGURE 1 | Manhattan plots for 8 longevity traits in Chinese Holsteins. Lon11, the days from the first calving to the end of the first lactation or culling; Lon12, the days from the first calving to the end of the second lactation or culling; Lon13, the days from the first calving to the end of the third lactation or culling; Lon14, the days from the first calving to the end of the fourth lactation or culling; Lon15, the days from the first calving to the end of the fifth lactation or culling; PL, productive life referring the days from the first calving to culling or death; ML, milking life referring the days from the first calving to culling or death but excluding all dry periods; HL, herd life referring the days from birth to culling or death.
heritabilities for partial productive life was slightly higher than those of full lifespan traits (0.038 ∼ 0.040) in current study. The genetic parameters for longevity estimated by the current study were reliable; and performing the genetic evaluation on longevity traits was feasible in Chinese Holstein population. Based on the breeder's equation ( BV/t = r BV, BV iσ BV L ), the genetic gain ( BV/t) of longevity from direct selecting the full lifespan traits was relatively slow due to the long generation interval (L) to obtain the phenotypes, which makes the partial productive life traits become attractive. For Chinese Holstein cows born in 2005, it needs 4.96 years on average to obtain phenotype records for total productive life of these animals, while it is 4.35 years for Lon13 on average. In current study, the genetic correlations among partial productive life traits ranged from 0.648 between Lon11 and Lon15 to 0.963 between Lon14 and Lon15, which is similar to the results in Nordic Holsteins from the Nordic Genetic Evaluation 4 . With the increase of lactation number, the information the partial productive life traits carried are closer to the total productive life, and correspondingly, the genetic correlations between total productive life and partial productive life traits gradually increased. The high genetic correlations between the total productive life and the partial productive life indicates the potential to implement an early selection on longevity using the partial productive life. To balance between a high genetic correlation with the total productive life and data availability, Lon13 (from the first calving to the end of third lactation or culling) could be considered as the target trait while keeping other FIGURE 2 | Quantile-quantile (Q-Q) plots of genome-wide association study for 8 longevity traits in Chinese Holsteins. Lon11, the days from the first calving to the end of the first lactation or culling; Lon12, the days from the first calving to the end of the second lactation or culling; Lon13, the days from the first calving to the end of the third lactation or culling; Lon14, the days from the first calving to the end of the fourth lactation or culling; Lon15, the days from the first calving to the end of the fifth lactation or culling; PL, productive life referring the days from the first calving to culling or death; ML, milking life referring the days from the first calving to culling or death but excluding all dry periods; HL, herd life referring the days from birth to culling or death.
partial productive life traits as information traits (only used in genetic evaluation by multi-trait model to increase prediction for the target, but not included in selection index). Further studied needed to be done in order to confirm the reasoning in using the traits Lon13 for selecting longevity in Chinese Holstein.

GWAS
In current study, two significant SNPs (rs137712544 and rs110628337) on BTA2 are within the reported QTL (37,604,171-49,295,365 bp) for the total productive life in German Holsteins (Kuhn et al., 2003). The gene RPRM within this QTL was considered as a candidate gene for longevity, which is a pleiotropic gene involved in suppression of cancer, regulation of mitotic cell cycle, cell cycle arrest, and regulation of survival. The motif prediction and comparison analysis of protein structure for the gene RPRM showed that it plays important roles on bovine fertility, including sexual maturation, steroidogenesis, gametogenesis, gonadal differentiation and gonadotrophin secretion (Durosaro et al., 2015). In the present study, the SNP rs41622390 located within the gene GRIA3 was significant associated with Lon13 and Lon14, and it also was a top associated SNP (P = 4.39E-09) for Lon13. In a study on United States Holsteins (Cole et al., 2011), the gene GRIA3 was reported to be associated with total productive life, somatic cell score, and daughter pregnancy rate. Result from current study confirmed the findings in United States Holsteins, and the gene GRIA3 was considered as a candidate gene for longevity in Holsteins.
The genes GTF2H5 and CA5A are associated with more than one longevity traits and being detected by the top associated SNP for Lon15 (rs134672623) and Lon14 (rs134612709), respectively. The gene GTF2H5, participating in the interstrand adducts removal process of DNA repair, was reported to be associated with mastitis (Chen et al., 2015) and lipomatous myopathy (Peletto et al., 2017) in cattle, lentivirus susceptibility in sheep (White et al., 2012), ovarian cancer (Gayarre et al., 2016), and trichothiodystrophy (Michalska et al., 2019) in human. The gene CA5A may play an important role in ureagenesis and gluconeogenesis and participates in a variety of biological processes, including respiration, calcification, acid-base balance, bone resorption, and the formation of aqueous humor, cerebrospinal fluid, saliva, and gastric acid (van Karnebeek et al., 2014). This gene was reported to be associated with heat stress in African indigenous cattle (Taye et al., 2017), somatic cell count in Chinese dairy cow (Chen et al., 2015) and European Holsteins (Wijga et al., 2011), and productivity and environmental adaptation traits in Rustaqi and Jenoubi cattle (Iraqi indigenous cattle) (Alshawi et al., 2019). Therefore, GTF2H5 and CA5A were suggested as the candidate genes for longevity in Holsteins. Furthermore, the genes RIBC2, FBLN1, and ATXN10 (identified by the association with HL, ML, PL, Lon11 and Lon15), NPY1R and NAF1 (identified by association with HL, Lon13, and Lon14), and TCEAL1, MORF4L2, GLRA4, PLP1, and RAB9B (identified by association with HL, ML, and Lon13) are novel findings from current study. Beside the genes FBLN1, NPY1R, and MORF4L2 associated with milk production, fertility and health traits in previous study (Supplementary  Table 2), few literatures reported that these novel genes are associated with longevity or related traits. These genes could be potential candidate genes for longevity.
Among all 55 significant SNPs identified by the present study, most of them were within the reported QTL for calving traits (26 SNPs, mainly including calving ease, stillbirth, calf size, and birth weight), health traits (24 SNPs, mainly including mastitis, somatic cell count/score, abomasum displacement and ketosis), fertility traits (22 SNPs, mainly including non-return rate and gestation length), and immunity (16 SNPs, mainly including blood immunoglobulin G level). This phenomenon has also been observed in the previous GWAS for longevity in North American (Nayeri et al., 2017), United States (Cole et al., 2011), and Nordic . Holstein population, Nordic Red cattle population , and German-Austrian Fleckvieh population (Mészáros et al., 2014), where the most significant SNPs were located within the previously identified QTL regions for production, type, diseases resistance, somatic cell count/score, fertility and calving traits. All longevity traits with different definitions essentially measure the resistance to culling caused by various problems, such as low productivity, reproduction disorders, and health problems. In the Chinese Holstein population, reproduction disorders (e.g., dystocia and infertility, accounting for 19%) and udder health problems (e.g., mastitis, accounting for 9%) were the main culling causes, according the survey data in the year Yan et al. (2016). The finding of shared genetic markers between longevity and other functional traits further confirms the fact that longevity is genetically related to health, fertility and calving traits. Among 109 potential candidate genes for longevity detected in the present study, 63 genes had been reported to be associated with clinical mastitis, somatic cell count/score, fertility traits (the first calving age, pregnancy rate and the age at puberty), calving traits (birth weight and calving ease score), other health traits (tick resistance, ketosis, brucellosis, and foot-and-mouth disease), embryonic development and heat stress in cattle. For example, the gene CACNA2D1 encodes a member of the alpha-2/delta subunit family, which is associated with voltage-gated calcium channels. In Indian Sahiwal cattle (Magotra et al., 2019), and Chinese Holstein, Sanhe, and Simmental cattle (Deng et al., 2011;Yuan et al., 2011a,b), the gene CACNA2D1 were significantly associated with somatic cell score or clinical mastitis. The gene FGF10 are involved in various cellular processes, including chemotaxis, cell migration, differentiation, survival, apoptosis, embryonic development and angiogenesis, which can inhibit dominant follicle growth and estradiol secretion (Gasperin et al., 2012), and plays important roles follicle selection (Diogenes et al., 2017) and vitro embryo production (Castilho et al., 2017) in cattle. The gene DNAJA3 plays an important antiviral role against foot-and-mouth disease by both degrading VP1 and restoring of IFN-β signaling pathway (Zhang et al., 2019). Furthermore, in post-GWAS analysis for mastitis resistance (Cai et al., 2018) and transcriptome comparative analysis for brucellosis (Rossetti et al., 2011) in cattle, the gene DNAJA3 also showed the significant statistical signal. We suggested that the genes CACNA2D1, FGF10 and DNAJA3 can be considered as candidate genes for longevity.
In current study, the X chromosome had the greatest number of significant SNPs (Table 4), which was in agreement with GWAS results for total productive life trait in United States Holsteins (Cole et al., 2011). In the present study, many genes close to the significant SNPs on X chromosome were found, including THOC2, RNF128, TCEAL1, MORF4L2, PLP1, RAB9B, GUCY2F, NXT2, PIR, VEGFD, ASB11 and ASB9. Excluded the gene THOC2 (associated with fertility in bovine), MORF4L2 (associated with heat stress in Holstein), VEGFD and ASB11 (associated with fertility in pig), no studies about the functions of these genes on various traits in livestock (especially in cattle) were available (Supplementary Table 2).

CONCLUSION
This is the first study to investigate the genetic architecture of longevity traits representing a full or a partial lifespan simultaneously. Because of high genetic correlations with the total productive lifespan traits and higher heritability, the partial productive life measured as days from the first calving to the end of third lactation or culling (Lon13) could be a good alternative trait for early selection on longevity. The shared underlying biological processes among different longevity traits were further confirmed by the detection of shared significant variants. The genes RPRM, GRIA3, GTF2H5, CA5A, CACNA2D1, FGF10, and DNAJA3 were suggested to be candidate genes for longevity in Holsteins, which could be used to pinpoint causative mutations and further benefit genomic prediction for longevity in dairy cattle. This study proved the feasibility of genetic evaluation on longevity in Chinese dairy population, and it should be considered in selection index of Chinese dairy cattle.

Phenotype and Pedigree
The dates of birth, calving, drying, and culling or death were collected for 132,690 Chinese Holstein cows born from 1999 to 2017, which were raised in 31 herds in China, including 22 herds in Beijing, two herds in Hebei, and one herd each in Tianjin, Yunnan, Henan, Heilongjiang, Jilin, and Inner Mongolia, respectively. These electronic records of each cow were extracted from the farm management software (AfiFarm 5 ). A total of eight longevity traits were analyzed in this study, including three traits representing full lifespan and five traits representing partial lifespan. Traits representing full lifespan were herd life (HL) referring the days from birth to culling or death, total productive life (PL) referring the days from the first calving to culling or death, and milking life (ML) referring the days from birth to culling or death but excluding all dry periods. Traits representing partial lifespan were the days of a cow staying during the period from the first calving to the end of the first (Lon11), second (Lon12), third (Lon13), fourth (Lon14), or fifth lactation (Lon15), according to the study by Clasen et al. (2017). In order to reflect the real longevity of cows, no correction for production or for functional performance has been performed for any of the eight longevity traits. Only cows with age at first calving ranged from 600 to 1800 days and did not change herds during the data collection period were kept for further analyses. Besides, for HL, PL and ML, only culled or dead cows were kept; whereas for all five partial lifespan traits, only cows culled or finished the corresponding lactation were kept. Ultimately, there were 78,227 cows available for HL, PL,and ML,and 103,479,90,279,82,826,79,571,and 78,144 cows available for Lon11, Lon12, Lon13, Lon14, and Lon15, respectively. To obtain an adequate pedigree, cows with phenotypic records were traced back as many generations as possible. The final pedigree included 437,418 females and 12,401 males born from 1969 to 2017.

Genotype
A total of 2,629 cows born from 2003 to 2015 were genotyped with the Illumina 150 K bovine bead chip (Illumina, Inc., San Diego, CA, United States). Genotype imputation was performed using the Beagle 5.1 software (Browning and Browning, 2009). The SNPs were removed from the dataset if they exhibited: (1) Minor allele frequency (MAF) lower than 0.05; (2) Fisher's exact test P-value for Hardy-Weinberg Equilibrium (HWE) less than 10-6; or (3) unknown position. Quality control for animals available in each trait was performed, respectively. Ultimately, the numbers of SNPs used for GWAS ranged from 116,547 for PL to 116,570 for Lon12 (Supplementary Table 3).

Genetic Parameter Estimation
Variance and co-variance components for eight longevity traits were estimated using the average information restricted maximum likelihood algorithm implemented in the DMU software (Madsen et al., 2006). Heritabilities were estimated using single-trait animal models. Correlations between the longevity traits within the same group were estimated using a multi-trait model, that is, a three-trait animal model for the three traits representing full lifespan (HL, PL, and ML), and a two-trait animal model for each pair of the five partial lifespan traits (Lon11, Lon12, Lon13, Lon14, or Lon15), instead of a fivetrait animal model which did not meet the convergence criteria. Genetic correlation between PL and any one trait measuring partial productive life was estimated using a two-trait animal model. The effects included in the model were the same for all longevity traits: y = afc + hy + ys + a + e where y is the vector of phenotypes for longevity traits, afc is the vector of fixed effect of age at first calving (≤22, 23, 24, 25, 26, 27, 28, 29, and ≥30 month); hy is a vector of fixed effect of herdbirth year; ys is a vector of fixed effect of birth year-season; a is the vector of additive genetic effects; and e is the vector of random residual effects. It was assumed that a ∼ N 0, Aσ 2 a and e ∼ N 0, Iσ 2 e , where A is the matrix of additive genetic relationships constructed from the pedigree, σ 2 a is the additive genetic variance, I is the identity matrix, and σ 2 e is the residual variance.

Genome-Wide Association Studies
The de-regressed estimate breeding values (dEBV) (VanRaden et al., 2009) generated from BLUP solutions of the above mentioned single-trait model were used as pseudo phenotypes for GWAS. The descriptive statistics of dEBV for 8 longevity traits are listed in Supplementary Table 4. A Fixed and Random Circuitous Probability Unification model implemented in the FarmCPU software  was used to perform singletrait GWAS. To reduce false positives caused by the population structure, top 50 or 90 principal components (PCs) calculated by the PLINK software 6 were added to the GWAS model as covariates to control the inflation factor λ below 1.2. The proportions of phenotypic variance explained by these PCs are listed in Supplemental Table 3. Bonferroni correction was used to control the false positive resulted from multiple comparisons.
The significance threshold was defined as 0.05/N, where N was the number of SNPs being tested. The Quantile-quantile (Q-Q) plots and the genomic inflation factor λ (Devlin and Roeder, 1999) were used to determine whether the observed distributions of -log (P-value) was against the expected distribution under no association hypothesis. The ARS_UCD1.2 assembly from the UC Santa Cruz genome annotation database 7 was used to refer the SNP positions and search the genes related to the significant SNPs.

DATA AVAILABILITY STATEMENT
The data analyzed in this study is subject to the following licenses/restrictions: This manuscript utilizes proprietary data.
Requests to access these datasets should be directed to YW, wangyachun@cau.edu.cn.

ETHICS STATEMENT
Ethical review and approval was not required for the animal study because All phenotypic data were recorded as part of routine dairy cattle management and genetic evaluations. The DNA samples were obtained for the purpose of routine genomic evaluations in previous projects. Thus, no additional animal handling or experiment was performed specifically for this study.

AUTHOR CONTRIBUTIONS
HZ, GS, and YW organized the study. HZ and AL led the manuscript preparation. HZ performed the data analysis and HL did the data curation of genotype. XY and XL did the data curation of phenotype. LL provided support for collection of raw data. All authors contributed to the article and approved the submitted version.

ACKNOWLEDGMENTS
We thank the Dairy Association of China (Beijing, China) for providing the pedigree.