Unraveling the Complexity to Observe Associations Between Welfare Indicators and Hair Cortisol Concentration in Dairy Calves

Using levels of the stress hormone cortisol as an indicator for welfare is a common, but debated practice. In this observational study, hair cortisol concentration (HCC) of samples from 196 dairy calves from 7 to 302 days of age collected from 12 Swedish farms was determined using a commercially available ELISA. An assessment of animal welfare, assessed using animal-based indicators, was performed on the day of sampling. First, methodological factors with the potential to impact HCC and the effect of age were analyzed using generalized additive models. This revealed a significant peak in hair cortisol in young calves (around 50 days of age) and an association between fecal contamination of hair samples and the level of cortisol extracted. Second, associations between welfare indicators and HCC were explored using cluster analysis and regularized regression. The results show a complex pattern, possibly related to different coping styles of the calves, and indicators of poor welfare were associated with both increased and decreased hair cortisol levels. High cortisol levels were associated with potential indicators of competition, while low cortisol levels were associated with the signs of poor health or a poor environment. When running the regularized regression analysis without the contaminated hair samples and with the contaminated samples (including a contamination score), the results did not change, indicating that it may be possible to use a contamination score to correct for contamination.


INTRODUCTION
Determining the activity of the hypothalamic-pituitary-adrenocortical (HPA) axis by measuring cortisol has been used as a potential indicator of stress and welfare for many years, although the use is complicated by the complexity of the system (Mormède et al., 2007). Persistent increases in cortisol levels, indicating chronic stress, have been suggested to be the most harmful for the health and well-being of any animal, although chronic stress can also be associated with decreased cortisol levels due to adrenal exhaustion. Yet, the most common methods for measuring cortisol, such as using blood and saliva, are mainly suitable for studying acute changes during a limited time (Otovic and Hutchinson, 2015). In addition, these methods are sensitive to stress associated with sampling, and there is a natural daily variation and also adaptation to recurrent stressors, all of which may affect the results (Ladewig and Smidt, 1989;Knights and Smith, 2007). Measuring cortisol metabolites in feces or urine enable retrospective measures, unaffected by the sampling procedure, but are influenced by passage rate and require the collection of all feces or urine excreted from the animal during the time period of interest, which causes practical difficulties (Möstl and Palme, 2002). Hair cortisol, on the other hand, is a non-invasive technique used for studying retrospective cortisol levels (Lee et al., 2015;Roth et al., 2016;Burnard et al., 2017). As cortisol is integrated with the hair during its growth, this method gives a picture of the total amount of circulating cortisol over time and the sampling is not subjected to the problems of circadian variations or the interference of momentary stress during sampling (Lee et al., 2015). Previous longitudinal studies have confirmed the relationship among hair cortisol concentration (HCC), serum levels, and fecal levels of cortisol in cattle (González-de-la-Vara et al., 2011;Tallo-Parra et al., 2015).
However, there are factors that can impact the analysis of HCC. For example, the washing and extraction protocol, sampling site and type of hair, season, age, and fecal contamination can impact HCC (Davenport et al., 2006;Moya et al., 2013;Burnett et al., 2014;Meyer et al., 2014;Tallo-Parra et al., 2015;Heimbürge et al., 2020a;Otten et al., 2020). In experimental studies, it is possible to control for these aspects within the study design but in observational studies, some factors can pose a challenge and require control at the analysis stage. Such an example is urine or fecal contamination which may be unavoidable when maintaining a consistent hair sampling site. Furthermore, even if there is an effect of age, in newborn and younger calves (6 weeks of age) that have been shown to have higher HCC compared with older animals (24 weeks and 6 months of age) (Heimbürge et al., 2020b;Hayashi et al., 2021), the pattern of decrease over time between these ages remains to be investigated.
The use of cortisol as a welfare indicator also has important limitations that have been pointed out in several publications (Rushen, 1991;Korte et al., 2007;Otovic and Hutchinson, 2015). While stress and animal welfare are closely linked, high stress does not automatically cause poor welfare and low stress does not necessarily define good welfare (Botreau et al., 2007). In addition, the concept of welfare stretches beyond the pure presence or absence of stress and includes the feelings experienced by the animal (Duncan, 2005). Thus, the relationship between HCC and welfare may depend on which welfare indicator is being measured. Previous studies on cattle have shown increased levels of hair cortisol in clinically compromised cows and cows subjected to heat stress (Burnett et al., 2015;Ghassemi Nejad et al., 2017). However, only weak associations between the results of welfare assessments and HCC in dairy cows have been observed at the herd level (van Eerdenburg et al., 2021).
In this observational study, the associations among HCC, age, and various animal-based indicators of welfare in young dairy calves are explored to assess the usefulness of hair cortisol as an indicator of welfare. We also investigate the effect of potential confounders and discuss the possibilities and limitations of controlling for the effect of fecal contamination.

MATERIALS AND METHODS
The cross-sectional study included 196 calves (1-43 weeks of age) from 42 pens on 12 Swedish dairy farms. These farms participated in another ongoing research project and were selected due to the presence of verotoxin-producing Escherichia coli (VTEC). A detailed description of the selection of animals for sampling can be found in Tamminen et al. (2020). In short, calves in pens, where VTEC was present, were selected for sampling using a systematic randomization process. The sampled calves were housed indoors in groups under typical Swedish conditions, most commonly on straw bedding but there were examples of calves on sawdust and slatted floors. Ethical approval for the study was granted by the regional ethical committee (Uppsala Djurförsöksetiska Nämnd, Dnr: C 85/15).

Animal-Based Assessment
The farm visit was planned so that the undisturbed animals could be observed for 20 min around morning feeding time as this is a time point of high activity (Bokkers and Koene, 2001). If multiple pens were included on a farm, the observer rotated between the pens every five min until each pen had been observed for a total of 20 min. Two out of 42 pens contained more than 20 animals and were too large to observe the whole pen from one location. In these cases, the pen was observed for 2 × 20 min from two opposite locations to account for the possibility of missing behaviors performed by animals standing far away from the assessor. The protocol, which consisted of clinical scoring, undisturbed behavioral observations, and a behavioral test, was developed based on previous publications on the assessment of calf welfare (Lidfors, 1993;Jensen and Kyhn, 2000;Bokkers and Koene, 2001;Whay et al., 2003;Windschnurer et al., 2008;Bokkers et al., 2009;Leach et al., 2009;Welfare Quality R , 2009). The full protocol is presented in Table 1. The measures consisted of a combination of long-term indicators, such as body condition score and coat condition, and measures reflect a shorter time period, such as ruminal fill. When applicable for the behavior being observed, such as mounting and licking, both performer and receiver were noted. After the undisturbed behavioral observations, fear of unknown humans was tested using a distance avoidance test. Finally, calves were restrained and individually assessed e.g., for nasal discharge, tear staining, and sampled for VTEC and hair. At this time, the reactivity of the calf to restraint was also scored. All animal-based assessments were carried out by the first author.

Analysis of HCC
Hair samples were collected from the tail of each calf (just above the tail switch) with scissors. Hair was cut as close to the skin as possible. Samples were wrapped in aluminum foil and stored dark at room temperature for 3-7 months before analysis. From 318 sampled animals, 196 samples including both individuals colonized by VTEC O157:H7 and randomly selected negative controls (1-3 controls per colonized individual) were selected for analysis of hair cortisol. The protocol for preparing and analyzing hair was based on previous studies on cattle and other species (Meyer et al., 2014;Tallo-Parra et al., 2015;Roth et al., 2016)

Individual assessment Definition
Body condition score Palpation and scoring from 1 (very thin) to 5 (fat) based on the amount of muscle/fat covering over the lumbar vertebrae's and hip bones. Split into poor (score 1-2) and normal (3-4). No score 5 was observed Cleanliness: Below hocks Ocular observation and scoring from 1 to 3: No dirt; minor splashing of manure (smaller than a hand in total); distinct dried plaques of manure (larger than a hand) or 50% of area covered in wet manure Upper hind legs Body Coat condition Ocular observation and scoring: poor (dull, shaggy or long rough hairs); normal (shiny and short hairs)

Conjunctivitis
Ocular observation of inflammation (redness/swelling) around the eye Fearfulness (Distance avoidance test) Freestanding calf is approached with a speed of 1 step per second with experimenter aiming at reaching shoulder level.
Stop at a distance of one arm's length and arm is lifted to touch the animal. Distance when avoidance reaction occurs noted. Scores 1 to 5: >2 m, <2 m, <1 m, withdrawal when arm is lifted or if the animal can be touched (for at least 2 s) Fecal consistency Ocular observation and assessment of consistency. Scored as: diarrhea (feces watery or slimy); loose (feces smooth and homogenous); firm (feces firm and break up) Hairless patches Ocular observation of number of areas >2 cm 2 with hair loss, extensive thinning of the coat due to parasites or hyperkeratosis with nondamaged skin were calculated. Up to 20 areas were counted. One area larger than the size of a hand counted as 20 small areas.

Lameness
Ocular observation and scoring from 1 to 3: no lameness; uneven temporal rhythm; reluctance to bear weight on a leg Nasal discharge Ocular observation of flow/discharge from the nostrils Number of wounds/inflammation As described above for Hairless patches but observation and counting of areas with damaged skin either in form of a scab or a wound, dermatitis due to ectoparasites or ear lesions due to torn off ear tags Reactivity (during sampling) Assessed during restraining for hair sampling. Animal scored from 1 to 4 after release: 1. Animal is relaxed and not trying to resist being restrained. Curious and/or contact-seeking toward persons performing the sampling; 2. Animal is nervous and stepping around during sampling. May try to avoid sampling by moving the hind and lifting hind legs although not kicking; 3. Animal is forcefully trying to avoid being restrained by trying to escape or forcefully kicks; 4. Animal is paralyzed by the restraining. Increased respiration rate and introvert behavior. No forceful attempts to escape or react to the sampling Ruminal fill Ocular observation and scoring as: poor (visible triangle formed in left para lumbar fossa and small rumen); normal (filled rumen is visible in Para lumbar fossa) Size within group Ocular observation of group and scoring as: small (calf is smaller than the average size in the group); normal (calf size does not differ from other animals); large (calf is larger than the average size in the group) but adapted and modified by authors. Prior to analysis, hair from the centimeter closest to the cut end of the hair (closest to the skin) was cut with scissors and placed in a 15-ml tube. Samples were washed by adding 4 ml of isopropanol, and samples were vortexed for 4 min before the isopropanol was decanted.
The procedure was repeated three times. Samples were left to dry at room temperature under a fume hood for 2-3 days until all the isopropanol had evaporated. For samples where fecal contamination remained after the washing procedure, the dirtiness of the hair was scored from 1 to 4 (1= no dirt, shiny, 2 = < 5 dirt particles, shiny, 3 = > 5 dirt particles, shiny, 4 = > 5 particles of dirt, dry, or dull). Approximately 50 mg of each hair sample was added to a 2-ml microtube with a screwcap. If less than 50 mg of washed hair was available, as much as possible was used. Exact duplicates were prepared for 30 samples (12%). In addition, a duplicate sample with a different degree of remaining dirt or of a different color was prepared from 21 of the individual samples. Samples were prepared farm by farm in random order. For exact duplicates, every 10th sample was selected during preparation (if the sample did not have enough hair, the 11th sample was taken). The 21 samples of different colors and dirtiness were selected based on availability during sample preparation. Three chrome steel balls (3.2 mm in diameter, BioSpec Products, Cat. No. 11079132c) per tube were added, and the tubes were frozen in liquid nitrogen for 2 min before grinding the hair for 3 × 60 s at 6,500 RPM in a bead beater (Precellys Evolution). After grinding, 1.2 ml of methanol was added and samples placed on a rocker. After 18 h, the samples were centrifuged. The first 41 samples were centrifuged for 2 min at 7,000 g, and 0.6 ml of methanol was transferred to new 1.5-ml tubes. As the first samples had occasional hair particles in the final product, the centrifugation protocol was changed and the remaining samples were centrifuged two times After the initial centrifugation, 0.8 ml of methanol was transferred to new 1.5ml tubes, and the samples were recentrifuged at 10,000 g for 5 min, and 0.6 ml of methanol was transferred to new 1.5-ml tubes. Methanol was evaporated at 38 • C under a fume hood after the centrifugation. Phosphate-buffered saline (0.01 M PBS, pH 7.4) was added to the dried samples. The first 92 samples were dissolved in 200 µl PBS, the last 159 samples in 150 µl as the dried sample appeared to dissolve better in this volume. The concentration of cortisol was determined using an ELISA kit designed for salivary cortisol (Salimetrics Europe Ltd, Art 1-3002) according to the manufacturer's instructions (validated by Moya et al., 2013). Each sample was analyzed in duplicate, and high and low controls were run on each assay. Inter-and intraassay coefficients of variation (CV) values were calculated according to the instructions from the manufacturer (Salimetrics, 2018). Final cortisol content (pg/mg) in hair was calculated using the formula presented by Meyer et al. (2014).

Data Management and Statistical Methods
Data were entered in Excel (Microsoft Corp., Redmond, WA) and exported to R Statistical Software (R Core Team, 2018) for statistical analysis. Hair cortisol was log10-transformed to achieve normal distribution before calculation of Pearson's correlation coefficient for exact duplicates and duplicates of different dirtiness or colors. Plots of the correlation were generated using the package ggplot (Wickham, 2016). The impact of remaining dirt, age, hair color, month, and methodological changes during the process was analyzed using a generalized additive model (GAM) with a log link function in the package gamm4 (Wood, 2004;Wood and Scheipl, 2017). In this analysis, one sample from each pair of exact duplicates was excluded from the analysis, but all other remaining samples were included. A random effect of pen nested within-herd was included to account for the clustering of the observations. The model was reduced based on the change in R-squared after the removal of one variable at a time, and the final model was evaluated by residual plots.
To explore and visualize the observations of the calves and HCC, a cluster analysis of mixed data was performed using the CluMix package in which distance between individual calves was based on Gower's general similarity coefficient and variables clustered based on similarity using measures of association (Hummel et al., 2017(Hummel et al., , 2019. The associations between hair cortisol (log10-transformed) and welfare indicators were also analyzed using regularized regression using package glmnet (Friedman et al., 2010). For animals with multiple samples, only one sample (with the lowest CV value) was included. Associations between welfare parameters and HCC were first explored excluding the samples with fecal contamination (n = 139) and second with all samples, but then with the dirtiness score included in the analysis (n = 196). There were small amounts of missing data in the welfare indicators randomly scattered across observations. The proportion of missing data per indicator was low, for the majority of indicators <1%. The indicator with the highest proportion of missing values was cleanliness body (8 missing) and distance avoidance test (7 missing at random, 23 missing by design). These were followed by cleanliness upper hind legs (6), cleanliness below hocks and coat condition (5), and ruminal fill (4). Another 10 variables were missing 1 observation. Missing values were assumed to be missing at random except for the distance avoidance test where 23 observations were missing due to the pens being too small or crowded for carrying out the test. These were treated as a separate category in the analysis. The remaining missing values were imputed using non-parametric random forest imputation in the package missForest before the elastic net regression (Stekhoven and Bühlmann, 2012;Stekhoven, 2013). Two variables (lameness and conjunctivitis) were excluded due to little variance between the calves, and qualitative indicators with several levels were reduced by grouping if some levels contained very few observations and grouping was biologically plausible (Supplementary Table S2). The counts of hairless patches and wounds or swellings were also condensed into categories (Supplementary Table S2). For many behavioral observations, a high number of calves did not perform the behavior leading to a distribution with the high frequency of zero observations. This was handled by adding a categorical variable describing if the behavior was observed and a quantitative variable with the number of times the behavior was observed (Robertson et al., 1994). The models were tuned and fitted using leave one out cross-validation using the package caret (Kuhn, 2020), and the models that generated the smallest mean squared errors were selected (James et al., 2013). The herd was included as a fixed effect to account for clustering.

RESULTS
The cortisol levels in hair ranged from 1.57 to 138.22 pg/mg with a median of 9.19. For high control, the average CV was 9.1% and for the low control, it was 10% giving an interassay (average of high and low control) CV of 9.5% (n = 12). The mean intraassay CV was 4.3% (n = 12). These values are well within the recommendations of the manufacturer (15 and 10%, respectively). Correlation between the 30 exact duplicates was high (Pearson's correlation coefficient = 0.94) while the correlation between the 21 duplicates of different colors and dirtiness was lower (Pearson's correlation coefficient = 0.76). There was no association between HCC and colonization with VTEC O157:H7 (Tamminen, 2020).

Factors Influencing the Extraction of Cortisol
The GAM showed that the dirtiness of hair was associated with a significant increase in cortisol extracted and that the increase was larger for the dirtiest samples ( Table 2). There was a significant peak in hair cortisol levels in calves around 50 days of age and a tendency for darker hair to be associated with higher HCC (Figure 1). Other parameters (storage time, number of centrifugations, methanol) did not influence the cortisol concentrations ( Table 2). A preliminary analysis indicated that lower HCC was seen in the samples where 200 µl of PBS was used to dissolve the cortisol, which supported our suspicion that the dried samples did not dissolve completely. However, in a stratified analysis of samples from farms where both preparation volumes were applied, no difference was observed. It was, therefore, concluded that the effect observed was a result of farm differences and not the amount of PBS. Also, removing the PBS variable did not impact the explanatory abilities of the model, further supporting that the effect was related to farm variations. Thus, this variable was excluded from the final model.

Association Between HCC, Age, and Indicators of Welfare
Detailed results of the welfare and behavior assessments (including all samples regardless of contamination level) and a visualization of the cluster analysis can be found in the (Supplementary Tables S1, S2; Supplementary Figure S1). HCC clustered (as expected from the previous analysis) closely to the dirtiness score. Interestingly, hair cortisol also clustered closely to score in the distance avoidance test (which was in the same cluster as the dirtiness score) although the pattern of association between HCC and the score in the avoidance distance test was not distinguishable. Most notably, a group from Herd 1, where the distance avoidance test was missing by design (indicating a small and crowded pen), appeared to have higher HCC levels. In addition, HCC clustered close to age, herd, sex, being fed milk, and stretching (although the latter had relatively few observations). Young animals being fed milk formed a cluster that appeared to have higher HCC values compared with the older animals. Animals with poor body conditions (and poor coat conditions) also showed high similarity and clustered closely together. However, in this group, cortisol levels did not appear to increase but decreased. There was also a cluster where calves with a high frequency of coughing during the behavioral observations and with nasal discharge and coughing observed during the clinical observations appeared to have a slightly lower HCC. There was some clustering associated with the herd, but the degree of similarity within-herd differed between herds. Regularized regression using clean samples and all samples yielded comparable models with similar R-squared (0.59 and 0.58) and mean squared error (0.42 and 0.46). In the model including all samples, the dirtiness score of hair (Scores 3 and 4) had the largest coefficients (Score 2 was also included but with a smaller coefficient). The coefficients of variables describing the association between HCC and the welfare indicators in the two models are presented in Figure 2. Variables with positive coefficients (i.e., associated with increased hair cortisol) in both models were a mixture of potential indicators for both poor FIGURE 2 | Coefficients from the regularized regression model that minimized root mean squared error. Positive coefficients were positively associated with HCC and negative coefficients were negatively associated with HCC. Behavioral variables marked with (#) are quantitative measures of the number of times the behavior was observed as opposed to binary variables indicating whether a behavior was observed (yes/no). and good welfare (Boissy et al., 2007;Botreau et al., 2007). Being fed milk had large coefficients in both models. The behaviors with the largest positive coefficients, i.e., affecting cortisol the most, present in both models were the qualitative and quantitative variables for being displaced and being rubbed on by others. That both the qualitative and the quantitative indicators for displacement were included in the final models which implies that the effect of being displaced on cortisol concentration increases the more calves are being displaced. Together with poor ruminal fill (model with all samples), these variables are potential indicators for competition and/or low dominance. Moderate tear staining was also included in both models. There were also variables with smaller positive coefficients not usually related to poor welfare (e.g., locomotor play, which is considered an indicator of good welfare, and grooming or comfort behaviors such as self-licking and rubbing). The coefficients of the variables associated with low cortisol levels (i.e., variables with negative coefficients) in both models were related to poor health (nasal discharge, coughing), dirtiness on the upper hind legs (indicating diarrhea), and high reactivity during sampling. Performing stereotypic behaviors, such as cross-sucking and tongue movements, also had small negative coefficients in both models.
There were some differences between the two models. In the model including only clean samples, mounting behaviors had relatively large coefficients while poor ruminal fill and dirtiness below hocks had relatively large coefficients in the model with all samples. For the negative coefficients, poor cleanliness of the body had a larger negative coefficient in the model with only clean samples compared with the model with all samples. These differences are likely associated with correlations between variables (leading to different variables being picked in the different models) and in some cases (such as for mounting) a result of few observations.

DISCUSSION
The results of this observational study display the complexity of the HPA axis and how factors related to contamination of samples, environment, and individual differences are associated with HCC. Many of these are closely interconnected and need to be considered and interpreted in association with each other.

Methodological Considerations Regarding the Extraction and Analysis of Hair Cortisol
The mean cortisol content in the samples was 15.4 pg/mg, which is in the higher range of what previous studies using an ELISA from the same manufacturer have reported (Moya et al., 2013;Burnett et al., 2014Burnett et al., , 2015Ghassemi Nejad et al., 2017). However, considering the peak in hair cortisol values observed in young animals this is not surprising as these other studies included older animals. As visualized in Figure 1, a significant increase in hair cortisol levels of dairy calves with a peak around 50 days of age was observed in the GAM. In the regularized regression, the variable "Fed milk, " i.e., belonging to the age group where the peak was observed, was associated with increased hair cortisol levels. The effect of young age on HCC is consistent with the increased HCC in 6-week-old calves compared with 24-week-old calves observed by Wood and Scheipl (2017).
Calves were weaned around 2 months of age on the farms included in this study, and the observed peak may be related to increased stress due to reduced milk allowance before weaning. There are also additional changes in the life of a young calf occurring during the first 2 months. For example, most of the farms kept animals in single pens for about 2 weeks before they were group housed. This transition, including social challenges and increased activity, may have influenced hair cortisol levels. Calves are also disbudded during this time period and although this is done with local analgesics in Sweden, the procedure may induce stress and pain when the analgesics wear off. Experimental studies where hair is clipped and the newly grown hair regularly sampled should provide additional information about the role of different challenges in a young calf 's life.
Correlation between duplicates was high for the exact duplicates while duplicates with different levels of contamination and color showed more variations. Considering this, in combination with the results of the GAM, emphasizes that fecal contamination had a large influence on HCC and the association is supported by a dose-response curve, i.e., increasing levels of contamination are associated with a larger effect (Hill, 1965). As the risk of dirtiness on the tail is high, we suggest avoiding the use of hair from this location when studying dairy calves, contradictory to the recommendations in previous studies on cattle (Moya et al., 2013;Burnett et al., 2014;Vesel et al., 2020). Dark hair color was also associated with increased HCC while mixed samples did not differ from white hair samples. This could be a result of the relatively low number of samples with dark and mixed hair in this study, that it was more difficult to observe contamination in dark samples or be related to other unmeasured parameters (such as the thickness of hair).
Fecal contamination of the hair sample was negatively correlated with the cleanliness of the calf in the welfare assessment (i.e., the samples from clean calves were less likely to be contaminated), and the variables clustered relatively close in the cluster analysis. Due to this correlation, it is possible that some aspects of the high HCC observed in dirty samples are actually associated with calves living in a suboptimal, dirty environment. There is some support for this in cluster analysis where it appears that high HCC was generally associated with poor cleanliness, except among the younger individuals that stand out with very high cortisol levels and good cleanliness. This discrepancy between the group of young animals and others may explain the relatively large change in the coefficient for poor cleanliness of body when more samples were added in the regularized regression. With the larger sample size, the influence of young animals, and thereby the usefulness of the variable cleanliness of body, decreased. In addition, some variables with few observations, such as mounting others, also decreased in importance when additional samples were added. However, we note that the majority of the larger coefficients not related to cleanliness remained comparable between models. Thus, we argue that using a contamination score can be a way to handle samples exposed to contamination in the analysis, as long as the potential for confounding between environment and contamination is kept in mind.

Hair Cortisol as an Indicator for Welfare in Dairy Calves
It is important to note that in this study, welfare was only assessed at the one-time point and the behaviors observed during a relatively short time period. This means that calves are compared based on a snapshot during an active time period. Hair cortisol on the other hand represents an average of cortisol levels during the time of hair growth, and there are likely past events influencing hair cortisol levels that are unaccounted for in this study. Still, it is interesting that, despite the limited observation time, we find relatively short-term measures, such as poor ruminal fill and being displaced, related to increased hair cortisol levels. However, as the personality and feeding behavior of calves have been shown to be relatively constant over time (Lecorps et al., 2018;Neave et al., 2018), the short-term measures may still be related to retrospective cortisol levels. BCS is also a measure of feed intake, although more long-term, as opposed to ruminal fill, this variable was not associated with high HCC levels. In the cluster analysis, this measure clustered with another long-term measure, coat condition, and these individuals appeared to have low HCC. A possible explanation could be that this represents a group that has been exposed to stress for a longer time and exhibit a habituated cortisol response. Another possible explanation is that these long-term characteristics are associated with a poor, nonstimulating environment which causes hypostimulation (Korte et al., 2007). Longitudinal studies where confounders can be limited or controlled and HCC followed over time are needed to confirm these associations.
The usefulness of cortisol as an indicator of welfare lies in the potential to identify chronically stressed individuals struggling with coping. In this study, indicators of poor welfare were found among variables associated with high and low hair cortisol levels. The variables associated with the largest negative coefficients in the regularized regression were related to poor health while the largest positive coefficients appeared to be related to competition and dominance. However, some of the variables with smaller positive coefficients were not related to stress and poor welfare. That locomotor play was related to higher cortisol levels and may be related to activity influencing cortisol, for example, human endurance athletes have higher HCC compared with controls (Skoluda et al., 2012) or a result of the correlation between play and young age. However, the association between high reactivity and low cortisol was not anticipated as previous studies on cattle have not identified associations between HCC and temperament (Cooke et al., 2017;Lockwood et al., 2017). In many other species, fearfulness and reactivity have been associated with different coping styles and used to separate proactive and reactive individuals (Koolhaas et al., 1999). Proactive coping is characterized by a low HPA axis activity but a high sympathetic reactivity, whereas reactive coping is associated with a more active HPA axis activity and low sympathetic reactivity. The association between high reactivity and low cortisol, observed in our study, may be attributable to calves having different coping styles. The animals that were fearful and reacted violently to restraining may be the examples of proactive individuals (with a low HPA axis activity) exposed to an unfamiliar situation, whereas the reactive individuals (with high HPA axis activity) are generally more prone to handle new situations in a more passive manner . This could explain why high cortisol levels were observed in dynamic calves interacting with other calves and the environment.
While high HCC was associated with indicators of poor and good welfare, low HCC was mainly associated with indicators related to poor welfare and reduced health. This is in agreement with a study that found lower hair cortisol levels in horses with clearly compromised welfare (Pawluski et al., 2017). It also supports the idea that the most severe forms of chronic stress may be associated with lower levels of cortisol, due to a decreased responsiveness to adrenocorticotropic hormone or habituation (Korte et al., 2007;Mormède et al., 2007). However, the association between low cortisol levels and indicators of poor clinical health (such as coughing and nasal discharge) contradicts that high cortisol levels can be useful to identify stress associated with clinical disease in calves, as previous studies on cows have suggested (Comin et al., 2013;Burnett et al., 2015). This discrepancy may also be a result of confounding due to age or environmental factors as mentioned above. A restrictive, non-stimulating environment can lead to a lower baseline of blood cortisol in healthy calves (Fisher et al., 1997). Such nonstimulating environments could be connected to environments where calves are more likely to develop coughing and other health-related problems (such as poor coat condition and BCS mentioned above). If so, low hair cortisol levels could be used as a herd level indicator of a suboptimal environment and/or poor management of calves where calves are also more likely to become sick. However, additional confirmation of these associations and increased understanding of how environment, personality, and coping styles influence hair cortisol levels are needed.

CONCLUSION
This study highlights the importance of fecal contamination of hair when analyzing HCC. To avoid contamination, the use of hair from the tail of dairy calves should be avoided. When contamination is present, the results suggest that contamination scores can be a way of addressing the problem in observational studies. The results also emphasize the complexity of the HPA axis in relation to different indicators of welfare and individual differences. While poor welfare can be associated with increased and decreased HCC, the results suggest that low HCC in calves could be a sign of long-term exposure to a suboptimal environment where calves are more likely to become sick. The results also point out the need to understand how different coping mechanisms affect HCC in dairy calves and propose new hypothesizes that should be explored in future studies.

DATA AVAILABILITY STATEMENT
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found below: Mendeley Data, doi: 10.17632/bxzdpbx5rn.1.

ETHICS STATEMENT
The animal study was reviewed and approved by Uppsala Djurförsöksetiska Nämnd, Dnr: C 85/15. Written informed consent was obtained from the owners for the participation of their animals in this study.