Prediction of fresh and ripened cheese yield using detailed milk composition and udder health indicators from individual Brown Swiss cows

The composition of raw milk is of major importance for dairy products, especially fat, protein, and casein (CN) contents, which are used worldwide in breeding programs for dairy species because of their role in human nutrition and in determining cheese yield (%CY). The aim of the study was to develop formulas based on detailed milk composition to disentangle the role of each milk component on %CY traits. To this end, 1,271 individual milk samples (1.5 L/cow) from Brown Swiss cows were processed according to a laboratory model cheese-making procedure. Fresh %CY (%CYCURD), total solids and water retained in the fresh cheese (%CYSOLIDS and %CYWATER), and 60-days ripened cheese (%CYRIPENED) were the reference traits and were used as response variables. Training-testing linear regression modeling was performed: 80% of observations were randomly assigned to the training set, 20% to the validation set, and the procedure was repeated 10 times. Four groups of predictive equations were identified, in which different combinations of predictors were tested separately to predict %CY traits: (i) basic composition, i.e., fat, protein, and CN, tested individually and in combination; (ii) udder health indicators (UHI), i.e., fat + protein or CN + lactose and/or somatic cell score (SCS); (iii) detailed protein profile, i.e., fat + protein fractions [CN fractions, whey proteins, and nonprotein nitrogen (NPN) compounds]; (iv) detailed protein profile + UHI, i.e., fat + protein fractions + NPN compounds and/or UHI. Aside from the positive effect of fat, protein, and total casein on %CY, our results allowed us to disentangle the role of each casein fraction and whey protein, confirming the central role of β-CN and κ-CN, but also showing α-lactalbumin (α-LA) to have a favorable effect, and β-lactoglobulin (β-LG) a negative effect. Replacing protein or casein with individual milk protein and NPN fractions in the statistical models appreciably increased the validation accuracy of the equations. The cheese industry would benefit from an improvement, through genetic selection, of traits related to cheese yield and this study offers new insights into the quantification of the influence of milk components in composite selection indices with the aim of directly enhancing cheese production.

The composition of raw milk is of major importance for dairy products, especially fat, protein, and casein (CN) contents, which are used worldwide in breeding programs for dairy species because of their role in human nutrition and in determining cheese yield (%CY). The aim of the study was to develop formulas based on detailed milk composition to disentangle the role of each milk component on %CY traits. To this end, , individual milk samples ( . L/cow) from Brown Swiss cows were processed according to a laboratory model cheese-making procedure. Fresh %CY (%CY CURD ), total solids and water retained in the fresh cheese (%CY SOLIDS and %CY WATER ), and -days ripened cheese (%CY RIPENED ) were the reference traits and were used as response variables. Training-testing linear regression modeling was performed: % of observations were randomly assigned to the training set, % to the validation set, and the procedure was repeated times. Four groups of predictive equations were identified, in which di erent combinations of predictors were tested separately to predict %CY traits: (i) basic composition, i.e., fat, protein, and CN, tested individually and in combination; (ii) udder health indicators (UHI), i.e., fat + protein or CN + lactose and/or somatic cell score (SCS); (iii) detailed protein profile, i.e., fat + protein fractions [CN fractions, whey proteins, and nonprotein nitrogen (NPN) compounds]; (iv) detailed protein profile + UHI, i.e., fat + protein fractions + NPN compounds and/or UHI. Aside from the positive e ect of fat, protein, and total casein on %CY, our results allowed us to disentangle the role of each casein fraction and whey protein, confirming the central role of β-CN and κ-CN, but also showing α-lactalbumin (α-LA) to have a favorable e ect, and β-lactoglobulin (β-LG) a negative e ect. Replacing protein or casein with individual milk protein and NPN fractions in the statistical models appreciably increased the validation accuracy of the equations. The cheese industry would benefit from an improvement, through genetic selection, of traits related to cheese yield Introduction As cheese consumption per capita continues to rise in Europe, North America, and Oceania (1), the dairy sector is looking for new sustainable ways to improve the cheese-making ability of milk. Cheese yield is usually expressed as a percentage (%CY) representing the amount of cheese produced from 100 kg of milk (2). This trait is fundamental not only to determine the profitability of dairy industries and farmers, but also to monitor the sustainability of the dairy chain. The variability in %CY is affected by many factors relating to milk quality and processing characteristics. The composition of the raw milk is of major importance, especially the content of fat and protein (or caseins) (3), which are used worldwide in breeding programs for dairy species (4) not only because of their nutritional role, but also because of their effect on %CY (5). The %CY is traditionally determined from bulk milk at the dairy industry level (6)(7)(8), but for research and genetic purposes, it can also be determined at the individual level through laboratory model cheese-making procedures (9,10). These techniques provide the opportunity to study individual animal variability using small quantities of milk (e.g., from 1.7 to 7,000 mL) (11,12) in procedures consisting of a series of highly controlled, standardized steps (e.g., cutting time, heating temperature). They also offer the possibility to measure the recovery of nutrients (%REC) in the curd throughout the weight and the composition of processed milk and whey. Few studies have estimated the heritability of measured cheese-making traits (5), including other dairy species (13,14). This is mainly because of high costs and limitations due to the labor intensity of each step in the cheese-making procedure at the individual animal level (10). To overcome the economic and practical obstacles of individual analysis, Fouriertransform infrared spectroscopy has been used to investigate the suitability of predicted %CY and %REC traits at the population level (15,16). However, unlike the predictions of milk chemical components (i.e., milk fatty acids and protein fractions) (17,18), predictions of the technological features of milk, such as cheesemaking traits, are often not sufficiently accurate to be classified as gold standard (19), so their application at the population level is still under review.
These limitations can be overcome using predictive formulas developed on the basis of the contribution of specific milk components to %CY. Since the early 1900s, many predictive %CY formulas have been constructed based on milk fat and protein content. These formulas are useful for (i) predicting %CY of specific cheeses, for example Cheddar (20) and Parmigiano Reggiano (21), and (ii) comparing predicted and measured %CY in order to monitor the efficiency of the cheesemaking process (i.e., precision of weights and measurements) (22). As previously mentioned, milk fat and protein are currently used worldwide in the selection programs of dairy cattle, and in some countries their economic weights are based on their contribution to cheese yield. However, other milk components are also recognized as influencing %CY and %REC traits, such as somatic cell count (SCC) and lactose, which are used as indicators of mastitis (23). Nevertheless, it is well known that the levels of efficiency by which milk nutrients are transferred to the curd and the cheese vary according to the breed of the cow (24), and hence can be only partially explained by differences in coagulation, curd firming and syneresis (25). In fact, the differences can be explained mainly in terms of the different milk protein profiles (26). Brown Swiss milk is usually considered particularly suitable for cheese production due to its fat and protein composition (27).
If each milk protein fraction has different effects on cheesemaking efficiency as previously found by Cipolat-Gotet et al. (28), we speculate that %CY prediction formulas based on protein or total casein (CN) will not be able to completely explain these effects. It is therefore essential to understand the role of the detailed milk composition, including milk protein fractions and udder health indicators, in the retention of milk nutrients and water in the curd and cheese, and to quantify the relative importance of these nutrients. This information could also be used to refine the selection goals for a dairy population, and to establish more precisely the economic weights of milk components in selection indices. For these reasons, the general aim of the present study was to identify and quantify the effects of detailed milk components on %CY. The specific objectives were: (i) to study %CY in terms of fresh cheese (%CY CURD ), milk solids (%CY SOLIDS ) and water (%CY WATER ) retained in the curd, and ripened cheese (%CY RIPENED ); (ii) to quantify the effects on %CY of the milk components mainly retained in cheese (fat, protein and/or CN); (iii) to quantify the effects on %CY of milk traits mainly related to udder health [lactose and

Ethical statement
All the dairy cows involved in this study were reared in commercial private farms and were not subjected to any invasive procedures. Milk samples from dairy cows used for the project were collected by technicians of breeders associations during routine milking within current milk-recording schemes (ICAR, International Committee for Animal Recording) and hence certified by local authorities.

Experimental design, animals and milk sampling
The present study is part of the Cowability-Cowplus projects. The milk from 1,271 Brown Swiss cows was collected once during the evening milking and divided in 3 subsamples per each cow. The whole sampling collection took place over the course of one year. The sampled cows represented different stages of lactation (25-388 days in milk) and parities (1)(2)(3)(4)(5) and were clinically healthy. Cows belonged to 85 herds (15 cows per herd, with a few exceptions) selected from 610 farms located in Trento Province (Italy) and representing different environments and dairy farming systems as described by Berton et al. (29). Briefly, the dairy farming systems were classified into 4 categories, 1 traditional and 3 modern types, which differed for the use and type of total mixed ration.
Among the 3 subsamples collected for each animal, one was analyzed for composition at the Milk Quality Laboratory of the Trento Breeders Association (Trento, Italy), and the others were transported to the Milk Laboratory of DAFNAE (Department of Agronomy, Food, Natural Resources, Animals and Environment) of the University of Padova (Legnaro, Padua, Italy) for cheese-making and quantification of the milk protein fractions.

Milk protein fractions
Individual milk samples (2 aliquots of 1 mL each per cow) were mixed with preservative (bronopol, 2-bromo-2nitropropan-1,3-diol, 0.6:100 vol/vol) to prevent microbial development, frozen at −20 • C in portable chilling devices immediately after collection, then stored at −80 • C until analysis. Frozen individual milk aliquots were prepared following the method proposed by Bobe et al. (31). The contents of the CN fractions (α S1 -, α S2 -, β-, and κ-CN) and whey proteins (β-LG and α-LA) were assessed by the validated reversed-phase HPLC method (32). The remaining NPN content was estimated as the difference from the total milk nitrogen content.
Model cheese-making and cheese yield traits Individual milk samples were processed within 20 h from milking according to the model cheese-making method described in detail by Cipolat-Gotet et al. (33). Briefly, 1,500 mL of milk from each cow were heated to 35 • C in a stainlesssteel micro-vat, thermophilic starter culture was added, milk was mixed with rennet and monitored for gelation time. The starter was an industrial freeze-dried formulation of thermophilic lactic bacteria (Delvo-Tec TS-10A DSL; DSM Food Specialties, Delft, The Netherlands). At a fixed time after gelation (10 min) each curd was cut with a vertical crosscut centered on the vertical axis of the vat. Five min after the first cut, the curd was reduced to cubes of about 1 cm 3 . After a further 5 min, the curd was separated from the whey and suspended on a cheese mold for 30 min over the whey-containing vat and turned every 2 min to facilitate draining. The curd was then pressed for 60 min at 250 kPa, turning every 20 min, and salted for 60 min in liquid brine at a saturation of 20% NaCl. The whey collected from each vat was also weighed, sampled, and analyzed for fat, protein, lactose, and total solids content with a MilkoScan FT2 (Foss, Hillerød, Denmark).
At the end of the cheese-making process and after brining, each cheese wheel was weighed. Curd components (fat, protein, and total solids) were measured as the difference in composition between the milk before processing and the whey. All the cheeses were then ripened at 15 • C and 85% relative humidity for the first month, then at 12 • C and the same relative humidity for the second month (a total of 60 days).
With the aforementioned measurements we were able to obtain four %CY traits. The classical formulas for %CY at 0 d (fresh curd), and at 60 d after ripening were calculated as follows: %CY CURD= weight of wheel at 0 d (g) weight of milk (g) ×100 %CY RIPENED= weight of wheel at 60 d (g) weight of milk (g) ×100 Frontiers in Veterinary Science frontiersin.org . /fvets. . Cheese yield was also calculated in terms of total solids (TS) and water retained in the fresh curd, as follows: %CY WATER= milk water g −whey water (g) weight of milk (g) ×100 Where milk and whey water were obtained as differences with respective TS.

Statistical analysis Editing
Before the statistical analysis, all trait values (milk composition, protein fractions, and %CY traits) falling outside 3 standard deviations (SD) of the mean were removed to exclude outliers, so that the results shown in Table 1 are already presented without outliers.

Regression models
Linear regression models were tested separately for predicting %CY traits (%CY CURD , %CY SOLIDS , %CY WATER and %CY RIPENED ) using different combinations of milk components as predictors selected on the basis of their correlations, technological roles and effects on cheese production (18,34,35). To quantify the weight of each nutrient on %CY the regressions tested included the major milk nutrients transferred to cheese. Four groups of predictive equations were identified, in which different combinations of predictors were tested separately to predict %CY traits: (i) basic composition, i.e., fat, protein and casein, tested individually and combined; (ii) udder health indicators (UHI), i.e., fat + protein or casein + lactose and/or SCS; (iii) detailed protein profile, i.e., fat + protein fractions, precisely a. casein fractions b. casein fractions + whey proteins c. casein fractions + whey proteins + NPN compounds (iv) detailed protein profile + UHI, i.e., fat + protein fractions + NPN compounds and/or lactose and/or SCS.
For all the %CY traits, we tested regression models both with and without intercept, although the results from the models with intercept are not reported as our main goal was to quantify the real contribution of each of the predictors to %CY. However, the fitting statistics between the models with and without intercept were comparable (data not shown). Values of the adjusted coefficients of determination of calibration (adjR 2 CAL ) were calculated using the following formula: Where R 2 is the sample R-squared, N is the total sample size and p is the number of independent variable.
For all the predictors, P-values were not reported since they were always lower than 0.001. Multicollinearity for two groups of variables was checked by evaluation of tolerance, variance inflation factor, Eigen values and condition index (Supplementary Table 1). The two groups included the following predictor variables: group (1) fat, protein, lactose and SCS; group (2) fat, protein fractions, NPN compounds, lactose and SCS. The results obtained from those tests evidenced the absence of multicollinearity among predictors of each group (Supplementary Table 1).

Validation
The accuracies of the %CY predictive formulas were assessed through a training-testing procedure. A training data set (80% of the total observations) was used to build the predictive equations, and a testing data set (20% of the total) was used as validation. Observations were randomly assigned to the training and testing sets, and the training-testing procedure was repeated 10 times for each of the %CY traits, changing the training and testing set samples each time. For each of the 10 training-testing tests of the prediction procedure of a given trait, the observed and the predicted values of the testing data set were used to calculate the coefficient of determination of validation (R 2 VAL ) and the root mean square error of validation (RMSE VAL ). The beta coefficient of each predictor, R 2 VAL and RMSE VAL for each trait are presented as the average of the 10 training-testing replicates carried out.
. /fvets. .   ) and water retained in fresh cheese (%CY WATER ) of models based on fat, protein fractions and/or non-protein nitrogen (NPN), and/or lactose, and/or somatic cell score (SCS) of processed milk.  The adjR 2 CAL values were similar to the R 2 VAL values highlighting the absence of over-fitting and multicollinearity of the proposed regression models . The adjR 2 CAL values were shown in Tables 2-4. Attention was given to R 2 VAL as it provides information on the effectiveness of prediction when applied externally (i.e., population level) and, therefore, considered more important for the purpose of this study.

Descriptive statistics
Variability in milk composition is well known to be a major factor in determining the efficiency of the cheese-making process. Table 1 summarizes the descriptive statistics for single test-day milk yield (dMY), milk components, and %CY traits. The average fat, protein, and lactose contents were 4.22, 3.71, and 4.85%, respectively, with fat having the highest coefficient of variation (CV, 21%). Regarding protein fractions, the casein index, defined as the percentage of casein on total protein, was 77.9%. As expected, β-CN was the predominant casein fraction (41.2%), followed by α S1 -CN (32.9%), κ-CN (12.1%), and α S2 -CN (11.8%). Milk NPN was in a ratio of ∼11.0% with protein. This group consists mainly of milk urea together with free amino acids and peptides (36). Our results show that the average %CY CURD was 15.0%, %CY SOLIDS was 7.22%, and %CY WATER 7.80%. The contribution of water to the total fresh %CY was therefore around 52.0%, and solids 48.0%. After ripening, the %CY decreased to 8.73% (%CY RIPENED ).

Prediction of cheese yield based on fat, protein or casein alone
As it is well known, the addition of rennet triggers the coagulation process and causes the casein micelles to aggregate and form a network which traps the majority of the fat globules. Most of the milk water and soluble compounds (lactose, whey proteins, many minerals, vitamins, etc.) are then expelled during syneresis and constitute the whey. The most important determinants of %CY, therefore, are the casein micelles and fat globules, although factors affecting whey expulsion are also important.
Even though milk fat and protein contents are correlated (in the present study r = 32.0%, Supplementary Table 2), if they are not standardized in the milk before cheese-making, their ratio is far from constant, especially in milk from different farming systems, parity, or lactation stages. It would therefore be expected that trying to predict %CY traits using only one milk component would result in merely moderate accuracies.
Using fat as the only predictor of %CY traits, we were able to predict %CY CURD with a regression coefficient of 3.50 (Table 2). This value is expected because the intercept of the prediction equation was fixed at 0.00 and the ratio between %CY CURD (15.00%) and milk fat content (4.22%) was 3.55. Nevertheless, the validation accuracy of this prediction equation was very modest (R 2 VAL = 0.29) and the corresponding RMSE VAL was high (2.21%). As the ratio between moisture and total solids in this type of fresh model cheese (Table 1) is slightly in favor of the former, the higher regression coefficient of fat (Table 2) for predicting %CY WATER (1.81) than for predicting %CY SOLIDS (1.70), and their sum almost coinciding with the regression coefficient for %CY CURD (3.50) were also expected. Given that lipids, quantitatively, account for a major part of cheese solids in full-fat cheeses, it is not surprising that the validation accuracy of the fat-based equation predicting %CY SOLIDS (R 2 VAL = 0.57) was about twice the R 2 VAL previously seen for %CY CURD (R 2 VAL = 0.29), whereas for %CY WATER it was almost negligible (R 2 VAL = 0.06).
Similarly, in the case of the %CY CURD predictive equations based on milk protein or on milk casein, the regression coefficients were also equal to the ratio between the average of %CY CURD and the average of the predicting nutrient (Tables 1  and 2). Moreover, the regression coefficients of %CY WATER were slightly higher than those of %CY SOLIDS ( Table 2). The determination coefficient of the protein-based %CY CURD equation (R 2 VAL = 0.49) was, instead, much larger than that of the fat-based equation (R 2 VAL = 0.29), and even larger in the case of milk casein content as the predictor (R 2 VAL = 0.53). The slightly lower content of protein in cheese compared with fat explains the lower determination coefficients of the %CY SOLIDS equations based on protein (0.42) and on casein (0.43), compared to when only fat was included as the predictor (0.57). In contrast, the hydrophilic properties of most proteins explain their higher accuracies compared with fat in predicting %CY WATER (0.31 for protein, 0.33 for casein and 0.06 for fat; Table 2).
After ripening, the ratio between moisture and total solids was less variable than in fresh cheese, and more related to .
/fvets. . chemical composition of cheese, as long as the cheese-making procedure and ripening conditions were constant. This explains why predicting %CY RIPENED always has a greater validation accuracy than predicting %CY CURD ( Table 2). The cross-validation approach used in this study showed that, as expected, the prediction equations of the combinednutrients models that always included milk fat content and protein or CN or protein fractions were on average more accurate than the single-nutrient (fat, protein or CN) models. The box-plots of all the regression models together (Figure 1) clearly show that, on average, predicting %CY SOLIDS was more accurate than predicting %CY RIPENED and %CY CURD , and much more accurate than predicting %CY WATER . But it is worth noting that, for each %CY trait, the worst validation accuracies (circles = outlier values) were those of the singlenutrient equations, i.e., when fat, protein and casein were tested individually in the formula.

Prediction of cheese yield based on fat and protein, or casein
Protein and fat are widely used in dairy cattle selection programs and in the milk quality payment system because of their nutritional value and their acknowledged contribution to %CY and the production of other dairy products (i.e., yogurt, butter).
The R 2 VAL for the %CY CURD/SOLIDS/RIPENED traits increased and the RMSE VAL decreased when fat was associated with protein or CN (Table 2). This outcome was partly expected, as CN (and indirectly protein) is the main actor in the coagulation process and whey expulsion.
Although fat exerts a large effect on %CY and %REC traits in the processing of milk from many dairy species, such as cattle (37), sheep (38) and goats (39), the regression coefficients of %CY CURD and %CY SOLIDS prediction reflect its own recovery. Indeed, the regression coefficients of fat when combined with protein or CN for predicting %CY SOLIDS were 0.82 and 0.80, respectively ( Table 2). These values are slightly lower than the average recovery of milk fat (REC FAT = 89. 79%) in the individual model cheese-making procedures carried out in this project (5). As explained in the previous study, the difference could be due to the fact that, as it was not possible to sample the wheels of curd at the beginning of ripening, REC FAT was obtained by difference [(fat in milk -fat in whey)/fat in milk], so the nutrients retained could be slightly overestimated due to small losses in the whey during the procedure and particularly during pressing.
The regression coefficients of fat on the %CY CURD equations were slightly higher (0.97 with protein, 0.88 with CN), consistent with the small regression coefficients obtained for %CY WATER (0.19 and 0.13). It is acknowledged that fat globules contribute to %CY by retaining moisture and acting as a physical obstacle to water (40), but not by exerting any active role in the coagulation process. This means that a small proportion of curd moisture is related to fat content, probably due to the amphiphilic properties of phospholipids and saponified lipids (41).
The regression coefficient of protein for predicting %CY SOLIDS (1.02) was much greater than the recovery of protein during model cheese-making (average REC PROTEIN 78.08%) (5). This is due to the fact that the majority of other solids retained in the curd, especially hydrophilic solids (lactose, minerals, etc.), are proportional to the quantity of whey retained, which in turn is much more proportional to protein (i.e., whey proteins) than to fat (6). For the same reasons, the regression coefficient of casein (1.33) was much larger than the theoretical recovery of casein alone (1.00).
The regression coefficients of protein (2.95) and casein (3.91) when combined with fat for predicting %CY CURD (Table 2) were much higher than unity because of the effect of proteins on the retention of whey in the curd. In fact, the regression coefficients of protein and casein for predicting %CY WATER were 1.88 and 2.51, respectively. In the case of both %CY CURD and %CY WATER, the regression coefficient of protein was about 75% that of casein, reflecting their ratio (casein number) in the milk ( Table 1).
As previously mentioned, protein or casein alone were better than fat in the prediction of %CY WATER , as these components remain bound to water, so that the addition of fat in the prediction equations of %CY WATER did not modify the validation accuracy (Table 2).
Similarly to %CY SOLIDS , all the coefficients for %CY RIPENED were lower than those for %CY CURD . This can be explained by (i) the reduction in %CY RIPENED (∼40%) due to water loss by evaporation from the crust and migration from the inner part of the wheel toward the surface (42); (ii) the microbiological, physical and biochemical reactions occurring during ripening (42,43), which may act as background noise in the prediction of %CY RIPENED . However, the lower regression coefficient of protein and casein was not as pronounced as for %CY SOLIDS due to the water retained in the cheese wheels at the end of the ripening period.

Contribution of udder health indicators to cheese yield
The inclusion in the statistical model of traits associated with udder health (SCS and lactose) only slightly increased the validation accuracy of the %CY prediction equations (Table 3), although the regression coefficients obtained are useful for increasing our knowledge of the relationships between these traits and the efficiency of the cheese-making process. Lactose percentage and SCC are associated with the udder health status of dairy cows (44-46). During mastitis, the mammary gland .
/fvets. . tissues are damaged, secretory cell activity is reduced causing a reduction in the synthesis of lactose (47), and in addition the permeability of the membrane increases causing leakage of lactose into the blood stream (23). Milk SCC encompasses a mixture of epithelial cells and leukocytes and has been widely used as an indicator of intramammary infections. SCC is of further importance as a widely accepted parameter for establishing the hygienic quality of raw milk and is currently used in the milk payment system (48,49). UHI could also have consequences for milk technological properties. High SCC is correlated with reduced fat and lactose contents, but also with an increased level of whey proteins and lower concentrations of caseins (50). Moreover, a variation in the lactose percentage affects the pH of milk and is associated with lower clotting ability (51). All these modifications to the milk composition could cause a reduction in %CY (52), with consequent decrease of the efficiency and sustainability of the whole process.
Most of the lactose in milk is lost in the whey in the cheesemaking process, and fresh curd usually contains only ∼1% of lactose (53), which is bound to the water retained in the curd. Our results reflect this recovery, and show that lactose in combination with fat, protein/casein and SCS mainly affected %CY CURD and %CY WATER , whereas the effect on %CY SOLIDS and %CY RIPENED was very small with average regression coefficients of 0.04 and 0.12, respectively (Table 3). Adding lactose to the protein and fat predictors in the %CY CURD model reduced the protein coefficient by about 19%. However, when the overall protein content was replaced in the model by its fractions (caseins and whey proteins) the contribution of lactose to the %CY CURD decreased to 0.09 (Table 4) because the total solids of the whey retained in the curd were also associated with the whey proteins.
Moving on to the contribution of milk SCS to %CY traits, high levels of SCC reduced the production efficiency of specific cheeses, such as Parmigiano Reggiano (52) and Cheddar (54) but also in model cheeses produced at individual cow level (55). However, according to Bobbo et al. (23), the effect of SCS on milk production, quality, coagulation, and cheese-making traits is nonlinear because a very low SCS has a slight influence on the cow's performance as the animal could be affected by undetectable intramammary infections. Indeed, Wall et al. (56) suggested that immunocompromised cows might have a very low SCC. Previous studies conducted on commercial productions or where a small number of observations were collected failed to demonstrate this trend.
For this reason, we tested both the linear and quadratic (data not shown) regressions for the effect of SCS on %CY traits, but we observed no differences in the fitting statistics between the two models.
Overall, our results confirm those previously reported by Bobbo et al. (23): a slight negative effect of SCS on %CY CURD,WATER,RIPENED, but no effect on %CY SOLIDS in all the models in which SCS was included (Tables 3, 4). Indeed, SCS mainly affects the water retained in the curd and has no effect on %CY SOLIDS . In contrast to our observations on the other predictors tested, the SCS coefficient was not lower for %CY RIPENED than for %CY CURD, meaning that SCS affects not only fresh cheese yield and recoveries, but also weight loss during ripening.

Prediction of cheese yield based on fat and the protein fractions
Replacing protein or total casein with the individual milk protein and NPN fractions in the statistical models increased the validation accuracy of the equations obtained (Table 4). The inclusion of protein fractions provided important information on the relative values of each fraction for cheese production, and insights into the mechanisms of the cheese-making process. Protein fractions have been widely studied for their role in the cheese-making process (28,57), yet to the best of our knowledge this is the first study to quantify the effect of single protein fractions on %CY traits based on a large number of individual model cheese-making procedures.
In comparing the contributions to %CY traits of the four major casein fractions included in the predictive formulas together with fat, we should consider the regression coefficient for total casein (Table 2) as the reference value (3.91 for %CY CURD , about a third of which is represented by solids, the remaining two thirds by retained moisture). The regression coefficients of α S1 -CN and α S2 -CN were lower than expected (2,49 and 2.39, respectively; Table 4), although the coefficient of α S2 -CN for %CY SOLIDS was similar to that of total casein (1.38), whereas the coefficient of α S1 -CN was lower (0.89). Both α S1 -CN and α S2 -CN were characterized by %CY WATER coefficients of regression that were much lower than that of total casein (1.14 and 0.77, respectively), a not unexpected result as these caseins have moderate to high hydrophobicity (58).
Conversely, the other casein fractions had much larger regression coefficients for %CY CURD than did total caseins (β-CN = 5.26, κ-CN = 4.79; Table 4). This is due to them having a much larger effect on moisture retention, especially β -CN compared with κ-CN (%CY WATER : 3.94 and 3.08, respectively). Given that κ-CN causes loss of stability of casein micelles after rennet-induced proteolysis in para-κ-CN, its high relative weight was expected. The effect on %CY SOLIDS is similar to that of total casein for β -CN (1.37), and much larger (2.37) for κ-CN. This is probably because β -CN increases casein and water retention in the curd (28,59), whereas κ-CN enhances the recovery of other solids (i.e., fat globules). As further confirmation of its importance in retaining other non-casein solids, the κ-CN coefficient on %CY SOLIDS showed very low variability among the predictive formulas, and a value higher than its own weight, on average 2.48.
Each casein fraction affected moisture retention in fresh cheese yield differently, although these differences tended to decrease during ripening ( water had evaporated, the regression coefficients of the major caseins were very similar (2.26-2.28), with the exception of α S2 -CN, which dropped to well below unity (0.81). Similarly, the regression coefficient of fat (on average %CY RIPENED = 0.64) even dropped to below the average recovery ratio of milk fat in the curd. This shows that not only moisture decreases greatly during ripening, but also that dry mass decreases as a result of the complex biochemical and microbiological processes that characterize cheese ripening (42,60). However, when whey proteins were added to the model, the picture changed completely. Although not directly involved in the renneting process, whey proteins contribute to the recovery of nutrients in the curd. Our results agree with those of Bonfatti et al. (18), whose study reported that variation in protein composition affects the cheese-making ability of milk. β-LG had a large, unfavorable effect, and a strong negative coefficient for all the %CY traits, especially %CY CURD and %CY WATER , on average -7.33 and -5.65, respectively. It is worth noting, however, that this negative effect regarded not only moisture retention, but also the recovery efficiency of milk solids fractions (%CY SOLIDS : -1.70). Previous studies have reported the adverse influence of β-LG on various traits describing the cheese-making process: Cipolat-Gotet et al. found a strong inverse relationship between β-LG and fat and protein recoveries (28), whereas Ketto et al. and Amalfitano et al. found β-LG to be associated with poor coagulation properties (26, 61).
When the overall protein content was replaced by its fractions (caseins and whey proteins) in the model, the contribution of lactose to %CY CURD decreased to 0.09 (Table 4) because the total solids of the whey retained in the curd were also associated with the whey proteins. This may also be related to the positive correlation between lactose and α-LA, the latter being directly involved in lactose synthesis (62).
In contrast, α-LA had a very strong favorable effect on all %CY traits. Bearing in mind that it has a very low concentration and variability in milk (0.09 ± 0.02%), that it is not retained in the curd, and that it is not known to have any direct effect on the cheese-making process, the actual contribution of α-LA to cheese yield is quite small, but it could be considered a marker of favorable/adverse conditions that need to be further investigated. It is worth pointing out that the negative regression coefficients of β-LG and the positive regression coefficients of α-LA remain after the cheese has ripened (%CY RIPENED ), confirming that the effect of whey proteins is not limited to moisture retention in the fresh curd (Table 4).
The inclusion of whey proteins in the model also caused evident modifications to the regression coefficients of caseins. The only casein fraction remaining unchanged after the inclusion of whey proteins was β-CN. In contrast, the value of the α s2 -CN coefficient decreased by an average of more than 80% when whey proteins were added as predictors and became negative for %CY WATER .
The inclusion of milk NPN compounds in the regression models is also worth some consideration. Although the direct contribution of these substances to curd formation is negligible, adding NPN together with fat and all the protein fractions made a positive contribution to %CY SOLIDS (0.78, and 0.90 after inclusion of UHI; Table 4) and increased the negative effect on %CY WATER (-1.61 and -0.73, respectively), so that the resulting effect on %CY CURD is negative (-0.73) in the first equation, and slightly positive (0.19) after adding UHI. Inclusion of the NPN fraction in the model also affected some of the other regression coefficients, particularly those of α S2 -CN, which decreased by about 40%.

Possible practical applications and the need for further research
The regression coefficients obtained with the different models confirm some previous findings, but also contribute new knowledge and shed fresh light on the relationships between milk composition and the mechanisms and efficiency of the cheese-making process. These coefficients may be used for estimating the relative importance of different milk components for the dairy industry.
The ratio of about 3:1 between the regression coefficients of protein and fat ( Table 2) represents their relative values in terms of the gross revenue (not profit) of milk destined for cheese-making. In addition, the fat and protein (or casein) coefficients have been widely used in the milk payment system, especially in countries where milk production is mainly destined for cheese manufacture. The destination of milk is also the basis for breeding and selection choices. It is worth noting that ratios equal to or >3:1 are used in the selection indices for most dairy populations, particularly in European countries, but not in the USA (4), where milk is used mainly for direct human consumption.
SCS is the most used udder health indicator worldwide, with very few exceptions, but its weight is highly variable in both payment schemes and selection indices, which reflects different production environments, levels of knowledge, and objectives. SCS is included in selection indices mainly as an indicator of health costs and the durability of the cows (63), whereas in payment schemes it is included as an indicator of low hygiene standards and because it is associated with the efficiency of the cheese-making process. Knowledge and quantification of the latter are still fragmentary and need further research. This information also needs to be incorporated into the selection indices for dairy populations. New insights are now obtainable with the use of differential cell count as an indicator of mastitis (64), but not yet as an indicator of effects on cheese-making.
This study shows that lactose is probably more effective than SCS as an indicator of the relationships between udder health and cheese-making efficiency. However, it is also evident that the complexity of these relationships cannot be captured from the physiological, metabolic, technological, and economic . /fvets. . points of view by single indicators, but that both SCS and lactose should be considered together with at least β-LG and NPN, and perhaps also α S2 -CN, in milk. New UHI should be defined for improving the efficiency of selection to enhance the cow's health and durability, but also the efficiency of the dairy industry, which is the starting point for a more sustainable chain. This last objective could be given a further boost by knowledge of the different impacts and relative importance of protein fractions on cheese-making efficiency. New knowledge in this area is of critical interest to the dairy chain sector.

Conclusions
In this study, we directly evaluated detailed milk components in relation to their contribution-individually and in combination-to different %CY traits. The large number and variability of individual samples, and direct measurements of %CY traits allowed us to gather information on effectiveness of predictions for application at the dairy cattle population level. Knowledge concerning the relationships between UHI and efficiency of the cheese-making process needs to be integrated with new information on β-LG and milk NPN, and perhaps also α S2 -CN. The results for the protein fractions provided a much more detailed understanding of the mechanisms that determine cheese yield. Despite the economic importance of the information contained in the detailed protein profile, it is not yet routinely used in milk payment schemes and genetic selection indices as it can only be obtained with expensive, time-consuming methods. This study offers new insights into the quantification of the influence of milk components in composite selection indices with the aim of directly enhancing cheese production.

Data availability statement
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.