Evaluation of physical and mental health conditions related to employees’ absenteeism

Background Employees’ health conditions are issues for not only employees themselves but also companies and society to keep medical costs low and productivity high. Data and methods In this analysis, 15,574 observations from 2,319 employees at four operational sites of a large corporation were used. The dataset contained physical and mental health conditions obtained from annual mandatory medical checkups, the Brief Job Stress Questionnaire (BJSQ), and work record information. Health and other factors related to long-term absenteeism (over three days in a quarter) were analyzed. Data were collected between February 2021 and January 2022, and we converted into quarterly observations. A logit (logistic regression) model was used in the analysis. Results Age and gender were identified as important basic characteristics. The estimates for these variables were positive and negative and significant at the 1% level. Among the variables obtained from the medical checkups, the estimates for diastolic blood pressure, HbA1c, anamnesis, heart disease history, smoking, increased weight, and frequency of alcohol consumption were positive and significant at the 1% level, further those for taking antihypertensive medications and kidney disease history were positive and significant at the 5% level. In contrast, the estimates for systolic blood pressure and amount of alcohol consumption were negative and significant at the 1% level. The estimate for taking antihyperglycemic medications and health guidelines were negative and significant at the 5% level. Among the variables obtained from the BJSQ, the estimates for amount of work felt, fatigue and support from family and friends were positive and significant at the 1%, and the estimate for irritation was positive and significant at the 5% level. The estimates for controlling job and physical complaints were negative and significant at the 1% level, and those for usage of employee’s ability to work and suitability of the work were negative and significant at the 5% level. As all four operational sites were located in the northeastern region of Japan (cold and snowy in winter), the seasonal effects were significant at the 1% level. The effect of year was also significant and significant differences were observed among the sites at the 1% level. Conclusion Some physical and mental health conditions were strongly associated with long-term absenteeism. By improving these conditions, corporations could reduce the number of employee absence days. As absenteeism was costly for corporations due to replacement employees and their training costs to maintain operations, employers must be concerned about rising healthcare (direct and indirect) costs and implement investments to improve employees’ health conditions. Limitations This study’s results were based on only one corporation and the dataset was observatory. The employees were primarily operators working inside the building and most of them are healthy. Therefore, the sample selection biases might exist, and the results cannot be generalized to other types of jobs, working conditions, or companies. As medical checkups and the BJSQ are mandatory for most companies in Japan, the framework of this study can be applied to other companies. Although we used the BJSQ results, better mental measures might exist. Similar analyses for different corporations are necessary.


Introduction
The International Labour Organization (ILO) (1) estimated that losses due to health problems would account for approximately 3.94% of annual global GDP.The World Health Organization (WHO) (2) reported that the economic loss caused by work-related health problems [any illness caused or made worse by workplace factors (3)] would be 4%-6% of GDP in most countries.Maintaining and improving employee health are serious issues for employers.WHO (2) also mentioned that "workplace health initiatives can help reduce sick leave absenteeism by 27% and health-care costs for companies by 26%".Several studies have been conducted on the productivity, characteristics, and health conditions of employees (4)(5)(6)(7)(8)(9)(10)(11)(12)(13)(14)(15)(16)(17).Various authors have also evaluated monetary costs and returns on health investments (18)(19)(20)(21).Loeppke et al. (22) stated that health-related productivity costs were over four times higher than medical costs.In their analysis, they developed a database by integrating the medical and pharmacy claims data with the productivity and health information obtained from the 15,380 Health and Performance Questionnaire (HPQ) respondents of four companies.Then, they added information collected on employer business measures to the database.
Health-related productivity losses have been attributed to absenteeism (repeatedly being absent from work due to health problems) (23) and presenteeism (being present at work but with reduced productivity due to health conditions) (24).Presenteeism is a complicated problem (25) and its proper measurement is difficult.Worker absence is a good proxy for employees' health conditions (26).Since most of the corporation's employees clock in and out of work, and additional trained employees are required to maintain corporate operations, absenteeism is the cornerstone metric guiding corporate policy for healthcare investment (27).Nawata (28) evaluated the health factors affecting absenteeism using data obtained from 1,136 employees at one operational site of a large corporation.However, this study has the limitations: (i) the number of observations was not large and the observation period was just three months, and (ii) only limited factors of physical health conditions obtained from medical checkups were used, ignoring the factors representing mental health conditions.Mental health is important for employee well-being, productivity, and absenteeism (29)(30)(31)(32)(33)(34)(35)(36)(37)(38)(39)(40)(41)(42).Goetzel et al. (43) emphasized that employers must be concerned about rising mental healthcare costs.Bryan et al. (44, p.1519) found "that a change in mental health has an effect on absenteeism more than three times greater than a change in physical health".
Since 2015, annual stress checks have become mandatory for companies with 50 or more workers in Japan under the Amendments of Industrial Safety and Health Act (45).The Japanese government also launched the Stress Check Program to screen workers with high psychological stress in the workplace (46).These amendments aim to prevent workers' mental disorders and improve working conditions that might cause job stress.Medical checkups and stress checks are performed as part of the regular operations of companies, and all costs are paid by the companies.That is, not only all direct costs but also necessary times for medical checkups and stress checks are treated as paid working hours.Hence, employers must be aware of these results to improve employee health.Tsutsumi and Kawakami (46) mentioned that the Japanese Stress Check Program might be effective in improving workers' mental health.The Brief Job Stress Questionnaire (BJSQ) (47) is usually used for stress checks, in which each worker answers 57 job stress questions.Watanabe et al. (48) also reported that the BJSQ helped to measure psychosocial factors at work.Therefore, we used the BJSQ results to represent the employees' mental health conditions.The BJSQ comprises four parts: job concerns, health conditions, people around the worker, and satisfaction.The 57 questions are then summarized into 19 items scored from 1 to 5; a higher score represents better conditions, that is, 5 is the best and 1 is the worst (49).
In this study, both physical and mental health conditions related to absenteeism were analyzed using 15,574 observations obtained from 2,319 employees over the period of February 2021 to December 2022.To the best of the author's knowledge, this study is the first attempt analyzing the relationship using a large individual dataset.

Data
The dataset contained information on medical checkups, BJSQ answers, and work records obtained from employees at four

Models
We set three indices determined by employee, year, and quarter given by i t q , , ( ) and converted the data into quarterly (one dimensional) observations.We define L Absence itq _ =1 if the i-th employee has long-term absenteeism at the q-th quarter of year t and 0 otherwise.The continuous integer index (hereafter, observation number)  = ( ) , , , n, was assigned for each i t q , , ( ).Note that the assignment is one to one and i t q , , ( ) is uniquely determined when  is given.Let q 1 the final quarter that the (i-1)-th employee worked and q 2 be the first quarter that i-th employees worked at year t,  1 , , and  2 2 = ( ) , , .If these employees worked throughout the year, q 1 4 = and q 2 = 1.The observation number is assigned so that   , , if the i-th employee had worked in both q and q + 1 quarters in year t.Let n 1 be the number of observations in year t.Then, the observation number starts from n 1 1 + in the next year.We obtained 18,549 (personquarter) observations.Of these observations, 7.9% were L Absence _  =1.The basic model used in the analysis is the logistic regression (logit) model with the fixed time effect given by where Λ is the distribution function of the logistic distribution given by Λ(ω) = (exp(ω))/(1 + exp.(ω)); x  is a vector of covariates representing the employee's characteristic and health condition; β is a vector of unknown parameters; γ tq is the fixed time effect, and n is the number of all observations.Since the medical checkup and BJSQ results are available only once a year, we assume that x  does not change in year t so that x x x tiq ti  ≡ = for any possible1 4 q ≤ ≤ where x ti represents the medical checkup and BJSQ results of the i-th employee in year t.Table 1 shows the assignment example of the observation number  and x  when employees worked thought the year.Here after, we omit the subscript  to avoid unnecessary complications.

Selection of covariates
As shown by Nawata (50), the selection of covariates is important.If we do not add the appropriate covariates, we obtain misleading results.However, if we add covariates that are irrelevant to absenteeism, we may lose the efficiency of the estimation due to multicollinearity among covariates and a reduction in the number of observations by missing values.The number of factors obtained from the medical checkups and the BJSQ was 41 and 19, respectively.Quarter, site location and year dummies were the other potential covariates.Therefore, it was necessary to control for the number of covariates.
The basic characteristics of employees are as follows: Female (dummy variable) is 1 if female and 0 if male, and Age (age of an employee).
Since these factors were fundamental, not affected by the health conditions and highly significant in all models, we selected health factors based on the models with these factors.
We employed the procedure used in Nawata (28) to select the proper medical checkup and BJSQ covariates.The dataset was observatory and causality problems might exist, and we analyzed the variables possibly related to absenteeism.The procedure is based on likelihood ratio statistics and the Akaike information criterion (AIC), one of the most widely used criteria in model selection.It is important to use likelihood ratio statistics because t-test statistics may provide misleading results in binary choices and similar models (51,52).
be (potential) covariates.The medical checkup covariates were selected by the following stepwise procedure that increases the covariates one by one: i) Estimate the model given by Let the log likelihoods of the first and second equations and their difference be log L 0 , log L j 1 and LR L L (2) using without missing values. 2 1 ⋅ LR j is the likelihood test statistic of H 0 3 0 : β = and asymptotically follows χ 2 1 ( ) under the null hypothesis.Choose x j that maximizes LR j 1 .ii) Without a loss of generality, we can assume that the first variable x 1 maximizes LR j 1 .Let , , , , and calculate the second stage log likelihoods an the difference, log 3) using observations without missing values.Let x 2 be a variable that maximizes LR j 2 .iii) Repeat steps m+1 times by increasing covariates one by one until LR m j + < 1 1 for all j m > It corresponds to minimizing the AIC.The selected model becomes Eq. ( 4) given by This procedure allowed to select the following variables from the medical checkups.The deals of the section procedure are available upon request to the author.
SBP (systolic blood pressure) mmHg, DBP (diastolic blood pressure) mmHg, GOT (glutamic-oxaloacetic transaminase) units per liter (U/L), GPT (glutamic-pyruvic transaminase) U/L,   The following variables were selected from the BJSQ (stress check) answers using the same procedure of the medical checkup case.These variables take integers 1-5; a larger value is better (stress is less), 1 is the worst, and 5 is the best.Tsutsumi et al. ( 53) considered the cut-off points to identify the high-stress employees.Since the cut-off points were obtained from the BJSQ answers and we assumed that they associated with absenteeism continuously, we directly used the values of these items in the analysis.
Since the year and seasonal factors were important, we considered that the time effect consisted of year and quarter effects and given by γ ζ η tq t q = + .The following dummy variables were used to represent the effects of the year, season, and site: Y22 (year dummy) is 1 if year 2022 and 0 if 2021, Q1, Q3, and Q4 (quarter dummies representing the first, third, and fourth quarters, respectively.The base is the second quarter, where the probability of long-term absence is the lowest), and.
Site2, Site3, Site4 (site dummies representing the second, third, and fourth sites.The base is the first site with the largest number of employees is largest).
Forty-five covariates were used in the analysis and x'β+γ tp in Eq. (1) becomes Eq. ( 5) given by The study design is summarized in Figure 2, and the variables not used in the analysis are listed in Appendix A. A summary of the covariates is provided in Table 2.The list of abbreviations used in the study is given in Table A1 in Appendix B. The total number of observations used in the estimation of Eq. ( 3) is 15,574, of which 1,148 have L_Absence = 1 and 14,426 have L_Absence = 0.

Results of estimation
Table 3 presents the estimation results.In the analysis, we used EViews 12.The gross percentage of long-term absenteeism (L_Absence = 1) was 7.4%, McFadden's R 2 was 0.0921, and the likelihood ratio statistic of the equation was 775.05.Among the basic characteristics, the estimate for Female was positive, and its t-value was quite large and highly significant.The odds ratio (OR) for females compared to males was 2.26 with a 95% confidence interval (CI) ranging from 1.86 to 2.74.The estimate for Age was negative and significant at the 1% level.The OR comparing employees aged 30-40 years was 0.69, with a 95% CI of 0.69-0.75. Figure 3 shows the ORs and 95% CIs for Female and Age.
Concerning multicollinearity among covariates, the variance inflation factors (VIFs) were not large except for SBP (3.29), DBP (3.24), GOT (5.68) and GPT (6.22).The VIFs for these variables are in parentheses.Except for these variables, the largest VIF was 2.78 and not large; thus the multicollinearity problem is not particularly serious.The correction coefficient of BP variables is relatively high (0.814).As the standard deviations (SDs) of SBP and DBP are different, we standardize them and define S_SBP=SBP/s1 and S_DBP=DBP/s2, where s1 (=19.70) and s2 (=13.41) are the SDs of SBP and DBP.We consider the level and difference of BP as BP_L = (S_SBP + S_DBP)/2 and BP_D = S_SBP-S_DBP.These correspond to the first and second principal components of the standardized BP levels.The method makes the estimators for concerning variables most efficient (54) in the two variable cases.We then estimated the logit model again and obtained the results shown in Table 4.
Note that these changes are simple linear transformations, and the estimation results of the other variables and the model fitness remain the same as those of Table 3.The estimate for BP_L was negative and not significant at the 1% level; however, the estimate for BP_D was negative and highly significant (p-value is 0.0000).
The correlation coefficient between GPT and GOT is 0.901.When we considered the level and difference between these variables, the p-value were 0.139 and 0.287 for the level and difference variables, respectively, and the results were not significant.

Basic characteristics
Age is a significant factor affecting long-term absenteeism.However, the paid leave days given to employees depend on the number of working years at the company, and the days become longer as the working years increase [Article 39 of the Labor Standards Act (55)].Evidently, the working experience of younger employees tends to be shorter than that of older employees.Unfortunately, working years were not available for the dataset.Therefore, additional studies concerning to age and working years are required.
Gender is an important factor.The OR for females compared to males is 2.26 [In this case, because the probability of long-term absenteeism is relatively low, the OR approximates the probability ratio (56)].This means that the long-term absenteeism probability of females is approximately twice that of males.The corporation depends heavily on female employees.Almost three-fourths of employees are female, and their absence directly affects corporate performance.Labor and health policies aimed at female employees are necessary.

Variables obtained from the medical checkups
Concerning blood pressure (BP), the relationships of SBP and DBP to absenteeism are opposite.The estimate for SBP is negative and that for DBP is positive.This means that higher SBP may reduce long-term absenteeism, whereas higher DBP may increase it.The results of Table 4 suggest that the (standardized) difference between SBP and DBP does matter, but the BP level may be less important.There have been many studies on the relationship between BP and diseases (especially cardiovascular diseases) (50).Therefore, it might be necessary to reevaluate the relationship between BP and disease from this viewpoint.
HbA1c is positively related to absenteeism.Anamnesis and histories of heart disease and kidney disease may increase the probability of long-term absenteeism.Compared to those without them, the ORs are 1.36, 2.88 and 2.53 for those with anamnesis, heart disease, and kidney disease, respectively.Therefore, special healthcare by the corporation should be necessary for such employees.
Smoking habits and large weight increases can also affect absenteeism.The ORs are 1.67 and 1.44 compared to those without them.These results are consistent with those of previous studies (20,(57)(58)(59).Health guidelines may reduce absenteeism as expected.These factors are modifiable through, and it may be worthwhile for the corporation to help improve these factors.The results for alcohol consumption are mixed.If an employee drinks more frequently, it may increase absenteeism; however, if an employee drinks more alcohol at once, the probability of absenteeism may decline.We cannot determine the reasons for this finding, and further studies regarding alcohol consumption are necessary.Antihypertensive medications would increase absenteeism, but antihyperglycemic medications would decrease.

Variables obtained from the BJSQ
The BJSQ primarily represents employees' mental elements.Among them, the burden concerning the quantity of work (M_ Burden) is positive and highly significant (t-value = 8.45, p-value = 0.000).It is not a good sign, and it is reasonable to consider that employees might be overworking because they cannot take off days due to too much work.In the worst case, overwork results in employee suicide (60).Employers and managers at operational sites should pay attention to avoiding employee overwork.W_Control, Ability_Usage, W_Suitability are significant at the 1% or 5% levels.These variables represent motivation, suitability, and willingness to work.Because the meanings of these variables are similar, we consider the case in which the values of all variables increase by one.Since = − 0.00021, −0.00034 and − 0.00021 for (i = 29, j = 30), (i = 29, j = 31) and (i = 30, j = 31), ( ) ˆˆV Therefore, the OR is 0.65 with 95% CI of 0.58-0.74(the OR and CI are calculated by comparing i x ∑ and ∑ + ( ) x i 1 ).This finding suggests that employers and managers can reduce long-term absenteeism by one-third through suitable work arrangements that would motivate employees.
Irritation is significant at the 5% level.However, the sign is positive.Further studies are necessary to address employee irritation.Fatigue and physical complaints (P_Complaint) are significant at the 1% level.Compared to the one-point improvement case, the ORs become 0.85 and 0.81, suggesting that long-term absenteeism could be reduced by about 15% and 20% through one-point improvement in these factors.The estimate for support from family and friends (F_Support) is positive and significant at the 1% level.It is reasonable to assume that family and friends advise employees to take off days more often when conditions are poor.Surprisingly, the estimate for work and family life satisfaction is not significant at even the 5% level.

Year, quarter, and site dummies
The estimate of the dummy variable for 2022 (Y22) is highly significant, suggesting a large difference in 2021 and 2022.The average daily number of new coronavirus patients (COVID-19) patients in Japan was 4,010 in February-December 2021, and 75,175 in January-December 2022 (62).This huge difference might have affected absenteeism; however, further investigation is necessary to evaluate the relationship of COVID-19 to absenteeism.
All estimates for quarter dummies are significant.The seasonal factors are important, especially in the fourth quarter (Q4, October-December).The operational sites are located in the northeastern region of Japan.The first snowfall occurs in mid-November; moreover, the daytime becomes shorter, and the weather becomes colder daily in the fourth quarter, which might affect the behaviors of employees.The estimate for Site2 is positive and significant.The percentages of observations answered the BJSQ are 93.Odds ratios and 95% confidence intervals of significant variables of positive estimates obtained from medical checkups.
Odds ratios and 95% confidence intervals of significant variables of negative estimates obtained from medical checkups.

Conclusion
This study analyzed the physical and mental health factors of employees that may be related to absenteeism.The dataset included the results of annual medical checkups, BJSQ, and work records at four operational sites in a large corporation.The sample period was from February 2021 to January 2022.Because there were too many potential covariates, health-related covariates were selected using the stepwise procedure.Subsequently, 15,574 observations from 2,319 employees were used in a logistic regression (logit) model.
The long-term absenteeism probability for females was much higher than that for males.The corporation depends heavily on female employees.Labor and health policies aimed at female employees are necessary for the corporation.The opposite relations Odds ratios and 95% confidence intervals of significant variables obtained from the BJSQ.Odds ratios and 95% confidence intervals of year, seasonal, and site dummies.were observed for SBP and DBP.These results suggest that the (standardized) difference between SBP and DBP was more important than the BP level.HbA1c, anamnesis and histories of heart disease and kidney disease were positively related to the probability of long-term absenteeism.Smoking habits and large weight increments were positively associated with absenteeism.
Health guidelines might reduce the absenteeism.It may be worthwhile for the corporation to help improve them.The results for alcohol consumption were mixed.Antihypertensive medications would increase absenteeism but antihyperglycemic medications would decrease.Among the BJSQ vaeiables, the quantity of work was positive and highly significant, and employers and managers should pay attention to avoiding overworking employees.Improving workers' motivation through suitable work arrangements could reduce long-term absenteeism by one-third.Fatigue and physical complaints were also important, and long-term absenteeism cloud be reduced by improving physical conditions.The estimate of support from family and friends was positive and significant.However, the estimate of work and family life satisfaction was not significant even at the 5% level.
The estimate of the dummy variable for 2022 was highly significant.Therefore, COVID-19 might have affected absenteeism.Seasonal factors were important, particularly in the fourth quarter.The estimate for Site2 was positive and significant, and it may be necessary to revise the labor and health management policies at the site.Among the major countries, the Japan is the only country performing annual mandatory health checkups and job stress checks for most employees regardless of their health conditions (63).It is extremely costly to do such a survey in other countries.The results of the paper would help when the similar types of studies or policy analyses are done in other countries.
The results of this study are based on operational sites of one corporation and the dataset was observatory.The employees were mainly operators working inside the buildings, and most of them are healthy people.Therefore, the sample selection biases might exist, and results may differ for different working conditions, job types, or companies.Hence, the results of this study cannot be generalized.However, annual medical checkups and the BJSQ for employees are mandatory for most companies, and the framework of this study is applicable to most companies in Japan.The influence of presenteeism is not evaluated.The implementations to improve the employees' health conditions are also important.These are the limitations of the study and should be investigated in future studies.

FIGURE 1 Distribution
FIGURE 1Distribution of quarterly absence days.
FIGURE 2Flow chart of the study.

β
are the estimators of these variables.

TABLE 1
Assignment example of observation number  in year t and values of x  when employees worked throughout year t.

TABLE 2
Summary of covariates.

TABLE 3
Results of estimation.