Smoking and BMI mediate the causal effect of education on lower back pain: observational and Mendelian randomization analyses

Objective Low back pain (LBP) has been associated with education in previous observational studies, but the causality remains unclear. This study aims to assess the impact of education on LBP and to explore mediation by multiple lifestyle factors. Design Univariable Mendelian randomization (MR) was performed to examine the overall effect of education on LBP. Subsequently, multivariable MR was conducted to assess both the direct effect of education on LBP and the influence of potential mediators. Indirect effects were estimated using either the coefficient product method or the difference method, and the proportion of mediation was calculated by dividing the indirect effect by the total effect. The observational study utilized data from the NHANES database collected between 1999 and 2004, and included 15,580 participants aged 20 years and above. Results Increasing education by 4.2 years leads to a 48% reduction in the risk of LBP (OR=0.52; 95% CI: 0.46 to 0.59). Compared to individuals with less than a high school education, those with education beyond high school have a 28% lower risk of LBP (OR=0.72; 95% CI: 0.63 to 0.83). In the MR study, smoking accounts for 12.8% (95% CI: 1.04% to 20.8%) of the total effect, while BMI accounts for 5.9% (95% CI: 2.99% to 8.55%). The combined mediation effect of smoking and BMI is 27.6% (95% CI: 23.99% to 32.7%). In the NHANES study, only smoking exhibits a mediating effect, accounting for 34.3% (95% CI: 21.07% to 41.65%) of the effect, while BMI does not demonstrate a mediating role. Conclusions Higher levels of education provide a protective effect against the risk of LBP. Additionally, implementing interventions to reduce smoking and promote weight loss among individuals with lower levels of education can also decrease this risk.


Introduction
Low back pain (LBP) is a pervasive health issue with a considerable global impact (1,2).It is the leading cause of disability worldwide, affecting an estimated 632 million people and influencing all aspects of their lives, from occupational productivity to psychosocial well-being (3).Despite its wideranging effects, the etiology of LBP remains complex and multifactorial, among which the influence of lifestyle should not be overlooked (2,4,5).
In recent years, there has been growing interest in the social determinants of health and their role in the development and progression of chronic diseases.One such determinant, educational attainment, has been linked with a wide range of health outcomes (6)(7)(8).Education can affect health through various pathways, including healthy lifestyle, employment opportunities, and psychosocial factors, and among others (9,10).Generally, higher levels of education are associated with better health and lower mortality (7,11).However, the role of education in the etiology of LBP is less well understood.Although some studies have suggested that individuals with less education may have a higher prevalence of LBP (12,13), it is unclear whether the effect of education on LBP is realized through a healthy lifestyle.It is also important to note that these studies are subject to confounding and reverse causation, making it challenging to infer a causal relationship.
In order to elucidate the causal effect of educational attainment on LBP and to understand the role of lifestyle in this relationship, we selected four unhealthy lifestyles factors (smoking, alcohol consumption, sedentary TV viewing, and high BMI) as potential mediators and investigated their complex relationships using Mendelian Randomization (MR) analysis, a method that uses genetic variation as an instrumental variable for estimating of causal effects.This approach offers a solution to address the problems of confounding and reverse causation that are common in observational studies (14,15).MR has been increasingly used in epidemiology and has proved particularly useful in exploring causal relationships, such as education on health outcomes (16,17).
In addition to MR, this study utilized data from the National Health and Nutrition Examination Survey (NHANES), a research program designed to assess the health and nutritional status of adults and children in the U.S (18).The NHANES data provide a valuable resource for examining the associations between education, lifestyle factors, and LBP in a representative sample of the U.S. population.With those approaches, our study aims to clarify the causal role of education in LBP and to investigate the mediating roles of smoking, alcohol consumption, leisure TV time, and BMI.By gaining insight into these relationships, we hope to contribute to the development of effective strategies for prevention and management of LBP.

Study design
We employed two research methods to investigate the impact of education level on LBP and the mediating effects of various lifestyles.Initially, we utilized a two-sample MR approach to examine the causal relationship between education level and LBP.We also exploring potential mediators such as smoking, alcohol consumption, sedentary TV time, and BMI through multivariate Mendelian randomization (MVMR) analysis.Subsequently, to examine the robustness of the identified mediators, we conducted an observational study with data from the NHANES collected between 1999 and 2004.This study rigorously followed the guidelines outlined in Strengthening the Reporting of Observational Studies in Epidemiology using Mendelian Randomization (Supplementary STROBE-MR Checklist) and Strengthening the Reporting of Observational Studies in Epidemiology (Supplementary STROBE-cross-sectional studies Checklist).

Data sources 2.2.1 Data for Mendelian randomization
Genetic instruments for educational attainment was obtained from the Social Science Genetic Association Consortium genome-wide association studies (GWASs) meta-analysis, which included 766,345 participants of European ancestry (19).The International Standard Classification of Education was utilized to establish the corresponding years of education for each major educational qualification.It should be noted that one standard deviation (SD) represents 4.2 years of schooling.Genetic variant data for LBP were obtained from the FinnGen consortium (https://www.finngen.fi)under the accession number: finn-b-M13_LOWBACKPAINORANDSCIATICA, which comprising 13,178 cases and 164,682 controls of European ancestry.The diagnosis of the cases was based on the World Health Organization's International Classification of Diseases inclusion criteria (ICD-10 M54.5, ICD-9 724.2, ICD-8 728.70).
The genetic instruments for smoking phenotypes were obtained from a comprehensive meta-analysis on tobacco and alcohol consumption, which encompassed more than 30 GWASs with over 1.2 million individuals of European ancestry (20).Based on previously documented research (21), we used the smoking index to measure exposure to smoking, where larger smoking index scores represent greater exposure.Genetic variables for alcohol consumption were derived from the latest GWAS pooled analysis (22), containing genetic data from 3,383,199 individuals.From this, we extracted data for the subset of European ancestry (N=2,669,029).As indicated in most of the studies, exposure to alcohol consumption was measured as the amount of alcohol consumed per week.The genetic instruments for BMI were obtained from the Genetic Investigation of Anthropometric Traits consortium GWAS meta-analysis, which included approximately 700,000 individuals of European descent.One SD represents a difference of 4.8 kg/m 2 (23).Genetic variables for leisure TV watching were derived from a behavioral GWAS containing 422,218 individuals.Participants will be asked: In a typical day, how many hours do you spend watching TV?The mean daily reported leisure TV watching was 2.8 h (SD 1.5h) (24).

Data for NHANES study
The present cross-sectional study utilized NHANES data obtained from the Centers for Disease Control and Prevention for the years 1999 to 2004.The primary objective of the NHANES project is to evaluate the health and nutritional status of noninstitutionalized Americans through stratified multistage probability surveys (18).Data can be accessed via the NHANES website (http://www.cdc.gov/nchs/nhanes.htm)(accessed on August 10, 2022).Our study included individuals aged 20 years or older who had completed an interview and excluded pregnant women, as well as individuals with missing data on education, LBP, or covariates.
The potential covariates assessed in this study were based on the existing literature (4,(25)(26)(27)(28), and included age, gender, race, marital status, household income, education level, smoking status, alcohol consumption, BMI, physical activity level, sedentary TV time as well as hypertension and diabetes mellitus.The education level is defined as the highest grade completed or the highest degree earned.LBP is considered a binary categorical variable, indicating whether LBP has occurred in the past three months.Smoking status is categorized into three groups: never smoked (fewer than 100 cigarettes in a lifetime), former smokers (more than 100 cigarettes but have quit), and current smokers (more than 100 cigarettes and still smoking).Race categories include non-Hispanic white, non-Hispanic black, Mexican American, and others.Marital status is categorized as married, living with a partner, or living alone.Family income is classified into three groups based on the poverty income ratio (PIR) (29): low (≤ 1.3), medium (1.3 to 3.5), and high (> 3.5).Drinking alcohol is defined as consuming an average of at least one alcoholic beverage per month.Physical activity levels are categorized as sedentary, moderate (at least 10 minutes of exercise resulting in light sweating or a mild to moderate increase in respiration/heart rate), and vigorous (at least 10 minutes of activity resulting in heavy sweating or increased respiration/heart rate).Sedentary TV time refers to the total amount of time spent watching TV or using a computer outside of work on a typical day.BMI is calculated using standardized techniques based on weight and height measurements.Previous diseases (hypertension and diabetes mellitus) were identified through a questionnaire by asking participants if they had ever been informed by their physician about these conditions.

Selection of instrumental variables
It is crucial to emphasize that the screening of instrumental variables (IV) must adhere to three fundamental assumptions (15): (1) a strong association between genetic variation and exposure factors; (2) genetic variation influencing the outcome solely through the exposure of interest; and (3) genetic variation being independent of confounders that may affect the outcome.We established a threshold of p < 5×10 -8 to select genetic variants associated with the exposure.Additionally, single nucleotide polymorphisms (SNPs) were clumped based on the removal of linkage disequilibrium (LD) with an R 2 > 0.001 within a 10,000 kb range using a European LD reference panel (30,31).SNPs associated with LBP were eliminated using a threshold of p < 5×10 -6 (32).To evaluate the strength of IVs, we computed the F-statistic, and MR analyses employed a threshold of F > 10 to prevent weak IV bias (33,34).

Statistical analysis
We initially conducted two-sample univariable MR (UVMR) to estimate the total effect of education on LBP (g), as well as the effects of education on each potential mediator (a).Subsequently, we treated these potential mediators as exposures and extracted their respective IVs, while removing SNPs that overlapped with education.Then performed UVMR again to examine the effects of each potential mediator on LBP.Finally, we incorporate the potential mediators identified by UVMR, along with education, into MVMR to construct various models.These models were used to investigate the independent effects of these mediators on LBP (b) and the direct effects of education on LBP (g*).Our primary analytical method employed inverse variance weighting (IVW) modeling, which is statistically more effective when all IVs are valid (35).
In the NHANES study, data were weighted using interview weights.For statistical description, categorical variables were expressed as proportions (%), while continuous variables were described using either the mean with SD or median with interquartile range (IQR).To compare differences between groups within the complex survey sample, we employed the Wilcoxon rank-sum test and the chi-squared test, the latter with Rao & Scott's second-order correction (36,37).Logistic regression, adjusted for the complex survey design, was used to determine odds ratios (ORs) and 95% confidence intervals (95%CIs) for assessing the associations between covariates and LBP.We also used an adjusted linear regression model for continuous outcomes such as BMI.Various models were developed to analyze the association between education level and LBP.Subgroup analyses, based on potential mediating factors including smoking status (never vs. former or current smokers), alcohol consumption (no vs. yes), BMI (<25 vs. ≥25 kg/m 2 ), and TV watching time (<3 vs. ≥3 hours per day), were conducted.Interactions between subgroups and education level were examined using likelihood ratio tests.
Our analyses were conducted using the "TwoSampleMR" and "MendelianRandomization" packages for MR, "mice" for imputation, and "survey" for weighting in R software (version 4.3.1,R Foundation for Statistical Computing, Vienna, Austria, http://www.R-project.org).We performed a total of nine UVMR analyses and four MVMR analyses, with statistical significance determined using the Bonferroni correction: p<0.05/13 (approximately 3.85×10 −3 ).

Mediating effects analysis
To decompose the overall effect (g) of education on LBP, we considered (i) the direct effect (g*) of education on LBP after adjusting for each mediator, and (ii) the indirect effects of education through each mediator.The indirect effect of each mediator was calculated using the product method; for instance, the indirect effect of education through smoking on LBP was determined by multiplying the effect of education on smoking (a) with the effect of smoking on LBP (b) (38).To derive the joint indirect effect of smoking and BMI, we employed a differences method (g-g*), where g* is the direct effect after adjusting for both smoking and BMI (15).For all mediators, the proportion of mediated effects was quantified by dividing their respective indirect effects by the total effect (39).
In the NHANES study, education level was reclassified into two categories: high school level or below, and above high school level.Similarly, smoking status was categorized as never smoked, former or current smoker.Multivariable logistic regression was employed to investigate the association between education and mediators (a'), while also examining the impact of mediators on LBP (b').

Sensitivity analysis
In MR analysis, we used MR-Egger, weighted median, weighted mode, and mv-lasso (40) (applied in MVMR) as complementary methods to test the robustness of IVW results.The consistency of the results from these various methods should provide greater robustness against bias from horizontal pleiotropy.Heterogeneity was assessed using Cochran's Q statistic in both IVW and MR-Egger (41).To address pleiotropy, we performed hypothesis testing on the MR-Egger intercept, removed outliers SNPs with MRPRESSO package (42), and visualized the results using leaveone-out analysis, forest plots, funnel plots, and scatter plots.Following the approach outlined by Burgess (43), we conducted analyses on each overlapping dataset to assess bias and type 1 error rates (https://sb452.shinyapps.io/overlap/).In addition, we conducted secondary analyses using different datasets to verify the robustness of our findings.
In the NHANES study, multiple imputations were conducted using the "mice" package with chained equations to impute missing values for covariates (44), excluding education level, LBP, smoking status, and BMI.Five datasets were generated in total.We analyzed each dataset separately and applied the Rubin's rules to combine their estimates and variances, resulting in a final outcome (45).Additionally, we excluded participants with BMI outside the range of 18.5 to 40 kg/m 2 and reanalyzed to assess the robustness of the results.

Instrumental variables and demographic characteristics
We acquired summary data on the association between SNPs and phenotypes from GWAS for each respective phenotype (Supplementary Table S1).A total of 317 SNPs were selected as IVs for education with a strong instrument indicated by an F-statistic of 19.6.For smoking index, alcohol consumption, BMI, and leisure TV, we extracted 123 SNPs (F=17.8),98 SNPs (F=16.9),521 SNPs (F=29.3),and 148 SNPs (F=17.1)respectively, ensuring the absence of weak IVs.
A total of 15,332 participants over the age of 19 completed interviews in the NHANES study conducted between 1999 and 2004.After excluding pregnant women (n=833) and individuals with missing data on educational attainment (n=57), LBP (n=8), smoking status (n=13), BMI (n=1498) and other covariates (n=2343), a total of 10,580 participants were included.The detailed process of inclusion and exclusion is illustrated in Figure 1.
The Supplementary Table S2 presents the weighted basic characteristics of both excluded and included participants.Overall, compared with the excluded group, the included group had a higher proportion of males, non-Hispanic whites, more than high school, married, high-income, consume alcohol and engage in exercise.Table 1 provides a baseline characterization of participants categorized by education level, with 39% of participants suffering from LBP.The mean age was 46.0 (16.4) years, and females accounted for 50%.A higher level of education is associated with a higher rate of alcohol consumption, increased physical activity, less smoking, fewer diagnosis of hypertension/diabetes, less time watching TV and a lower prevalence of LBP.
In the NHANES study, we initially examined the association of each covariate and LBP using univariate logistic regression (Supplementary Table S3).The results revealed that gender, race, BMI, family income, smoking status, physical activity level, watching TV time, as well as hypertension or diabetes mellitus were associated with the prevalence of LBP.However, no association was observed between alcohol consumption and LBP, which is consistent with the result obtained from MR analysis.
In the multifactorial analysis, we constructed multiple models to adjust for confounding variables.Although age, marital status, and alcohol consumption did not show an association with LBP in the univariate analysis, these factors were included in the models based on the existing literature (46,47).The findings consistently The NHANES study's flow diagram.
demonstrated an inverse association between education level and the prevalence of LBP.After adjusting for all confounders, individuals with education beyond high school had reduced odds of LBP (OR=0.72;95%CI: 0.63 to 0.83), compared to those with less than a high school education, as detailed in Table 2.The results of the subgroup analysis showed that the association between education level and LBP remained consistent across all subgroups.Furthermore, no interaction effects were observed between education level and the potential mediating factors, as illustrated in Figure 3.

Mediating effect
In the UVMR analysis, alcohol consumption showed no effect on LBP and was consequently excluded from subsequent MVMR analysis.In screening potential mediators, we constructed four models that individually adjusted for different mediators, as depicted in Figure 4.The results consistently indicated a direct effect of education on LBP with no evidence of complete mediation.After adjusting for education, BMI, and smoking index in model-1, the OR for leisure TV time became insignificant (OR=1.14;95%CI: 0.84 to 1.54), thus excluding it from further analysis.The adjusted causal effects of smoking and BMI on LBP were OR=1.46 (95% CI: 1.03 to 2.09) and OR=1.18 (95% CI: 1.08 to 1.30), respectively.
In the NHANES study, after adjusting for other covariates, education level had an inverse association with smoking status (OR=0.62;95%CI: 0.56 to 0.68); however, it no longer demonstrated an association with BMI (b= -0.16, 95%CI: -0.47 to 0.15).Additionally, both smoking status and BMI were independently associated with the occurrence of LBP, with ORs of 1.24 (95%CI: 1.10 to 1.40) and 1.02 (95%CI: 1.01 to 1.03), respectively, as shown in Supplementary Table S4.
In the MR analysis, the total effect (g, as shown in Figure 5) of education level on LBP was -0.658 (95% CI: -0.79 to -0.53), and in the NHANES study, it was -0.306 (95% CI: -0.39 to -0.22).The direct effect (g*, as shown in Figure 5) was -0.476 (95% CI: -0.71 to -0.24) in the MR analysis and -0.282 (95% CI: -0.37 to -0.20) in the NHANES study, respectively.The mediation analysis revealed that smoking accounted for 12.8% (95% CI: 10.4 to 20.8) of the total effect in the MR study, while BMI accounted for 5.90% (95% CI: 2.99 to 8.55), and their combined mediated proportion was 27.6% (95% CI: 23.99 to 32.7).In contrast, in the NHANES study, only smoking showed a mediating effect with a proportion of 34.3% (95% CI: 21.07 to 41.65), whereas BMI did not act as a mediator, as detailed in Table 3.

Sensitivity analysis
In the MR analysis, we observed heterogeneity among the studies (Supplementary Table S5 and Supplementary Figures S1-4); however, no pleiotropy was detected (Supplementary Table S6 and Supplementary Figures S1-4).To explore potential SNPs exerting substantial effects on the outcomes, we reanalyzed the data after removing outlier SNPs and generated scatter plots (Supplementary Figure S5), funnel plots (Supplementary Figure

S6
), leave-one-out plots (Supplementary Figure S7), and forest plots (Supplementary Figure S8).Despite conducting this comprehensive analysis, none of the SNPs exhibited an impact on the results.Furthermore, upon implementing the MR-Egger, weighted median, weighted mode, and mv-lasso (applied in the MVMR) methods, we observed an expansion of the 95%CI, resulting in a loss of statistical significance for certain findings.However, it is important to note that the causal direction remained consistent with the IVW method (Figure 2 and Supplementary Figures S1-4).Lastly, we conducted a reassessment of the causal relationship between educational attainment, potential mediating variables, and LBP using an alternative dataset and obtained consistent results (Supplementary Table S7).However, in the analysis of sample overlap, we observed that high overlap rates introduce bias and increase the likelihood of Type I errors in the causal effect between education and smoking, alcohol consumption, and leisure TV time (Supplementary Table S8).
Within our NHANES study, we employed two distinct strategies: a multiple imputation approach and the exclusion of extreme BMI values.The findings derived from both methods were consistent with our initially presented results, reinforcing the hypothesis that an elevated level of education is inversely associated with the prevalence of LBP (Supplementary Tables S9-10).

Discussion
Our findings reveal an inverse association between education level and the prevalence of LBP, as evidenced by both the MR analysis and NHANES database.Smoking was identified as a crucial mediator in the causal relationship between education and LBP, whereas BMI only exhibited mediating effects in the MR study.
Our findings align with previous research on the impact of education on LBP.For example, a prospective study demonstrated a reduction in disability due to back pain with increasing levels of education (48).Additionally, a meta-analysis of 64 studies revealed a higher likelihood of experiencing disabling back pain among individuals with lower educational attainment (49).
Mediation analysis revealed smoking and BMI as mediating variables in the causal pathway from education to LBP, which aligns with previous research indicating that lower education levels are associated with higher rates of smoking and obesity (50), both of which can elevate the risk of LBP (4,27).However, our NHANES database study found only smoking, not BMI, to be a mediator.This finding appears inconsistent with some prior studies (51).We have tried to explore possible reasons for this discrepancy as follows.One possibility is that in the NHANES study, education is included as a categorical variable in the model, which may obscure the relationship with BMI.Another possibility is that there may exist The association between education level and low back pain in subgroups.Each stratification factor was adjusted for all other variables (age, sex, marital status, race, household income, smoking status, physical activity, alcohol consumption, hypertension or diabetes mellitus, body mass index, and time spent watching television) except for the stratification component itself.LBP, low back pain; OR, odds ratio.a non-linear relationship between BMI and education, which would not be captured by standard multivariable regression analysis (52).Furthermore, it is important to take into account factors such as diet and levels of physical activity that may influence the relationship between BMI and education (53).In summary, while our analysis did not find a significant correlation between education and BMI, this does not conclusively disprove a potential relationship.The relationship between BMI and education could be complex and influenced by various factors.Future research employing advanced statistical methods and accounting for potential confounders and interaction effects might provide further insights into this complex topic.
Another intriguing aspect of our study is the finding that smoking and BMI together mediate 27.6% (95% CI: 23.99 to 32.7) of the effect of education on LBP, leaving approximately 3/4 of the effect unaccounted for.This further highlights the complex relationship between education and LBP.It is well known that individuals with higher education attainment are more likely to engage in cognitive work as opposed to physical labor.Furthermore, they often report greater job satisfaction and enjoy better access to quality healthcare resources (54,55), which previous research has identified as significant determinants of chronic LBP (56).Therefore, the potential role of these non-lifestyle factors in mediating the impact of education on LBP warrants further investigation.
Our findings highlight the importance of targeting smoking cessation in the prevention and management of LBP, especially among individuals with lower educational attainment.Smoking has been implicated in impairing blood flow, leading to reduced oxygen and nutrient supply to spinal tissues, which may promote the development of LBP (57).Furthermore, nicotine may increase pain sensitivity, potentially intensifying LBP symptoms (58).Therefore, interventions that reduce smoking prevalence could significantly alleviate the LBP burden.Although BMI was not identified as a mediator between education and LBP in the NHANES study, obesity remains a notable risk factor for LBP.It is hypothesized that obesity, especially abdominal obesity, exerts additional mechanical stress on the lower back, resulting in pain (59).In addition, adipose tissue can secrete pro-inflammatory cytokines, which could contribute to LBP (60,61).Accordingly, weight management may still offer benefits for LBP prevention and treatment.
Our study still has several limitations.First, the smoking index used in MR was derived from a combination of various smokingrelated indicators, not all of which were available in the NHANES database.As a result, we could only classify smoking status into two or three categories in the NHANES study, a categorization that is dimensionally different from that used in MR research.This may lead to some inconsistencies between the findings of the two methods.Second, despite excluding SNPs associated with known confounders in MR and adjusting for them in the NHANES study, residual confounding effects due to unmeasured or unknown factors cannot be completely ruled out.Third, in MR studies, partial sample overlap between education and smoking, alcohol consumption, and leisure TV time introduces bias and increases the probability of Type I errors.Further research is still needed to confirm the causal relationship among them.Additionally, the categorization of education into two groups for calculating the direct and indirect effects on LBP during the mediation analysis may result in a loss of granularity in the measurement of the exposure.These limitations emphasizes the need for further research to refine the evidence in this area.

Conclusions
In conclusion, education can reduce the prevalence of LBP, partly through its effects on smoking cessation and weight management.This implies the necessity of comprehensive strategies to prevent and manage LBP, which not only encompass  direct interventions like pain management and physical therapy but also emphasize the importance of education and promoting healthy lifestyles.

2 Univariate
FIGURE 2 Univariate Mendelian randomization analysis.(A) The effects of education and potential mediators on lower back pain.(B) The effects of education potential mediators.Potential mediators include smoking index, alcohol consumption, BMI, and leisure TV time.LBP, low back pain; BMI, body mass index; OR, odds ratio; IVW, inverse variance weighting.

5
FIGURE 5 Total effect of education on low back pain (A) and mediating effects model for smoking and BMI (B).The efficiency values in Mendelian randomization are derived from the method of inverse variance weighting, while in the NHANES study, they are obtained through weighted multivariable logistic regression.BMI, body mass index; NHANES, National Health and Nutrition Examination Survey.

TABLE 1
Weighted population characteristics by categories of education level.

TABLE 2
The weighted Association between education level and low back pain.

TABLE 3
Mediating effects of smoking and body mass index.
BMI, body mass index.MR, Mendelian randomization.NHANES, National Health and Nutrition Examination Survey; P, proportion.