Exploring the role of vitamin D in cognitive function: mediation by depression with diabetes modulation in older U.S. adults, a NHANES weighted analysis

Background The relationship between vitamin D levels, depressive symptoms, and cognitive function has yet to be definitively understood in the elderly, particularly when considering the impact of chronic diseases. This study focuses on how depression mediates the impact of 25-hydroxyvitamin D3 (25(OH)D3) on cognitive performance in older U.S. adults. Methods We analyzed data from 2,745 elderly individuals extracted from the NHANES 2011–2014 cycles, applying weighted processing to account for the complex multi-stage sampling design characteristic of NHANES data. Utilizing weighted data for covariate and model selection, we conducted mediation analyses on both the overall population and subgroup data. Significant mediation pathways were validated using a stratified weighted bootstrap approach. For significant subgroup pathways, we explored interactive mechanisms through interactive mediation analysis. Results Mediation analyses, thoroughly accounting for the impact of chronic conditions, revealed significant pathways in both the weighted overall population and the weighted diabetes subgroup. After 1,000 stratified weighted bootstrap replications, the proportion of mediation effects were 10.6% [0.040, 0.268] and 20.9% [0.075, 0.663], respectively. Interactive mediation analysis for diabetes indicated that the interaction between diabetes and depression was not significant in the direct pathway (estimates = 0.050, p = 0.113) but was significant in the mediation pathway, yielding the largest effect size compared to other covariates (estimates = 0.981, p < 0.001). Conclusion This study highlights the mediating role of depression in the relationship between vitamin D levels and cognitive function in the elderly, particularly emphasizing diabetes as a key moderator. Our findings suggest targeted interventions addressing both vitamin D sufficiency and depression could significantly benefit cognitive health, especially in diabetic individuals.


Introduction
Cognitive dysfunction, increasingly prevalent in geriatric populations, not only escalates healthcare burdens but also severely diminishes their quality of life (1).Vitamin D, a fat-soluble steroid, has been demonstrated to have profound effects on physical and mental health due to its significant roles in the cardiovascular, endocrine, and nervous systems.The relationship between vitamin D levels and depression is supported by both the presence of vitamin D-related components in brain tissue and the plausible biological mechanisms through which vitamin D influences neurological functions (2,3).These mechanisms include the regulation of neurotransmitter synthesis, which is crucial for neurological functions stabilization, and the modulation of inflammatory responses, which have been implicated in the pathophysiology of depression (2,3).Two follow-up cohort studies have found that vitamin D deficiency is a risk factor for depression in the elderly (4,5).Additionally, existing research has established that the relationship between depression and cognitive function is unidirectional, with depression occurring prior to the onset of cognitive impairments and serving as a longitudinal risk factor for cognitive decline (6,7).Therefore, we posit that there may exist a pathway in elderly individuals where vitamin D influences cognitive function through the mediation of depression.This hypothesis highlights the importance of considering the interplay between nutritional status, mental health, and cognitive health in the aging population, suggesting a sequential and interconnected process that warrants further investigation.
The elderly represent a distinct demographic, often grappling with a greater number of chronic conditions than their younger counterparts.Likewise, various chronic diseases are known to influence the onset of depression and cognitive decline.Evidence indicates that older adults with diabetes are at heightened risk for developing both depression and cognitive dysfunction (8,9).Clinically significant depression is observed in one out of every four individuals with type 2 diabetes mellitus (10).Evidence indicates that diabetes significantly contributes to cognitive dysfunction and predisposes individuals to cognitive decline, with a 50% increased risk of developing dementia (11).Other common chronic diseases among older adults, such as hypertension, stroke, and heart disease, have also been reported to be associated with the occurrence of depression and cognitive dysfunction (12)(13)(14)(15)(16)(17).Without accounting for such covariates as chronic diseases, we might struggle to accurately differentiate between the direct and indirect effects of vitamin D levels on cognition, and to precisely determine the significance of the mediating effect of depression.Neglecting these covariates could obscure the true relationship between vitamin D and cognitive outcomes, as well as the mediating role of depression.
In our research, we extend beyond common demographic and lifestyle factors to particularly emphasize the impact of chronic disease-related covariates on how vitamin D and depression mediate cognitive function.This includes considering the effects of chronic diseases both in mediating pathways and direct pathways.Moreover, we have conducted subgroup mediation effect analyses on chronic disease factors that demonstrated statistical significance in preliminary screenings, using them as grouping variables.All analyses, including variable selection, model fitting, and subgroup analyses, are based on the complex stratified weighted data from the National Health and Nutrition Examination Survey (NHANES), utilizing the official NHANES weights. 1 This approach ensures our findings reflect the real situation across the U.S. civilian, non-institutionalized population.Our objective is to delineate and validate the pathways through which vitamin D impacts cognitive function by mediating depression, with a special focus on the role of chronic diseases.

Study population
Data from the 2011-2014 NHANES cycles are used in our study, as only these cycles include our selected research variables.We excluded all rows with missing values in the key variables of cognitive function, depression, and 25(OH)D3.The remaining dataset, comprising 2,745 individuals aged 60 and over, retained covariates with less than 10% missing values for further analysis.The study population, based on complex multistage probability weighted data, consisted of approximately 46% men and 54% women, with a total weighted sample size of 50,667,046.The median age of the weighted sample was 68 years, with an interquartile range of 63 to 74 years (Supplementary Table S1).This survey design ensures that the sample is representative of the non-institutionalized elderly civilian US population, capturing a broad range of demographic and socioeconomic backgrounds to provide a robust basis for our analysis.
Participants were included based on their willingness and ability to participate in comprehensive health assessments.These assessments were conducted by trained interviewers and healthcare providers, either in participants' homes or at Mobile Examination Centers (MEC).Ethical approval for the data collection was obtained from the National Center for Health Statistics Ethics Review Board, and informed consent was secured from all participants.Detailed interview protocols and measurement methodologies are described in the NHANES documentation. 2

Cognitive function assessment
The NHANES 2011-2014 cycles included a comprehensive cognitive assessment battery that consisted of three tests: the Consortium to Establish a Registry for Alzheimer's Disease (CERAD), the Animal Fluency Test (AF), and the Digit Symbol Substitution Test (DSST).The CERAD evaluated immediate and delayed memory function through a set of learning and recall trials involving 10 unrelated words.The assessment was comprised of three Word Learning Subtests and one Delayed Word Recall test.We aggregated these scores to arrive at a total CERAD score (18).The AF assessed categorical verbal fluency by asking participants to name as many animals as possible within a minute, with a preliminary test involving naming three articles of clothing.The DSST gaged processing speed, sustained attention, and working memory by requiring participants to match symbols and numbers on a paper form within 2 min.To create a composite cognitive function score as our dependent variable, we averaged the z-scores of Depressive symptoms were assessed using the Patient Health Questionnaire-9 (PHQ-9), a widely accepted and validated self-rating scale administered during face-to-face interviews at MEC.The PHQ-9 consists of nine items that evaluate various facets of depression, including anhedonia, depressed mood, sleep disturbance, and suicidal ideation, among others.Respondents rated each item on a scale from 0 to 3, resulting in a total score ranging from 0 to 27, where higher scores indicate more severe depressive symptoms.To achieve analytical consistency with cognitive scores and 25(OH)D3 trends, we transformed the PHQ-9 scores using the formula: Depression Score = 27 − Original Score (actual PHQ-9 score).This transformation aligns the depression scores directionally with cognitive function scores and 25(OH)D3 levels, where higher values consistently represent better outcomes, thereby making the model's interpretation more straightforward and consistent.We analyzed the depression scores as a continuous mediator variable.

Laboratory measurement of 25(OH)D3
The concentrations of serum 25(OH)D3, 25(OH)D2 and C3-epi-25(OH)D3 were analyzed using high-performance liquid chromatography tandem mass spectrometry (LC-MS/MS) at the National Center for Environmental Health, CDC, Atlanta, GA.Calibration and quality control followed NHANES laboratory procedures.The total serum 25(OH)D level was computed by summing the levels of 25(OH)D3 and 25(OH)D2.We selected 25(OH) D3 as the independent variable for our mediation analysis for several reasons.First, it is the most metabolically stable and abundant form of vitamin D metabolite (21), with its serum concentrations serving as a reliable biomarker for assessing an individual's Vitamin D status (22).Measurement of 25(OH)D3 thus represents a more accurate estimation of in vivo vitamin D status (23).Secondly, in the case of the continuous variable 25(OH)D2, the proportion of truncated values is excessively high (74.1%),which compromises its accuracy and, by extension, affects the accuracy of the combined value of total 25(OH)D.

Covariates
We selected covariates from the NHANES dataset based on their relevance to depression and cognitive function, as determined by literature reviews and expert consultation.These covariates include age, gender, poverty income ratio (PIR), marital status, educational background, Body Mass Index (BMI), waist circumference, smoking status, alcohol consumption, season of exam, and historical chronic diseases such as diabetes, hypertension, heart diseases (including congestive heart failure, coronary heart disease, angina/angina pectoris, and heart attack), emphysema, chronic bronchitis, malignancy, hyperlipidemia, stroke, and sleep disorders.All information was acquired through standardized questioning, physical examinations, and laboratory tests, provided by qualified medical staff.

Statistical analysis
To ensure our statistical analysis accurately represents the complex structure of NHANES, we adapted all our statistical analysis methods to align with its sophisticated, multistage, probability sampling design, explicitly incorporating complex weights to achieve accurate estimates that reflect the U.S. population.The cognitive score was divided into quartiles, with quantile thresholds determined using survey-weighted quantile analysis.We present the descriptive results in two forms: weighted data, to provide estimates that are generalizable to the population level, and unweighted data, which represent the raw data without considering the survey's stratification and clustering.For categorical variables, we display frequencies and percentages, denoted as n (%).Continuous variables are described using median values with the interquartile range, indicated as M (Q1, Q3), for both weighted and unweighted data.Variable selection was conducted using weighted data.For comparisons between categorical variables and outcome variable, we applied an adjusted Chi-squared test, which accounts for the design effect in survey data.This adjustment makes the test appropriate for contingency tables, particularly those with small expected frequencies in cells.For continuous variables, the comparison with outcome variable was carried out using a Wilcoxon rank-sum test tailored for complex survey samples.The variables extracted for our study were those with less than 10% missing values.Missing data were filled using Random Forest Imputation via the missForest package.This technique excels at handling nonlinear relationships between variables and offers a robust, high-accuracy approach for imputing both numerical and categorical data without requiring distributional assumptions (24).For subgroup analysis, we selected age, gender, and chronic disease-related variables as grouping factors, ensuring that each subgroup demonstrated a power value greater than 0.8 on the actual number of data points.This criterion guarantees the statistical robustness and representativeness of our analysis, focusing on variables with sufficient statistical power to detect meaningful differences.
From both overall and subgroup perspectives, we employed surveyweighted generalized linear modeling (GLM) with a Gaussian family to assess the effects of the independent variables on the mediator, the mediator on the outcome variable, and the independent variables on the outcome variable.This initial analysis aimed to determine the statistical significance of these relationships, thereby establishing the necessity for mediation analysis.Furthermore, we assessed the interaction effects between 25(OH)D3 and depression within the cohort in a weighted analysis to mitigate the potential influence of interaction effects.This step is critical in ensuring that the mediation analysis is not confounded by unaccounted interaction between variables.
In our analysis, we employed structural equation modeling (SEM) specifically tailored for the complex, weighted survey design.We integrated key aspects of the survey's methodology, including the designation of primary sampling units (PSUs) and the careful stratification of the data into distinct layers.Each model took into account the hierarchical structure of the data through nesting, alongside the application of survey weights, which are pivotal in ensuring our analyses accurately represent the broader U.S. civilian, non-institutionalized population.By strategically incorporating chronic disease covariates into the mediation and direct effect paths in varying configurations, we constructed five multifactorial mediation analysis models, each designed to explore different aspects of the relationships among chronic conditions, mediators, and outcomes.We conduct a comprehensive evaluation of several fit indices for SEM, including the comparative fit index (CFI), Tucker-Lewis index (TLI), Akaike information criterion (AIC), Bayesian information criterion (BIC), standardized root mean square residual (SRMR), and samplesize adjusted BIC (SABIC), to select the model with the best fit.The chosen model is then used to perform mediation analysis on both the weighted total population data and weighted subgroup data.The criteria for determining significant pathways were based on the condition that the p-values for direct effects, mediation effects, and total effects were all less than 0.05.For the validation of significant pathways, we developed stratified weighted bootstrap mediation analysis for complex survey data, with an emphasis on maintaining the integrity of the survey's design, including stratification, clustering, and weighting.This approach meticulously generates 1,000 weighted bootstrap samples, with a unique emphasis on ensuring that each stratum within the data has at least two primary sampling units (PSUs).This requirement is crucial for preserving the representativeness and variability of the survey's stratified design, thereby ensuring that the integrity of the survey structure is upheld in the analysis.For significant subgroup pathways, we adopt the approach of incorporating depression interaction terms into both the mediation and direct paths.This method allows us to analyze the moderating mechanism of the subgroup on depression.Finally, as a sensitivity analysis, we applied the same mediation analysis model to unweighted NHANES data to examine the stability of significant pathways.
All statistical analyses were completed by R 4.2.3 software and the results were considered statistically significant when the bilateral p-value <0.05.

Baseline characteristics of the study population
Table 1 presents the baseline characteristics of the study population, with statistical descriptions provided for both data adjusted using NHANES's complex survey weights and for unweighted data from the 2011-2014 NHANES dataset.The medians for continuous variables and the proportions for categorical variables differ between the weighted and unweighted data, underscoring the impact of the NHANES complex weighting.Nevertheless, the directional trends across both the weighted and unweighted data remain similar with respect to the association with cognitive function categories.Individuals with conditions such as hypertension, diabetes, heart disease, and stroke are more prevalent in the cohort with poor cognitive function than in those with better cognitive function.

Covariable selection and preliminary analysis for mediation analysis
In our analysis, the covariables selected for further study based on the weighted data were stroke, alcohol consumption, education level, age group, gender, income-to-poverty ratio, diabetes, hypertension, heart disease, and marital status (Supplementary Table S1).Preliminary analysis confirmed the statistical significance of the relationships between 25(OH)D3 levels, depression, and cognitive function, which laid the groundwork for the subsequent mediation analysis.Upon controlling for all selected covariates, both 25(OH)D3 (Estimate = 0.049, p = 0.010) and depression (Estimate = 0.066, p = 0.005) were identified as independent predictors of cognitive function.A significant positive association was also established between 25(OH)D3 levels and depression scores (Estimate = 0.098, p = 0.001).This association persisted (Estimate = 0.094, p = 0.001) when taking into account the influence of chronic diseases (stroke, hypertension, diabetes and heart disease).In the results presented in Table 2, the survey-weighted regression analyses demonstrate significant associations between 25(OH)D3 levels, depression scores, and cognitive outcomes across both the entire cohort and most stratified subgroups after adjusting for NHANES complex survey designs.In Figure 1, the analysis reveals no significant interaction between 25(OH)D3 levels and depression scores in influencing cognitive outcomes, both in the univariate model and the multivariate model.

Selection of stratified weighted mediation analysis models
Upon evaluating various fit indices such as the Comparative Fit Index (CFI), Tucker-Lewis Index (TLI), Standardized Root Mean Square Residual (SRMR), Akaike Information Criterion (AIC), Bayesian Information Criterion (BIC), and Sample-Size Adjusted BIC (SABIC), Model 1 proved to be the most appropriate.It demonstrated an optimal balance among these measures, suggesting that it best captures the data structure when compared to other models, as detailed in Table 3.The results of the path analysis for Model 1 in the weighted total population are presented in Figure 2.

Mediation analysis for weighted total and subgroup data
Mediation analysis was conducted using Model 1 on both the weighted total population data and weighted subgroup data.As presented in Table 4, significant pathways were identified in the overall data and within the diabetes subgroup.In contrast, other pathways did not show statistical significance either in the mediation effect, direct effect, or total effect.

Validation of significant pathways
We developed a stratified weighted bootstrap approach to validate the robustness of mediation pathways in both the weighted overall population and the weighted diabetic subgroup.Figure 3, based on 1,000 bootstrap resamples, illustrates that the mediated effect accounts for 20.9% [0.075, 0.663] in the diabetic subgroup.For the general population, the mediated effect is 10.6% [0.040, 0.268].Both effects were statistically significant, as indicated by their 95% bootstrap confidence intervals not encompassing zero.

Interactive mediation analysis for depression and diabetes
We further explored the moderating role of diabetes in the mediation analysis.Figure 4 presents the path analysis results after incorporating interaction terms into both the direct and mediated pathways.Notably, the introduction of an interaction term between diabetes and depression yields a statistically significant estimate of 0.981 in the mediation pathway, surpassing all other variables in the mediation analysis.However, this interaction term is not significant in the direct pathway.Therefore, we consider diabetes to serve as an important moderator of depression in the mediation pathway.

Sensitivity analysis
We applied Model 1 to unweighted data for both overall and subgroup mediation analyses, as shown in Table 4.The results indicate that pathways for the general population and the diabetes subgroup remain significant.Additionally, in the unweighted scenario, numerous other significant pathways emerged, including subgroups of elderly under 70 years, females, non-diabetics, and those with hypertension (Table 5).

Discussion
In our study, leveraging the complex multi-stage sampling design of NHANES and fully accounting for the impact of chronic conditions in the elderly, we conducted an in-depth analysis of how vitamin D affects cognitive function through depression as a mediating factor.Several studies across diverse regions have reported that lower vitamin D is associated with cognitive impairment in the elderly (25,26), which was consistent with our hypothesis.However, other studies showed that there is no association between vitamin D status and cognitive impairment (27)(28)(29).Given the conflicting conclusions in existing research, it's plausible that the relationship between vitamin D and cognitive function is intricate, potentially influenced by mediating factors and different subgroups.Our research, drawing on weighted data from the entire U.S. civilian, non-institutionalized population, confirms the presence of a pathway in which vitamin D impacts cognitive function through depression, both in the general population and specifically within the diabetic subgroup.
Many NHANES-based studies overlook the importance of the survey's sampling weights in mediation analyses (30,31).Building on the R survey package, we integrated it with the complex multi-stage sampling design and combined it with the lavaan package, ensuring that weighted data are utilized throughout the mediation analysis process (32).This methodology not only significantly reduces bias in our estimates but also enhances the statistical power of subgroup  analyses by improving generalizability.By considering the design effect and employing appropriate weighting methods, our approach effectively addresses the challenges of unbalanced subgroup sizes, thereby ensuring more accurate and representative analysis outcomes (33).For significant mediation pathways, we implemented a validation process using a robust stratified weighted bootstrap approach, which we developed specifically to accommodate the complex multi-stage sampling design of NHANES.This innovative method underscores the stability and replicability of our findings.
Beyond the general population, the diabetes subgroup also exhibited the significant pathway, prompting further investigation into the mechanisms by which diabetes influences the mediation effect of depression.Our analysis revealed that the interaction term between depression and diabetes presented the highest estimate value (0.981) among all covariates within the mediation pathway, positioning diabetes as a crucial conditional moderator in the mediation of depression.This assertion is well-supported theoretically, as depression and diabetes frequently co-occur (34).Both depression and diabetes are intricately linked to disruptions in brain function and neuroplasticity, evidenced by changes in brain chemistry and structure, such as altered prefrontal neurotransmitter levels and hippocampal atrophy, which impact cognition and mood (35).Research indicates that the prevalence of depression is threefold higher in individuals with prediabetes and twofold higher in diabetes patients compared to the general population (36).
The sensitivity analyses, using unweighted NHANES data across subgroups enhanced our study's confidence.Beyond the significant pathways identified in the general population and the diabetic subgroup, significant pathways also emerged among the elderly under 70, females, non-diabetics, and hypertensive individuals.This outcome suggests that the impact of vitamin D, mediating through depression on cognitive function, may be more extensive than previously thought, underlining its significance across various population segments.The sensitivity analysis also hints at a potential overestimation of effects when the survey design is not considered, thereby emphasizing the importance of incorporating NHANES weighting throughout our research for accurate and meaningful public health insights.Our findings highlight the potential for interventions such as targeted vitamin D supplementation to not only improve vitamin D levels but also address depression as a key intermediary, thereby enhancing cognitive function.This approach    holds particular promise for diabetic and elderly populations, where the interplay of vitamin D, depression, and cognitive decline appears most significant.These findings pave the way for future randomized controlled trials designed to evaluate the significant pathways identified in our research.Such trials are crucial for refining and potentially enhancing current public health guidelines and clinical practices for the elderly, particularly in the context of leveraging vitamin D to mediate the impact of depression on cognitive function.
There are several limitations present in the study.Firstly, although our findings are supported by existing literature suggesting potential pathways between vitamin D, depression, and cognitive function, given the cross-sectional nature of our research, it inherently lacks the temporal data necessary to establish causality or the sequence of occurrence among these variables.Secondly, the study is also geographically limited to the U.S. elderly population, and despite controlling for multiple variables, including some collected through self-reported questionnaires which may introduce bias, unmeasured confounders may still exist.Thirdly, since our analysis methods incorporated weights from NHANES' complex multi-stage sampling design, many machine learning algorithms do not support analyses with complex weights, leading us to employ path analysis.This analytical approach, being a conventional statistical method, is subject to limitations related to p-values, confidence intervals, and statistical power.Fourthly, while this study comprehensively analyzes the mediating role of depression in the  Future research should explore potential variables that influence vitamin D status, such as BMI, gender, and sun exposure, to expand the research directions and deepen our understanding of the underlying determinants.

Conclusion
In conclusion, our study utilizes NHANES data to uncover significant pathways by which vitamin D affects cognitive function through depression, with diabetes serving as a pivotal moderator.Throughout the mediation analysis, we employed NHANES' complex multi-stage sampling design with weighted data and validated the mediation effect proportion using a stratified weighted bootstrap approach.Our results are representative of the entire U.S. civilian, non-institutionalized elderly population.The findings highlight the need to concurrently address vitamin D sufficiency and mental health in developing cognitive health strategies for the elderly, emphasizing the importance of customized interventions for individuals with diabetes.

Model 1 :
Comprehensive mediation with chronic disease covariates.Model 2: Mediation with chronic disease covariates, selective direct effect without chronic disease covariates.Model 3: Mediation and direct effect without chronic disease covariates.Model 4: Direct effect with all covariates, mediation excluding chronic disease covariates.Model 5: Direct effect with only chronic disease covariates, mediation excluding chronic disease covariates.CFI, comparative fit index; TLI, Tucker-Lewis index; SRMR, standardized root mean square residual; AIC, akaike information criterion; BIC, bayesian information criterion; SABIC, sample-size adjusted bayesian information criterion.

FIGURE 2
FIGURE 2 Weighted overall population path analysis diagram.The figure illustrates the path diagram for the variables analyzed in our selected SEM, using the weighted total population.Black lines with associated values represent path estimates on depression and cognitive function, respectively.The red lines highlight the study's focal path estimates.The value of 0.40 adjacent to 'Cognitive function' indicates the estimated residual variance for the dependent variable within the SEM.PIR, poverty income ratio; Heart, heart disease.A blue asterisk indicates statistical significance.

FIGURE 3
FIGURE 3 Diagram of weighted overall population and weighted diabetes subgroup pathways, along with stratified weighted bootstrap results.Red dots represent the medians of 1,000 resampling iterations, and red lines indicate confidence intervals.

FIGURE 4
FIGURE 4 Path analysis diagram for the weighted overall population with the addition of diabetes and depression interaction terms.The figure illustrates the path diagram for the variables analyzed in selected SEM, using the weighted total population.Black lines with associated values represent path estimates on depression and cognitive function, respectively.The red lines highlight the focal path estimates of the mediation effect of depression moderated by diabetes.The value of 0.40 adjacent to 'Cognitive function' indicates the estimated residual variance for the dependent variable within the SEM.PIR, poverty income-ratio; Heart, heart disease.A blue asterisk indicates statistical significance.
Characteristics of adults aged 60 and over, stratified by cognitive function category in the 2011-2014 NHANES.

TABLE 2
Survey-weighted regression for independent & mediator variables with outcome.

TABLE 3
Comparative fit indices for weighted SEM models with varied inclusion of chronic disease covariates.

TABLE 4
Mediation analysis of 25(OH)D3 on cognitive function via depression.