Your new experience awaits. Try the new design now and help us make it even better

ORIGINAL RESEARCH article

Front. Nutr., 26 November 2025

Sec. Nutritional Epidemiology

Volume 12 - 2025 | https://doi.org/10.3389/fnut.2025.1705683

Dietary patterns and obesity are associated with type 2 diabetes risk in elderly Chinese men: a machine learning approach


Haowei SunHaowei Sun1Lijin ZhuLijin Zhu1Peng WangPeng Wang1Keqing YuanKeqing Yuan2Saida Salima NawrinSaida Salima Nawrin2Yufei CuiYufei Cui2Longfei Li,
Longfei Li1,2*
  • 1College of Physical Education and Health, Heze University, Heze, Shandong, China
  • 2Graduate School of Medicine, Tohoku University, Sendai, Japan

Background: Type 2 diabetes mellitus (T2DM) is a major global public health issue, with a particularly high prevalence in China, especially among older men. Obesity, dietary habits, and metabolic risk factors are key contributors to the development of T2DM. However, research on the relationship between dietary patterns, obesity, and T2DM in elderly Chinese men remains limited. Objective: This study aims to examine the links between obesity, dietary habits, blood pressure, and the risk of developing T2DM in elderly Chinese men. We utilize unsupervised machine learning methods along with SHAP-based model interpretation to identify significant lifestyle and metabolic factors associated with T2DM risk.

Methods: A cross-sectional study was conducted with 982 participants aged 60 years and older from community health centers in Heze City, China. Unsupervised machine learning methods (UMAP) were used to identify dietary patterns, and supervised machine learning with SHAP was applied to evaluate the importance of obesity, dietary patterns, and lifestyle factors on T2DM risk. Logistic regression analyses were performed to investigate the associations between obesity, dietary habits, blood pressure, and T2DM risk. Sensitivity analyses were performed to verify the robustness of the findings.

Results: Four distinct dietary patterns were identified: “high-fiber nutrient-dense,” “staple–protein,” “seafood-eggs,” and “sugary and processed foods.” The prevalence of newly diagnosed T2DM in males was 48.37%. Obesity was inversely associated with T2DM risk across all models (odds ratios: 0.272–0.278, all P < 0.05). Compared with the high-fiber nutrient-dense pattern, adherence to the staple–protein, seafood–eggs, and sugary and processed foods patterns was significantly associated with increased obesity and T2DM risk (all P < 0.01). Shapley Additive Explanations (SHAP) analysis highlighted dietary behaviors, total energy intake, and physical activity as major contributors to T2DM prediction. Sensitivity analyses confirmed the robustness of these associations, independent of total caloric intake and BMI.

Conclusion: In this population of elderly Chinese males, unhealthy dietary patterns are positively associated with obesity and T2DM risk, whereas obesity itself showed an inverse relationship with T2DM. These findings underscore the importance of promoting nutrient-dense diets and targeted lifestyle interventions to reduce T2DM risk in this population.

1 Introduction

Type 2 diabetes mellitus (T2DM) has become a significant global public health concern, with its incidence increasing consistently over recent decades (1, 2). In China, the impact of T2DM is especially severe (3), largely due to the country's rapid economic growth, accelerating urbanization, and consequent shifts in lifestyle patterns (4). These societal transformations have contributed to rising rates of obesity, unhealthy eating habits, and reduced physical activity—well-established risk factors that play a critical role in the onset of T2DM (5).

The development of type 2 diabetes mellitus is influenced by a combination of genetic susceptibility and environmental factors, reflecting its multifactorial nature (6). Among these, lifestyle behaviors that are subject to modification—such as dietary habits, physical inactivity, tobacco smoking, alcohol intake, and weight management—are recognized as key contributors to both the onset and progression of the disease (7). Increasing evidence indicates that unhealthy dietary patterns and excess body weight are closely associated with elevated blood pressure and impaired glucose metabolism, particularly among older adults, especially males (1, 8).

While numerous epidemiological studies have confirmed the protective effects of a healthy diet in reducing the risk of chronic diseases, including T2DM, most of this research has focused on Western populations. Numerous large-scale population studies have demonstrated that maintaining a nutritious diet plays a crucial role in lowering the incidence of chronic illnesses (911). Despite this, much of the existing research examining the link between dietary habits and chronic disease has concentrated on individual food items, limiting the ability to draw conclusions about optimal dietary patterns for health promotion (12). Dietary patterns encompass the quantity, variety, balance, and combination of various foods, beverages, and nutrients consumed regularly (1315). It is also worth emphasizing that a significant proportion of these investigations primarily involve healthy adults or younger populations, with relatively little focus on vulnerable groups such as elderly individuals with diabetes. Recent cohort studies in China indicate that dietary patterns characterized by high intakes of vegetables, fruits, whole grains, and fish are significantly associated with a decreased risk of type 2 diabetes mellitus (T2DM), whereas consumption of diets rich in red meat, refined grains, and sodium is linked to an increased risk (16, 17). Nevertheless, evidence regarding these dietary associations among elderly Chinese populations, especially males—a demographic considered particularly vulnerable and understudied—remains limited.

To address these gaps, advanced data-driven approaches such as unsupervised machine learning have been increasingly applied in nutritional epidemiology. Unlike traditional statistical methods (e.g., PCA or factor analysis), which require predefined assumptions about dietary groupings, unsupervised learning can uncover previously hidden or emergent dietary patterns in an unbiased manner, providing a more nuanced representation of real-world eating behaviors. In particular, Uniform Manifold Approximation and Projection (UMAP) has demonstrated superior capability in capturing non-linear relationships and preserving both local and global structure within high-dimensional dietary data, enabling more accurate characterization of complex dietary patterns (1821). Moreover, Shapley Additive Explanations (SHAP) enable the interpretation of complex models by quantifying the contribution of individual features to predicted outcomes, thereby enhancing both transparency and clinical relevance (22). Despite their potential, relatively few studies have employed these methodologies to examine dietary patterns (23, 24). This unsupervised methodology facilitates the data-driven identification of naturally occurring dietary patterns, as opposed to relying on predefined categories, thereby uncovering subtle and previously unrecognized relationships between diet and disease.

Therefore, the present study aims to examine the associations between obesity, dietary patterns, blood pressure, and T2DM risk among elderly Chinese males. By leveraging UMAP for dietary pattern extraction and SHAP for model interpretability, this study seeks to identify key nutritional and metabolic predictors of T2DM and provide evidence to inform targeted prevention strategies and public health interventions in China.

2 Materials and methods

2.1 Study design and participants

This cross-sectional study was conducted to investigate the associations between obesity, dietary patterns, blood pressure, and the risk of type 2 diabetes mellitus in elderly Chinese males. A total of 982 participants aged ≥60 years were recruited from community health centers in Heze City between 2018 and 2019. Participants were selected using a multistage stratified random sampling method to ensure representativeness. Of the 982 participants, 453 (46.1%) were rural residents and 529 (53.9%) were urban residents.

Inclusion criteria were: (1) male sex, aged ≥60 years; (2) residence in the study area for at least 1 year; and (3) ability to provide informed consent. Exclusion criteria included: (1) diagnosis of type 1 diabetes; (2) severe cardiovascular, renal, or hepatic disease; (3) cognitive impairment that affected the ability to complete dietary assessments; and (4) use of medications that significantly alter metabolic function. The study was conducted in accordance with the Declaration of Helsinki, and all procedures were approved by the ethics committee of Heze University. All participants or their families signed informed consent documentation before sample collection. The study was approved by the Institutional Review Board of Heze University (Approval No. 20180314).

2.2 Diagnosis of obesity

Anthropometric measurements were conducted following standardized protocols. Height and weight were measured using a calibrated stadiometer and a digital scale, respectively, with participants wearing light clothing and no shoes. Body mass index (BMI) was calculated as weight (kg) divided by height squared (m2) (25). Subjects were categorized into four BMI groups according to the World Health Organization guidelines for Asians: underweight (< 18.5 kg/m2), normal weight (18.5–22.9 kg/m2), overweight (23–24.9 kg/m2), and general obesity (≥25 kg/m2) (26). Abdominal obesity was defined as a WC ≥90 cm in men (27). General obesity was further defined as BMI ≥28 kg/m2, based on the criteria recommended by the Working Group on Obesity in China (WGOC) (28). Central obesity was defined as waist circumference (WC) ≥90 cm for men (29).

2.3 Blood pressure

Blood pressure (BP) was measured using an automated sphygmomanometer after participants has rested for at least 5 min in a seated position. Three measurements were taken at 1-min intervals, and the average of the last two readings was recorded. Hypertension was defined as systolic BP ≥140 mmHg, diastolic BP ≥90 mmHg, or current use of antihypertensive medication (30).

2.4 Diagnosis of diabetes

Fasting blood samples were collected in the morning after at least 8 h of overnight fasting. Blood glucose, glycated hemoglobin (HbA1c), and lipid profiles (total cholesterol, triglycerides, high-density lipoprotein cholesterol, and low-density lipoprotein cholesterol) were analyzed using standard enzymatic methods in a certified laboratory. Type 2 diabetes mellitus was defined according to the American Diabetes Association criteria as fasting plasma glucose (FPG) ≥7.0 mmol/L, HbA1c ≥6.5%, self-reported physician diagnosis, or current use of antidiabetic medication (31).

2.5 Dietary assessment and pattern analysis

A validated semi-quantitative food frequency questionnaire (FFQ) comprising of 81 items was used to assess food and nutrient intake (32). Trained dietitians conducted structured interviews in which participants were asked to recall the frequency of their consumption of each food item over the previous 4 weeks (33). To facilitate accurate estimation of portion sizes, trained interviewers used food photographs depicting standard reference portions. For liquid items such as milk, juice, and other beverages, intake was recorded in milliliters (ml) and subsequently converted to grams for analysis purposes (34). The frequency of food intake was recorded using seven categories: (1) never; (2) less than once per week; (3) once per week; (4) 2–3 times per week; (5) 4–6 times per week; (6) once per day; and (7) two or more times per day. The 81 food items were classified into 21 predefined food groups according to similarities in nutrient profiles, culinary use, and grouping schemes adopted in previous studies (32, 35). The average intake of each food item was calculated by multiplying the reported frequency by the portion size, using data from the China Food Composition table (2009) as a reference (36).

2.6 Clustering dietary patterns using machine learning

To identify dietary patterns, Uniform Manifold Approximation and Projection (UMAP) was applied in conjunction with varimax rotation (37). Dimensionality reduction was first conducted using UMAP, followed by the extraction of latent features through factor analysis based on the raw dataset. The UMAP parameters were set as follows: umap_min_dist = 0.1, max_components = 2, n_neighbors = 10, and Chebyshev distance as the metric. These values were optimized through systematic parameter tuning to ensure the generation of distinct and well-separated clusters.

To determine the optimal number of clusters, 2D data were applied to K-means clustering; the default settings were used for K-means clustering. The number of clusters was determined using the elbow method. The number of clusters was determined using the elbow method and the final cluster number was also confirmed as peak numbers in the contour map. We applied the elbow method on the UMAP-reduced features.

The two-dimensional representations produced by UMAP were then subjected to factor analysis with varimax rotation, and the resulting factors were interpreted based on their loadings on the 21 predefined food groups. Cluster labeling was performed according to the relative contributions of each food group to the factors, with “trend values” reflecting the standardized factor scores across clusters.

To complement the unsupervised identification of dietary patterns, a supervised classification model was subsequently applied to evaluate disease risk. Specifically, the dietary clusters derived from the UMAP–K-means procedure were incorporated as input features—together with metabolic and lifestyle variables—into an XGBoost classifier trained to predict Type 2 diabetes mellitus (T2DM) status. This hybrid analytical framework enabled both data-driven discovery of dietary structures and quantitative assessment of their predictive relevance for T2DM. The interpretability of the model was further enhanced by applying SHAP, which quantified the relative contribution of each feature to predicted T2DM risk.

To evaluate the relative contribution of various features to Type 2 diabetes mellitus risk, we trained an XGBoost classifier using the study dataset. Model performance was assessed using 5-fold cross-validation. Hyperparameters were optimized via grid search to maximize predictive performance. To interpret the contribution of individual features to T2DM risk, we applied SHAP (SHapley Additive exPlanations) analysis using the shap-viz package. SHAP values quantify the impact of each feature on the model's predictions, allowing for ranking of the most influential variables. Force plots and summary plots were generated to visualize how dietary patterns, energy intake, and lifestyle factors influence T2DM risk, thereby enhancing interpretability and providing insight into potential mechanisms linking these factors to disease onset.

Unsupervised machine learning algorithms were implemented using the Python programming language (version 3.7.10) within the Anaconda distribution (version 2.1.1) on a Mac operating system (macOS 12.5). The required libraries, including umap-learn (version 0.3.10) and scikit-learn (version 0.21.3), were installed and executed in a Jupyter Notebook environment (version 6.4.6).

2.7 Statistical analysis

Statistical analyses were conducted using JMP software (version 16.2.0) and Python (version 3.7.10). The distribution of variables was assessed for normality using the Kolmogorov–Smirnov test. Since all continuous variables met the assumption of normality, no transformations were necessary. Descriptive statistics were calculated for each cluster, with means and standard deviations reported for continuous variables, and frequencies with percentages for categorical variables. To examine differences related to T2DM, analysis of variance (ANOVA) was applied to continuous data, while Chi-square tests were utilized for categorical variables.

Other lifestyle-related and sociodemographic information—including age, smoking status, sleep quality, sleep duration, living arrangements, and educational level—was obtained through a self-administered questionnaire. Average sleep duration per night was estimated based on participants' self-reported bedtime and wake-up time, without differentiating between weekdays and weekends. Sleep duration was then categorized into two groups: “6–8 h per day” (considered typical) and “atypical” (shorter or longer than 6–8 h). Educational attainment was defined by the highest level of formal education completed and categorized into four groups: “Illiterate,” “Primary school,” “Middle school,” and “College or above.” Physical activity was assessed using the International Physical Activity Questionnaire (IPAQ). Total weekly physical activity was calculated in terms of metabolic equivalents (METs) multiplied by hours per week (METs × h/week). Blood samples were collected in siliconized vacuum tubes containing sodium fluoride for biochemical analysis. Fasting blood glucose (FBG) was measured using an enzymatic method (Eerotec). Serum uric acid (UA) levels were determined enzymatically using the Pureauto SUA kit (Sekisui Medical Co. Ltd.). Triglycerides, low-density lipoprotein cholesterol (LDL-C), and high-density lipoprotein cholesterol (HDL-C) levels were also measured using enzymatic methods with appropriate reagent kits. Participants reported any history of physical illness and current use of medication, including treatment for diabetes and dyslipidemia, via a binary (“yes” or “no”) format. A two-tailed P-value < 0.05 was considered statistically significant.

2.8 Sensitivity analyses

To evaluate the robustness of our findings, we conducted sensitivity analyses focusing on BMI and total energy intake. Participants were stratified by BMI categories (underweight, normal weight, overweight, and obese), and multivariate logistic regression models were rerun to assess whether associations between obesity and T2DM were consistent across BMI strata. Additionally, total daily energy intake was estimated from the validated food frequency questionnaire (FFQ), and participants were divided into quartiles based on energy intake. Multivariate logistic regression models were subsequently reanalyzed with total energy intake included as a covariate to determine whether the observed associations between dietary patterns, obesity, blood pressure, and T2DM risk remained robust after adjustment for caloric intake. These sensitivity analyses served as an internal validation approach to examine the robustness of the observed associations under different model assumptions, thereby increasing confidence in the stability of the findings.

3 Results

3.1 Descriptive characteristics by type 2 diabetes mellitus

As shown in Table 1, 48.37% of the male participants were newly diagnosed with Type 2 diabetes mellitus. Notably, an inverse association between obesity and T2DM was observed in subsequent analyses (Table 1). No statistically significant differences were observed between diabetic and non-diabetic males regarding age, body mass index (BMI), hip and waist measurements, level of physical activity, systolic and diastolic blood pressure, income, serum insulin, HOMA-IR values, LDL and HDL cholesterol, triglyceride levels, C-reactive protein (CRP), uric acid concentration, fasting plasma glucose, total caloric intake, glycated hemoglobin (HbA1c), education level, tobacco and alcohol use, family history of diabetes, duration and quality of sleep, marital and employment status, residential area, presence of general or abdominal obesity, or ethnic background. Nonetheless, males with diabetes were more frequently prescribed antidiabetic drugs and had a higher likelihood of reporting a family history of hypertension compared with those without T2DM (P < 0.05).

Table 1
www.frontiersin.org

Table 1. Demographic characteristics of the diabetes.

3.2 Clustering dietary patterns using unsupervised machine learning

A dataset with 81 dimensions (81 variables) was reduced to two dimensions using the UMAP algorithm. The first and second dimensions were plotted on the x and y axes. A 2D scatter plot of the clusters is presented in Figure 1. A density contour map overlaid on the scatter plot revealed four distinct peaks, indicating four natural dietary clusters. These clusters represent four principal dietary patterns in the study population.

Figure 1
Scatter plot with contours illustrating K-means clustering. Clusters A, B, C, and D are shown in red, blue, green, and black, respectively, on UMAP axes, with distinct regions for each cluster.

Figure 1. Scatter plot and contour map.

The data has been reduced to two dimensions [Dimension 1 (D1) and Dimension 2 (D2)] using UMAP. Each point on the graph represents an individual participant in the study. The contour lines are curves that show areas of constant Gaussian values in the 2D space, connecting points of similar 2D values. Number clusters (n = 4) were chosen optimally by contour map analysis. Colors indicate cluster assignment using K-means clustering (K = 4; see “Section 2” for details).

3.3 UMAP-derived visualization of four principal dietary patterns based on factor loadings

To investigate dominant dietary behavior patterns, we conducted a factor analysis combined with Uniform Manifold Approximation and Projection (UMAP), using categorized food groups as the basis for analysis. The derived factor loadings for each dietary pattern reveal key food items that contributed significantly, either positively or negatively (loadings ≥0.25 or ≤ −0.25) are shown in Table 2. The first pattern, referred to as “high-fiber nutrient-dense,” was characterized by high consumption of vegetables, legumes, mushrooms, tubers, fruits, whole grains, seaweed, wheat-based products, nuts, and dairy. The second pattern, labeled “staple–protein,” was associated with greater intake of white rice, red and poultry meats, organ meats, eggs, legumes, and various condiments. The third pattern, termed “seafood-eggs,” was marked by frequent consumption of seafood, eggs, dairy, nuts, fruits, and tea, while showing an inverse relationship with sugar-sweetened beverages. The fourth dietary pattern, described as “sugary and processed foods,” was characterized by high intake of fast food, sugary drinks, desserts, and alcoholic beverages, alongside lower consumption of dairy and white rice. Both high-fiber nutrient-dense and seafood–eggs patterns provide high-quality protein and fiber, but the former emphasizes vegetables and whole grains, while the latter focuses on seafood and eggs. The staple–protein pattern relies on refined carbohydrates and red meat, with moderate sugar intake. The sugary and processed foods pattern is energy-dense but lacks fiber and high-quality protein. Collectively, these four dietary patterns accounted for 12.33, 24.62, 33.03, and 39.10% of the total variance in dietary intake, respectively.

Table 2
www.frontiersin.org

Table 2. Description of dietary patterns identified through factor analysis and UMAP.

3.4 Importance of difference features interpreted by SHAP value

The SHAP visualization (Figure 2) illustrates the relative contribution of the top 20 features utilized by an XGBoost machine learning model in predicting the risk of Type 2 diabetes mellitus in elderly Chinese males. The model was trained using a comprehensive feature set, including demographic information, anthropometric measures, blood pressure, dietary intake, and physical activity variables. Its performance was evaluated via five-fold cross-validation, achieving an average ROC of 0.83 and an accuracy of 84% (see Supplementary Figure 1), indicating a high level of discriminatory capability. Notably, factors such as dietary behaviors, total energy consumption, and physical activity levels were identified as the most prominent negative predictors. To better elucidate how dietary patterns influence the model's predictions and to clarify the underlying mechanisms, we used the shap-viz package to produce force plots (also shown in Figure 2). This figure displays the ranked influence of each feature on T2DM risk and conveys the overall predictive magnitude. In the visualization, red-colored features denote variables that contribute to a decreased probability of T2DM onset. These visual outputs enhance model interpretability and support users in understanding the basis of predictions, thereby facilitating more informed dietary choices.

Figure 2
SHAP summary plot illustrating the average impact of features like dietary patterns and total energy intake on a model output, with dietary patterns having the highest mean SHAP value. SHAP waterfall plot showing the distribution of feature impacts such as dietary patterns and physical activity levels on model output, with color indicating feature value gradient from high to low. SHAP force plot visualizing individual prediction impact with features like C-reactive protein and dietary patterns contributing positively, and DBP and education level contributing negatively.

Figure 2. SHAP values of dietary patterns for T2DM model. (A) SHAP summary plot. Absolute feature contribution of the 20 features with the highest mean absolute SHAP value. (B) SHAP waterfall plot. Complete distribution of the SHAP values for the top 20 features based on the highest mean absolute SHAP value. Each sample of the test set is represented as a data point per feature, and the x axis shows the positive or negative effect on the model's prediction of the feature. The color coding depicts the value of the feature and is scaled independently based on the range observed in the data. (C) SHAP force plot. Representative force plots, depicting how the features contributed to the prediction for specific data points.

3.5 Association of clusters with type 2 diabetes mellitus and obesity

Table 3 displays the outcomes of five sequential logistic regression models assessing the link between obesity and the risk of Type 2 diabetes mellitus (DM) based on a cross-sectional analysis. Throughout all models, obesity was consistently found to be inversely related to the risk of T2DM, as evidenced by odds ratios between 0.272 and 0.278, all with P-values under 0.05. This negative association remained robust even with the addition of covariates in each subsequent model.

Table 3
www.frontiersin.org

Table 3. Multivariable logistic regression analysis of the association between obesity and type 2 diabetes mellitus.

In the multivariable logistic regression analysis, dietary patterns were significantly associated with obesity risk among elderly Chinese males (Table 4). Using the high-fiber nutrient-dense dietary pattern as the reference, participants adhering to the staple–protein pattern exhibited a higher risk of obesity across all models (Model 1: β = 0.352, 95% CI: 0.120–0.584, P = 0.002; Model 5: β = 0.348, 95% CI: 0.128–0.568, P = 0.003). Similarly, the seafood–eggs pattern was positively associated with obesity (Model 1: β = 0.415, 95% CI: 0.182–0.648, P = 0.002; Model 5: β = 0.386, 95% CI: 0.159–0.613, P = 0.003). Notably, the sugary and processed foods pattern demonstrated the strongest association with obesity risk, with consistently significant effects after full adjustment (Model 1: β = 0.601, 95% CI: 0.372–0.830, P = 0.002; Model 5: β = 0.581, 95% CI: 0.358–0.804, P = 0.003). These associations remained robust after adjusting for a wide range of sociodemographic, lifestyle, biochemical, and clinical covariates, suggesting that dietary patterns characterized by lower nutritional quality are independently linked to increased obesity risk in this population.

Table 4
www.frontiersin.org

Table 4. Multivariable logistic regression analysis of the association between dietary patterns and obesity.

As shown in Table 5, all three non-reference dietary patterns were significantly associated with increased risk of type 2 diabetes mellitus (T2DM) compared with the high-fiber nutrient-dense pattern. In Model 1, ORs were 1.31 (95% CI: 1.05–1.64) for staple–protein, 1.28 (95% CI: 1.02–1.61) for seafood–eggs, and 1.54 (95% CI: 1.20–1.97) for sugary and processed foods, with P = 0.012. These associations remained significant across Models 2–5 after progressively adjusting for demographic factors, lifestyle behaviors, metabolic biomarkers, blood pressure, T2DM status, and family history or medication use (all P < 0.01), indicating a consistent positive relationship between unhealthy dietary patterns and T2DM risk in this population.

Table 5
www.frontiersin.org

Table 5. Multivariable logistic regression analysis of the association between dietary patterns and type 2 diabetes mellitus.

3.6 Sensitivity analysis

To assess the robustness of our findings, sensitivity analyses were performed by categorizing participants according to BMI and dividing them into quartiles based on total daily energy intake, estimated using the validated FFQ. Multivariate logistic regression models were then rerun with BMI and energy intake included as covariates. The results indicated that these adjustments had minimal impact on the associations between dietary patterns, obesity, blood pressure, and T2DM risk, with effect sizes and statistical significance largely unchanged compared with the original models (see Supplementary Tables 16). The results of the sensitivity analyses were consistent with the primary findings, indicating that the associations remained robust across different BMI strata and after adjustment for total energy intake.

3.7 Age- matched analysis

As shown in Figure 3 (Supplementary materials), the distribution of Type 2 diabetes mellitus was imbalanced across different levels of obesity and clusters.

Figure 3
Dot plot showing age distribution from sixty to ninety across three categories: General Obesity, Normal Weight, and Overweight. Red dots indicate diabetes (DM) absence, and blue dots indicate its presence.

Figure 3. A dot plot was used to illustrate the age distribution across different scales. The color coding represents each scale distinctly.

4 Discussion

This study explored the associations between obesity, dietary patterns, blood pressure, and type 2 diabetes mellitus (T2DM) risk in elderly Chinese males, using unsupervised machine learning techniques and SHAP-based model interpretation. We identified four distinct dietary patterns—“high-fiber nutrient-dense,” “staple–protein,” “seafood-eggs,” and “sugary and processed foods”—and examined their impact, alongside obesity and blood pressure, on the prevalence and risk of T2DM. Furthermore, the results remained similar even when sensitivity analyses were performed. The findings provide important insights into modifiable risk factors and their role in predicting diabetes outcomes.

4.1 Type 2 diabetes mellitus prevalence and clinical characteristics

The overall prevalence of newly diagnosed type 2 diabetes mellitus among elderly males was notably high (48.37%), substantially exceeding the reported national prevalence among Chinese adults aged ≥60 years, which ranges from ~20 to 25% (38). Interestingly, despite the high prevalence of type 2 diabetes, no significant differences were observed in BMI, waist circumference, blood pressure, lipid profile, or other metabolic markers between participants with and without diabetes. This finding may reflect the relatively homogeneous characteristics of this elderly population, many of whom may have adopted long-term lifestyle modifications or health management strategies that attenuate traditional risk factors (39, 40). Furthermore, unmeasured factors—such as habitual dietary patterns, genetic predisposition, or ongoing medication use—could contribute to diabetes risk and partially explain the absence of significant differences (41, 42). Comparable results have been reported in other older populations, indicating that standard anthropometric and metabolic markers may have limited utility to discriminate diabetes risk among elderly individuals (30, 43).

4.2 Machine learning-based dietary pattern classification

To overcome the limitations of conventional dietary pattern analyses, we employed UMAP (Uniform Manifold Approximation and Projection) to reduce dimensionality and cluster high-dimensional dietary intake data (19). This unsupervised approach effectively identified four distinct dietary patterns: “high-fiber nutrient-dense,” “staple–protein,” “seafood–eggs,” and “sugary and processed foods,” demonstrating the capacity of machine learning methods to uncover latent dietary structures without predefined assumptions. The high-fiber nutrient-dense pattern was characterized by high intake of plant-based and nutrient-rich foods, whereas the sugary and processed foods pattern was characterized by high consumption of processed and high-sugar items, reflecting contrasting dietary behaviors. The “staple–protein” and “seafood–eggs” patterns occupied intermediate positions and reflected regionally or culturally typical dietary habits. These four dietary patterns accounted for 6.07%−12.33% of the variance in dietary intake individually, and cumulatively explained up to 39.10% of the total variance. Although the percentage of variance explained by individual clusters may appear modest, this is consistent with previous studies on dietary patterns, where typical variance explained ranges from 5 to 30% due to the inherent complexity and heterogeneity of dietary behaviors (44, 45). Therefore, the identified patterns capture meaningful distinctions in dietary habits among elderly Chinese men and provide a robust basis for examining associations with obesity and type 2 diabetes risk.

4.3 Associations between dietary patterns, obesity, and T2DM

Our results demonstrate that elderly males adhering to dietary patterns with lower nutritional quality are at higher risk of obesity (46). The sugary and processed foods pattern had the strongest association with obesity, suggesting that diets high in refined sugars and processed items may drive adiposity more than traditional staple-based or seafood-rich patterns (47, 48). These findings align with prior studies indicating that nutrient-dense, high-fiber diets are protective against excessive weight gain, whereas energy-dense, nutrient-poor diets promote metabolic dysregulation (49). The robustness of these associations after adjusting for sociodemographic, lifestyle, biochemical, and clinical factors underscores the independent role of dietary quality in obesity risk (50).

All non-reference dietary patterns were significantly associated with higher T2DM prevalence compared with the high-fiber nutrient-dense pattern. The staple–protein, seafood–eggs, and sugary and processed foods patterns showed consistent positive associations across fully adjusted models. Unhealthy dietary patterns were associated with impaired glycemic regulation in this cross-sectional analysis, potentially through mechanisms such as increased insulin resistance, chronic inflammation, or unfavorable lipid profiles (50, 51). Sensitivity analyses further indicated that these associations were largely independent of total energy intake, emphasizing that the type and combination of foods consumed may exert a stronger influence on T2DM risk than overall caloric load (52, 53).

Stratified analyses revealed that T2DM prevalence varied across dietary clusters and adiposity levels, indicating potential interactions between diet and body fat in influencing diabetes risk (54). Although overall blood pressure did not differ significantly between diabetic and non-diabetic participants, it may still serve as a modifier in diabetes pathophysiology through sex-specific hormonal or vascular mechanisms (55, 56). The inverse association observed between obesity and T2DM risk may reflect survivor bias, age-related changes in fat distribution, or metabolic heterogeneity among older adults, highlighting the complexity of risk factor interactions in this demographic (57, 58). Our sensitivity analyses demonstrated that the associations between dietary patterns, obesity, and T2DM risk persisted after accounting for BMI stratification and caloric intake, indicating the internal stability and robustness of the findings.

4.4 Feature importance and interpretability using SHAP

To enhance model transparency and understand the predictive contributions of individual features, we employed SHAP values (59). SHAP analysis indicated that dietary patterns, total energy intake, and physical activity were key predictors in the model, with dietary patterns exhibiting the greatest overall contribution. Their effects were mainly negative, implying that optimizing these factors may offer protective benefits against diabetes mellitus. This result is consistent with previous studies, indicating that adherence to certain dietary patterns may play a beneficial role in the prevention and management of T2DM (6063). The use of the shapviz package allowed for a more nuanced understanding of how each variable contributed to individual model predictions (64). Variables displayed in red were considered indicative of protective effects, while cumulative plots provided a multi-layered view of predicted risk. Such interpretability is essential for translating complex outputs from machine learning models into practical insights for clinical decision-making and public health planning.

4.5 Public health implications and recommendations

These findings have several important implications for dietary guidance, health education, and strategies for the prevention of type 2 diabetes mellitus (T2DM). First, the high prevalence of diabetes, despite the absence of significant differences in conventional risk markers, suggests that dietary patterns provide additional predictive value that could complement existing clinical screening tools. Second, the application of interpretable machine learning techniques, such as UMAP and SHAP, represents a promising approach for advancing precision nutrition and personalized prevention strategies.

Public health interventions should focus on promoting nutrient-rich dietary habits and culturally appropriate food choices, emphasizing increased consumption of vegetables, whole grains, and seafood, while limiting processed and sugary foods.

4.6 Strengths and limitations

This study has several key strengths. First, it draws on a large, well-characterized cohort of elderly Chinese males, enabling robust analyses of dietary patterns, obesity, blood pressure, and T2DM risk. Second, the application of advanced analytical approaches, including unsupervised machine learning (UMAP) combined with factor analysis, allowed for the identification of distinct dietary patterns and the effective visualization of complex multidimensional dietary behaviors. Third, the use of SHAP values within an XGBoost modeling framework provided interpretable insights into the relative contributions of demographic, lifestyle, and dietary factors to T2DM risk, enhancing the translational relevance of our findings for public health interventions. Finally, comprehensive multivariable logistic regression models with extensive covariate adjustment and sensitivity analyses support the validity and reliability of the observed associations between dietary patterns, obesity, and T2DM risk.

Nonetheless, several limitations should be acknowledged. First, the cross-sectional design prevents causal inferences regarding the relationships between dietary patterns, obesity, and T2DM, limiting conclusions about temporal or mechanistic pathways. Second, dietary intake was assessed through self-reported measures, which may be prone to recall bias or misreporting, particularly in an older population. Third, since the study focused exclusively on elderly Chinese males, the generalizability of these findings to women or to other ethnic and age groups may be limited. Dietary behaviors, metabolic responses, and T2DM risk profiles can differ by sex due to hormonal, physiological, and sociocultural factors. Similarly, other ethnic groups may exhibit distinct dietary patterns, obesity prevalence, or genetic predispositions affecting T2DM risk. Future studies including women and diverse populations are needed to assess whether the identified dietary patterns and their associations with obesity and T2DM risk are consistent across broader demographic groups. Fourth, residual confounding from unmeasured variables, such as genetic predispositions or detailed nutrient metabolism data, cannot be entirely excluded. Fifth, this study did not include an independent external validation dataset due to cohort constraints. While sensitivity analyses served as an internal robustness assessment, future research should validate these findings in independent prospective cohorts and other populations to enhance generalizability and clinical applicability. Finally, although advanced machine learning techniques were employed, the complexity of these models may restrict their immediate application in routine clinical practice without further validation.

Overall, the combination of rigorous statistical modeling, interpretable machine learning analyses, and detailed dietary pattern characterization constitutes a major methodological strength, whereas the observational design and reliance on self-reported dietary data remain important limitations to be addressed in future longitudinal studies.

5 Conclusions

In this study of elderly Chinese males, we observed a high prevalence of newly diagnosed Type 2 diabetes mellitus (T2DM), with nearly half of participants affected. Contrary to traditional assumptions, obesity was inversely associated with T2DM risk, even after adjustment for a wide range of demographic, lifestyle, and biochemical covariates. Dietary patterns emerged as key determinants of both obesity and T2DM risk. Specifically, adherence to unhealthy dietary patterns—namely staple–protein, seafood–eggs, and sugary and processed foods—was consistently associated with an increased risk of obesity and T2DM, while a high-fiber nutrient-dense diet was protective. Machine learning analyses further highlighted the prominent role of dietary behaviors, total energy intake, and physical activity as influential predictors of T2DM risk. These findings suggest that, beyond traditional anthropometric measures, diet quality and lifestyle factors play a critical role in shaping metabolic health in this population. Intervention strategies focusing on promoting nutrient-dense diets and modifiable lifestyle behaviors may be effective in reducing the burden of obesity and T2DM among elderly Chinese males. Importantly, this study is among the first to leverage unsupervised machine learning with SHAP analysis to uncover dietary and metabolic determinants of T2DM in elderly Chinese men, offering novel insights for precision prevention strategies.

Data availability statement

The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

Ethics statement

This study was conducted in accordance with the Declaration of Helsinki. The protocol was reviewed and approved by the Ethics Committee of Heze University, Institutional Review Board (IRB), Heze University, Heze, Shandong Province, China (Approval No. 20180314). All participants, or their legal guardians when necessary, provided written informed consent prior to participation. The studies were conducted in accordance with the local legislation and institutional requirements. The participants provided their written informed consent to participate in this study.

Author contributions

HS: Writing – original draft, Formal analysis. LZ: Writing – original draft, Formal analysis. PW: Investigation, Writing – original draft. KY: Software, Writing – original draft. SS: Methodology, Writing – original draft. YC: Methodology, Writing – original draft. LL: Methodology, Writing – review & editing, Data curation, Formal analysis, Investigation, Software, Supervision.

Funding

The author(s) declare that no financial support was received for the research and/or publication of this article.

Acknowledgments

The authors would like to acknowledge the study participants for their informed consent and the staff at the Heze university for their dedicated work.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Generative AI statement

The author(s) declare that no Gen AI was used in the creation of this manuscript.

Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.

Publisher's note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Supplementary material

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fnut.2025.1705683/full#supplementary-material

References

1. Zheng Y, Ley SH, Hu FB. Global aetiology and epidemiology of type 2 diabetes mellitus and its complications. Nat Rev Endocrinol. (2018) 14:88–98. doi: 10.1038/nrendo.2017.151

PubMed Abstract | Crossref Full Text | Google Scholar

2. Chatterjee S, Khunti K, Davies MJ. Type 2 diabetes. Lancet. (2017) 389:2239–51. doi: 10.1016/S0140-6736(17)30058-2

Crossref Full Text | Google Scholar

3. Huang Y, Fan W, Zhu J, Zhan L, Chen W. DB1 epidemiology, treatment and economic burden of type 2 diabetes mellitus with nephropathy complications in China. Value Health. (2012) 15:A603. doi: 10.1016/j.jval.2012.08.013

Crossref Full Text | Google Scholar

4. Ning F, Zhang D, Xue B, Zhang L, Zhang J, Zhu Z, et al. Synergistic effects of depression and obesity on type 2 diabetes incidence in Chinese adults. J Diabetes. (2020) 12:142–50. doi: 10.1111/1753-0407.12968

PubMed Abstract | Crossref Full Text | Google Scholar

5. Ogurtsova K, da Rocha Fernandes JD, Huang Y, Linnenkamp U, Guariguata L, Cho NH, et al. IDF diabetes atlas: global estimates for the prevalence of diabetes for 2015 and 2040. Diabetes Res Clin Pract. (2017) 128:40–50. doi: 10.1016/j.diabres.2017.03.024

PubMed Abstract | Crossref Full Text | Google Scholar

6. Klen J, Dolžan V. Glucagon-like peptide-1 receptor agonists in the management of type 2 diabetes mellitus and obesity: the impact of pharmacological properties and genetic factors. Int J Mol Sci. (2022) 23:3451. doi: 10.3390/ijms23073451

PubMed Abstract | Crossref Full Text | Google Scholar

7. Issaka A, Paradies Y, Stevenson C. Modifiable and emerging risk factors for type 2 diabetes in Africa: a systematic review and meta-analysis protocol. Syst Rev. (2018) 7:139. doi: 10.1186/s13643-018-0801-y

PubMed Abstract | Crossref Full Text | Google Scholar

8. Müller M, Canfora EE, Blaak EE. Gastrointestinal transit time, glucose homeostasis and metabolic health: modulation by dietary fibers. Nutrients. (2018) 10:275. doi: 10.3390/nu10030275

PubMed Abstract | Crossref Full Text | Google Scholar

9. Global, regional, and national comparative risk assessment of 84 behavioural, environmental and occupational, and metabolic risks or clusters of risks, 1990-2016: a systematic analysis for the global burden of disease study 2016. Lancet. (2017) 390:1345–422. doi: 10.1016/S0140-6736(17)32366-8

Crossref Full Text | Google Scholar

10. Pan A, Lin X, Hemler E, Hu FB. Diet and cardiovascular disease: advances and challenges in population-based studies. Cell Metab. (2018) 27:489–96. doi: 10.1016/j.cmet.2018.02.017

PubMed Abstract | Crossref Full Text | Google Scholar

11. Ley SH, Hamdy O, Mohan V, Hu FB. Prevention and management of type 2 diabetes: dietary components and nutritional strategies. Lancet. (2014) 383:1999–2007. doi: 10.1016/S0140-6736(14)60613-9

PubMed Abstract | Crossref Full Text | Google Scholar

12. Shinozaki N, Murakami K, Asakura K, Masayasu S, Sasaki S. Identification of dish-based dietary patterns for breakfast, lunch, and dinner and their diet quality in Japanese adults. Nutrients. (2020) 13:67. doi: 10.3390/nu13010067

PubMed Abstract | Crossref Full Text | Google Scholar

13. Evert AB, Dennison M, Gardner CD, Garvey WT, Lau KHK, MacLeod J, et al. Nutrition therapy for adults with diabetes or prediabetes: a consensus report. Diabetes Care. (2019) 42:731–54. doi: 10.2337/dci19-0014

PubMed Abstract | Crossref Full Text | Google Scholar

14. Neuenschwander M, Ballon A, Weber KS, Norat T, Aune D, Schwingshackl L, et al. Role of diet in type 2 diabetes incidence: umbrella review of meta-analyses of prospective observational studies. BMJ. (2019) 366:l2368. doi: 10.1136/bmj.l2368

PubMed Abstract | Crossref Full Text | Google Scholar

15. Churuangsuk C, Hall J, Reynolds A, Griffin SJ, Combet E, Lean MEJ. Diets for weight management in adults with type 2 diabetes: an umbrella review of published meta-analyses and systematic review of trials of diets for diabetes remission. Diabetologia. (2022) 65:14–36. doi: 10.1007/s00125-021-05577-2

PubMed Abstract | Crossref Full Text | Google Scholar

16. Cui Z, Zhou H, Liu K, Wu M, Li S, Meng S, et al. Dietary copper and selenium intakes and the risk of type 2 diabetes mellitus: findings from the China health and nutrition survey. Nutrients. (2022) 14:2055. doi: 10.3390/nu14102055

PubMed Abstract | Crossref Full Text | Google Scholar

17. Tsitsou S, Athanasaki C, Dimitriadis G, Papakonstantinou E. Acute effects of dietary fiber in starchy foods on glycemic and insulinemic responses: a systematic review of randomized controlled crossover trials. Nutrients. (2023) 15:2383. doi: 10.3390/nu15102383

PubMed Abstract | Crossref Full Text | Google Scholar

18. Murakami K, Shinozaki N, Fujiwara A, Yuan X, Hashimoto A, Fujihashi H, et al. A systematic review of principal component analysis-derived dietary patterns in Japanese adults: are major dietary patterns reproducible within a country? Adv Nutr. (2019) 10:237–49. doi: 10.1093/advances/nmy079

PubMed Abstract | Crossref Full Text | Google Scholar

19. Li L, Momma H, Chen H, Nawrin SS, Xu Y, Inada H, et al. Dietary patterns associated with the incidence of hypertension among adult Japanese males: application of machine learning to a cohort study. Eur J Nutr. (2024) 63:1293–314. doi: 10.1007/s00394-024-03342-w

PubMed Abstract | Crossref Full Text | Google Scholar

20. Tatoli R, Lampignano L, Bortone I, Donghia R, Castellana F, Zupo R, et al. Dietary patterns associated with diabetes in an older population from southern italy using an unsupervised learning approach. Sensors. (2022) 22:2193. doi: 10.3390/s22062193

PubMed Abstract | Crossref Full Text | Google Scholar

21. Hozumi Y, Wang R, Yin C, Wei GW. UMAP-assisted K-means clustering of large-scale SARS-CoV-2 mutation datasets. Comput Biol Med. (2021) 131:104264. doi: 10.1016/j.compbiomed.2021.104264

PubMed Abstract | Crossref Full Text | Google Scholar

22. Rynazal R, Fujisawa K, Shiroma H, Salim F, Mizutani S, Shiba S, et al. Leveraging explainable AI for gut microbiome-based colorectal cancer classification. Genome Biol. (2023) 24:21. doi: 10.1186/s13059-023-02858-4

PubMed Abstract | Crossref Full Text | Google Scholar

23. Liu Q, Wen Q, Lv J, Shi Z, Guo Y, Pei P, et al. The prospective associations of lipid metabolism-related dietary patterns with the risk of diabetes in Chinese adults. Nutrients. (2022) 14:980. doi: 10.3390/nu14050980

PubMed Abstract | Crossref Full Text | Google Scholar

24. Yin Z, Chen J, Zhang J, Ren Z, Dong K, Kraus VB, et al. Dietary patterns associated with cognitive function among the older people in underdeveloped regions: finding from the NCDFaC Study. Nutrients. (2018) 10:464. doi: 10.3390/nu10040464

PubMed Abstract | Crossref Full Text | Google Scholar

25. Ortega FB, Lavie CJ, Blair SN. Obesity and cardiovascular disease. Circ Res. (2016) 118:1752–70. doi: 10.1161/CIRCRESAHA.115.306883

Crossref Full Text | Google Scholar

26. Koromani F, Oei L, Shevroja E, Trajanoska K, Schoufour J, Muka T, et al. Vertebral fractures in individuals with type 2 diabetes: more than skeletal complications alone. Diabetes Care. (2020) 43:137–44. doi: 10.2337/dc19-0925

PubMed Abstract | Crossref Full Text | Google Scholar

27. Sun Y, Liu B, Snetselaar LG, Wallace RB, Caan BJ, Rohan TE, et al. Association of normal-weight central obesity with all-cause and cause-specific mortality among postmenopausal women. JAMA Netw Open. (2019) 2:e197337. doi: 10.1001/jamanetworkopen.2019.7337

PubMed Abstract | Crossref Full Text | Google Scholar

28. Liu R, Dang S, Zhao Y, Yan H, Han Y, Mi B. Long-term waist circumference trajectories and body mass index with all-cause mortality in older Chinese adults: a prospective nationwide cohort study. Arch Public Health. (2022) 80:94. doi: 10.1186/s13690-022-00861-y

PubMed Abstract | Crossref Full Text | Google Scholar

29. Alberti KG, Zimmet P, Shaw J. The metabolic syndrome–a new worldwide definition. Lancet. (2005) 366:1059–62. doi: 10.1016/S0140-6736(05)67402-8

PubMed Abstract | Crossref Full Text | Google Scholar

30. Tang O, Daya N, Matsushita K, Coresh J, Sharrett AR, Hoogeveen R, et al. Performance of high-sensitivity cardiac troponin assays to reflect comorbidity burden and improve mortality risk stratification in older adults with diabetes. Diabetes Care. (2020) 43:1200–8. doi: 10.2337/dc19-2043

PubMed Abstract | Crossref Full Text | Google Scholar

31. Standards of medical care in diabetes. Diabetes Care. (2005) 28 Suppl 1:S4–36. doi: 10.2337/diacare.28.suppl_1.S4

Crossref Full Text | Google Scholar

32. Rimm EB, Giovannucci EL, Stampfer MJ, Colditz GA, Litin LB, Willett WC. Reproducibility and validity of an expanded self-administered semiquantitative food frequency questionnaire among male health professionals. Am J Epidemiol. (1992) 135:1114–26; discussion 1127–36. doi: 10.1093/oxfordjournals.aje.a116211

PubMed Abstract | Crossref Full Text | Google Scholar

33. Renall N, Lawley B, Vatanen T, Merz B, Douwes J, Corbin M, et al. The fecal microbiotas of women of Pacific and New Zealand European ethnicities are characterized by distinctive enterotypes that reflect dietary intakes and fecal water content. Gut Microbes. (2023) 15:2178801. doi: 10.1080/19490976.2023.2178801

PubMed Abstract | Crossref Full Text | Google Scholar

34. Jin F, Zhang J, Shu L, Han W. Association of dietary fiber intake with newly-diagnosed type 2 diabetes mellitus in middle-aged Chinese population. Nutr J. (2021) 20:81. doi: 10.1186/s12937-021-00740-2

PubMed Abstract | Crossref Full Text | Google Scholar

35. Seidelmann SB, Feofanova E, Yu B, Franceschini N, Claggett B, Kuokkanen M, et al. Genetic variants in SGLT1, glucose tolerance, and cardiometabolic risk. J Am Coll Cardiol. (2018) 72:1763–73. doi: 10.1016/j.jacc.2018.07.061

PubMed Abstract | Crossref Full Text | Google Scholar

36. Yang Y, Wang G-y, Pan X-c. China Food Composition. Vol. 42. Beijing, China: Peking University Medical Press (2009). p. 795–9.

Google Scholar

37. Ægidius HM, Veidal SS, Feigh M, Hallenborg P, Puglia M, Pers TH, et al. Multi-omics characterization of a diet-induced obese model of non-alcoholic steatohepatitis. Sci Rep. (2020) 10:1148. doi: 10.1038/s41598-020-58059-7

PubMed Abstract | Crossref Full Text | Google Scholar

38. Magliano DJ, Boyko EJ, IDF Diabetes Atlas, 10th edition scientific committee. IDF DIABETES ATLAS. 10th ed. International Diabetes Federation (2021).

Google Scholar

39. Shapiro YN, Peppercorn JM, Yee AJ, Branagan AR, Raje NS, Donnell EKO. Lifestyle considerations in multiple myeloma. Blood Cancer J. (2021) 11:172. doi: 10.1038/s41408-021-00560-x

PubMed Abstract | Crossref Full Text | Google Scholar

40. Ikegami K, Refetoff S, Van Cauter E, Yoshimura T. Interconnection between circadian clocks and thyroid function. Nat Rev Endocrinol. (2019) 15:590–600. doi: 10.1038/s41574-019-0237-z

PubMed Abstract | Crossref Full Text | Google Scholar

41. Siegel KR, Bullard KM, Imperatore G, Ali MK, Albright A, Mercado CI, et al. Prevalence of major behavioral risk factors for type 2 diabetes. Diabetes Care. (2018) 41:1032–9. doi: 10.2337/dc17-1775

PubMed Abstract | Crossref Full Text | Google Scholar

42. Wainberg M, Mahajan A, Kundaje A, McCarthy MI, Ingelsson E, Sinnott-Armstrong N, et al. Homogeneity in the association of body mass index with type 2 diabetes across the UK biobank: a Mendelian randomization study. PLoS Med. (2019) 16:e1002982. doi: 10.1371/journal.pmed.1002982

PubMed Abstract | Crossref Full Text | Google Scholar

43. Tang O, Matsushita K, Coresh J, Sharrett AR, McEvoy JW, Windham BG, et al. Mortality implications of prediabetes and diabetes in older adults. Diabetes Care. (2020) 43:382–8. doi: 10.2337/dc19-1221

PubMed Abstract | Crossref Full Text | Google Scholar

44. McCann SE, Marshall JR, Brasure JR, Graham S, Freudenheim JL. Analysis of patterns of food intake in nutritional epidemiology: food classification in principal components analysis and the subsequent impact on estimates for endometrial cancer. Public Health Nutr. (2001) 4:989–97. doi: 10.1079/PHN2001168

PubMed Abstract | Crossref Full Text | Google Scholar

45. Zhou F, Wu F, Zou S, Chen Y, Feng C, Fan G. Dietary, nutrient patterns and blood essential elements in Chinese children with ADHD. Nutrients. (2016) 8:352. doi: 10.3390/nu8060352

PubMed Abstract | Crossref Full Text | Google Scholar

46. Dashti HS, Hivert MF, Levy DE, McCurley JL, Saxena R, Thorndike AN. Polygenic risk score for obesity and the quality, quantity, and timing of workplace food purchases: a secondary analysis from the ChooseWell 365 randomized trial. PLoS Med. (2020) 17:e1003219. doi: 10.1371/journal.pmed.1003219

PubMed Abstract | Crossref Full Text | Google Scholar

47. Allman-Farinelli M, Ijaz K, Tran H, Pallotta H, Ramos S, Liu J, et al. Virtual reality food court to study meal choices in youth: design and assessment of usability. JMIR Form Res. (2019) 3:e12456. doi: 10.2196/12456

Crossref Full Text | Google Scholar

48. Machado PP, Steele EM, Levy RB, da Costa Louzada ML, Rangan A, Woods J, et al. Ultra-processed food consumption and obesity in the Australian adult population. Nutr Diabetes. (2020) 10:39. doi: 10.1038/s41387-020-00141-0

PubMed Abstract | Crossref Full Text | Google Scholar

49. Jakubczyk K, Koprowska K, Gottschling A, Janda-Milczarek K. Edible flowers as a source of dietary fibre (total, insoluble and soluble) as a potential athlete's dietary supplement. Nutrients. (2022) 14:2470. doi: 10.3390/nu14122470

PubMed Abstract | Crossref Full Text | Google Scholar

50. Merino J, Guasch-Ferré M, Li J, Chung W, Hu Y, Ma B, et al. Polygenic scores, diet quality, and type 2 diabetes risk: an observational study among 35,759 adults from 3 US cohorts. PLoS Med. (2022) 19:e1003972. doi: 10.1371/journal.pmed.1003972

PubMed Abstract | Crossref Full Text | Google Scholar

51. Sayón-Orea C, Razquin C, Bulló M, Corella D, Fitó M, Romaguera D, et al. Effect of a nutritional and behavioral intervention on energy-reduced mediterranean diet adherence among patients with metabolic syndrome: interim analysis of the PREDIMED-plus randomized clinical trial. JAMA. (2019) 322:1486–99. doi: 10.1001/jama.2019.14630

PubMed Abstract | Crossref Full Text | Google Scholar

52. Duan MJ, Vinke PC, Navis G, Corpeleijn E, Dekker LH. Ultra-processed food and incident type 2 diabetes: studying the underlying consumption patterns to unravel the health effects of this heterogeneous food category in the prospective lifelines cohort. BMC Med. (2022) 20:7. doi: 10.1186/s12916-021-02200-4

PubMed Abstract | Crossref Full Text | Google Scholar

53. Jin Q, Shi N, Aroke D, Lee DH, Joseph JJ, Donneyong M, et al. Insulinemic and inflammatory dietary patterns show enhanced predictive potential for type 2 diabetes risk in postmenopausal women. Diabetes Care. (2021) 44:707–14. doi: 10.2337/dc20-2216

PubMed Abstract | Crossref Full Text | Google Scholar

54. Duran EK, Cook NR, Bobadilla M, Kim E, Manson JE, Buring JE, et al. Plasma placental growth factor concentrations are elevated well in advance of type 2 diabetes mellitus onset: prospective data from the WHS. J Am Heart Assoc. (2019) 8:e012790. doi: 10.1161/JAHA.119.012790

PubMed Abstract | Crossref Full Text | Google Scholar

55. Khurana S, Grandbois J, Tharmalingam S, Murray A, Graff K, Nguyen P, et al. Fetal programming of adrenal PNMT and hypertension by glucocorticoids in WKY rats is dose and sex-dependent. PLoS ONE. (2019) 14:e0221719. doi: 10.1371/journal.pone.0221719

PubMed Abstract | Crossref Full Text | Google Scholar

56. Shang Y, Zhang Y, Leng W, Lei X, Chen L, Zhou X, et al. Sex differences in type 2 diabetes mellitus-related left ventricular remodeling: a cardiovascular magnetic resonance study. J Diabetes Res. (2022) 2022:1427864. doi: 10.1155/2022/1427864

PubMed Abstract | Crossref Full Text | Google Scholar

57. Iona A, Bragg F, Guo Y, Yang L, Chen Y, Pei P, et al. Adiposity and risks of vascular and non-vascular mortality among Chinese adults with type 2 diabetes: a 10-year prospective study. BMJ Open Diabetes Res Care. (2022) 10:e002489. doi: 10.1136/bmjdrc-2021-002489

PubMed Abstract | Crossref Full Text | Google Scholar

58. Pearsey HM, Henson J, Sargeant JA, Davies MJ, Khunti K, Suzuki T, et al. Zinc-alpha2-glycoprotein, dysglycaemia and insulin resistance: a systematic review and meta-analysis. Rev Endocr Metab Disord. (2020) 21:569–75. doi: 10.1007/s11154-020-09553-w

PubMed Abstract | Crossref Full Text | Google Scholar

59. Qi X, Wang S, Fang C, Jia J, Lin L, Yuan T. Machine learning and SHAP value interpretation for predicting comorbidity of cardiovascular disease and cancer with dietary antioxidants. Redox Biol. (2025) 79:103470. doi: 10.1016/j.redox.2024.103470

PubMed Abstract | Crossref Full Text | Google Scholar

60. Raj AJ, Biswakarma S, Pala NA, Shukla G, Vineeta Kumar M, Chakravarty S, et al. Indigenous uses of ethnomedicinal plants among forest-dependent communities of Northern Bengal, India. J Ethnobiol Ethnomed. (2018) 14:8. doi: 10.1186/s13002-018-0208-9

PubMed Abstract | Crossref Full Text | Google Scholar

61. Hatta M, Horikawa C, Takeda Y, Ikeda I, Yoshizawa Morikawa S, Kato N, et al. Association between obesity and intake of different food groups among Japanese with type 2 diabetes mellitus-Japan diabetes clinical data management study (JDDM68). Nutrients. (2022) 14:3034. doi: 10.3390/nu14153034

PubMed Abstract | Crossref Full Text | Google Scholar

62. de Koning L, Chiuve SE, Fung TT, Willett WC, Rimm EB, Hu FB. Diet-quality scores and the risk of type 2 diabetes in men. Diabetes Care. (2011) 34:1150–6. doi: 10.2337/dc10-2352

PubMed Abstract | Crossref Full Text | Google Scholar

63. Ahmad S, Demler OV, Sun Q, Moorthy MV, Li C, Lee IM, et al. Association of the mediterranean diet with onset of diabetes in the women's health study. JAMA Netw Open. (2020) 3:e2025466. doi: 10.1001/jamanetworkopen.2020.25466

PubMed Abstract | Crossref Full Text | Google Scholar

64. Zeng Z, Wu J, Qin G, Yu D, He Z, Zeng W, et al. Using time-series chest radiographs and laboratory data by machine learning for identifying pulmonary infection and colonization of Acinetobacter baumannii. Respir Res. (2024) 25:2. doi: 10.1186/s12931-023-02624-x

PubMed Abstract | Crossref Full Text | Google Scholar

Keywords: type 2 diabetes mellitus, dietary patterns, obesity, unsupervised machine learning, SHAP analysis

Citation: Sun H, Zhu L, Wang P, Yuan K, Nawrin SS, Cui Y and Li L (2025) Dietary patterns and obesity are associated with type 2 diabetes risk in elderly Chinese men: a machine learning approach. Front. Nutr. 12:1705683. doi: 10.3389/fnut.2025.1705683

Received: 15 September 2025; Revised: 02 November 2025;
Accepted: 11 November 2025; Published: 26 November 2025.

Edited by:

Mohd Dilshad Ansari, SRM University (Delhi-NCR), India

Reviewed by:

Shula Shazman, Open University of Israel, Israel
Djeane Debora Onthoni, University of Tartu, Estonia

Copyright © 2025 Sun, Zhu, Wang, Yuan, Nawrin, Cui and Li. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Longfei Li, bGkubG9uZ2ZlaS5xNEBhbHVtbmkudG9ob2t1LmFjLmpw

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.