AUTHOR=Wang Yazhi , Zhang Mingkang , Wang Peng TITLE=Data-driven cluster analysis on the association of aging, obesity and insulin resistance with new-onset diabetes in Chinese adults: a multicenter retrospective cohort study JOURNAL=Frontiers in Medicine VOLUME=Volume 12 - 2025 YEAR=2025 URL=https://www.frontiersin.org/journals/medicine/articles/10.3389/fmed.2025.1640017 DOI=10.3389/fmed.2025.1640017 ISSN=2296-858X ABSTRACT=BackgroundType 2 diabetes mellitus (T2DM) is an endocrine and metabolic disorder that can lead to multi-organ damage and dysfunction, imposing significant financial burden on national healthcare systems. Currently, the early identification of high-risk individuals and the prevention of T2DM remain major challenges for clinicians. This study aimed to use easily obtainable clinical indicators to perform cluster analysis on healthy individuals, in order to accurately identify high-risk population requiring early intervention.MethodsThis study was a multicenter retrospective cohort study with a median follow-up period of 3 years. A total of 12,607 Chinese adult individuals without diabetes at baseline were included. The K-means clustering algorithm was applied to five standardized indicators: age, body mass index (BMI), fasting blood glucose (FBG), triglycerides (TG), and HDL-C (high-density lipoprotein cholesterol). After clustering, multivariate Cox proportional hazards regression analysis was used to evaluate and compare the risk of diabetes incidence among different clusters.ResultsThe study population comprising 12,607 subjects was clustered into four distinct groups: Cluster 1 (metabolic health cluster), Cluster 2 (low HDL-C cluster), Cluster 3 (old age and mild metabolic disorder cluster), and Cluster 4 (severe obesity and insulin resistance cluster). The proportional distributions of each cluster were 37.95, 29.99, 24.95, and 7.11%, respectively. The clinical characteristics and diabetes incidence risks varied significantly among the four clusters. Cluster 4 exhibited the highest diabetes incidence rate, followed by Cluster 3, Cluster 2, and Cluster 1. In all models adjusted for covariates, the diabetes incidence rates in Cluster 3 and Cluster 4 were significantly higher than those in Cluster 1 and Cluster 2. However, no significant difference was observed between Cluster 3 and Cluster 4.ConclusionCluster-based analyses can effectively identify individuals at high risk of diabetes in the normal population. These high-risk groups (clusters 3 and 4) are often associated with aging, obesity, and insulin resistance (IR), necessitating early and targeted interventions.