ORIGINAL RESEARCH article
Front. Aging Neurosci.
Sec. Neurocognitive Aging and Behavior
Volume 17 - 2025 | doi: 10.3389/fnagi.2025.1532884
Development and Validation of Deep Learning-and Ensemble Learning-based Biological Ages in the NHANES Study
Provisionally accepted- Zhejiang University, Hangzhou, China
Select one of your emails
You have multiple emails registered with Frontiers:
Notify me on publication
Please enter your email address:
If you already have an account, please login
You don't have a Frontiers account ? You can register here
Conventional machine learning (ML) approaches for constructing biological age (BA) have predominantly relied on blood-based markers, limiting their scope. We analyzed data from 24,985 participants in the National Health and Nutrition Examination Survey (NHANES) from 1999 to 2010, with follow-up extending to December 31, 2019, or until death or loss to follow-up. Using the Least Absolute Shrinkage and Selection Operator (LASSO), we selected 30 features encompassing blood and urine biochemistry, physical examination data, behavioral traits, and socioeconomic factors. These features were used to train deep neural networks (DNN) and ensemble learning models, specifically Deep Biological Age (DBA) and Ensemble Biological Age (EnBA), using chronological age (CA) as the reference label. Model performance was assessed using mean absolute error (MAE), and interpretability was explored with Shapley Additive exPlanation (SHAP). Predictive accuracy of DBA and EnBA for mortality was compared with Phenotypic Age (PhenoAge) using the area under the curve (AUC) from Cox proportional hazards models and hazard ratios (HR), adjusted for demographics and lifestyle factors. Sensitivity analyses were performed to confirm robustness. DBA and EnBA effectively predicted actual age (MAE = 2.98 and 3.58 years, respectively) and demonstrated strong predictive capability for all-cause mortality, with AUCs of 0.896 (95% CI: 0.891-0.898) and 0.889 (95% CI: 0.884-0.894). Higher DBA and EnBA accelerations were significantly associated with increased mortality risk (HRs of 1.059 and 1.039, respectively). SHAP analysis identified prescription medication usage, hepatitis B surface antibody status, and vigorous physical activity as the top contributors to DBA. BA acceleration was also linked to elevated risk of death from specific chronic conditions, including cardiovascular, cerebrovascular diseases, and cancer. We developed and validated two ML-based BA models that accurately predicted both all-cause and cause-specific mortality. These models show promise for early identification of high-risk individuals, potentially facilitating timely preventative interventions.
Keywords: Aging, biological age, machine learning, deep learning, deep neural networks
Received: 22 Nov 2024; Accepted: 23 Jun 2025.
Copyright: © 2025 Huang, Yang, Wang, Abula, Dong and Li. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
* Correspondence: Wenyuan Li, Zhejiang University, Hangzhou, China
Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.