Your new experience awaits. Try the new design now and help us make it even better

ORIGINAL RESEARCH article

Front. Med.

Sec. Gastroenterology

Volume 12 - 2025 | doi: 10.3389/fmed.2025.1678076

This article is part of the Research TopicAdvancing Gastrointestinal Disease Diagnosis with Interpretable AI and Edge Computing for Enhanced Patient CareView all 4 articles

Predicting the risk of metabolic-associated fatty liver disease in the elderly population in China: Construction and evaluation of interpretable machine learning models

Provisionally accepted
Yingxin  ZengYingxin Zeng1*Chaobing  YangChaobing Yang2Xin  YangXin Yang1Xinmei  ZhangXinmei Zhang1Guodong  XiaGuodong Xia3*
  • 1The Affiliated Hospital of Southwest Medical University, Department of Gastroenterology, The Affiliated Hospital of Southwest Medical University, Luzhou, China
  • 2Department of Critical Care Medicine, The Affiliated Hospital of Southwest Medical University, The Affiliated Hospital of Southwest Medical University, Luzhou, China
  • 3Department of Gastroenterology, The Affiliated Hospital of Southwest Medical University; Health Management Center, The Affiliated Hospital of Southwest Medical University, Luzhou, China

The final, formatted version of the article will be published soon.

With the rising incidence of metabolic dysfunction-associated fatty liver disease (MAFLD) in the elderly population, this study aimed to develop an optimal screening model by comparing ten different machine learning (ML) algorithms to identify high-risk elderly individuals using routine health examination data. The study included 2,635 individuals aged 60 years and older who underwent annual health examinations at the Health Management Center of Southwest Medical University Affiliated Hospital from January to December 2024. Initial feature selection was performed using the least absolute shrinkage and selection operator (LASSO) regression, followed by univariate and multivariate logistic regression analysis to identify nine independent predictive factors. Predictive models were constructed using ten ML algorithms, and model performance was evaluated based on discriminative ability, calibration ability, and clinical utility. Feature importance was visualized and individual-level interpretability was provided using the SHapley Additive exPlanations (SHAP) method. The final analysis included nine variables. After 10-fold cross-validation and hyperparameter tuning, the Random Forest (RF) model performed best, achieving an area under the curve (AUC) of 0.892 (95% CI: 0.870–0.914) in the validation cohort. Feature importance analysis revealed that the TyG-BMI index, height, and albumin levels played significant roles in predicting MAFLD risk. Machine learning models, particularly the random forest algorithm, can effectively predict the risk of MAFLD in the elderly population. These models may assist clinicians in early screening and intervention, thereby improving patient outcomes.

Keywords: machine learning, metabolic-associated fatty liver disease, predictive model, older adults, random forest

Received: 01 Aug 2025; Accepted: 06 Oct 2025.

Copyright: © 2025 Zeng, Yang, Yang, Zhang and Xia. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence:
Yingxin Zeng, 15244959726@163.com
Guodong Xia, 894242130@qq.com

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.