AUTHOR=Hong Yan , Chen Xinrong , Wang Ling , Zhang Fan , Zeng ZiYing , Xie Weining TITLE=Machine learning prediction of metabolic dysfunction-associated fatty liver disease risk in American adults using body composition: explainable analysis based on SHapley Additive exPlanations JOURNAL=Frontiers in Nutrition VOLUME=Volume 12 - 2025 YEAR=2025 URL=https://www.frontiersin.org/journals/nutrition/articles/10.3389/fnut.2025.1616229 DOI=10.3389/fnut.2025.1616229 ISSN=2296-861X ABSTRACT=BackgroundMetabolic dysfunction-associated fatty liver disease (MAFLD) is a prevalent and progressive liver disorder closely linked to obesity and metabolic dysregulation. Traditional anthropometric measures such as body mass index (BMI) are limited in their ability to capture fat distribution and associated risk. This study aimed to develop and validate machine learning (ML) models for predicting MAFLD using detailed body composition metrics and to explore the relative contributions of adipose tissue features through explainable ML techniques.MethodsData from the 2017–2018 National Health and Nutrition Examination Survey (NHANES) were used to construct predictive models based on anthropometric, demographic, lifestyle, and clinical variables. Six ML algorithms were implemented: decision tree (DT), support vector machine (SVM), generalized linear model (GLM), gradient boosting machine (GBM), random forest (RF), and XGBoost. The Boruta algorithm was used for feature selection, and model performance was evaluated using cross-validation and a validation set. SHapley Additive exPlanations (SHAP) were employed to interpret feature contributions.ResultsAmong the six models, the GBM algorithm exhibited the best performance, achieving area under the receiver operating characteristic curve (AUC) values of 0.875 (training) and 0.879 (validation), with minimal fluctuations in sensitivity and specificity. SHAP analysis identified visceral adipose tissue (VAT), BMI, and subcutaneous adipose tissue (SAT) as the most influential predictors. VAT had the highest SHAP value, underscoring its central role in MAFLD pathogenesis.ConclusionThis study demonstrates the effectiveness of integrating body composition features with machine learning techniques for MAFLD risk prediction. The GBM model offers robust predictive accuracy and interpretability, with potential applications in clinical decision-making and public health screening strategies. SHAP analysis provides meaningful insights into the relative importance of adiposity measures, reinforcing the value of fat distribution metrics beyond conventional obesity indices.