Your new experience awaits. Try the new design now and help us make it even better

ORIGINAL RESEARCH article

Front. Med.

Sec. Rheumatology

Development of a Machine Learning-Based Predictive Model for Osteoporosis Risk and Its Application in Clinical Decision Support

Provisionally accepted
  • Jiangxi University of Traditional Chinese Medicine, Nanchang, China

The final, formatted version of the article will be published soon.

Objective: This study was aimed at developing an interpretable machine learning model for predicting osteoporosis (OP) risk using real-world clinical data, and at establishing a web-based visualization tool for assisting clinical decision-making. Methods: A total of 5,328 individuals from the Affiliated Hospital of Jiangxi University of Chinese Medicine (2015–2024) were included. Multidimensional data, including demographic characteristics, anthropometric measures, lumbar spine bone mineral density (L1–L4), and more than 90 blood biochemical and inflammatory markers, were collected. Key variables were identified using univariate analysis followed by least absolute shrinkage and selection operator (LASSO) regression. Five machine learning algorithms—Decision Tree, Random Forest, XGBoost, CatBoost, and Multi-Layer Perceptron (MLP)—were developed and compared. SHapley Additive exPlanations (SHAP) analysis was conducted to enhance model interpretability, and a web-based tool was subsequently developed based on the best-performing model. Results:Five key predictive variables—age, sex, body mass index (BMI), uric acid (UA), and alkaline phosphatase (ALP)—were ultimately selected. Among the five models evaluated, the Random Forest model achieved the highest AUC (0.759) in the test set, demonstrating moderate discriminative performance and good model stability. SHAP analysis revealed that BMI contributed most to the model's predictions, while increased age, female sex, elevated ALP, and reduced UA were associated with a higher risk of osteoporosis. Based on this model, a web-based tool was developed to enable individualized risk prediction and feature-level visualization, providing a quantitative reference for clinical risk assessment. Conclusions:The osteoporosis prediction model developed in this study achieved quantitative risk estimation and interpretable outputs using a limited set of features, providing a feasible technical approach for early screening of osteoporosis. Future work should focus on external validation and recalibration in multicenter populations to further evaluate and optimize the model's predictive performance and clinical applicability.

Keywords: Osteoporosis, machine learning, LASSO regression, random forest, Shap, Clinical decision support

Received: 06 Aug 2025; Accepted: 29 Oct 2025.

Copyright: © 2025 Shao, Wu, Deng, Cheng, Huang, SUN and Li. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence:
ZiChen Shao, szc1119899361@outlook.com
Huanan Li, lihuanan1974@126.com

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.