ORIGINAL RESEARCH article

Front. Physiol.

Sec. Metabolic Physiology

Volume 16 - 2025 | doi: 10.3389/fphys.2025.1607276

This article is part of the Research TopicNovel Rehabilitation Approaches for Non-Communicable Diseases in the Era of Precision MedicineView all 6 articles

Machine Learning-Based Prediction of Knee Pain Risk Using Lipid Metabolism Biomarkers: A Prospective Cohort Study from CHARLS

Provisionally accepted
Biao  GuoBiao Guo1YUAN  LIYUAN LI2*Weihang  PengWeihang Peng3Yabin  LiuYabin Liu4Fei  HeFei He1Zhe  ZhaiZhe Zhai5
  • 1Xi’an University of Posts and Telecommunications, Xi'an, Shaanxi Province, China
  • 2Faculty of Health Sciences and Sports, Macao Polytechnic University, Macao, China
  • 3Faculty of Applied Sciences, Macao Polytechnic University, Macao, Macau Region, China
  • 4Xi'an Physical Education University, Xi'an, Shaanxi Province, China
  • 5Harbin Sport University, Harbin, Heilongjiang Province, China

The final, formatted version of the article will be published soon.

Knee pain significantly affects the health and quality of life among middle-aged and elderly individuals. However, the predictive value of lipid metabolism biomarkers for knee pain risk remains unclear. This study aimed to evaluate whether lipid-related metabolic indicators can effectively predict the incidence of knee pain using machine learning techniques. Data from the China Health and Retirement Longitudinal Study (CHARLS, 2011(CHARLS, -2013) ) were analyzed, incorporating multiple lipid biomarkers and composite indices. Five machine learning models were trained and assessed for performance. SHAP (SHapley Additive exPlanations) analysis was further applied to interpret model outputs and identify the most influential predictors. The results indicated a notably higher prevalence of knee pain in high-altitude, cold regions such as Qinghai and Sichuan provinces. Composite metabolic indices, including lipid accumulation product (LAP), triglyceride-glucose index (TyG), and TyG-BMI, demonstrated superior predictive value compared to traditional single lipid markers. The Stacked Ensemble model achieved the best discrimination (AUC = 0.85) and calibration (Brier score = 0.13). SHAP results confirmed the dominant contribution of LAP and TyG-related indices to prediction outcomes. These findings underscore the value of combining metabolic indicators with interpretable machine learning approaches for early identification and personalized prevention of knee pain in aging populations.

Keywords: Knee Pain, Metabolic biomarkers, machine learning, Lipid accumulation product, CHARLS

Received: 07 Apr 2025; Accepted: 16 Jun 2025.

Copyright: © 2025 Guo, LI, Peng, Liu, He and Zhai. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence: YUAN LI, Faculty of Health Sciences and Sports, Macao Polytechnic University, Macao, China

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.