Your new experience awaits. Try the new design now and help us make it even better

SYSTEMATIC REVIEW article

Front. Med.

Sec. Precision Medicine

This article is part of the Research TopicAdvancements and Challenges in AI-Driven Healthcare InnovationView all 4 articles

Clinical applications of large language models in knee osteoarthritis: a systematic review

Provisionally accepted
Zebing  MaZebing Ma1Yibing  LiuYibing Liu1Ziyan  ZhangZiyan Zhang2Rui  ChenRui Chen3Huayu  FanHuayu Fan3Lili  NiLili Ni4*Xiangyang  CaoXiangyang Cao3,5,6,7*
  • 1Hunan University of Chinese Medicine, Changsha, China
  • 2Central South University, Changsha, China
  • 3Luoyang Orthopedic Hospital of Henan Province (Orthopedic Hospital of Henan Province), Zhengzhou, China
  • 4The Second Affiliated Hospital of Hunan University of Chinese Medicine, Changsha, China
  • 5Institute of Intelligent Medical and Bioengineering Henan Academy of Traditional Chinese Medicine Sciences, Zhengzhou, China
  • 6Henan Province Artificial Intelligence Engineering Research Center for Bone Injury Rehabilitation, Zhengzhou, China
  • 7Henan University of Chinese Medicine, Zhengzhou, China

The final, formatted version of the article will be published soon.

Background and aims: Knee osteoarthritis (KOA) is a common chronic degenerative disease that significantly impacts patients' quality of life. With the rapid advancement of artificial intelligence, large language models (LLMs) have demonstrated potential in supporting medical information extraction, clinical decision-making, and patient education through their natural language processing capabilities. However, the current landscape of LLM applications in the KOA domain, along with their methodological quality, has yet to be systematically reviewed. Therefore, this systematic review aims to comprehensively summarize existing clinical studies on LLMs in KOA, evaluate their performance and methodological rigor, and identify current challenges and future research directions. Methods: Following the PRISMA guidelines, a systematic search was conducted in PubMed, Cochrane Library, Embase databases and Web of science for literature published up to June 2025. The protocol was preregistered on the OSF platform. Studies were screened using standardized inclusion and exclusion criteria. Key study characteristics and performance evaluation metrics were extracted. Methodological quality was assessed using tools such as Cochrane RoB, STROBE, This is a provisional file, not the final typeset article STARD, and DISCERN. Additionally, the CLEAR-LLM and CliMA-10 frameworks were applied to provide complementary evaluations of quality and performance. Results: A total of 16 studies were included, covering various LLMs such as ChatGPT, Gemini, and Claude. Application scenarios encompassed text generation, imaging diagnostics, and patient education. Most studies were observational in nature, and overall methodological quality ranged from moderate to high. Based on CliMA-10 scores, LLMs exhibited upper-moderate performance in KOA-related tasks. The ChatGPT-4 series consistently outperformed other models, especially in structured output generation, interpretation of clinical terminology, and content accuracy. Key limitations included insufficient sample representativeness, inconsistent control over hallucinated content, and the lack of standardized evaluation tools. Conclusion: LLMs show notable potential in the KOA field, but their clinical application is still exploratory and limited by issues such as sample bias and methodological heterogeneity. Model performance varies across tasks, underscoring the need for improved prompt design and standardized evaluation frameworks. With real-world data and ethical oversight, LLMs may contribute more significantly to personalized KOA management.

Keywords: knee osteoarthritis, Large language models, artificial intelligence, Clinical DecisionSupport, Systematic review, ChatGPT

Received: 22 Jul 2025; Accepted: 31 Oct 2025.

Copyright: © 2025 Ma, Liu, Zhang, Chen, Fan, Ni and Cao. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence:
Lili Ni, 415491070@qq.com
Xiangyang Cao, cxy1260@126.com

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.