AUTHOR=Fahy Stephen , Oehme Stephan , Milinkovic Danko Dan , Bartek Benjamin 

TITLE=Enhancing patient education on the role of tibial osteotomy in the management of knee osteoarthritis using a customized ChatGPT: a readability and quality assessment

JOURNAL=Frontiers in Digital Health

VOLUME=Volume 6 - 2024

YEAR=2025

URL=https://www.frontiersin.org/journals/digital-health/articles/10.3389/fdgth.2024.1480381

DOI=10.3389/fdgth.2024.1480381

ISSN=2673-253X

ABSTRACT=IntroductionKnee osteoarthritis (OA) significantly impacts the quality of life of those afflicted, with many patients eventually requiring surgical intervention. While Total Knee Arthroplasty (TKA) is common, it may not be suitable for younger patients with unicompartmental OA, who might benefit more from High Tibial Osteotomy (HTO). Effective patient education is crucial for informed decision-making, yet most online health information has been found to be too complex for the average patient to understand. AI tools like ChatGPT may offer a solution, but their outputs often exceed the public's literacy level. This study assessed whether a customised ChatGPT could be utilized to improve readability and source accuracy in patient education on Knee OA and tibial osteotomy.MethodsCommonly asked questions about HTO were gathered using Google's “People Also Asked” feature and formatted to an 8th-grade reading level. Two ChatGPT-4 models were compared: a native version and a fine-tuned model (“The Knee Guide”) optimized for readability and source citation through Instruction-Based Fine-Tuning (IBFT) and Reinforcement Learning from Human Feedback (RLHF). The responses were evaluated for quality using the DISCERN criteria and readability using the Flesch Reading Ease Score (FRES) and Flesch-Kincaid Grade Level (FKGL).ResultsThe native ChatGPT-4 model scored a mean DISCERN score of 38.41 (range 25–46), indicating poor quality, while “The Knee Guide” scored 45.9 (range 33–66), indicating moderate quality. Cronbach's Alpha was 0.86, indicating good interrater reliability. “The Knee Guide” achieved better readability with a mean FKGL of 8.2 (range 5–10.7, ±1.42) and a mean FRES of 60 (range 47–76, ±7.83), compared to the native model's FKGL of 13.9 (range 11–16, ±1.39) and FRES of 32 (range 14–47, ±8.3). These differences were statistically significant (p < 0.001).ConclusionsFine-tuning ChatGPT significantly improved the readability and quality of HTO-related information. “The Knee Guide” demonstrated the potential of customized AI tools in enhancing patient education by making complex medical information more accessible and understandable.