Your new experience awaits. Try the new design now and help us make it even better

ORIGINAL RESEARCH article

Front. Digit. Health

Sec. Health Informatics

Volume 7 - 2025 | doi: 10.3389/fdgth.2025.1623399

Artificial Intelligence in Patient Education: Evaluating Large Language Models for Understanding Rheumatology Literature

Provisionally accepted
Claudia  Mendoza PintoClaudia Mendoza Pinto1Pamela  Munguia-RealpozoPamela Munguia-Realpozo1Ivet  Etchegaray MoralesIvet Etchegaray Morales2Edith  Ramírez-LaraEdith Ramírez-Lara1Juan Carlos  Solis PoblanoJuan Carlos Solis Poblano1*Maximo Alejandro  García-FloresMaximo Alejandro García-Flores1Jorge  Ayón-AguilarJorge Ayón-Aguilar1
  • 1Mexican Social Security Institute, Mexico City, Mexico
  • 2Universidad Autonoma de Puebla, Puebla, Mexico

The final, formatted version of the article will be published soon.

Background: Inadequate health literacy hinders positive health outcomes, yet medical literature often exceeds the general population's comprehension level.While health authorities recommend patient materials be at a sixth-grade reading level, scientific articles typically require college-level proficiency. Large language models (LLMs) like ChatGPT show potential for simplifying complex text, possibly bridging this gap.Objective: This study evaluated the effectiveness of ChatGPT 4.0 in enhancing the readability of peer-reviewed rheumatology articles for layperson comprehension.Methods: Twelve open-access rheumatology articles authored by the senior investigators were included. Baseline readability was evaluated utilizing Flesch-Kincaid Grade Level (FKGL) and Simple Measure of Gobbledygook (SMOG) indices. Each article was processed by ChatGPT 4.0 with a prompt requesting simplification to a sixth-grade level. Two expert rheumatologists evaluated the generated summaries' appropriateness (accuracy, absence of errors/omissions).Readability changes were analyzed using paired t-tests.ChatGPT significantly improved readability (P<.0001), reducing the average reading level from approximately 15th grade (FKGL: 15.06, SMOG: 14.08) to 10th grade (FKGL: 10.52, SMOG: 9.48). The expert reviewers deemed the generated summaries appropriate and accurate. The average word count was significantly reduced from 3517 to 446 words (P = 0.047).ChatGPT effectively lowered the reading complexity of specialized rheumatology literature, making it more accessible than the original publications.However, the achieved 10th-grade reading level still exceeds the recommended sixth-grade level for patient education materials. While LLMs are a promising tool, their output may require further refinement or expert review to meet optimal health literacy standards and ensure equitable patient understanding in rheumatology.

Keywords: Rheumatology, Health Literacy, Patient Education, readability, Large language models, ChatGPT, peer-reviewed literature, artificial intelligence

Received: 18 Jun 2025; Accepted: 29 Sep 2025.

Copyright: © 2025 Mendoza Pinto, Munguia-Realpozo, Etchegaray Morales, Ramírez-Lara, Solis Poblano, García-Flores and Ayón-Aguilar. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence: Juan Carlos Solis Poblano, jchemato@yahoo.com

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.