ORIGINAL RESEARCH article
Front. Cell Dev. Biol.
Sec. Molecular and Cellular Pathology
Volume 13 - 2025 | doi: 10.3389/fcell.2025.1600202
This article is part of the Research TopicArtificial Intelligence Applications in Chronic Ocular Diseases, Volume IIView all 31 articles
Large language model-based multimodal system for detecting and grading ocular surface diseases from smartphone images
Provisionally accepted- 1Ningbo Eye Hospital, Ningbo, Zhejiang Province, China
- 2Affiliated Eye Hospital to Wenzhou Medical University, Wenzhou, Zhejiang Province, China
- 3West China Second University Hospital, Sichuan University, Chengdu, Sichuan Province, China
- 4First People’s Hospital of Aksu, Xinjiang, China
- 5Shenzhen Eye Hospital, Shenzhen, China
Select one of your emails
You have multiple emails registered with Frontiers:
Notify me on publication
Please enter your email address:
If you already have an account, please login
You don't have a Frontiers account ? You can register here
Background: The development of medical artificial intelligence (AI) models is primarily driven by the need to address healthcare resource scarcity, particularly in underserved regions. Proposing an affordable, accessible, interpretable, and automated AI system for non-clinical settings is crucial to expanding access to quality healthcare.Methods: This cross-sectional study developed the Multimodal Ocular Surface Assessment and Interpretation Copilot (MOSAIC) using three multimodal large language models: gpt-4-turbo, claude-3-opus, and gemini-1.5-pro-latest, for detecting three ocular surface diseases (OSDs) and grading keratitis and pterygium. A total of 375 smartphone-captured ocular surface images collected from 290 eyes were utilized to validate MOSAIC. The performance of MOSAIC was evaluated in both zero-shot and few-shot settings, with tasks including image quality control, OSD detection, analysis of the severity of keratitis, and pterygium grading. The interpretability of the system was also evaluated.Results: MOSAIC achieved 95.00% accuracy in image quality control, 86.96% in OSD detection, 88.33% in distinguishing mild from severe keratitis, and 66.67% in determining pterygium grades with five-shot settings. The performance significantly improved with the increasing learning shots (p<0.01). The system attained high ROUGE-L F1 scores of 0.70 to 0.78, depicting its interpretable image comprehension capability.Conclusion: MOSAIC exhibited exceptional few-shot learning capabilities, achieving high accuracy in OSD management with minimal training examples. This system has significant potential for smartphone integration to enhance the accessibility and effectiveness of OSD detection and grading in resource-limited settings.
Keywords: Ocular surface disease, Large Language Model, Multimodal model, Keratitis, Conjunctivitis, Pterygium
Received: 26 Mar 2025; Accepted: 15 May 2025.
Copyright: © 2025 Li, Wang, Xiu, Zhang, Wang, Wang, Chen, Yang and Chen. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
* Correspondence:
Zhongwen Li, Ningbo Eye Hospital, Ningbo, Zhejiang Province, China
Weihua Yang, Shenzhen Eye Hospital, Shenzhen, 518040, China
Wei Chen, Affiliated Eye Hospital to Wenzhou Medical University, Wenzhou, Zhejiang Province, China
Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.