Your new experience awaits. Try the new design now and help us make it even better

ORIGINAL RESEARCH article

Front. Artif. Intell.

Sec. Machine Learning and Artificial Intelligence

Volume 8 - 2025 | doi: 10.3389/frai.2025.1579375

AI in Conjunctivitis Research: Assessing ChatGPT and Deepseek for Etiology, Intervention, and Citation Integrity via Hallucination Rate Analysis

Provisionally accepted
Muhammad  HasnainMuhammad Hasnain1*Khursheed  AurangzebKhursheed Aurangzeb2Musaed  AlhusseinMusaed Alhussein2Imran  GhaniImran Ghani3Muhammad  Hamza MahmoodMuhammad Hamza Mahmood1
  • 1Lahore Leads University, Lahore, Pakistan
  • 2King Saud University, Riyadh, Riyadh, Saudi Arabia
  • 3Virginia Military Institute, Lexington city, Virginia, United States

The final, formatted version of the article will be published soon.

The advent of large language models and their applications have gained significant attention due to their strengths in natural language processing. In this study, ChatGPT and Deepseek are utilized as AI models to assist in diagnosis based on the responses generated to clinical questions. Furthermore, ChatGPT, Claude, and Deepseek are used to analyze images in order to assess their potential diagnostic capabilities, applying the various sensitivity analyses described. We employ prompt engineering techniques and evaluate their abilities to generate high-quality responses. We propose several prompts and use them to answer important information on conjunctivitis. Our findings show that Deepseek excels in offering precise and comprehensive information on specific topics related to conjunctivitis. Deepseek provides detailed explanation and in-depth medical insights. Conversely, the ChatGPT model provides generalized public information on the infection, which makes it more suitable for broader and less technical discussions. Deepseek achieved a better performance with a 7% hallucination rate compared to ChatGPT's 13% in this study. Claude demonstrated perfect 100% accuracy in binary classification, significantly outperforming ChatGPT's 62.5% accuracy. Deepseek showed limited performance in understanding images dataset on conjunctivitis. This comparative analysis serves as an insightful reference for scholars and health professionals applying these models in varying medical contexts.

Keywords: ChatGPT, Comprehensiveness, deepseek, Eye infection, Prompts

Received: 19 Feb 2025; Accepted: 17 Jul 2025.

Copyright: © 2025 Hasnain, Aurangzeb, Alhussein, Ghani and Hamza Mahmood. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence: Muhammad Hasnain, Lahore Leads University, Lahore, Pakistan

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.