ORIGINAL RESEARCH article

Front. Digit. Health

Sec. Connected Health

Volume 7 - 2025 | doi: 10.3389/fdgth.2025.1588143

AI-Generated Draft Replies to Patient Messages: Exploring Effects of Implementation

Provisionally accepted
  • 1University Medical Center Groningen, Groningen, Netherlands
  • 2Elisabeth Tweesteden Hospital (ETZ), Tilburg, Netherlands

The final, formatted version of the article will be published soon.

IntroductionThe integration of Large Language Models (LLMs) in Electronic Health Records (EHRs) has the potential to reduce administrative burden. Validating these tools in real-world clinical settings is essential for responsible implementation. In this study, the effect of implementing LLM-generated draft responses to patient questions in our EHR is evaluated with regard to adoption, use and potential time savings. Material and MethodsPhysicians across 14 medical specialties in a non-English large academic hospital were invited to use LLM-generated draft replies during this prospective observational clinical cohort study of 16 weeks, choosing either the drafted or a blank reply. The adoption rate, the level of adjustments to the initial drafted responses compared to the final sent messages (using ROUGE-1 and BLEU-1 natural language processing scores), and the time spent on these adjustments were analyzed.ResultsA total of 919 messages by 100 physicians were evaluated. Clinicians used the LLM draft in 58% of replies. Of these, 43% used a large part of the suggested text for the final answer (≥10% match drafted responses: ROUGE-1: 86% similarity, vs. blank replies: ROUGE-1: 16%). Total response time did not significantly different when using a blank reply compared to using a drafted reply with ≥10% match (157 vs. 153 seconds, p=0.69).DiscussionGeneral adoption of LLM-generated draft responses to patient messages was 58%, although the level of adjustments on the drafted message varied widely between medical specialties. This implicates safe use in a non-English, tertiary setting. The current implementation has not yet resulted in time savings, but a learning curve can be expected.

Keywords: large language model (LLM), Inbasket Messages, Adoption, Time saving, LLM generated draft responses, electronic healt records

Received: 09 Apr 2025; Accepted: 26 May 2025.

Copyright: © 2025 Bootsma-Robroeks, Workum, Schuit, Mehri, Hoekman, Doornberg, van der Laan and Schoonbeek. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence: Charlotte M.H.H.T. Bootsma-Robroeks, University Medical Center Groningen, Groningen, Netherlands

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.