ORIGINAL RESEARCH article
Front. Digit. Health
Sec. Health Informatics
Unlocking Electronic Health Records: A Hybrid Graph RAG Approach to Safe Clinical AI for Patient QA
Samuel Thio 1,2,3,4
Matthew Lewis 1
Spiros Denaxas 1,5,6,7
Richard James Butler Dobson 1,4,2,8,9,10
1. University College London, London, United Kingdom
2. King's College London Institute of Psychiatry Psychology & Neuroscience, London, United Kingdom
3. UKRI Engineering and Physical Sciences Research Council DRIVE-Health CDT, London, United Kingdom
4. NIHR Maudsley Biomedical Research Centre, London, United Kingdom
5. IT:U Interdisciplinary Transformation University Austria, Linz, Austria
6. British Heart Foundation, London, United Kingdom
7. Ethniko kai Kapodistriako Panepistemio Athenon, Athens, Greece
8. NIHR University College London Hospitals Biomedical Research Centre, London, United Kingdom
9. Health Data Research UK, London, United Kingdom
10. CogStack Limited, London, United Kingdom
Select one of your emails
You have multiple emails registered with Frontiers:
Notify me on publication
Please enter your email address:
If you already have an account, please login
You don't have a Frontiers account ? You can register here
Abstract
Electronic health record (EHR) systems present clinicians with vast repositories of clinical information, creating a significant cognitive burden where critical details are easily overlooked. While Large Language Models (LLMs) offer transformative potential for data processing, they face significant limita-tions in clinical settings, particularly regarding context grounding and hallucinations. Current solutions typically isolate retrieval methods focusing either on structured data (SQL/Cypher) or unstructured semantic search but fail to integrate both simultaneously. This work presents MediGRAF (Medical Graph Retrieval Augmented Framework), a novel hybrid Graph RAG system that bridges this gap. By uniquely combining Neo4j Text2Cypher capabilities for structured relationship traversal with vector embeddings for unstructured narrative retrieval, MediGRAF enables natural language querying of the complete patient journey. Using 10 patients from the MIMIC-IV dataset (generating 5,973 nodes and 5,963 relation-ships), we generated enough nodes and data for patient level question answering (QA), and we evaluated this architecture across varying query complexities. The system demonstrated 100% recall for factual queries which means all relevant information was retrieved and included in the output; meanwhile complex inference tasks achieved a mean expert quality score of 4.25/5 with zero safety violations. These results demonstrate that hybrid graph-grounding significantly advances clinical information retrieval, offering a safer, more comprehensive alternative to standard LLM deployments.
Summary
Keywords
Electronic Health Records, Knowledge graphs, Large language models, Natural Language Processing, Neo4j, Retrieval-Augmented Generation, Text2Cypher, Vector Embeddings
Received
04 January 2026
Accepted
17 February 2026
Copyright
© 2026 Thio, Lewis, Denaxas and Dobson. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Samuel Thio
Disclaimer
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.