ORIGINAL RESEARCH article
Front. Artif. Intell.
Sec. Natural Language Processing
Volume 8 - 2025 | doi: 10.3389/frai.2025.1634774
This article is part of the Research TopicMedical Knowledge-Assisted Machine Learning Technologies in Individualized Medicine Volume IIView all 21 articles
Named Entity Recognition for Chinese Electronic Medical Records by Integrating Knowledge Graph and ClinicalBERT
Provisionally accepted- Xuzhou Medical University, Xuzhou, China
Select one of your emails
You have multiple emails registered with Frontiers:
Notify me on publication
Please enter your email address:
If you already have an account, please login
You don't have a Frontiers account ? You can register here
General purpose language models often struggle to accurately identify domain specific terminology in the medical domain, leading to suboptimal performance in named entity recognition tasks. This study proposes a method for named entity recognition in Chinese electronic medical records that combines ClinicalBERT, a language model pre-trained on clinical corpora, with structured knowledge from a medical knowledge graph. To enhance semantic understanding, entity representations derived using Translating Embeddings (TransE) are incorporated into the model. In addition, the method integrates multiple character level features, including positional labels, contextual category clues, and semantic embeddings, which help improve boundary detection in Chinese texts that lack explicit word delimiters. The input texts are first annotated using the Begin, Inside, Outside, End, Single (BIOES) tagging scheme, then encoded by ClinicalBERT and passed through a bidirectional long short term memory (BiLSTM) network followed by a conditional random field (CRF) layer for label prediction. Experimental results on public datasets show that the proposed method achieves an F1 score of 89.44 percent, outperforming existing baselines and demonstrating strong effectiveness for clinical applications.
Keywords: Named entity recognition1, ClinicalBERT2, Chinese electronic medical records3, Knowledge graphs4, BiLSTM5, CRF6
Received: 25 May 2025; Accepted: 18 Aug 2025.
Copyright: © 2025 Xu and Ma. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
* Correspondence: Kai Ma, Xuzhou Medical University, Xuzhou, China
Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.