A Knowledge-Driven Framework for Surgical Safety Check Integration Using Speech Recognition and Speaker Verification

shi, wen; fan, rui; hu, jian; wang, jingrong; bian, mei; Jiang, Wei

doi:10.3389/fnins.2025.1726720

ORIGINAL RESEARCH article

Front. Neurosci.

Sec. Neuroscience Methods and Techniques

This article is part of the Research TopicAdvances in Explainable Analysis Methods for Cognitive and Computational NeuroscienceView all 8 articles

A Knowledge-Driven Framework for Surgical Safety Check Integration Using Speech Recognition and Speaker Verification

Provisionally accepted

wen shi¹

rui fan¹

jian hu¹

jingrong wang¹

mei bian²

Wei Jiang^1*

¹Air Force Medical University Tangdu Hospital, Xi'an, China
²The First Affiliated Hospital of Xi'an Jiaotong University, Xi'an, China

The final, formatted version of the article will be published soon.

Background: The WHO Surgical Safety Checklist reduces preventable errors, but real operating rooms contain overlapping speech, high ambient noise, similar vocalizations, uncertain speaker identity, and occasional omission of checklist items. ASR-only systems that ignore identity constraints and semantic–temporal structure are therefore prone to misrecognition and incorrect verification. Method: We design an integrated verification framework that couples automatic speech recognition and speaker verification (ASR+SV) with a knowledge-driven rule engine derived from the WHO checklist. Multichannel audio is processed by a Conformer-based ASR module and an ECAPA-TDNN speaker verification model, after which a rule layer enforces consistency across semantic content, speaker role, and checklist phase using an explicit ontology and conflict-resolution rules. The system generates real-time prompts in four states (“pass,” “fault,” “alarm,” “uncertain”). Performance is evaluated in high-fidelity simulated operating-room scenarios with controlled noise, speaking distances, and multi-speaker interactions, using word error rate (WER), equal error rate (EER), verification accuracy, and alarm rate. Three configurations—“ASR-only,” “ASR+SV,” and the full knowledge-driven method—are compared, and ablation experiments isolate the contribution of the rule layer. Results: Under medium-to-high noise and multi-speaker interference, the full framework reduced WER from 18.7% to 13.5% and achieved an EER of about 3.1% relative to baseline. Checklist verification accuracy reached 93.8%, while the alarm rate decreased to 2.7%. The knowledge layer corrected errors from homophones, accent drift, and role confusion by constraining “role–semantics–process” relations, maintaining robust performance at speaking distances up to 1.5 m and background noise of 60 dB. Conclusion: The proposed knowledge-driven ASR+SV framework jointly addresses semantic correctness and speaker identity while remaining interpretable and suitable for embedded deployment. It provides a technical foundation for “time-out” and “operation review” in intelligent operating rooms. Future work will include clinical pilot studies, integration with electronic medical records and multimodal OR data, and deeper analysis of privacy, accountability, and workflow acceptance.

Keywords: Surgical safety checklist, Automatic speech recognition (ASR), speaker verification (SV), Knowledge-Driven Reasoning, Operating Room Noise Robustness

Received: 16 Oct 2025; Accepted: 24 Nov 2025.

Copyright: © 2025 shi, fan, hu, wang, bian and Jiang. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence: Wei Jiang, jiangwei20251@outlook.com

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.