Your new experience awaits. Try the new design now and help us make it even better

ORIGINAL RESEARCH article

Front. Neurosci.

Sec. Neuroscience Methods and Techniques

Volume 19 - 2025 | doi: 10.3389/fnins.2025.1692122

This article is part of the Research TopicAdvances in Explainable Analysis Methods for Cognitive and Computational NeuroscienceView all 3 articles

Explainable AI for Forensic Speech Authentication within Cognitive and Computational Neuroscience

Provisionally accepted
Zhe  ChengZhe Cheng1Haitao  YangHaitao Yang1Yingzhuo  XiongYingzhuo Xiong1Xuran  HuXuran Hu2*
  • 1Hunan Police Academy, Changsha, China
  • 2Xidian University School of Mechano-Electronic Engineering, Xi'an, China

The final, formatted version of the article will be published soon.

The proliferation of deepfake technologies presents serious challenges for forensic speech authentication. We propose a deep learning framework combining Convolutional Neural Networks (CNNs) and Long Short-Term Memory (LSTM) networks to improve detection of manipulated audio. Leveraging the spectral feature extraction of CNNs and the temporal modeling of LSTMs, the model demonstrates superior accuracy and generalization across the ASVspoof2019 LA and WaveFake datasets. Linear Frequency Cepstral Coefficients (LFCCs) were employed as acoustic features and outperformed MFCC and GFCC representations. To enhance transparency and trustworthiness, explainable artificial intelligence (XAI) techniques, including Grad-CAM and SHAP, were applied, revealing that the model focuses on high-frequency artefacts and temporal inconsistencies. These interpretable analyses validate both the model's design and the forensic relevance of LFCC features. The proposed approach thus provides a robust, interpretable, and XAI-driven solution for forensic authentic detection.

Keywords: Multimedia forensics, Digital speech processing, authentic detection, Explainable artificial intelligence, CognitiveNeuroscience

Received: 25 Aug 2025; Accepted: 15 Oct 2025.

Copyright: © 2025 Cheng, Yang, Xiong and Hu. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence: Xuran Hu, xuranhu@stu.xidian.edu.cn

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.