Your new experience awaits. Try the new design now and help us make it even better

ORIGINAL RESEARCH article

Front. Neurol.

Sec. Movement Disorders

Volume 16 - 2025 | doi: 10.3389/fneur.2025.1706317

This article is part of the Research TopicAI-powered Advances in Diagnosis and Management of Movement DisordersView all articles

Parkinson's disease detection using spectrogram-based multi-model feature fusion networks

Provisionally accepted
Wenna  ChenWenna Chen1*Rongfu  LvRongfu Lv2Xiaowei  DuXiaowei Du1Xiangyu  ChenXiangyu Chen1Hao  WangHao Wang1Jincan  ZhangJincan Zhang2Ganqin  DuGanqin Du1*
  • 1The First Affiliated Hospital of Henan University of Science and Technology, Luoyang, China
  • 2College of Information Engineering, Henan University of Science and Technology, Luoyang, China

The final, formatted version of the article will be published soon.

Parkinson's disease (PD) is the world's second most common neurodegenerative disorder. Traditional diagnostics, which rely on clinical assessment and imaging, are often invasive, costly, and require specialized medical personnel, creating significant barriers to early detection. Given that approximately 90% of PD patients develop vocal impairments, vocal analysis has emerged as a promising non-invasive diagnostic tool. However, individual deep learning models are frequently limited by overfitting and poor generalizability, which constrain their diagnostic accuracy. To overcome these limitations, this study proposes a PD classification method utilizing spectrogram feature fusion with pre-trained convolutional neural networks (CNNs). The dataset, sourced from the First Affiliated Hospital of Henan University of Science and Technology, includes voice recordings from 61 PD patients and 70 healthy controls (HC). Preprocessing the raw speech signals yielded 2,476 spectrograms. Three pre-trained models, DenseNet121, MobileNetV3-Large, and ShuffleNetV2, were used for feature extraction. To ensure dimensional alignment for fusion, the output of MobileNetV3-Large was adjusted to a dimension of (1024, 7, 7) using a 1Ă—1 convolutional layer before features were fused via summation. The results obtained from using 5-fold cross-validation indicate that models using feature fusion consistently surpass individual models across all evaluation metrics. Specifically, the fusion of MobileNetV3-Large and ShuffleNetV2 achieved the highest accuracy of 95.56% and an AUC of 0.99. Comparative experiments with existing state-of-the-art methods confirm that the proposed approach achieves competitive classification performance. The fusion of multi-model features more effectively captures subtle pathological signatures in PD speech, offering a reliable, low-cost, and non-invasive tool for auxiliary PD diagnosis. The code is available at https://github.com/lvrongfu/pjs.

Keywords: parkinson's detection, deep learning, Feature fusion, Convolutional Neural Networks, Transfer Learning

Received: 16 Sep 2025; Accepted: 14 Oct 2025.

Copyright: © 2025 Chen, Lv, Du, Chen, Wang, Zhang and Du. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence:
Wenna Chen, chenwenna0408@163.com
Ganqin Du, dgq99@163.com

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.