ORIGINAL RESEARCH article

Front. Aging Neurosci.

Sec. Parkinson’s Disease and Aging-related Movement Disorders

Volume 17 - 2025 | doi: 10.3389/fnagi.2025.1586273

This article is part of the Research TopicFrontier Research on Artificial Intelligence and Radiomics in Neurodegenerative DiseasesView all 15 articles

Non-invasive detection of Parkinson's disease based on speech analysis and interpretable machine learning

Provisionally accepted
  • 1School of Medical Information Engineering, Anhui University of Chinese Medicine, Hefei, China
  • 2Jiangxi Medical College, Nanchang University, Nanchang, Jiangxi, China., Nanchang, China
  • 3State Key Laboratory of Organ Failure Research, Nanfang Hospital, Southern Medical University, Guangzhou, China

The final, formatted version of the article will be published soon.

Objective: Parkinson's disease (PD) is a progressive neurodegenerative disorder that significantly impacts motor function and speech patterns. Early detection of PD through non-invasive methods, such as speech analysis, can improve treatment outcomes and quality of life for patients. This study aims to develop an interpretable machine learning model that uses speech recordings and acoustic features to predict PD.Methods: A dataset of speech recordings from individuals with and without PD was analyzed. The dataset includes features such as fundamental frequency (Fo), jitter, shimmer, noise-to-harmonics ratio (NHR), and nonlinear dynamic complexity measures. Exploratory data analysis (EDA) was conducted to identify patterns and relationships in the data. The dataset was split into 70% training and 30% testing sets. To address class imbalance, synthetic minority oversampling technique (SMOTE) was applied. Several machine learning algorithms, including K-Nearest Neighbors (KNN), Support Vector Machine (SVM), Decision Trees, Random Forests, and Neural Networks, were implemented and evaluated. Model performance was assessed using accuracy, recall, F1 score, and area under the receiver operating characteristic curve (AUC-ROC) metrics. SHapley Additive exPlanations (SHAP) were used to explain the models and evaluate feature contributions.The analysis revealed that features related to speech instability, such as jitter, shimmer, and NHR, were highly predictive of PD. Nonlinear metrics, including Recurrence Plot Dimension Entropy (RPDE) and Pitch Period Entropy (PPE), also made significant contributions to the model's predictive power. Random Forest and Gradient Boosting models achieved the highest performance, with an AUC-ROC of 0.98, recall of 0.95, ensuring minimal false negatives. SHAP values highlighted the importance of fundamental frequency variation and harmonic-to-noise ratio in distinguishing PD patients from healthy individuals.The developed machine learning model accurately predicts Parkinson's disease using speech recordings, with Random Forest and Gradient Boosting algorithms demonstrating superior performance. Key predictive features include jitter, shimmer, and nonlinear dynamic complexity measures. This study provides a reliable, non-invasive tool for early PD detection and underscores the potential of speech analysis in diagnosing neurodegenerative diseases.

Keywords: Parkinson's disease, Speech analysis, machine learning, non-invasive, Early detection

Received: 02 Mar 2025; Accepted: 21 Apr 2025.

Copyright: © 2025 Xu, xie, Pang, Li, Jin, Huang and Shao. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence:
Fangliang Huang, School of Medical Information Engineering, Anhui University of Chinese Medicine, Hefei, China
Xian Shao, State Key Laboratory of Organ Failure Research, Nanfang Hospital, Southern Medical University, Guangzhou, 510515, China

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.