Your new experience awaits. Try the new design now and help us make it even better

ORIGINAL RESEARCH article

Front. Neuroinform.

This article is part of the Research TopicNeuroinformatics for NeuropsychologyView all 4 articles

Enhancing Dementia and Cognitive Decline Detection with Large Language Models and Speech Representation Learning

Provisionally accepted
  • 1Kozminski University, Warsaw, Poland
  • 2WarsawIQ, Warsaw, Poland
  • 3Uniwersytet Marii Curie-Sklodowskiej, Lublin, Poland

The final, formatted version of the article will be published soon.

Dementia poses a major challenge to individuals and public health systems. Detecting cognitive decline through spontaneous speech offers a promising, non-invasive avenue for diagnosis of mild cognitive impairment (MCI) and dementia, enabling timely intervention and improved outcomes. This study describes our submission to the PROCESS Signal Processing Grand Challenge (ICASSP 2025), which tasked participants with predicting cognitive decline from speech samples. Our method combines eGeMAPS features from openSMILE, HuBERT (a self-supervised speech representation model), and GPT-4o, OpenAI's state-of-the-art large language model. These are integrated with the custom LSTM and ResMLP neural networks, and supported by Scikit-learn regressors/classifiers for both cognitive score regression and dementia classification. Our regression model based on LightGBM achieved an RMSE of 2.7775, placing us 10th out of 80 teams globally and surpassing the RoBERTa baseline by 7.5%. For the three-class classification task (Dementia / MCI / Control), our LSTM model obtained an F1-score of 0.5521, ranking 20th of 106 and marginally outperforming the best baseline. We trained models on speech data from 157 study participants, with independent evaluation performed on a separate test set of 40 individuals. We discoved that integrating large language models with self-supervised speech representations enhances the detection of cognitive decline. The proposed approach offers a scalable, data-driven method for early cognitive screening and may support emerging applications in neuropsychological informatics.

Keywords: Dementia detection, Natural Language Processing, Large language models, transformer, machine learning, speech-basedbiomarkers, Speech representation learning, Cognitive screening

Received: 04 Aug 2025; Accepted: 03 Nov 2025.

Copyright: © 2025 Chlasta, Struzik and Wójcik. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence: Karol Chlasta, karol@chlasta.pl

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.