AUTHOR=Balagopalan Aparna , Eyre Benjamin , Robin Jessica , Rudzicz Frank , Novikova Jekaterina 

TITLE=Comparing Pre-trained and Feature-Based Models for Prediction of Alzheimer's Disease Based on Speech

JOURNAL=Frontiers in Aging Neuroscience

VOLUME=Volume 13 - 2021

YEAR=2021

URL=https://www.frontiersin.org/journals/aging-neuroscience/articles/10.3389/fnagi.2021.635945

DOI=10.3389/fnagi.2021.635945

ISSN=1663-4365

ABSTRACT=Introduction: Research related to automatic detection of Alzheimer's disease (AD) is important, given the high prevalence of AD and the high cost of traditional diagnostic methods. Since AD significantly affects the content and acoustics of spontaneous speech, natural language processing and machine learning provide promising techniques for reliably detecting AD. There has been a recent proliferation of classification models for AD, but these vary in the datasets used, model types and training and testing paradigms. In this study, we compare and contrast the performance of two common approaches for automatic AD detection from speech on the same, well-matched dataset, to determine the advantages of using domain knowledge versus pre-trained transfer models.

Methods: Audio recordings and corresponding manually-transcribed speech transcripts of a picture description task administered to 156 demographically matched older adults, 78 with AD and 78 healthy were classified using machine learning and natural language processing. The audio was acoustically-enhanced, and post-processed to improve the quality of the speech recording and control for variation due to recording conditions. Two approaches were used for classification of these speech samples: 1) using domain knowledge: extracting an extensive set of clinically relevant linguistic and acoustic features derived from speech and transcripts, and 2) using transfer-learning: using transcript-representations that are automatically derived from state-of-the-art pre-trained language models, by fine-tuning Bidirectional  Encoder Representations from Transformer (BERT)-based sequence classification models. 

Results: We compared the utility of speech transcript representations obtained from recent natural language processing models to more clinically-interpretable feature-based methods. Both the feature-based approaches and fine-tuned BERT models significantly outperformed the baseline linguistic model using a small set of linguistic features. We observed that fine-tuned BERT models numerically outperformed feature-based approaches on the AD detection task, but the difference was not statistically significant. Our main contribution is the observation that when tested on the same, demographically balanced dataset and tested on independent, unseen data, both domain knowledge and pre-trained linguistic models have good performance for detecting AD. 

Conclusion: This approach supports the value of  linguistically-focussed processing techniques to detect AD from speech and highlights the need to compare model performance on carefully balanced datasets.