Event Abstract

Acoustic markers of PPA variants using machine learning

  • 1 Johns Hopkins Medicine, Department of Neurology, United States
  • 2 University of South Carolina, Communication Sciences and Disorders Arnold School of Public Health, United States

Introduction. Speakers’ acoustic profile carries significant linguistic and non-linguistic information. Employed in clinical practice, it can provide behavioral markers for a quick assessment of primary progressive aphasia (PPA). PPA is a complex language syndrome where different speech and language properties such as prosody, lexical retrieval, and motor speech functioning may be affected. It is classified into three main variants: the nonfluent (nfvPPA), semantic (svPPA), and logopenic (lvPPA). Primary progressive apraxia of speech (PPAOS) is also distinguished (Duffy et al. 2017) but may fall into the category of nfvPPA (Gorno-Tempini et al. 2011). The present study aims to determine the contribution of the acoustic properties of vowels, prosody, and voice quality in the classification of PPA variants by using machine learning models. Methods. Oral samples from picture description tasks of 50 individuals with PPA (lvPPA:17, svPPA:14, nfvPPA:11, PPAOS:8) were automatically transcribed and segmented into vowels and consonants using the new acoustic analysis platform THEMIS. From the segmented vowels, we measured: i. Vowel formants (F1…F5) (den Ouden, et al. 2017); ii. vowel duration (Duffy, et al., 2017); iii. Mean fundamental frequency (F0), min F0 and max F0 (Hillis, 2014); iv. Pause duration (Mack et al. 2015), and v. H1–H2, H1–A1, H1–A2, H1–A3 measures of voice quality. We compared three machine learning models: support vector machines (SVM) (Cortes and Vapnik, 1995), random forests (RF) (Breiman, 2001), and decision trees (DT) (Hastie et al. 2009) in an one-against all strategy, where each variant was tested against all others. We run all models with a 3-fold group-cross-validation to ensure that the speakers in the training and evaluation sets are different. The models were implemented in Python (Pedregosa et al. 2011). Results. We report the mean cross-validated accuracy of the best performing model that resulted from model comparison: i. RF model provided the highest classification accuracy for nfvPPA [Mean 82%, SD: 9%], ii. SVM had the highest accuracy for svPPA [Mean 66%, SD: 8%], iii. RF had the highest accuracy for lvPPA [Mean 57%, SD: 15%] and iv. RF provided the highest classification accuracy for PPAOS [Mean 80%, SD: 8%] (Figure 1). In all models, pause duration and F0 measures were ranked higher than most other features (Figure 2). Discussion. This study employed an innovative method for the classification of PPA variants, using an automated speech transcription, segmentation, feature extraction and modeling. Using just acoustic features the best model classified nfvPP, svPPA, and PPAOS with high accuracy. However, acoustic features alone could not classify lvPPA with such high accuracy. More linguistic markers might be needed for a more accurate classification of lvPPA. Furthermore, we showed that prosody, which is measured by fundamental frequency and pause duration, contributes more than any other factor to the classification of PPA variants as alluded in previous research by our group and others (Hillis 2014, Patel et al. 2018, Mack 2015). Finally, the findings demonstrate the potential benefit of using machine learning models in clinical practice for the subtyping of PPA variants.

Figure 1
Figure 2

References

Breiman, L. (2001). Random forests. Machine Learning, 45(1):5–32.
Cortes, C., and Vapnik, V. (1995). Support-vector networks. Machine learning, 20(3), 273-297.
Den Ouden D. B., Galkina E., Basilakos A. and Fridriksson J. (2017). Vowel formant dispersion reflects severity of apraxia of speech. Aphasiology, DOI: 10.1080/02687038.2017.1385050
Duffy, J. R., Hanley, H., Utianski, R., Clark, H., Strand, E., Josephs, K. A., and Whitwell, J. L. (2017). Temporal acoustic measures distinguish primary progressive apraxia of speech from primary progressive aphasia. Brain and language, 168, 84-94.
Gorno-Tempini, M. L., Hillis, A. E., Weintraub, S., Kertesz, A., Mendez, M., Cappa, S. e. a., Ogar, J., Rohrer, J., Black, S., Boeve, B. F., Manes F., Dronkers N. F., Vandenberghe R., Rascovsky K., Patterson K., Miller B. L., Knopman D. S., Hodges J. R., Mesulam M. M. and Grossman M. (2011). Classification of primary progressive aphasia and its variants. Neurology, 76(11):1006–1014.
Hastie, T., Tibshirani, R., and Friedman, J. (2009). The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Second Edition. Springer Series in Statistics. Springer, New York.
Hillis, A. E. (2014). Inability to empathize: brain lesions that disrupt sharing and understanding another’s emotions. Brain, 137(4), 981–997. http://doi.org/10.1093/brain/awt317
Mack, J. E., Chandler, S. D., Meltzer-Asscher, A., Rogalski, E., Weintraub, S., Mesulam, M.-M., and Thompson, C. K. (2015). What do pauses in narrative production reveal about the nature of word retrieval deficits in PPA? Neuropsychologia, 77:211 – 222.
Patel S., Oishi K., Wright A., Sutherland-Foggio H., Saxena S., Sheppard S.M. and Hillis A.E. (2018) Right Hemisphere Regions Critical for Expression of Emotion Through Prosody. Frontiers in Neurology 9:224. doi: 10.3389/fneur.2018.00224
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., and Duchesnay, E. (2011). Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12:2825–2830.

Keywords: primary progressive aphasia, machine learning, Vowel acoustics, Prosody, Voice Quality

Conference: Academy of Aphasia 56th Annual Meeting, Montreal, Canada, 21 Oct - 23 Oct, 2018.

Presentation Type: oral presentation

Topic: not eligible for a student prize

Citation: Themistocleous C, Ficek B, Ficek B, Webster KT, Wendt H, Hillis AE, Den Ouden DB and Tsapkini K (2019). Acoustic markers of PPA variants using machine learning. Conference Abstract: Academy of Aphasia 56th Annual Meeting. doi: 10.3389/conf.fnhum.2018.228.00092

Copyright: The abstracts in this collection have not been subject to any Frontiers peer review or checks, and are not endorsed by Frontiers. They are made available through the Frontiers publishing platform as a service to conference organizers and presenters.

The copyright in the individual abstracts is owned by the author of each abstract or his/her employer unless otherwise stated.

Each abstract, as well as the collection of abstracts, are published under a Creative Commons CC-BY 4.0 (attribution) licence (https://creativecommons.org/licenses/by/4.0/) and may thus be reproduced, translated, adapted and be the subject of derivative works provided the authors and Frontiers are attributed.

For Frontiers’ terms and conditions please see https://www.frontiersin.org/legal/terms-and-conditions.

Received: 30 Apr 2018; Published Online: 22 Jan 2019.

* Correspondence: Dr. Charalambos Themistocleous, Johns Hopkins Medicine, Department of Neurology, Baltimore, MD, 21287, United States, charalampos.themistokleous@isp.uio.no