Event Abstract

Back to Event

Building a Cognitive Profile with a Non-Intrusive Sensor: How Speech and Sounds Map onto our Cognitive Worlds

Gabriel J. Collins¹, Jason Poleski^1*, Matthias R. Mehl², Allison Tackman², Ramon A. Reyes², Amanda E. Kraft¹, Jon C. Russo¹, Dylan E. Kenny¹, Peter B. Bryan¹, Edwin A. Simons¹ and William D. Casebeer¹

¹ Lockheed Martin (United States), United States
² Department of Psychology, University of Arizona, United States

In studies of human performance with varied situations, psychologists often attempt to assess individuals’ cognitive states with minimally invasive methods to limit the impact of data collection on the measurements, as well as participant burden. While convenient, studies confined to laboratory settings can miss the daily experiences that influence mood, stress level, and overall condition. On the other hand, some studies have assessed how cognitive abilities change from day to day through behavioral or imaging techniques with the goal of creating personalized models of cognitive function (Ayaz et al., 2012; Sliwinski et al., 2018). We posit that by using passive, unobtrusive technology, sequenced “snapshots” of a participant’s daily life can increase understanding of how and why cognitive states change at an individual level over time, improving fidelity of cognitive models. While there are many possible sensors that could provide useful information, this research investigates a non-intrusive phone application called the Electronically Activated Recorder (EAR; Mehl, 2017) to collect ambient audio snapshots throughout the day from participants’ smartphones. The audio is obfuscated and encrypted for privacy, and signal processing combined with machine learning techniques provide features that can forecast useful cognitive states for daily personalization of behavioral and cognitive models. The EAR measures quantitative and qualitative aspects of an individual’s spoken interactions and also detects audio events such as singing, shouting, laughing, and crying. Gaussian Mixture Models (GMMs) characterize individual speakers for subsequent association, and Mozilla’s DeepSpeech (Hannun et al., 2014) converts speech to text for sentiment and Linguistics Inquiry and Word Count (LIWC) analyses. Once the audio is projected into this feature space, classification algorithms recover personal characteristics including health and wellbeing and the Big Five personality traits (openness, conscientiousness, extraversion, agreeableness, and neuroticism). This work details the experimental deployment of the EAR sensor to over 250 participants in an office setting, and how it can be used to improve accuracy of model forecasts on cognitive states of participants. We then discuss the signal processing approach and audio filtering techniques. The research is based upon work supported by the Office of the Director of National Intelligence (ODNI), Intelligence Advanced Research Projects Activity (IARPA), via 2017-17042800004. The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of the ODNI, IARPA, or the U.S. Government. The U.S. Government is authorized to reproduce and distribute reprints for Governmental purposes notwithstanding any copyright annotation thereon.

Acknowledgements

(Acknowledgement included in Abstract text due to character restriction)

References

Ayaz, H., Shewokis, P. A., Bunce, S., Izzetoglu, K., Willems, B., & Onaral, B. (2012). Optical brain monitoring for operator training and mental workload assessment. Neuroimage, 59(1), 36-47.

Hannun, A., Case, C., Casper, J., Catanzaro, B., Diamos, G., Elsen, E., ... & Ng, A. Y. (2014). Deep speech: Scaling up end-to-end speech recognition. arXiv preprint arXiv:1412.5567. Source code available from https://github.com/mozilla/DeepSpeech.

Mehl, M. R. (2017). The Electronically Activated Recorder (EAR) A Method for the Naturalistic Observation of Daily Social Behavior. Current directions in psychological science, 26(2), 184-190.

Sliwinski, M. J., Mogle, J. A., Hyun, J., Munoz, E., Smyth, J. M., & Lipton, R. B. (2018). Reliability and validity of ambulatory cognitive assessments. Assessment, 25(1), 14-30.

Keywords: audio signal processing, Personality Assessment, cognitive state assessment, Social Interactions, Linguistics

Conference: 2nd International Neuroergonomics Conference, Philadelphia, PA, United States, 27 Jun - 29 Jun, 2018.

Presentation Type: Oral Presentation

Topic: Neuroergonomics

Citation: Collins GJ, Poleski J, Mehl MR, Tackman A, Reyes RA, Kraft AE, Russo JC, Kenny DE, Bryan PB, Simons EA and Casebeer WD (2019). Building a Cognitive Profile with a Non-Intrusive Sensor: How Speech and Sounds Map onto our Cognitive Worlds. Conference Abstract: 2nd International Neuroergonomics Conference. doi: 10.3389/conf.fnhum.2018.227.00013

Copyright: The abstracts in this collection have not been subject to any Frontiers peer review or checks, and are not endorsed by Frontiers. They are made available through the Frontiers publishing platform as a service to conference organizers and presenters.

The copyright in the individual abstracts is owned by the author of each abstract or his/her employer unless otherwise stated.

Each abstract, as well as the collection of abstracts, are published under a Creative Commons CC-BY 4.0 (attribution) licence (https://creativecommons.org/licenses/by/4.0/) and may thus be reproduced, translated, adapted and be the subject of derivative works provided the authors and Frontiers are attributed.

For Frontiers’ terms and conditions please see https://www.frontiersin.org/legal/terms-and-conditions.

Received: 02 Apr 2018; Published Online: 27 Sep 2019.

* Correspondence: Mr. Jason Poleski, Lockheed Martin (United States), Bethesda, United States, jason.poleski@lmco.com

Abstract Info

Abstract

The Authors in

Frontiers

Google

Google Scholar

PubMed

in Frontiers