Impact Factor 3.566

The Frontiers in Neuroscience journal series is the 1st most cited in Neurosciences

This article is part of the Research Topic

Bio-inspired Audio Processing, Models and Systems

Original Research ARTICLE Provisionally accepted The full-text will be published soon. Notify me

Front. Neurosci. | doi: 10.3389/fnins.2018.00262

Real-Time Tracking of Selective Auditory Attention from M/EEG: A Bayesian Filtering Approach

 Sina Miran1,  Sahar Akram2, Alireza Sheikhattar1,  Jonathan Z. Simon1, 3, 4, Tao Zhang5 and  Behtash Babadi1, 4*
  • 1Department of Electrical and Computer Engineering, University of Maryland, College Park, United States
  • 2Facebook (United States), United States
  • 3Department of Biology, University of Maryland, College Park, United States
  • 4The Institute for Systems Research, University of Maryland, College Park, United States
  • 5Starkey Hearing Technologies, United States

Humans are able to identify and track a target speaker amid a cacophony of acoustic interference, an ability which is often referred to as the cocktail party phenomenon. Results from several decades of studying this phenomenon have culminated in recent years in various promising attempts to decode the attentional state of a listener in a competing-speaker environment from non-invasive neuroimaging recordings such as magnetoencephalography (MEG) and electroencephalography (EEG). To this end, most existing approaches compute correlation-based measures by either regressing the features of each speech stream to the M/EEG channels (the decoding approach) or vice versa (the encoding approach). To produce robust results, these procedures require multiple trials for training purposes. Also, their decoding accuracy drops significantly when operating at high temporal resolutions. Thus, they are not well-suited for emerging real-time applications such as smart hearing aid devices or brain-computer interface systems, where training data might be limited and high temporal resolutions are desired. In this paper, we close this gap by developing an algorithmic pipeline for real-time decoding of the attentional state. Our proposed framework consists of three main modules: 1) Real-time and robust estimation of encoding or decoding coefficients, achieved by sparse adaptive filtering, 2) Extracting reliable markers of the attentional state, and thereby generalizing the widely-used correlation-based measures thereof, and 3) Devising a near real-time state-space estimator that translates the noisy and variable attention markers to robust and statistically interpretable estimates of the attentional state with minimal delay. Our proposed algorithms integrate various techniques including forgetting factor-based adaptive filtering, l_1-regularization, forward-backward splitting algorithms, fixed-lag smoothing, and Expectation Maximization. We validate the performance of our proposed framework using comprehensive simulations as well as application to experimentally acquired M/EEG data. Our results reveal that the proposed real-time algorithms perform nearly as accurately as the existing state-of-the-art offline techniques, while providing a significant degree of adaptivity, statistical robustness, and computational savings.

Keywords: Attention, auditory, Real-time, Dynamic estimation, EEG, MEG, state-space models, Bayesian filtering

Received: 18 Nov 2017; Accepted: 05 Apr 2018.

Edited by:

Malcolm Slaney, Google (United States), United States

Reviewed by:

Edmund C. Lalor, University of Rochester, United States
Dan Zhang, Tsinghua University, China
Alain De Cheveigne, École Normale Supérieure, France  

Copyright: © 2018 Miran, Akram, Sheikhattar, Simon, Zhang and Babadi. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence: Prof. Behtash Babadi, University of Maryland, College Park, Department of Electrical and Computer Engineering, College Park, United States,