AUTHOR=Castaño-Candamil Sebastián , Meinel Andreas , Tangermann Michael TITLE=Post-hoc Labeling of Arbitrary M/EEG Recordings for Data-Efficient Evaluation of Neural Decoding Methods JOURNAL=Frontiers in Neuroinformatics VOLUME=13 YEAR=2019 URL=https://www.frontiersin.org/journals/neuroinformatics/articles/10.3389/fninf.2019.00055 DOI=10.3389/fninf.2019.00055 ISSN=1662-5196 ABSTRACT=

Many cognitive, sensory and motor processes have correlates in oscillatory neural source activity, which is embedded as a subspace in the recorded brain signals. Decoding such processes from noisy magnetoencephalogram/electroencephalogram (M/EEG) signals usually requires data-driven analysis methods. The objective evaluation of such decoding algorithms on experimental raw signals, however, is a challenge: the amount of available M/EEG data typically is limited, labels can be unreliable, and raw signals often are contaminated with artifacts. To overcome some of these problems, simulation frameworks have been introduced which support the development of data-driven decoding algorithms and their benchmarking. For generating artificial brain signals, however, most of the existing frameworks make strong and partially unrealistic assumptions about brain activity. This limits the generalization of results observed in the simulation to real-world scenarios. In the present contribution, we show how to overcome several shortcomings of existing simulation frameworks. We propose a versatile alternative, which allows for an objective evaluation and benchmarking of novel decoding algorithms using real neural signals. It allows to generate comparatively large datasets with labels being deterministically recoverable from the arbitrary M/EEG recordings. A novel idea to generate these labels is central to this framework: we determine a subspace of the true M/EEG recordings and utilize it to derive novel labels. These labels contain realistic information about the oscillatory activity of some underlying neural sources. For two categories of subspace-defining methods, we showcase how such labels can be obtained—either by an exclusively data-driven approach (independent component analysis—ICA), or by a method exploiting additional anatomical constraints (minimum norm estimates—MNE). We term our framework post-hoc labeling of M/EEG recordings. To support the adoption of the framework by practitioners, we have exemplified its use by benchmarking three standard decoding methods—i.e., common spatial patterns (CSP), source power-comodulation (SPoC), and convolutional neural networks (ConvNets)—wrt. Varied dataset sizes, label noise, and label variability. Source code and data are made available to the reader for facilitating the application of our post-hoc labeling framework.