AUTHOR=Kostas Demetres , Aroca-Ouellette Stéphane , Rudzicz Frank TITLE=BENDR: Using Transformers and a Contrastive Self-Supervised Learning Task to Learn From Massive Amounts of EEG Data JOURNAL=Frontiers in Human Neuroscience VOLUME=Volume 15 - 2021 YEAR=2021 URL=https://www.frontiersin.org/journals/human-neuroscience/articles/10.3389/fnhum.2021.653659 DOI=10.3389/fnhum.2021.653659 ISSN=1662-5161 ABSTRACT=Deep neural networks (DNNs) used for brain-computer-interface (BCI) classification are commonly expected to learn general features when trained across a variety of contexts, such that these features can be fine-tuned to specific contexts. While some success is currently found in this approach, we suggest that this interpretation is limited and an alternative would better leverage the newly (publicly) available massive EEG datasets. We consider how to adapt techniques and architectures used for language modelling (LM), that appear capable of ingesting awesome amounts of data, towards the development of encephalography modelling (EM) with DNNs in the same vein. We specifically adapt an approach effectively used for automatic speech recognition, which similarly (to LMs) uses a self-supervised training objective to learn compressed representations of raw data signals. After adaptation to EEG, we find a single pre-trained model capable of modelling completely novel raw EEG sequences recorded with differing hardware, and different subjects performing different tasks. Furthermore, both the internal representations of this model and the entire architecture can be fine-tuned to a variety of downstream BCI and EEG classification tasks, outperforming prior work in task-specific (sleep stage classification) self-supervision.