Somatosensory imagery induces topographically specific activation patterns instrumental to fMRI-based Brain Computer Interfacing

For patients with ‘locked-in’ syndrome (LIS), brain-computer interfaces (BCIs) based on hemodynamics are a promising tool to regain communication. Visual and motor imagery are oftenused information-encoding strategies despite frequent visuo-motor impairments in LIS. However, evoking imagery not grounded in recent sensory experience is very challenging. Therefore, somatosensory imagery might be more suitable, as somatosensation is preserved even in late-stage LIS. Using 7T fMRI in healthy subjects, we find that somatosensory imagery is associated with topographic decoding-weight patterns in primary somatosensory cortex and show high-accuracy decoding with little pre-training. We immediately implement these novel findings to drive brain-based communication, providing a promising proof-of-concept for a novel type of communication BCI with high clinical potential. Our new BCI-control strategy paves the way for several future applications. Next to brain-based communication, it might be employed for neurofeedback-based training to restore somatotopy in stroke and chronic pain patients suffering somatosensory deficits.


Introduction
suitable for online BCI-based communication; (2) explore to what extent its classification is driven by somatotopic information. For this proof-of-concept study, we exploited the high signal to noise ratio of ultra-high field 7T functional magnetic resonance imaging.

Results
Ten healthy participants were instructed by auditory cues to perform alternating somatosensory imagery of the hand or foot (figure 1) while ultra-high field 7T fMRI data were collected. Each participant completed five (participant 7) to six somatosensory-imagery classifier-training runs. Seven participants also completed one or more answer-encoding runs in which they answered a binary question using imagined somatosensory stimulation to the right hand to encode "yes" and imagined somatosensory stimulation to the left foot to encode a "no" answer.

Suitability of somatosensory imagery for BCI
After establishing that all participants subjectively experienced somatosensory imagery (see methods section 4.2), we went on to test the suitability of this strategy to induce activation that can be reliably decoded for the purpose for BCI communication. We hypothesized that, using a machinelearning algorithm, the contents of somatosensory hand or foot imagery can be accurately decoded on the single-trial level with similar accuracy for each imagery class, with accuracy increasing when increasing numbers of runs are used to train the classifier (hypothesis 1a). Our second hypothesis (1b) was that somatosensory imagery would yield accurate decoding of answer runs in simulated real-time as well as during a proof-of-concept online brain-based communication session.
Testing the first hypothesis (1a), we found that significant above-chance classification accuracy (p<0.05, based on permutation testing) was achieved in eight out of ten subjects with one training run, and for all subjects for two or more trainings runs (figure 2). The average proportion correct classifications increased with the number of runs included for training (figure 3), up to 0.82 (sd= 0.12) when using five runs (four in one subject) for training and one run for testing. The effect of imagery class and the amount of training runs was tested in an imagery class (2, foot or hand) by training runs To test the second hypothesis (1b) and provide a proof-of-concept for brain-based communication, we used the answer runs that were available from seven subjects (17 runs in total, 7 'no', 10 'yes'; table 1). A simulated real-time analysis revealed no significant effect of the number of runs used for training (F(5, 30) = 1.470, p = 0.229, figure 4). Using all training runs, the average proportion correct for these seven subjects was 0.86 (sd = 0.17), mounting to 0.92 (sd =0.08) when disregarding one subject who, after the session, indicated he misunderstood the instructions for the answer runs. In the proof-of-concept online communication session with the last subject all answers to the questions were accurately decoded.

Assessment of somatotopy
Secondly, we wanted to assess whether classification success would be based on somatotopic information. We hypothesized (2a) that classification accuracy would be higher in primary somatosensory cortex, where somatotopy is most pronounced, compared to secondary somatosensory cortex and (2b) that the discriminative weights obtained from the support vector machine (SVM) would show a somatotopic distribution.
Next, we tested somatotopy of the discriminative SVM weights (hypothesis 2b). Visual inspection of discriminative SVM weight maps revealed a pattern that roughly corresponded to the expected somatotopy in four out of ten subjects (2, 8, 9 and 10; figure 6). A weaker somatotopic pattern was observed in one subject (4). A reversed pattern was noticed in two subjects (3 and 6), and a mixed pattern was seen in three final subjects (1, 5 and 7). In a more quantitative evaluation of the extent to which there was a somatotopic distribution in the discriminative SVM weights, we tested whether there was a distinction in the median coordinates of the voxels associated with the 200 largest positive (foot) weights and the largest negative (hand) weights. We focused on the x-coordinates only, because somatotopy is expected to be most pronounced on the medial-lateral axis of the brain. A Wilcoxon signed rank test revealed that there was a significant within-subject difference (z = 2.09, p = 0.0365) between the median x-coordinates of the voxels associated with the 200 largest positive (foot) weights and the median x-coordinate of the voxels associated with the 200 largest negative (hand) weights.
The median x-coordinate across subjects was 8.50 (sd = 14.5) for the 200 largest positive weights and -21.5 (sd = 18.4) for the 200 largest negative weights (figure 7). Also, we found a significant correlation between the difference in median x coordinates and the overall classification accuracy, i.e. subjects with a larger value for the foot-hand median x-coordinate difference (larger 'hand-foot distance') also showed a higher classification accuracy (Spearmann's rho .948, p = 0.000).

Discussion
The current study (1) investigated whether somatosensory imagery is a suitable encoding strategy for a hemodynamic (fMRI-based) BCI, aiming to provide a proof-of-concept of its suitability for online BCI-based communication and (2) tested to what extent its decoding is based on somatotopic information. In accordance with our first set of hypotheses, we found that the contents of somatosensory imagery could be decoded above chance in each of our participants with similar accuracy for hand or foot, and that decoding accuracy increased with an increasing number of training runs. We also found that answers to binary questions encoded by seven healthy participants using the somatosensory imagery BCI could be successfully decoded in simulated real-time as well as during a proof-of-concept online communication session tested in one subject. In accordance with our second set of hypotheses, we found that decoding accuracy was significantly higher in primary somatosensory cortex compared to secondary somatosensory cortex, and that discriminative weight patterns generated by healthy participants through somatosensory mental imagery showed a somatotopic distribution.

Suitability for use in fMRI Brain-Computer-interfacing
The average classification success was 82% for the offline leave-one-run out cross-validation procedure on the training runs, and 92% for the classification of the answer runs in simulated realtime (86% when including a subject who indicated afterwards that he misunderstood the instructions).
The fact that the classification success was higher for simulated real-time classification of the answer runs could be explained by the additional training run that was used for training the classifier in simulated real-time, whereas in the offline leave-one-run out procedure one run necessarily had to be left out for testing. However, the evaluation of the effect of additional training runs in the offline condition suggested that the beneficial effect of adding training runs levelled off after four runs.
Another possible explanation might lie in the slight differences in the analysis pipelines for the simulated real-time and offline analyses regarding slice-scan-time-correction and data transformation (Talairach transformation and downsampling to 2mm 3 offline versus smoothing with 4mm FWHM Gaussian kernel in native space in simulated-real-time). On the other hand, the increased accuracy for the answer run classification might also be related to enhanced imagery quality during the answer run, due to extended practice, focus on a single imagery strategy (in contrast to the alternating imagery strategies in the training runs) and possibly increased motivation because of the social interaction component of answering a genuine question.
Taken together, these findings inspire confidence that somatosensory imagery will be a viable additional control strategy for BCI communication procedures. Somatosensory imagery might be especially suited for locked-in patients with reduced visual or auditory faculties who lack muscle control. These patients might not be able to evoke stable imagery involving their most affected modalities, as they lack recent memories of sensory experiences which can increase the level of detail and vividness of mental representations generated by imagination (Hassabis & Maguire, 2009). The current study was performed at 7T, but given the relatively large distance between hand and foot representations in primary somatosensory cortex similar results could be expected at a field strength of 3T which is more commonly available in clinical settings. Our novel somatosensory imagery control strategy might also be applied in binary fNIRS-based communication BCIs. Of course, taking into account the considerably lower spatial resolution of fNIRS, optode placement and data analysis procedures would have to be optimized.

Topographic nature of somatosensory imagery
Our results indicate that decoding accuracy was higher within the primary than in the secondary somatosensory cortex mask, and that it is possible to induce somatotopic discriminative weight patterns by mental imagery. The median x-coordinates of the subjects with highest offline classification accuracies roughly correspond to values reported in the literature for finger (Kolasinski et al., 2016;Nelson & Chen, 2008;Pfannmoller, Greiner, Balasubramanian, & Lotze, 2016;Sanchez-Panchuelo, Francis, Bowtell, & Schluppeck, 2010;Schweizer, Voit, & Frahm, 2008) and foot representations (Akselrod et al., 2017; figure 7). The slightly more medial imagery-related weight coordinates could be due to the fact that, in the current study, imagery involved the whole hand whereas stimulation is usually delivered to the fingers only. For the foot, imagery coordinates were more inferior than reported stimulation coordinates. Similar observations of corresponding sites for imagery and stimulation in primary sensory cortices have also been made in previous studies in the somatosensory domain (de Borst & de Gelder, 2017;Newman et al., 2005;Schmidt et al., 2014;Wise et al., 2016;Yoo et al., 2003), however, to our knowledge, the current study is the first to report a somatotopic pattern of discriminative weights in the (primary) somatosensory cortex during imagery.
Previous studies have reported topographic recruitment of visual cortex during imagery (Klein et al., 2004;Slotnick, Thompson, & Kosslyn, 2005;Thirion et al., 2006) The presence of a somatotopic pattern in the discriminative weights, that is a more medial and right sided location for the highest positive weights corresponding to the imagery class 'foot' and a more lateral and left sided location for the highest negative weights associated with 'hand', was associated with higher classification success. This suggests that subjects might increase classification success by increasing the level of somatotopy in their imagery-induced brain activation patterns. If this finding can be extrapolated to subjects with maladaptive somatosensory representations, we speculate that the current BCI might be suitable to use for training the somatotopy of these representations. Somatosensory dysfunction and aberrant plasticity have been linked to pain severity in chronic pain (Di Pietro et al., 2013;Di Pietro, Stanton, Moseley, Lotze, & McAuley, 2015;Flor, Braun, Elbert, & Birbaumer, 1997;Kim, Kim, & Nabekura, 2017;Pleger et al., 2005;Wrigley et al., 2009), but see Makin et al. (2013) for an alternative view). Neurofeedback based on somatosensory mental imagery could be especially suitable for clinical interventions in patients for whom actual somatosensory stimulation is difficult or impossible due to their physical limitations or technical limitations of the scanner environment.
Somatotopic patterns could in theory also be evoked using motor imagery or even actual movement. However, our subjects were carefully instructed and trained in a mock scanner to exclusively use somatosensory imagery, and it was emphasized that they should not move during imagery runs. Subjects' compliance to these instructions is supported by the fact that we did not observe hand or foot movement during training or scanning. Also, during the training runs, subjects did not receive any feedback or reward based on their performance, so there was no incentive to try out non-somatosensory imagery strategies. In line with this, subjects' reports indeed indicated that they used purely somatosensory strategies (supplementary table 1).
Based on the current data, we cannot establish a link between classification accuracy and the use of a specific somatosensory imagery strategy: The two subjects with largest classification success used touch and warmth imagery, as did the two subjects with the lowest classification success. It would be interesting to investigate whether imagery targeting different somatosensory submodalities induces different patterns of activation in primary somatosensory cortex. However, such pattern differences might be most apparent within a limb-specific subregion, whereas the general somatotopy of hand and foot is most likely preserved across somatosensory submodalities (Sur, Wall, & Kaas, 1984).
We instructed our subjects to stick with the same strategy for hand and foot. Whereas some subjects only reported using imagery based on one somatosensory submodality (e.g. vibrotactile stimulation), others used imagery of more naturalistic somatosensory stimulation involving a combination of somatosensory submodalities. Such imagery, e.g., combining warmth and touch might increase vividness and evoke stronger activation. Work by Kosslyn and Thompson in the visual domain suggests that the level of vividness and detail is related to primary visual cortex activation during imagery (Kosslyn & Thompson, 2003). We therefore propose to leave the choice of strategy to the participant, and have them choose the strategy that subjectively evokes most vivid and detailed imagery and that they feel most comfortable with.
There are some limitations to this study. Somatosensory imagery involved only two contralateral body parts, which are relatively far apart in physical and somatotopic space, and whose representations reside on contralateral sides of the body. Thus, classification could have been based solely on less specific imagery related to body side. However, the fact that classification success was highest within the primary somatosensory cortex mask and linked to the degree of somatotopy in the discriminative weight maps indicates that subjects used a limb-specific somatosensory imagery strategy rather than a body-side related strategy. This is also in accordance with the introspective description of imagery strategies reported by the subjects. However, note that the question whether the brain signal is body-part or body-side specific is irrelevant in the context of the suggested twochoice communication BCI application, as long as it can be used at will.
A final limitation of this study is that only healthy subjects without neurological impairments participated in the current study. Patients might have shorter attentional spans leading to more 1 0 variable imagery quality. However, we found higher classification rates during answer runs than during training runs, possibly due to increased motivation. This effect might positively extrapolate to patients in a BCI communication setting.

Conclusion
Taken together, we conclude that somatosensory imagery is a successful informationencoding strategy for hemodynamic BCI-based communication with a high clinical potential. The current results pave the way for future applications, such as brain-based communication for patients with locked-in-syndrome, and neurofeedback-based training to support the rehabilitation of somatosensory representations in stroke and chronic pain patients suffering from somatosensory deficits.

Participants
Ten neurologically healthy participants (six males, mean age = 29 years, sd = 5 years, one left-handed) were recruited among students and staff members of the Faculty of Psychology and

Procedure and experimental design
Two to five days before the actual scanning took place, subjects participated in a training session in which they performed one or more imagery runs in a mock scanner. The purpose of this training session was to instruct them about the task and possible imagery strategies, and to encourage them to try several imagery strategies and select the one that worked best for them, evoking the most stable and vivid somatosensory experience, as assessed by introspection. Two imagery strategies were suggested as possible starting points. The first strategy was to imagine someone gently touching the inner surface of the hand or foot. The second strategy was to apply techniques from autogenic training, imagining a heavy and warm feeling in the hand or foot. The session lasted at least 30min or until the participant indicated (s)he had found a subjectively effective imagery strategy.
During the scanning session, the participants' heads were stabilized with foam padding to minimize head movement. They were instructed not to move and to keep their eyes closed during the 1 functional runs. They received auditory instructions indicating when they had to start imagining touch to the right hand ("hand") and the left foot ("foot"), and when they could stop ("rest"). Each participant completed five (participant 7) to six somatosensory-imagery classifier-training runs. Every classifier-training run contained a pseudorandom sequence of nine left-foot (LF) and nine right-hand (RH) somatosensory mental imagery trials of 18s duration, each followed by a 16, 18 or 20s rest period (figure 1). Seven participants also completed one or more answer-encoding runs in which they answered a binary question using imagined touch to the right hand to encode "yes" and imagined touch to the left foot to encode a "no" answer. Each answer-encoding run included five tactileimagery trials (encoding the same answer). Timing of the imagery trials and rest intervals was identical to the training runs. If possible, each question was followed by two answer runs. In the first answer run, subjects were asked to encode the actual answer to the question using the appropriate imagery strategy. In the second answer run, they were asked to encode the opposite answer (in order to obtain approximately the same number of "yes" and "no" answers-encoding data). The correct answers to the questions were unknown to the researcher responsible for data analysis until after completion of the offline analyses. In one subject, the answer runs were also analysed online and the results were reported back to the participant right after the run.
After scanning, participants were asked to fill out a questionnaire in which they had to rate the experienced quality of their somatosensory imagery (i.e., whether it was a robust mental experience) and the extent to which it evoked a tactile sensation (i.e., to what extent did they experience it as inducing tactile input coming from the target limb) on a ten-point scale (one being the worst/least applicable, ten being the best/most applicable). They also rated the percentage of time that they were successful at maintaining the imagery and described the particular imagery strategy they used.
All subjects reported successful somatosensory imagery. The type of strategies used ranged from imagining a warm feeling (four subjects), imagining vibratory stimulation to the finger or toe (two subjects), to imagining touch (massage, brush strokes, somatosensory sensation; four subjects; supplementary table 1). On a scale of 1-10, the average score subjects gave for the quality of their hand imagery was 6.9 (sd = 0.65) and of their foot imagery 6.6 (sd = 0.49). The average score in response to the question to what extent they experienced a tactile sensation during imagery was 5.3 (sd = 2.3) for the hand and 4.9 (sd = 1.9) for the foot. The reported average percentage of time that somatosensory imagery worked during the hand imagery epochs was 75% (sd = 4.5%, N=6) and 73% (sd = 8.2%, N=6) for the foot imagery epochs. Paired t-tests did not reveal any significant differences between foot-and hand-related ratings at p<0.05.

MRI data acquisition
Data were acquired at Maastricht Brain Imaging Centre (Maastricht University) on a 7T scanner (Siemens Healthcare, Germany) with a 32 receiver channel (Nova) head coil.
The scanning session included the acquisition of an anatomical scan (8min) and five to six classifier training runs (12min/run). In seven subjects, there was sufficient time to end the session with one to seven additional answer-encoding runs (1 answer run: 2 subjects, 2 answer runs: 4 subjects, 7 answer runs: 1 subject) in which the participant answered one or more binary questions using somatosensory imagery (4min/run).
In the functional runs, a gradient echo sequence was used with 78 slices of 1.5mm thickness, 1.5mm in-plane resolution, 198 mm x 198 mm field-of view, TR/TE = 2000/26.1ms, 60 degree flip angle, multiband acceleration factor 2 and GRAPPA acceleration factor 3 (340 volumes for the training runs, 96 volumes for the answer runs). In each session, five volumes were acquired in opposite phase encoding direction to be used for (offline) correction of EPI distortions.

Data analysis Preprocessing for offline multivariate analyses
Offline data analysis was performed using Brainvoyager QX 2.8.4 (BrainInnovation, Maastricht, the Netherlands). The anatomical scan was down-sampled to 1mm 3 . The first two volumes of each functional run were discarded. The remaining volumes were adjusted based on slicescan time, realigned to the first functional image of the session using rigid body transformations, temporally filtered to remove linear trends and nonlinear periodic signals with a frequency below 0.01Hz, corrected for EPI distortions, coregistered to the anatomical scan interpolating to 2mm 3 isotropic, and normalized to Talairach space. This procedure includes the definition of subject specific anatomical landmarks (AC [anterior commissure], PC [posterior commissure] and the borders of the cerebrum) that are used to rotate each brain in the AC-PC plane followed by piecewise, linear transformations to fit each brain in the common Talairach 'proportional grid' system (Talairach & Tournoux, 1988).
The multivariate analyses were limited to the voxels included in anatomical masks. Masks were obtained by creating anatomical regions of interest (ROIs) in Talairach space based on the Jülich probability maps for the primary (PSC3a, PSC3b, PSC2 and PSC2;Geyer, Schleicher, & Zilles, 1999;Geyer, Schormann, Mohlberg, & Zilles, 2000;Grefkes, Geyer, Schormann, Roland, & Zilles, 2001) and secondary somatosensory cortex (SIIOP1, SIIOP2,SIIOP3 and SIIOP4; Eickhoff, Amunts, Features (t-values per voxel) were extracted by linearly fitting each voxel within the mask with a design matrix consisting of a constant term (intercept) and a modelled hemodynamic response, within a temporal window of 30s, spanning from one volume (2s) before the auditory cue to 14 volumes (28s) after the auditory cue. The modelled hemodynamic response was obtained by convolving a canonical hemodynamic response function (hrf) with a box-car predictor whose value was one in the volumes corresponding to the imagery blocks, and zero otherwise.

Effect of the amount of training data and imagery class
To test the hypothesis that the amount of training data would increase decoding accuracy independent of the imagery class the number of runs used for SVM training was increased from one to five, each time testing on one of the remaining runs, using all possible combinations. Features were extracted from the largest, most inclusive anatomical masks of the primary and secondary somatosensory cortex. To test single-subject significance, permutation tests were performed. The edge corrected and logit transformed classification accuracies were subsequently entered in a two-way repeated-measures ANOVA in with imagery class (2) and number of training runs (5) as within-subject factors. To test whether accuracy increased with the number of training runs, within-subject planned polynomial contrasts were used.

Assessment in the context of BCI communication
To test the suitability of the current approach for (online) communication, a simulated realtime classification was performed using Turbo-BrainVoyager (version 3.2, BrainInnovation B.V., Maastricht, the Netherlands) to emulate the procedure that would be performed during the scanning session. In the last participant, a proof-of-concept online communication experiment was performed in which the answer runs were also decoded online during the scanning session using an SVM based on data from all available training runs.
The online and simulated real-time analyses were based on slice data in native resolution (1.5mm 3 ), which were 3D-motion corrected using the standard TBV incremental procedure spatially realigning each volume to the first recorded volume by rigid body transformation. The lowest probability (largest) S1 and S2 mask was back-transformed from Talairach space into native space.
After spatially smoothing the functional data with a 4 mm full -width -at -half -maximum (FWHM) kernel, t-values were computed by fitting a general linear model with the same predictors as used in the offline procedure, within the same temporal window of 30s, spanning from one volume (2s) before the auditory cue to 14 volumes (28s) after the auditory cue. T-values from the S1 and S2 mask were used as features for SVM training and testing using LIBSVM (2000-2009 Chih-Chung Chang andChih-Jen Lin).
In the simulated real-time analyses, the effect of the number of training runs was tested by using a classifier based on an increasing number of runs, starting from the run closest to the answer run to emulate the temporal adjacency that is characteristic for the online situation, each time adding a preceding run for each additional level for SVM training, and subsequently testing on the available answer run(s). The resulting edge-corrected and logit transformed accuracies were entered into a repeated-measures ANOVA with number of training runs (1-6) as within factor was used to statistically assess the effect of the amount of training data.

Assessment of somatotopy Comparing accuracy in primary and secondary somatosensory cortex masks
The effect on decoding accuracy of the anatomical region chosen for feature extraction was evaluated by repeating the cross-validation procedure with five runs (four for subject 7) for training and one run for testing with features extracted from the anatomical masks of the primary (PSC3a, PSC3b, PSC2 and PSC2) and secondary (SIIOP1, SIIOP2, SIIOP3 and SIIOP4) somatosensory cortex at different probability levels (Eickhoff, Amunts, et al., 2006;Eickhoff, Schleicher, et al., 2006;Geyer et al., 1999;Geyer et al., 2000;Grefkes et al., 2001). The probability level (a percentage) of a mask for a particular anatomical area indicates that all voxels in that mask were classified as part of that anatomical area in at least the corresponding percentage of subjects in the database used to construct the anatomical probability map. For example, an anatomical mask with a probability level of 30% contains voxels that were classified as belonging to that particular anatomical region in at least 30% of the subjects in the database used to construct these anatomical probability maps. The higher the probability level, the smaller the mask size (supplementary table 2). To test single-subject significance, permutation tests were performed.
Accuracy values were entered in a two-way repeated-measures ANOVA, with anatomical region (2, S1 or S2) and probability level (8, 10-80%) as within-subject factors, to test whether there was a difference between S1 and S2 in terms of decoding accuracy, and whether this was possibly affected by the probability level, which was further assessed using within-subject planned polynomial contrasts. For the ANOVA, only probability levels of 80% or lower were taken into account which corresponded to masks with more than 500 voxels, to ensure stable classification results.

Assessment of somatotopy in discriminative SVM weights
The extent to which the classification was based on somatotopically different weight patterns in primary somatosensory cortex was assessed using the average discriminative weights across the different training splits. Discriminative weights were extracted from single-subject maps (obtained when testing the overall single-subject classification accuracies) that represented the average weights across the different training splits, using SVM classifiers trained on features from the most inclusive anatomical mask, with all possible combinations of 4/5 runs for training. Because the representations of hand and foot are most clearly separated on x-axis (left-to right), we computed the subject-specific median x-coordinates for the 200 largest positive (foot imagery) and 200 largest negative weights (hand imagery), and compared them using a Wilcoxon signed rank test. Finally, the Spearman rank correlation was computed between the subject-specific differences in median x-coordinates for foot and hand and the leave-one-run-out classification accuracy revealed using phase-locked analysis and verified using multivoxel pattern analysis and functional connectivity. J Neurosci, 35 (7) Tables   Table 1: Data from simulated real-time analysis of answer runs. No answer-encoding data were obtained for subjects 4, 7 and 9 due to running short of MR scanning time.     Figure 5: Offline single-trial classification accuracies for S1 versus S2. Accuracies were obtained using a leave-one-run out cross-validation scheme based on spatial features extracted from separate masks of S1 (red line) and S2 (purple line) at different probability levels ranging from 10% to 80% . Higher probability levels were excluded because the masks included less than 500 voxels for the S2 mask. The shaded areas indicate the standard deviation for each case.