Rapid Communication with a “P300” Matrix Speller Using Electrocorticographic Signals (ECoG)

A brain–computer interface (BCI) can provide a non-muscular communication channel to severely disabled people. One particular realization of a BCI is the P300 matrix speller that was originally described by Farwell and Donchin (1988). This speller uses event-related potentials (ERPs) that include the P300 ERP. All previous online studies of the P300 matrix speller used scalp-recorded electroencephalography (EEG) and were limited in their communication performance to only a few characters per minute. In our study, we investigated the feasibility of using electrocorticographic (ECoG) signals for online operation of the matrix speller, and determined associated spelling rates. We used the matrix speller that is implemented in the BCI2000 system. This speller used ECoG signals that were recorded from frontal, parietal, and occipital areas in one subject. This subject spelled a total of 444 characters in online experiments. The results showed that the subject sustained a rate of 17 characters/min (i.e., 69 bits/min), and achieved a peak rate of 22 characters/min (i.e., 113 bits/min). Detailed analysis of the results suggests that ERPs over visual areas (i.e., visual evoked potentials) contribute significantly to the performance of the matrix speller BCI system. Our results also point to potential reasons for the apparent advantages in spelling performance of ECoG compared to EEG. Thus, with additional verification in more subjects, these results may further extend the communication options for people with serious neuromuscular disabilities.


IntroductIon
Many people affected by neurological or neuromuscular disorders such as amyotrophic lateral sclerosis (ALS), brainstem stroke, or spinal cord injury, are impaired in their ability to or even unable to communicate. A brain-computer interface (BCI) uses brain signals to restore some of the lost function. A BCI approach that several groups have begun to test in clinical applications in humans (e.g., Sellers et al., , 2010Vaughan et al., 2006;Nijboer et al., 2008; see Donchin and Arbel, 2009 for a comprehensive review) is the matrix-based speller originally described by Farwell and Donchin (1988). This speller uses different event-related potentials (ERPs) including the P300 evoked response. In this system, the user attends to a character in a matrix while each row or column flashes rapidly and pseudo-randomly. The brain produces a response to the row or column that contains the intended character (i.e., the oddball); this response is different for the other rows or columns. The BCI can detect the desired character by determining the row and column that produces the largest evoked response. Using this approach, recent electroencephalography (EEG)-based studies (Serby et al., 2005;Sellers et al., , 2010Lenhardt et al., 2008;Nijboer et al., 2008;Guger et al., 2009) reported real-time accuracies from 79 to 91% (6 × 6 matrix of 36 characters; 2.8% chance) at 13-42 s per selection.
of ECoG monitoring were based solely on the requirements of the clinical evaluation without any consideration of this study. Following placement of the subdural grid, postoperative CT imaging verified grid location (Talairach and Tournoux, 1988).

Data ColleCtion
We recorded ECoG from the implanted electrodes using six g.USBamp amplifier/digitizer systems (g.tec, Graz, Austria) and the BCI software platform BCI2000 Mellinger and Schalk, 2007;Schalk and Mellinger, 2010). Simultaneous clinical monitoring was implemented using a connector that split the cables coming from the patient into one set that was connected to the clinical monitoring system and another set that was connected to the g.USBamp devices. Thus, at no time was clinical care or clinical data collection compromised. Two electrocorticographically silent electrodes (i.e., locations that were not identified as eloquent cortex by electrocortical stimulation mapping) over inferior and superior posterior parietal cortex served as ground and reference, respectively. We used a grounding connection between the g.USBamp systems and the patient's skin to dissipate any electric currents generated by external electromagnetic fields and to block electromagnetic interference. The amplifiers sampled the signal at 512 Hz and used a high-pass filter at 0.1 Hz and a notch filter at 60 Hz.

experimental paraDigm
The subject sat 60 cm in front of a flat-screen monitor. She was presented with a matrix of alphanumeric characters that was centered on the screen and arranged in a 6 × 6 configuration (see Figure 2). At this distance, the matrix subtended ±7.1° of the horizontal and vertical visual field.
The subject participated in a recording session that consisted of offline and online experiments. In the offline (i.e., calibration) experiments, the BCI2000 matrix speller flashed each of the 12 rows or columns in a pseudo-random sequence. Flashes occurred at a rate of 16 Hz. Each flash lasted 1/64 s (16 ms) to 3/64 s (47 ms), followed by a 1/64 to 3/64-s inter-stimulus period. The intensity contrast between a flash and a non-flash was 3:1. Fifteen flash sequences comprised one trial. The subject's task in each trial was to pay 2 methoDs

human subjeCt
The subject in this study was a 29-year-old right-handed woman with intractable epilepsy who underwent temporary placement of subdural electrode arrays (see Figure 1A) to localize seizure foci prior to surgical resection. The subject had corrected-to-normal vision and gave informed consent through a protocol reviewed and approved by the review board of Albany Medical College.
The subject had a total of 96 subdural electrode contacts (i.e., one 8 × 8 64-contact grid, one 23-contact grid, and two strips in 1 × 6 and 1 × 3 configuration, respectively). These grids/strips were placed over the left hemisphere in frontal, parietal, temporal, and occipital regions (see Figure 1B for details). The implants consisted of flat electrodes with an exposed diameter of 2.3 mm and an inter-electrode distance of 1 cm, and were implanted for 1 week. Grid placement and duration A B  The subject sat 60 cm in front of a flat-screen monitor that presented a centered 6 × 6 matrix containing alphanumeric characters as well as space (Sp) and backspace (Bs). The rows and columns in the matrix flashed rapidly and pseudo-randomly. The subject's task was to pay attention to the intended character. The computer determined the intended character from the subject's ECoG responses. multiple approaches to determine the target-related response from data for which the intended selection is known (i.e., calibration data). These approaches included stepwise regression, peak picking, area under the curve measurements, and the covariance. In our study, we used a stepwise regression procedure that has been described in detail in Krusienski et al. (2006). In brief, we first filtered the brain signal from each channel between 0.1 and 20 Hz and downsampled it to 40 Hz. The downsampled ECoG signal of all 96 channels for 500 ms after stimulus onset comprised a total of 1920 potential signal features. A stepwise procedure then produced a linear model that predicted, given a subset of all features, whether or not the stimulus associated with these features was a target or non-target. In this iterative procedure, each step added the most significant and/or removed the least significant feature based on the p-value of an F-statistic (p add = 0.1, p remove = 0.15; Jennrich, 1977). To prevent overfitting, the stepwise procedure limited the number of features to 60 and terminated when a step did not further improve the regression model or when the maximum number of iterations (5000) was reached. In summary, this procedure reduced the 1920 potential ECoG features to a maximum of 60 features, and resulted in a linear model that was predictive of target or non-target. This linear model was applied to the ECoG response to each stimulus (i.e., row or column flash). The row and column with the highest model output defined the predicted character. Because there were 36 characters, chance accuracy was 2.8%.

onlIne experIments
For each online experiment, we used one of three different flash durations (i.e., 1/64, 2/64, 3/64 s). For each flash duration, we collected calibration data ("THE QUICK BROWN") and performed the offline analyses described above to establish a regression model. We then used this model to evaluate online system performance.
In these online experiments, we asked the subject to use the matrix speller BCI system to spell "THE QUICK BROWN FOX JUMPS OVER THE LAZY DOG." The BCI system provided feedback on the predicted characters as shown in Figure 2. The subject performed a "backspace" selection to correct for incorrect selections.

optImIzatIon of system performance
Over the course of online experimentation, we continually optimized system parameters (i.e., the flash duration and number of flash sequences) so as to optimize the subject's information transfer rate. The results are shown in Figure 4 and Table 1. For one flash sequence, spelling accuracy reached a maximum of 81% (see Figure 4) at a flash duration of 3/64 s. We then used a flash duration of 3/64 s (i.e., 47 ms) and increased the number of flash sequences. The accuracy reached 98% at three flash sequences, while the actual information transfer rate (i.e., bit rate), which was calculated including stimulation-and flight-time, peaked at 60.5 bits/min and two flash sequences (i.e., a selection every 4.5 s). In a subsequent seventh 3.5 min run, we reduced the time between selections to 2 s. The subject achieved a selection every 3.5 s at 86.4% accuracy. This represents an information transfer rate of 69 bits/min or 17 characters/min. attention to the highlighted character in the words "THE QUICK BROWN," and to make a mental note (i.e., to count) each time the correct row/column flashed. A 3-s pause (i.e., "flight time") between characters gave the subject time to shift her attention onto the next character. We used the ECoG data collected in this calibration experiment to establish a classifier using the stepwise regression method reported in Krusienski et al. (2006). We then configured the BCI to use this classifier in online experiments.
During each of the seven online experiments, the subject copyspelled the sentence "THE QUICK BROWN FOX JUMPS OVER THE LAZY DOG." The BCI system provided feedback of the characters predicted from the ECoG signals. The subject selected "backspace" to correct incorrect selections. In the seven online experiments, the subject spelled a total of 301 characters (i.e., 444 characters including "backspace" and subsequent corrections) using different stimulation parameters that are described in more detail in the Section "Results."

offlIne analyses
In offline analyses of data from each of the calibration experiments, we first filtered the signal between 0.1 and 20 Hz and downsampled it to 40 Hz. We then extracted the stimulus response, i.e., the ECoG signals from all 96 channels for 500 ms after stimulus onset (see Figure 3). This yielded 20 features (i.e., 40 × 0.5 = 20) per channel or a total of 1920 features for all 96 channels. We define a sequence to be 12 flashes, i.e., flashes of six rows and six columns of the presented matrix. Of these 12 flashes, two (i.e., the row and column that contained the desired character) are expected to elicit a target evoked response (i.e., oddball ERP) and 10 are not. With 15 flash sequences in each trial, this yielded 30 target ERPs and 150 nontarget ERPs. As we recorded 13 trials (i.e., each character in "THE QUICK BROWN") during a calibration experiment, this resulted in a total of 390 target and 1950 non-target ERPs for calibration.

stepwIse regressIon model
In the matrix speller paradigm, the subject's selection is predicted by the intersection of the row and column that elicits the largest target-related response. Farwell and Donchin (1988) proposed data with the same flash duration and 1-3 flash sequences. Figure 5 shows the locations of all 96 subdural electrodes (blue dots) and the corresponding color-coded classification accuracies. Accuracy ranged from chance level (1/(6 × 6) = 2.8%) to 50% for the best electrode location.
Statistical comparisons (two-sample t-test, Bonferroni corrected for the number of features, i.e., 1920) of each extracted feature (ECoG amplitudes at a given time and location) between target and non-target conditions revealed statistically significant (p << 0.001) differences over wide-spread areas in secondary visual cortex (see locations marked with A, B, C, D in the brain plot in Figure 5), associative visual cortex (E), angular gyrus (F), and somatosensory association cortex (G). The traces below show the correlation of the ECoG signals following the flash with the type of the ERP (i.e., target vs. non-target). This correlation analysis for locations A-G showed dominant peaks between 125 and 175 ms after the flash. The polarities of these peaks were reversed between the neighboring electrodes C, D, and E. Furthermore, signals recorded from angular gyrus (F), but not other locations, were sensitive to the orientation (i.e., row or column) of the attended flash (p = 0.00003).

optImIzIng number of electrodes
The results presented in the previous section show that, in this particular subject, ERPs recorded from electrodes over visual cortex contribute significantly to the performance of the matrix speller BCI system. This suggests that a similar level of performance may be achieved using recordings from only a few electrodes over a relatively small area, which is important for potential clinical application of this approach. Thus, we were interested in the relationship between the number of utilized electrodes over visual cortex and spelling performance.
To do this, in offline post hoc analyses, we evaluated spelling performance using 1-6 electrodes over visual cortex (i.e., locations A-F in Figure 5) and 1-3 flash sequences. In these analyses, we used the same calibration data as in the online experiment (i.e., "THE QUICK BROWN," 15 flash sequences, 3/64 s flash duration). We then established one classifier for each possible combination of the 1-6 electrodes over visual cortex. For each combination, we then applied the corresponding classifier to the data from the online experiments. The results in Figure 6 and Table 2 show the relationship between the best combinations of 1-6 electrodes and spelling performance, i.e., accuracy and bit rate, for 1-3 flash sequences. The results suggest that this particular subject could achieve a maximum of 100% classification accuracy at three flash sequences and four electrodes, and a maximum of 64 bits/min at two flash sequences and five electrodes. Furthermore, one bipolar derivation ( between locations C and A) may already allow for 57 bits/min or 90% of the peak spelling performance supported by five electrodes (see Table 2).

dIscussIon
The results of this study show that ECoG can support matrix BCI spelling at a sustained rate of 17 characters/min (i.e., 69 bits/min) and a peak rate of 22 characters/min (i.e., 113 bits/min). In line with recently completed studies (Brunner et al., 2010a,b;Treder and Blankertz, 2010), our offline analyses show that visual areas provided important contributions to the subject's performance. The results also indicate that only one bipolar derivation over visual In a final run, we further decreased the number of flash sequences to one. In this run, which is shown in Video 1 in Supplementary Material, the subject spelled the word "FLOWER" at a rate of 2.75 s/character (i.e., 22 characters/min or 113 bits/min).

cortIcal locatIons wIth sIgnIfIcant evoked responses
The results presented in the previous section demonstrated that the BCI system successfully predicted the intended character online with an accuracy of 81% using only one flash of each row/column. We were interested in the physiological basis for this successful demonstration, i.e., in the cortical locations and ERP components that held significant information. To do this, we trained the classifier separately on each location using the calibration data with a flash duration of 3/64 s, and evaluated performance on the online   cortex could support almost the same level of performance. In conclusion, with verification of our results in more subjects, these findings may increase the BCI-based communication options for people with serious motor disabilities.
The spelling rate of the ECoG-based matrix speller BCI shown here is beginning to match or even exceed that of conventional assistive devices. These devices are often either intrusive (e.g., cheek or tongue-switch), cumbersome (e.g., letter board), or susceptible to fatigue (e.g., video-based eye-trackers using the corneal reflection). Thus, while invasive, the BCI method presented here may provide distinct advantages over those conventional assistive devices.
While the spelling rate shown here is very high, it is still at least one order of magnitude slower than conventional communication (e.g., 200-400 characters using keyboard or voice; Majaranta and Räihä, 2002;Schalk, 2008). Although the spelling rate of the matrix speller could be further improved, there are fundamental limitations to these potential improvements. These limitations are due to the required dwell time (i.e., the time during which the rows/columns are intensified) and the flight time (i.e., the time between two characters). In our study, we used single-flash sequence BCIs. However, generation of a VEP depends on foveation of the target character. This is of critical relevance to clinical application of this BCI method, because eye movements are often impaired or lost in the target population. For example, although some people with ALS maintain residual eye movement for years (Cohen and Caroscio, 1983;Palmowski et al., 1995;Birbaumer and Cohen, 2007), others progress to near-complete or complete paralysis. The distance to foveation influences visual acuity and also VEP amplitude (Sherman, 1979;De Keyser et al., 1990) and thus would reduce the performance of any BCI that depends at least in part on VEPs.
An interesting finding was the polarity reversal of VEPs recorded from neighboring electrodes. While recording at the cortical surface (ECoG) can record these polarity-reversed VEPs, EEG recordings may only record the canceled superposition (Di Russo et al., 2002;Makeig et al., 2002). This cancellation effect may be one reason why the performance of EEG-based matrix speller systems, despite wider cortical coverage (e.g., 64 scalp locations of an extended 10-20 montage; Sharbrough et al., 1991), appears to be lower than that shown here.
While quite encouraging, the results shown here are based on only one subject who had coverage of large cortical areas including visual areas. Thus, it is currently unclear whether the results presented here will generalize to other subjects. Furthermore, while we were able to make general performance comparisons of this ECoGbased study with previously published EEG-based studies, we did not compare performance of ECoG and EEG within this subject.
The linear relationship between the flash duration and the accuracy, as well as the fact that only one electrode was sensitive to the orientation (i.e., row or column) of the attended flash, suggests that, in this particular subject, the magnitude of the ERP in response to visual stimulation was determined mostly by luminance. However, many previous studies have shown that the cortex performs neuronal processing of other features of visual stimuli, such as spatial frequency, orientation, motion, direction, speed, and many other spatiotemporal features Wiesel, 1959, 1962;Zeki et al., 1991). A recent study (Martens et al., 2009) showed that these properties of the visual system can be exploited to increase presentation/classification (i.e., the smallest possible number) and a dwell time (i.e., the time the subject sustained eye-gaze/attention) of as little as 0.75 s. While this dwell time compares favorably to what is used in other assistive devices (e.g., 0.6-1.0 s for a modern eye-tracker; Majaranta and Räihä, 2002), these other devices tend to provide higher communication performance. This is because the matrix spelling paradigm used here also requires a flight time during which the subject produces brain responses, the computer evaluates the responses, and the subject shifts gaze/attention to the next character. It appears impractical to further substantially decrease either the 2-s flight-time, or the 0.75-s dwell time. Thus, the paradigm presented here should be limited to a spelling rate that is only modestly higher than what we report here. This limitation appears to have two reasons. First, the current paradigm is synchronous, i.e., the subject has to synchronize his/her behavior with the timing of the BCI. This requires the subject to shift eye-gaze/ attention onto the intended character within the 2-s flight-time and to sustain eye-gaze/attention for the 0.75-s dwell time. One potential solution to overcome this limitation is an asynchronous paradigm, i.e., a paradigm in which the subject does not have to synchronize behavior with the system. SSVEP-based BCIs often use such asynchronous paradigms. In such a paradigm, the subject performs a selection by focusing eye-gaze on the target character (i.e., one of multiple light sources flickering at different frequencies) while the BCI detects those frequencies in the EEG recorded over occipital cortex (Middendorf et al., 2000). These paradigms not only overcome the synchronization requirement, they also permit stimulating each potential target independently for the whole dwell time (i.e., by using individual frequencies for each potential target). Using such a paradigm, Bin et al. (2009) reported 18.7 characters/ min for EEG. The use of this paradigm with ECoG may further increase performance.
The results suggest that ERPs over visual areas (VEPs) contribute significantly to the performance of the matrix speller BCI system. Recent studies (Bin et al., 2009;Martens et al., 2009) suggest that a time-, frequency-, and code-based stimulation may elicit a wide range of VEPs while minimizing the flight time and obtrusive flickering that currently limits the utility of P300-and SSVEP-based This table shows the relationship between the number of electrodes over visual cortex and accuracy (left) or bit rate (right) that this subject can achieve with these electrodes at 1-3 flash sequences. The data in these tables corresponds to the traces in Figure 6; locations A-F correspond to the electrode locations and evoked responses in Figure 5. *Bipolar derivation.
matrix speller BCI remains unclear. Compared to non-invasive approaches, an ECoG-based approach entails additional costs and risks. More generally, despite some encouraging successes of noninvasive matrix spellers Nijboer et al., 2008), it is still unclear to what extent matrix spellers can serve the needs of people with disabilities, in particular those in whom eye-gaze is compromised: two recent studies (Brunner et al., 2010b;Treder and Blankertz, 2010) demonstrated that the performance of the matrix speller depends substantially on the subject's ability to fixate the target character. It is also unclear whether similar fast stimulation rates (i.e., 16 Hz) can be used in people with disabilities. Even if the high speed suggested by this study could be translated to clinical applications, it is unclear to what extent end users will find this increased spelling rate desirable. Furthermore, it is currently unknown whether the added benefit of increased robustness and/ or increased spelling rate will outweigh the additional cost of surgical implantation. More generally, it is still debated whether people with complete paralysis can even achieve and maintain brain-based control, irrespective of whether EEG or ECoG is used (Hill et al., 2006;Kübler and Birbaumer, 2008). In summary, the results shown in this study demonstrate that ECoG supports spelling performance exceeding 20 characters/min. In consequence, with additional verification in more subjects, our results may further extend the communication options for people with severe motor disabilities.

supplementary materIal
The Video 1 for this article can be found online at http://www. frontiersin.org/Neuroprosthetics/10.3389/fnins.2011.00005/ abstract the amplitude of the EEG response, and thereby increase overall classification accuracy. This suggests that more extensive electrode coverage may yield higher performance.
While in this study we only recorded signals from electrodes over the left hemisphere, it is known that visual cortex has bilaterally symmetric retinotopic maps (Engel et al., 1994(Engel et al., , 1997Yoshor et al., 2007). Thus, some of the ERPs may only reflect right visual field stimulation (Daniel and Whitteridge, 1961) and therefore bilateral coverage might further increase performance. As a related point, the electrode placement in this study was based solely on the requirements of the clinical evaluation, without any consideration of this study. Pre-surgical mapping of visual cortex using functional magnetic resonance imaging (fMRI; Engel et al., 1994Engel et al., , 1997Vansteensel et al., 2010) could be used to optimize electrode location.
In this study we used subdural electrodes (i.e., electrodes placed underneath the dura mater). This placement requires penetration of the skull and the outer meningeal covering, i.e., the dura. This is important for clinical application of this BCI method, because the penetration of the dura increases the risk of bacterial infection (Davson, 1976;Hamer et al., 2002;Fountas and Smith, 2007;Van Gompel et al., 2008;Wong et al., 2009). Epidural electrodes (i.e., electrodes placed on top of the dura mater) provide signals of approximately comparable fidelity (Torres Valderrama et al., 2010) and do not penetrate the dura. Thus, epidural placement may increase safety and thus clinical practicality of an ECoG-based matrix speller BCI.
Success of wide-spread clinical application of ECoG-based matrix speller BCI systems depends mainly on costs and risks (Higson, 2002;Raab and Parr, 2006). The results presented in this paper are of critical relevance to these issues, because they suggest that effective ECoG-based matrix speller BCI systems may be realized by using only one bipolar and possibly epidural electrode.
Our results provide encouraging evidence that ECoG can provide high spelling rates, and recent results (Chao et al., 2010;Schalk, 2010) suggest that ECoG has good long-term stability. Moreover, an ECoG-based system reduces the patient's dependence on a caregiver to set up EEG electrodes or other external conventional assistive devices. At the same time, the clinical value of an ECoG-based