Autonomous Parameter Adjustment for SSVEP-Based BCIs with a Novel BCI Wizard

Brain-Computer Interfaces (BCIs) transfer human brain activities into computer commands and enable a communication channel without requiring movement. Among other BCI approaches, steady-state visual evoked potential (SSVEP)-based BCIs have the potential to become accurate, assistive technologies for persons with severe disabilities. Those systems require customization of different kinds of parameters (e.g., stimulation frequencies). Calibration usually requires selecting predefined parameters by experienced/trained personnel, though in real-life scenarios an interface allowing people with no experience in programming to set up the BCI would be desirable. Another occurring problem regarding BCI performance is BCI illiteracy (also called BCI deficiency). Many articles reported that BCI control could not be achieved by a non-negligible number of users. In order to bypass those problems we developed a SSVEP-BCI wizard, a system that automatically determines user-dependent key-parameters to customize SSVEP-based BCI systems. This wizard was tested and evaluated with 61 healthy subjects. All subjects were asked to spell the phrase “RHINE WAAL UNIVERSITY” with a spelling application after key parameters were determined by the wizard. Results show that all subjects were able to control the spelling application. A mean (SD) accuracy of 97.14 (3.73)% was reached (all subjects reached an accuracy above 85% and 25 subjects even reached 100% accuracy).


INTRODUCTION
Brain-Computer Interfaces (BCIs) transfer electroencephalographic (EEG) brain signals collected by non-invasive electrodes and elicited by the user into computer commands, without using the brain's normal output pathways of peripheral nerves and muscles (Wolpaw et al., 2002). There are different control paradigms for such BCIs. Among the most common approaches are the steady-state visual evoked potential (SSVEP)-paradigm (Müller-Putz et al., 2005;Bin et al., 2009;Gao et al., 2014), the event-related desynchronization/synchronization (ERD/ERS)paradigm (Blankertz et al., 2007), and the P300 event-related potential (ERP)-paradigm (Townsend et al., 2010). Usability challenges have impeded BCI usage in everyday scenarios for a long time.
Recently this issue has been addressed by various research groups. Millán et al. (2010) predicted in their review that the time is ripe for developing practical BCI prototypes that will have a real impact in improving life quality of disabled people. Since then more effort has been made to conduct studies with the target population (Sellers et al., 2010;Holz et al., 2013;Kaufmann et al., 2013;Riccio et al., 2013;Daly et al., 2015;Kübler et al., 2015). For example, Sellers et al. (2010) tested a BCI used by a 51-year-old ALS-patient at his home. The system was used successfully for over 2.5 years and restored the user's independence in social interactions. Recalibration was performed remotely (via the internet). Holz et al. (2013) also installed a BCI controlled application at a locked-in ALS-patient's home. Their study demonstrated expert-independent home-use of BCI but it also reported varying performance and technical problems and stresses the importance of regular calibration.
BCIs are seeing considerable research interest, as there has been consistent growth in papers mentioning BCI from 2001 to 2012 (Thompson et al., 2014). PubMed search results with the search term "Brain-Computer Interface/Brain-Machine Interface" provide over 4200 results for articles in the last decade, and with the additional search term "SSVEP" there are still 185 results. The focus on this paper lies exclusively on SSVEPbased BCIs, which have become quite common in the meantime and represent a standard BCI paradigm. Visual stimuli (e.g., a set of boxes flickering at different constant frequencies) can be displayed simultaneously and independently on a computer monitor. By gazing at a particular box the user can select a desired command. Through, looking at a stimulus, brain signals are modulated with the corresponding frequency. Measured by an EEG the brain signals can then be classified in real time. Therefore, various applications like spelling interfaces (Volosyak et al., 2011a) and control applications for a prosthesis (Müller-Putz and Pfurtscheller, 2008) or for navigation (Martinez et al., 2007) can be implemented with the SSVEP approach. As SSVEPbased BCI depends on gaze shifting, these systems may not work for severely disabled people. However, SSVEP-based BCIs have also been tested with handicapped users (see Volosyak et al., 2009a;Müller et al., 2015). A SSVEP BCI system relies on a variety of different parameters which influence the BCI performance, such as stimulation frequencies and classification thresholds. For example, it has been observed that stimulation with lower stimulation frequencies, yields larger amplitudes (Gao et al., 2003;Zhu et al., 2010;Inkaew et al., 2015) and that the SSVEP response is maximum at 15 Hz (Pastor et al., 2003). Those need to be adjusted precisely. Therefore, automatized calibration methods are an essential step for BCIs to progress from laboratory demonstrators to real live applications, as the precise parameter set up cannot be expected from users or caretakers and caregivers.
Calibration methods have already become the standard for ERP-based BCIs . Typically, in a supervised classifier, data is collected during a calibration phase in which the user is asked to perform specific tasks. The collected data of brain signals is then analyzed and decoded to customize control parameters. Many research articles concentrate on shortening or omitting the calibration periods and on the development of so called Zero Training BCIs (Krauledat et al., 2008;Grizou et al., 2013;Kindermans and Schrauwen, 2013;Kindermans et al., 2014). Also Kaufmann et al. (2012) developed a user-centered ERP-BCI application that adjusts classifier weights and control parameters individually in the background and demonstrated feasibility of auto-calibrating ERP-BCI use. A wizard that handles important parameters especially for the SSVEP-paradigm was suggested by Volosyak et al. (2010b). Punsawad and Wongsawat (2012) proposed a SSVEP-based BCI system that requires less assistance from the caretaker, as it could be enabled or disabled by alpha band EEG, but to our best knowledge an automated calibration process for SSVEP-based BCIs has yet not been developed. The calibration software presented here was tested and evaluated with 61 subjects. We further explored BCI demographics based on the data of this large number of subjects. Previous BCI field-studies have been made with other non-invasive BCI approaches like the P300 paradigm (e.g., Guger et al., 2009) and the motor imagery paradigm (e.g., Guger et al., 2003). But except for Guger et al. (2012) the field-studies focusing on the SSVEP paradigm report subjects that were not able to gain satisfactory control over the BCI (e.g., Allison et al., 2010;Volosyak et al., 2011b).
Subjects are referred to as BCI-illiterates, it the BCI software cannot detect their intentions accurately, more precisely, if the classification accuracy cannot surpass a certain threshold (see e.g., Dornhege, 2007;Blankertz et al., 2010;Brunner et al., 2010;Fernandez-Vargas et al., 2013); e.g., the value of 70% is often used in the literature (Perelmouter and Birbaumer, 2000;Brunner et al., 2010). Therefore, we define the BCI literacy rate as the percentage of users who are able to achieve control over the BCI and the BCI illiteracy rate analogously. It should be noted that those terms (e.g., both BCI illiteracy and synonymously used BCI deficiency) have been criticized for being pejorative; those terms imply that it is the BCI end-user's "fault" that he or she cannot control the BCI. Yet, one could argue that it is not the fault of the user but of the BCI software that has not been able to perform proper classifications. Therefore, some authors prefer the use of the term "lack of BCI efficiency" instead (e.g., Vidaurre et al., 2011). However, the term BCI efficiency has already been defined in a different manner as the proportion of minimum number of compulsory commands to the total number of detected commands (Volosyak et al., 2009b). Therefore, we decided to use the regular historically established term "BCI illiteracy." Though, some improvements in the BCI software algorithms are clearly visible over the years, the "BCI illiteracy" phenomenon remains a reoccurring problem in SSVEP-BCI field-studies. Allison et al. (2010) reported a BCI illiteracy rate of 24.52%, Volosyak et al. (2009b) mentioned 13.51% andVolosyak et al. (2011b) an illiteracy rate of 2.33%.
The overall aim of this research was • to investigate optimal stimuli selection for SSVEP-based BCIs through analysis of the wizard outputs, • to show that the vast majority, if not all BCI users are able to control a SSVEP-based BCI application, and • to prove that generally higher classification accuracies can be achieved (through autonomous parameter adaption by the wizard and changes in the signal classification algorithms).
The used classification methods are based on the algorithms developed in our previous studies (e.g., Volosyak et al., 2009bVolosyak et al., , 2011bAllison et al., 2010). In these studies the topic of intersubject variability has been addressed as well. In order to compare the BCI performance, we further analyzed whether factors such as gender and sleep influence performance. For the analysis of such demographic factors all participants of the study went through the same questionnaires as in the mentioned publications.

Hardware
The subjects were seated in front of a LCD screen (BenQ XL2420T, resolution: 1920 × 1080 pixels, vertical refresh rate: 120 Hz) at a distance of about 60 cm. The computer system used operated on Microsoft Windows 7 Enterprise and was based on an Intel processor (Intel Core i7, 3.40 GHz). Standard Ag/AgCl electrodes were used to acquire the signals from the surface of the scalp. The ground electrode was placed over AF Z , the reference electrode over C Z , and the eight signal electrodes were placed at predefined locations on the EEG-cap marked with P Z , PO 3 , PO 4 , O 1 , O 2 , O Z , O 9 , and O 10 according to the international system of EEG electrode placement (Oostenveld and Praamstra, 2001). Standard abrasive electrolytic electrode gel was applied between the electrodes and the scalp to help bring impedances below 5 k . An EEG amplifier g.USBamp (Guger Technologies, Graz, Austria) was used. The sampling frequency was set to 128 Hz. During the EEG signal acquisition, an analog band pass filter between 2 and 30 Hz and a notch filter around 50 Hz were applied directly in the amplifier.

Wizard
The wizard ran the user through three phases in order to provide subject-specific stimulation frequencies (phase 1 and 2), classification thresholds, and time segment lengths (phase 3). The techniques used in each step were derived from several previous findings. The so called multi-target technique for the selection of individual subject-dependent stimulation frequencies, presented by Volosyak et al. (2010c), was based on the dual stimulation technique suggested by Mukesh et al. (2006) that used frequency combinations in order to increase the number of SSVEP targets. Figure 1 illustrates the entire calibration procedure for one subject.
The first task of the wizard was to select four optimal stimulus frequencies. One of the reasons for choosing a low-frequency band for SSVEP visual stimulation, were larger amplitudes (Gao et al., 2003;Zhu et al., 2010;Inkaew et al., 2015). Because of this, the wizard took only frequencies between 6.32 and 20 Hz into consideration. Furthermore, the number of suitable frequencies on the LCD monitor is limited due to the vertical refresh rate of 120 Hz. The stimulating frequencies have to be the subfrequencies of the monitor refresh rate (Volosyak et al., 2009a;Chen et al., 2014;Nakanishi et al., 2014). Therefore, the optimal frequencies were drawn from the set of fourteen possible target frequencies 6. 32, 6.67, 7.06, 7.50, 8.00, 8.57, 9.23, 10.00, 10.91, 12.00, 13.33, 15.00, 17.14, and 20.00 Hz (obtained with dividers between 6 and 24, see Table 1).

Classification Method
For SSVEP signal classification we used a minimum energy combination method (MEC) introduced by Friman et al. (2007), as modified by Volosyak (2011). To detect a frequency in the spatially filtered signals the SSVEP power estimations for all N f frequencies were normalized into probabilities whereP i is the ith power estimation, 1 ≤ i ≤ N f .  FIGURE 1 | Illustration of the calibration procedure for one subject. After EEG-data were recorded (subject's eyes closed) 10.61 Hz had the strongest SSVEP response and was determined as alpha wave frequency. As this frequency did not belong to the set of possible target frequencies no frequency was filtered out. In phase 2 (multi-target stimulation) EEG-data were recorded while the subject faced two circles in sequence (each for 10 s). The first circle presented the target frequencies 6.32, 7.50, 8.00, 10.00, 10.91, 13.33, and 17.14 Hz. The second circle presented 6.67, 7. 06, 8.57, 9.23, 12.00, 15.00, and 20.00 Hz. Results from the EEG recordings are displayed in the diagrams. In phase three, frequencies with the highest SSVEP-responses from both recordings were presented in series, and corresponding SSVEP key parameters were calculated. To increase robustness in the four target spelling application we considered three additional frequencies, selected as means between two target frequencies (see also Volosyak et al., 2010a).
The classifier output O was the index of the ith frequency if the following conditions held: (1) the ith frequency had the highest probability p ′ i , (2) p ′ i exceeded certain predefined thresholds β i , and (3) the detected frequency belonged to one of the stimulating frequencies. So for a BCI system with four targets the output was defined as where 1 ≤ i ≤ N f . If no frequency probability exceeded the corresponding threshold β i or if one of the additional frequencies had highest probability the output O was set to zero. The choice of the β i depended on the corresponding stimulation frequency (generally lower stimulation frequencies produce higher SSVEPresponse, therefore thresholds can be higher) but also on user factors (the quality of the SSVEP-signal differs between subjects). The values for the β i were determined in a calibration session with the here presented wizard. If O was classified as an additional frequency (i > 4), then the classification would be rejected as the detected frequency did not belong to the set of stimulation frequencies.
EEG-data were processed by the computer in blocks of 13 samples (101.5625 ms with the used sampling rate of 128 Hz). Therefore, the time segment lengths displayed in Table 2 were used. The classification was performed with the sliding window of T s after receiving the new EEG data block.

Phase 1 (Alpha-Test)
Whether high alpha wave activity occurred, was tested in the first phase of the wizard, as the low frequency band overlaps with the alpha band (8-13 Hz Zhu et al., 2010), which can cause false classifications (Zhu et al., 2010;Cao et al., 2014). Therefore, the frequencies from the set of possible target frequencies (see Table 1), which belonged to the alpha band, were checked for interference with each subject's alpha wave. Therefore, in the first phase of the wizard it was tested whether high alpha wave activity occurred.
When the wizard program was started, the user was instructed by an audio instruction and a text message displayed on the screen to close his or her eyes. After 10 s a second audio command told the user to open his or her eyes again. During the closed eye period EEG data were recorded. The five stimulation frequencies f st = 8.57, 9.23, 10.00, 10.91, 12.00 Hz (possible target frequencies on the LCD monitor belonging to the alpha band) and ten neighboring frequencies f st ±0.3 Hz were tested for SSVEP response. If after 10 s one of the possible target frequencies had the highest averaged probability and surpassed a certain threshold, it would be neglected further on. After this procedure, all of the remaining frequencies differed from the alpha wave by 0.15 Hz or more.
In all phases, signal-to-noise ratio (SNR) distributions for 10 different time segment lengths were calculated online. The criterion for frequency selection in phase 1 and phase 2 was based on the calculation of the integral value of SNR distribution of the different segment lengths over time.
Values for p ′ i were calculated for all used time segment lengths and for all frequencies, so after 812.5 ms (eight blocks), a value for p ′ i,812.5 ms was calculated. As we recorded 100 blocks of EEG-data, 93 values (100 − 8 + 1 blocks) were calculated for p ′ i,T 1 . After 10 s, those 93 values were summed and an average valuep ′ i,T 1 was calculated. This was done for all time segment lengths, so after 10 s we had averaged probabilitiesp ′ i,T s for all ten used time segment lengths. Then the average over allp ′ i,T s was calculated In phase 1 we considered 15 possible alpha frequencies (N f = 15). The criteria for classifying the ith frequency as alpha frequency were (1) the ith frequency had the highest averaged probabilityp i ′ and (2)p i ′ > 0.1.

Phase 2 (Multi-Target Stimulation)
In phase 2, multi-target stimulation was used to find optimal stimulation frequencies (frequencies with the strongest SSVEP response). The user faced a circle (radius 245 pixels) divided into 147 segments (seven rings, each containing 21 segments) representing seven stimulating frequencies at once. Each of the seven stimulating frequencies was presented by 21 segments which were scattered randomly around the circle. When subjects were instructed (by an audio message) to focus their gaze on the circle, the flickering started and EEG data were collected. After 10 s (100 blocks of EEG-data), the flickering paused for 2 s. Thereafter, the user faced a second circle, identical to the first one, but now it flickered with seven different frequencies. EEG data were recorded for another 10 s. Each circle contained higher and lower frequencies and, in order to avoid mutual influences between stimulating frequencies, each group of seven simultaneously flickering segments followed the additional restrictions rules (as e.g., in Volosyak et al., 2010c): The stimulation frequencies for the first circle were 6.32, 7.50, 8.00, 10.00, 10.91, 13.33, and 17.14 Hz. For the second, they were 6.67, 7.06, 8.57, 9.23, 12.00, 15.00, and 12.00 Hz. If one of the 14 possible target frequencies interfered with the users alpha wave, this frequency would be neglected and one of the circles would contain only six frequencies. The probabilities Equation (3) of seven stimulation frequencies (N f = 7) for each circle were calculated after data had been recorded. After this, the 14 possible target frequencies were sorted from highest averaged probability to lowest and the top four frequencies were selected as optimal target frequencies. However, the restriction rules Equation (4) were checked for each frequency from the set of those four selected frequencies in descending order and if they were violated, the corresponding frequency would be replaced with the frequency which had the highest averaged probability of the remaining frequencies. For example, if 6 Hz had the highest and 12 Hz the second highest averaged probability, the latter was replaced, because otherwise the restriction rules would be violated, as 12 Hz is a harmonic of 6 Hz.

Phase 3 (Sequential Stimulation)
In order to find optimal thresholds for the SSVEP-classification a white circle (radius 150 pixels), flickering at the frequency which had the highest SSVEP response in phase 2, was displayed. It was necessary to simulate noise caused by peripheral vision when concentrating on the target object. Therefore, the white circle (the target object) was surrounded by a green ring (outer diameter 500 pixels, inner diameter 300 pixels), containing 144 segments. Each of the remaining three frequencies from the optimal frequency set was presented by 48 flickering segments which were scattered randomly around this ring. The user was instructed by an audio command to gaze at the white circle. The circle and the ring flickered for 10 s while EEG data were recorded. The flickering then paused for 2 s so that further recordings would not be influenced by the SSVEP-responses that occurred during the first recording. Thereafter, the white circle flickered with the second highest frequency from phase two, while the ring flickered with the remaining three frequencies. This procedure was repeated until data for all four optimal frequencies were collected, so the total recording time for phase three was 40 s. Table 3 shows the blinking sequence of the four optimal frequencies during phase 3.
After this, the classifier outputs O (see Equation 2) were analyzed. Let β = [β 1 , β 2 , β 3 , β 4 ] be the vector of all four classification thresholds. Classification thresholds were chosen equal for all four frequencies (β i = β j for i, j = 1, 2, 3, 4). Classifier outputs O i,T s ,β were determined for all p ′ i,T s and for all thresholds β, with β j = 0.15, 0.16, . . . , 0.99 for j = 1, . . . , 4. The output was then categorized into three classes: • If the output O i,T s ,β was equal to the index of the stimulation frequency, the output was classified as "correct classification." • If O i,T s ,β was equal to the index of one of the remaining three stimulation frequencies, the output was classified as "wrong classification." • If O i,T s ,β was equal to zero (no frequency probability exceeded the thresholds β i or an additional frequency had highest probability) the output was categorized as "zero classification." Note that there were only four stimulation frequencies in each circle but three additional frequencies were considered (N f = 7). Then the distributions of correct classifications, p correct (i, β, T s ), false classifications, p false (i, β, T s ) and zero classifications, p zero (i, β, T s ) for each of the four stimulation frequencies (i = 1, 2, 3, 4) was calculated. The thresholds and minimal time segment lengths were selected in an iterative process: First all O i,T s ,β for the i − th frequency were analyzed for the smallest time segment length T s = 812.15 ms. The largest value β j = 0.30, 0.31, . . . , 0.70 , β = [β j , β j , β j , β j ] which satisfied the conditions p correct (i, β, T s ) ≥ 40 and p false (i, β, T s ) = 0 was determined and, if such value existed, the threshold β i corresponding to the ith frequency was set to this value. This was carried out for each frequency individually. If β j satisfying Equation (5) were found for all four stimulation frequencies, the minimal segment lengths T 0 was set to T s and the thresholds β i were chosen as classification thresholds. If such β j did not exist for at least one of the four stimulation frequencies, T s was set to the next higher segment length and the conditions Equation (5) were checked again. This procedure was repeated until such β j satisfying Equation (5) were found. Table 4 shows an example of the threshold determination process.

Three-Step Spelling Application
The Three-step spelling application (Gembler et al., 2014) resembles an earlier developed graphical user interface (GUI) layout (Volosyak et al., 2011a;Kick and Volosyak, 2014). The initial screen is displayed in Figure 2A. Four commands were represented on the computer screen by flickering boxes of default sizes (125 × 125 pixels). The size of the boxes varied during the experiment as described by Volosyak (2011). After selecting a desired box, the position of the three boxes containing the alphabet changed from upper horizontal to left-hand vertical according to the first selection made (see Gembler et al., 2014). After the second selection the positions changed once more from left vertical to horizontal bottom position. Also, each box now contained a single letter. In the second and the third step, the far right box ("Del" in the first step) would contain the command "back, " giving the user the option to switch to the previous view. An overview of the three steps necessary to choose a single letter is shown in Figure 2B. In this example the threshold for the corresponding frequency was set to 51 (bold), as it was the largest threshold β j that satisfied equation (5). For all β j > 51 (light gray) Equation (5) was not satisfied.
Every command classification was followed by an audio feedback in order to reduce the information load on the visual channel.
The SSVEP classification was performed on the basis of the adaptive time segment length of the acquired EEG data (Volosyak, 2011). If no classification could be made and the actual time t allowed the extension of T s to the next predefined value, this new value was used instead (see Figure 3). The fixed starting segment length was determined by the wizard software. We further included a segment length of 160 blocks ( 16 s, see also Gembler et al., 2015).

Subjects
All subjects (healthy adult volunteers) gave written informed consent in accordance with the Declaration of Helsinki. This research was carried out in accordance to best practice guidelines; ethical principles were taken into consideration during conducting of all BCI experiments. Information needed for the analysis of the experiments was stored anonymously during the experiment; results cannot be traced back to the participant. Sixty-one subjects participated, with a mean (SD) age of 22.8 (3.89) years (range 17-49); 17 of the subjects were female. All subjects were students or employees of the Rhine-Waal University of Applied Sciences. The EEG recording took place in a standard laboratory room with low background noise and luminance. None of the subjects had neurological or visual disorders. Spectacles were worn when appropriate. Subjects did not receive any financial reward for participating in this study.

Procedure
After completing the consent form, each subject completed a brief pre-questionnaire, where questions regarding gender, the need for vision correction, tiredness, and BCI experience were answered. Thereafter, subjects were prepared for the EEG recording. At first subjects went through the steps of the Wizard software and key parameters for BCI performance were determined. Once started by the experimenters, subjects were guided by audio and text feedback through the phases and used the wizard independently. The key parameters found by the

A B
Three-step GUI Choosing the letter E   wizard were transferred automatically to the Three-step spelling application which was then started by the experimenters. Subjects participated in a familiarization run spelling the word "BRAIN" and a word of choice (e.g., the own first name). Next, each subject used the GUI to spell the phrase "RHINE WAAL UNIVERSITY." The spelling phase ended automatically when the phrase was spelled correctly. The experiment would have been stopped manually in case a subject had not been able to execute a desired classification within a certain time frame, had wished to end the experiment, or if unintentional repeated misclassifications had occurred. However, every subject was able to complete the spelling task. Spelling errors were corrected via the implemented delete button. Information needed for the analysis of the test was stored anonymously during the experiment. After the spelling phase the subjects completed a post-questionnaire, answering questions regarding tiredness and their subjective opinion about the BCI system.

ITR calculation
The information transfer rate (ITR) represented the amount of information communicated per unit time and was calculated based on the following formula (Wolpaw et al., 2002): In the formula above, B represents the number of bits per trial. The Accuracy P was calculated as the ratio between the number of correct selections and the total number of classified commands. The number of possible choices was the number of flickering boxes available (N = 4). To obtain ITR in bits per minute, B is multiplied by the number of command classifications per minute. ITR and accuracy were calculated on-line and displayed at the lower right corner of the GUI of the Three-step spelling application during the spelling task (see Figure 2A).

Spelling Performance
All 61 subjects were able to complete the spelling task; no subject reported any pain or discomfort during the experiment. Table 5 shows the overall spelling performance for the spelling task from 61 subjects. The analysis of the spelling performance reveals an overall mean (SD) ITR of 21.92 (7.63) bpm and a mean (SD) accuracy of 97.14 (3.73)%. All Subjects reached 85-100% accuracy; 24 of the 61 subjects even completed the spelling task without errors, achieving an accuracy of 100%.

Wizard
The SNR distributions for the stimulation in two groups of seven frequencies (multi-target stimulation) were analyzed with the Minimum Energy Combination algorithm (Section 2.2.1) to find the best four stimulation frequencies. Figure 4A shows for how many subjects each frequency was selected (wizard phase 2). The data were analyzed with the Minimum Energy Combination algorithm to find the best four stimulation frequencies.
The optimal time segment for each user was selected in a similar way. In phase two, selected stimuli were presented in sequence and again the SNR distributions were analyzed with the Minimum Energy Combination algorithm and were used to calculate the accuracy for each segment length and classification border simultaneously. Based on these values, a minimal time segment length was chosen. Figure 4B shows how often each time segment length was chosen. The length of the time-window used for Minimum Energy Combination method was 10 s for each calibration step of the wizard. Thus, the wizard returns adequate outcomes from data sets recorded during 70 s (10 s for phase one, 20 s for phase two, and 40 s for phase three).

Questionnaire Results
In the pre-questionnaire subjects answered questions regarding gender, the need for vision correction, tiredness, and BCI experience, as displayed in Table 6. Seventeen subjects (27.9%) were female [22.71 (4.29) years] and 72% were male subjects [22.83 (5.22) years]. Female subjects performed with an information transfer rate of 25.35 (6.54) bpm while males performed with a rate of 20.12 (7.34) bpm. A t-test revealed a significant difference between the mean ITR of female and male subjects, t (32) = 2.64, p < 0.05. The results of the post-questionnaire are given in Table 7.

DISCUSSION
The presented wizard for SSVEP-based BCI sets up key parameters in adequate time and allows inexperienced personnel to set up the BCI, as only one click is necessary for calibration. Overall, including pauses between steps, the calibration process took <2 min. However, a calibration that runs in the background, invisible to the user as already developed for the P300paradigm (see e.g., Kaufmann et al., 2012) is desirable for the SSVEP-paradigm as well. Regarding user friendliness, almost all subjects stated that they would recommend the system, though nearly a third of the subjects stated to be annoyed by the flickering. This is in line with observations from other studies (see e.g., Müller et al., 2015). In their review Zhu et al. (2010) also summed up the disadvantages of using lower frequencies. Subjects tend to be more annoyed by lower frequencies, visual fatigue occurs more easily, stimuli can  The numbers are represented as number of respondents or in form: mean value (SD), range. The level of tiredness was rated on a scale from 1 to 5: (1), not tired; (2), little tired; (3), moderately tired; (4), tired; and (5), very tired. The level of tiredness was rated on a scale from 1 to 5: (1), not tired; (2), little tired; (3), moderately tired; (4), tired; and (5), very tired.
provoke epileptic seizures, and the low frequency band covers the alpha band.
In the presented study, 14 subjects reported slightly increased tiredness (see Tables 6, 7). However, five subjects reported a decrease in tiredness. Thus, for the majority of subjects the calibration and spelling performance with the SSVEP-based BCI did not produce significant fatigue. As also reported by Volosyak et al. (2011b), neither the hours of sleep nor the subjective level of tiredness influenced BCI performance. However, in the presented study the effect of gender was significant. A trend that female subjects might perform better with lower frequencies was also observed in previous works Volosyak et al., 2011b). This observation needs further investigation.
As has been discussed intensively throughout the BCI literature, a major challenge in SSVEP-based BCIs lies in finding a compromise between accuracy and speed. The choice of the timewindow dedicated to the SSVEP response detection during EEG analysis has high impact on the accuracy. While a short timewindow results in classification errors, a longer time-window slows performance down (Volosyak et al., 2010a;da Cruz et al., 2015). As the commands corresponding to the stimulating frequencies are produced only if their probabilities are higher than certain predefined thresholds, classification thresholds are another important factor in finding balance between speed and accuracy. The wizard determined these variables specifically for each user. Figure 4B shows a wide variety of ideal minimum time segment lengths among the 61 subjects. The highest minimal segment length determined was 7109.4 ms but in most cases the shortest possible minimal time segment length of 812.5 ms was selected. Though, a longer minimal time segment length results in lower ITR, it yields higher accuracies, and for some users a long time-window at the beginning was necessary to guarantee control over the BCI system. The system might have been unable to interpret intentions for those users if fixed segment lengths were used. For the majority of subjects, lower frequencies were selected (see Figure 4A). For nine subjects the lowest possible frequency set 6.32, 6.67, 7.06, and 7.50 Hz was determined, and although the frequencies are separated by <0.5 Hz, this set worked well. This observation is in harmony with the observation by Gao et al. (2003), that two flickering targets with a frequency difference as low as 0.2 Hz can be successfully distinguished in the SSVEP response. This value has recently been updated to 0.1 Hz (Hwang et al., 2012). Lower stimulation frequencies can even be distinguished with a difference of 0.05 Hz . The most frequently selected frequency was 7.5 Hz, which might be explained by the fact that that its second harmonic is 15 Hz, which is the stimulation frequency at which the SSVEP response is maximum according to Pastor et al. (2003). As lower frequencies overlapping with the alpha band (8-13 Hz), might cause false classifications (Zhu et al., 2010;Cao et al.,  2014). Otherwise, simple closing of the eyes might lead to false classifications. Therefore, the frequencies were checked for interference with each subject's alpha wave in the first phase of the wizard software. In our previous SSVEP-based field-studies the BCI literacy rate could steadily be improved and now, after more than 5 years of research, 100% has been achieved. Allison et al. (2010) reported 80 out of 106 and Volosyak et al. (2009b) 32 out of 37 subjects that were able to perform a spelling task. Due to further modifications BCI illiteracy rate could be reduced to 2.33% (Volosyak et al., 2011b). Only Guger et al. (2012) showed that their SSVEP-BCI could provide effective communication for all 53 subjects. One simple cause for a high literacy rate in SSVEP-based BCIs is a low number of stimulation targets. Guger et al. (2012) as well as Volosyak et al. (2011b) also used four simultaneously displayed stimulation frequencies (see Table 8). It should be noted that BCI literacy among all participants was achieved in studies using other BCI approaches as well. Kaufmann et al. (2012) reported that all 19 subjects were able to complete a spelling task with a P300 speller with an average accuracy of 91.2% and an ITR of 15.1 bpm and in a study with 99 subjects Guger et al. (2003) reported a BCI literacy rate of 100% as well. Guger et al. (2009) also achieved full BCI literacy with 81 subjects using the motor imagery paradigm.
Further we would like to point out, that generally higher ITRs than in the presented study can be achieved with BCIs. For example, Spüler et al. (2012) reported an average ITR of 144 bpm and an accuracy of 96% with a BCI that used codemodulated, visual evoked, potentials and the detection of errorrelated potentials. Although, the ITR of 21.92 (7.63) bpm in the here presented study is considerably low compared to these values, the results are promising because of the very high mean (SD) accuracy of 97.14 (3.73)% which slightly surpasses the values from previous field-studies using four stimuli classes. In comparison Volosyak et al. (2011b) reported that subjects reached an accuracy of 92.26 (7.82)%, but two subjects were unable to achieve control over the system. LEDs were used as stimulation source and subject navigated a miniature robot through a labyrinth. As in the presented study four distinct commands were used. In a previous smaller sized study, the same graphical user interface as in the presented study was tested with six healthy subjects and a mean accuracy of 87.41 (6.74)% was reached (Gembler et al., 2014). Frequencies and time segment lengths were not selected user dependently. In comparison the mean (SD) accuracy of 97.14 (3.73)% achieved in the presented study is significantly larger [t (5) = 3.44, p < 0.05], which supports our hypothesis, that the amount of misclassifications can be reduced through automated user-specific parameter selection and larger classification timewindows. Guger et al. (2012) also used a relatively large classification time-window and achieved a BCI literacy rate of 100% (see Table 8).
According to Perelmouter and Birbaumer (2000) a typical patient's estimation "almost absolutely reliable" is equivalent to a classification accuracy value between 90 and 95%. Almost all subjects (93.44%) in this study surpassed 90% classification accuracy and 77.05% of the subjects achieved accuracies above 95%.

Limitations
On rare occasions (four subjects) the determined classification threshold for one of the frequencies was too low and a test subject performed poorly during the familiarization run. This could be explained by diminishing concentration of the subject during the calibration process, or by the fact that the wizard did not take the spatial arrangement of the boxes into account, since it always concerned the corresponding box "del, " which had rather prominent position. In these cases, the calibration process was repeated. Therefore, further software improvements are necessary.
It should also be mentioned that for long term use recalibration might be inevitable and that the here presented SSVEP-based BCI depends on the user's vision and control over the eye movements. The wizard usually selected lower frequency sets as the determination is based on the SSVEP-response. However, visual stimulation with low frequencies is known to cause fatigue.
Furthermore, subjects in this study may not be reflective of the general population; they tended to be young healthy men, therefore additional tests with older and physically impaired people are needed.

CONCLUSIONS
A wizard for a SSVEP-based BCI that automatically determines individual BCI parameters for each user and can be handled by inexperienced personnel has been tested with 61 healthy subjects. The presented study confirms that through careful user-specific choice of SSVEP BCI constants such as stimulation frequencies, classification thresholds, and segment lengths, high accuracies can be achieved by a broad population.
All subjects achieved reliable control over the BCI system, reaching accuracies above 85%. A comparison with previous field-studies proves, that the introduced modifications are an essential step to broaden the literacy rates of BCI systems to all potential users. The main causes for the achieved literacy rate and high accuracies are: • the number of simultaneously displayed targets is limited to four, • subject specific frequency and threshold selection through the presented wizard, and • extended classification time-windows (>8 s) for poor performers.
Further research might also consider other BCI paradigms. Volosyak et al. (2010b) introduced the BCI wizard as a system that automatically identifies key parameters to customize the best BCI paradigm for each user. Tailoring a BCI including the input signal has also been suggested by Guger et al. (2012); Kübler et al. (2014). Furthermore, the construction of the wizard software and its GUI allow an extension, so that key parameters could also be determined for BCIs with a higher number of stimuli. Because of the mentioned disadvantages of lower stimulation frequencies, the wizard could be modified, so that higher frequencies are also considered. The methods of the Wizard might be integrated directly in applications, such as the speller presented, so that spatial proximity of targets is also taken into account when selecting key parameters.