Exploring the Brain Activity Related to Missing Penalty Kicks: An fNIRS Study

At vital moments in professional soccer matches, penalties were often missed. Psychological factors, such as anxiety and pressure, are among the critical causes of the mistakes, commonly known as choking under pressure. Nevertheless, the factors have not been fully explored. In this study, we used functional near-infrared spectroscopy (fNIRS) to investigate the influence of the brain on this process. An in-situ study was set-up (N = 22), in which each participant took 15 penalties under three different pressure conditions: without a goalkeeper, with an amiable goalkeeper, and with a competitive goalkeeper. Both experienced and inexperienced soccer players were recruited, and the brain activation was compared across groups. Besides, fNIRS activation was compared between sessions that participants felt anxious against sessions without anxiety report, and between penalty-scoring and -missing sessions. The results show that the task-relevant brain region, the motor cortex, was more activated when players were not experiencing performance anxiety. The activation of task-irrelevant areas was shown to be related to players experiencing anxiety and missing penalties, especially the prefrontal cortex (PFC). More particularly, an overall higher activation of the PFC and an increase of PFC lateral asymmetry were related to anxious players and missed penalties, which can be caused by players' worries about the consequences of scoring or missing the penalty kicks. When experienced players were feeling anxious, their left temporal cortex activation increased, which could be an indication that experienced overthink the situation and neglect their automated skills. Besides, the left temporal cortex activation is higher when inexperienced players succeeded to score a penalty. Overall, the results of this study are in line with the neural efficiency theory and demonstrate the feasibility and ecological validity to detect neurological clues relevant to anxiety and performance from fNIRS recordings in the field.


INTRODUCTION
Penalty kicks are highly important in soccer. Penalties are common and have a big influence on the outcome of a match. By taking the large amounts of money and number of fans into account, the importance of penalty kicks increases even more. In other words, missing a penalty in a crucial match can cause thousands of fans to be disappointed and the corresponding club to miss out on millions of euros. Many technical skills have an influence on the quality of the penalty kick (see Memmert et al., 2013 for review). Apart from technical skill, psychological factors seem to have a clear influence on the outcome of a penalty kick as well. It was found that only psychological factors had a large negative influence on the outcome of the penalty, where skill and fatigue did not (Jordet et al., 2007).
Many studies on the causes of missed penalties have convincingly shown that the kicker's anxiety and the mental pressure under which the kicker are the most common psychological factors. While resistance to mental pressure depends on player personality (Lin et al., 2017), the pressure often leads to distress, which is a negative factor adversely influencing the quality of the penalty kick and thereby hindering scoring, rather than eustress that is positive and gives a feeling of arousal and thereby enhancing chance of scoring (Le Fevre et al., 2003). The degraded performance under pressure and anxiety is often referred to as choking, which is prevailing in the critical moments of the soccer big matches (Chiappori et al., 2002;Arrondel et al., 2019). Anxiety, as a result of choking under pressure, was found to be related to bad direction of penalty taking (Wilson et al., 2009), and this adverse effect is also present in other sports domain, such as weightlifting (Genakos and Pagliero, 2011), golf (Hickman and Metz, 2015), chess (González-Díaz and Palacios-Huerta, 2016), basketball (Fryer et al., 2018), and tennis (Cohen-Zada et al., 2017).
Choking under pressure is generally explained by self-focusing theory or distraction theory. The self-focus theory posits that anxiety or pressure increases the level of self-consciousness, resulting in more consciously monitoring or controlling skill execution, and choking as a result (Baumeister, 1984;Hill et al., 2010;Roberts et al., 2017). This means that excessive pressure leads to the undermining of automatism and therefore there is overwhelmed attention toward the execution of the skill. On the other hand, the distraction-theory posits that anxiety or pressure occupies the working memory, causing a shift from task-relevant cues to task-irrelevant cues (Sarason, 1988;Hill et al., 2010;Gröpel and Mesagno, 2017;Roberts et al., 2017). Unlike the selffocus theory, little attention is paid toward the execution of the skill, where the distractions can be either internal (e.g." worries) or external (e.g., distracting fans), which can be explained by the circles of attention (Eberspächer et al., 1990). The two theory are relevant to the neural efficiency theory, positing that expert athletes show more efficient brain activity than non-athletes, meaning that task-relevant activities are increased and taskirrelevant activities are decreased. A task-irrelevant activity, such as planning and worries about thinking about consequences of missing penalties, can be a distracting factor suppressing taskrelevant activities, such as motor controlling. These two theories of choking can be connected via the fear circuit model (Hatfield and Kerick, 2007), which involves the prefrontal cortex (PFC), basal ganglia, thalamus, premotor cortex, motor cortex, limbic system, anterior cingulate cortex, left temporal cortex, and the corticospinal tract.
Many human neuroimaging studies have provided neural evidence of choking in the brain above and beyond behavioral analysis. First, increased PFC activation will act as a distraction in the brain leading to choking. The study of Korb (2010) suggested that an increase in PFC activation is associated with being distracted from a physical task and with being stressed, although the opposite trend was observed in the study of Alshargie et al. (2016) in a different task. Second, a decrease in PFC lateral asymmetry will lead to choking (Hatfield and Kerick, 2007), where the improved performance was found associated with higher left compared to right PFC activation (Silveira et al., 2019). Third, the neural efficiency theory suggests that optimal performance can be achieved by activating task-relevant brain areas. This means an increase in motor cortex activity is associated with being less likely to choke when performing a sport-related exercise, and supportive evidence was reported in a study of Wolf et al. (2014) in expert table tennis players. Forth, intensive connectivity between dorsolateral PFC (DLPFC) and the motor cortex was found necessary for maintaining the level of performance in motor-related tasks, thus preventing choking (Yoon et al., 2006;Clapp et al., 2009;Lee and Grafton, 2015). Last, a heightened left temporal cortex activation was found associated with choking in experts (Wolf et al., 2015) due to self-instruction, suppressed automated skill, disturbing thoughts, similar to the phenomenon found in novice players. This is supported by the reported lower activation in this brain region of experts in shooting sports (Haufler et al., 2000(Haufler et al., , 2002Allen et al., 2004;Kerick et al., 2004) due to lower cognitive demands.
Despite the neurological evidence of choking under pressure, the study of this phenomenon in the field is limited. In particular, there can be a huge difference between controlled-laboratorysetting choking, in which simple sensorimotor tasks [such as feet tapping, treadmill walking, and smartphone-based touch game (Udina et al., 2020)] are performed, and the real-life naturalistic-setting choking, where a wide range of external factors involve. A possible reason for the lack of choking study in the field is the susceptibility to movement artifacts of neuroimaging techniques. Recently, functional near-infrared spectroscopy (fNIRS) has been often used in in situ studies of brain activity due to its compelling robustness to movement artifacts, which has been proven in the study of Carius et al. (2020). In that study, brain activity was reliably measured during bouldering-a special form of climbing without a rope where complex whole-body movements are involved, and it was demonstrated that fNIRS is capable of measuring sensorimotor activity during the execution of heavy and irregular movements. As bouldering involves more strenuous movements than kicking a ball, it should be possible to measure brain activity in the soccer domain as well, which has not been fully explored yet. To our best knowledge, there has been only one fNIRS study in penalty kick of soccer, which compared the brain activity of experienced and inexperienced goalkeepers who were instructed to watch pre-recorded videos of penalty kicks from the perspective of goalkeepers (Kuriyama et al., 2015). However, an actual physical movement was not involved and choking was not focused. On the other hand, effects of pressure on poorer performance have been demonstrated in a previous work (Ito et al., 2011), but its working-memory task did not include physical activity. To date, the feasibility to capture choking effect in the field using fNIRS is still unclear.
In this study, we aimed to explore brain activity during the real situation of penalty kicking using fNIRS. Crucially, we sought to examine the left temporal cortex, motor cortex, PFC, and functional connectivity between DLPFC and motor cortex during choking to characterize brain activity that involves anxiety and impacts performance. The findings of this study can have an implication on a wider range of tasks beyond the soccer/sports domain, such as in surgery, where motor performance under high mental pressure involves. This study provides insights on why people fail to perform under pressure and possibly paves a way toward tailored intervention to prevent choking by utilizing a closed-loop brain-computer interface.
The present study also aimed to investigate the correlation between the level of expertise and the capability to deal with pressure. It was evidenced that the brain activity of sports professionals differ by level of expertise (Kuriyama et al., 2015;Wolf et al., 2015), where experts show more efficient brain activity or activate the correct areas of the brain for a certain activity when performing a skill. Under mental pressure, we predicted that experience in sports might also influence the way to cope with anxiety, leading to different patterns in brain activity between experts and novices during choking.
In general, previous works leave open the critical question of to what extent brain activity associated with choking under pressure in a penalty kick situation can be reflected by inthe-field fNIRS measurement. While theories of choking under pressure are under development, we focused more on the anxiety and pressure (Yu, 2015), which are the associated psychological factors that can be explicitly measured and strongly induced using established methods in sports psychology. Specifically, we formulated our research questions as follows: RQ1: Performance-What is the difference in brain activity between performing success (scoring) and failure (missing) when taking a penalty kick? RQ2: Performing under pressure-What brain activity is associated with performing under pressure during a penalty kick situation? RQ3: Experienced and Inexperienced players-What is the general difference in brain activity between experienced and inexperienced soccer players when taking a penalty kick? RQ4: Anxiety and Experienced players-What brain activity is associated with experienced soccer players that experience (performance) anxiety when taking a penalty kick? RQ5: Anxiety and Inexperienced players-What brain activity is associated with inexperienced soccer players that experience (performance) anxiety when taking a penalty kick?

Participants
In total, 22 participants (10 females; age average: 22.9 years, standard deviation: 2.00 years) were recruited to participate in the experiment. Among these, ten participants were experienced soccer players who were in the first team of vv Drienerlo-the soccer association of the University of Twente-and trained and played matches regularly. On the other hand, 12 inexperienced participants who were also recruited never played or had limited experience in soccer. The short form of Edinburgh Handedness Inventory (Veale, 2013) was used to confirm that all participants were right-footed and right-handed, with an average Laterality Quotient of 77.27. The Sport Competition Anxiety Test (SCAT), consisting of 15 items in the range of 10-30, was used to indicate the level of performance anxiety of the participants (Martens, 1977;Martens et al., 1990;Wood, 2017a) before the experiment; the results indicate that eight, 11, and three of participants were classified into the group with low (score <17), medium (score 17-24), and high (score >24) performance anxiety, respectively, with the grand average score of 18.3. All participants provided written informed consent to participate in the study.

Tasks and Procedure
The experiment has been approved by the ethics committee of the EEMCS faculty of the University of Twente (reference number: RP 2020-118). During the experiment, the participants were instructed to perform a penalty kicking task for three rounds, each of which consisted of five penalties (trials). While pressure induction level differs by round, the same rules applied for each penalty (trial); before the penalty can be taken, the player had to wait for the referee to blow the whistle. The goalkeeper had to stay on the goal line until the ball was struck. However, the goalkeeper was allowed to move horizontally on the goal line. The player was not allowed to pause (fully stand still) during the run-up but was allowed to slow down in order to trick the goalkeeper.
The experiment consisted of three rounds and the aim was to increase the pressure per round. In the first round, the lowest amount of pressure should be induced and in the last round, the highest pressure should be induced. The specifics of each round were set-up in cooperation with a sports psychologist of the NOC*NSF (the Dutch overarching sports organization), who interacts directly with sports professionals. Based on the advice and expertise of the sports psychologist, three rounds were set-up as follows: 1. No goalkeeper: during the first round, no goalkeeper was present. The player was shooting at an empty goal and was informed that as it was a practice round to familiarize the player with the experimental protocol. Therefore, low pressure was expected from this round. 2. Amiable goalkeeper: during the second round, a goalkeeper was present but was not allowed to distract the player, who was informed that it was a friendly competition between the player and the goalkeeper and that this round aimed to see how well the player could perform against a goalkeeper. Neither the goalkeeper nor the researcher was allowed to respond to the performance of the player. By introducing a non-interacting goalkeeper, the aim was to introduce the competitive element, without raising the pressure too much. 3. Competitive goalkeeper: during the last round, we aimed to maximize the mental pressure of the player. A goalkeeper was present and allowed to distract the player, who was informed that it was a competition in which only the best performing experienced and inexperienced players could win two 50-euro giftcards, assessed by the number of goals scored and the quality of penalty taking. To imitate a reallife professional penalty shoot-out, the player was instructed to start from the halfway line (about 40-50 m from the goal), walk with the ball toward the penalty spot, and place the ball on the right spot. This would prolong the time of being anxious, in which the researcher also tried to involve in pressure induction by providing unrealistically good statistics of previous participants or the confronting goalkeeper. Besides, the goalkeeper also tried to intimidate the player by awaiting the player at the penalty spot to which the player was approaching from the halfway line, wasting time (by drinking water or retying their shoelaces), talking to the player when he/she tried to concentrate, repeatedly calling the player by the first name, stretching arms, jumping, and telling the player that he/she already knew the direction of the upcoming shooting. The aim of these actions of the goalkeeper and researcher was to shift the attention of the players from their task (Eberspächer et al., 1990).
When the participants had finished the SCAT questionnaire, the fNIRS headset was attached to the participants. Whilst the researcher was verifying the quality of each channel, the structure of the experiment was explained to the participants. An overview of the experimental protocol is depicted in Figure 2. For every round, a resting period of 30 s was recorded first. During this period, the participants were instructed to refrain from moving and speaking but to keep their eyes open and to look in one certain direction, preferably where they could see as little distracting external stimuli as possible. An explanation of the round followed. It was chosen to explain the details of each round after the resting period to ensure that the participants were not thinking/worrying about the upcoming round during the resting period. After the round was explained, the participants were briefly interviewed by asking how confident they were and how many goals they thought they would score. The interview is expected to help assess how anxious the players were. After placing the ball on the penalty spot and preparing for the run-up, the participants were instructed to wait for 5 s, until the researcher indicated that they could kick the ball. The researcher was tracking the time by using the built-in stopwatch of the OxySoft software (used for the fNIRS measurement). Using this software, markers were placed during the experiment to indicate the start and end of each 5-s waiting period. These 5-s periods were used for the data analysis, as the player was standing still, minimizing the chances of motion artifacts. The participants were also instructed to minimize body movement during this period. This 5-s waiting period was included before every kick. When all five penalties were taken, the participants were asked to fill out a small questionnaire. This questionnaire included two questions in a five-point Likert scale, regarding the satisfaction with the performance and the level of motivation during that round. Furthermore, the Sport Anxiety Scale (SAS) (Smith et al., 1990(Smith et al., , 2006Wood, 2017b), consisting of 21 questions in a four-point Likert scale, was included to determine the level of anxiety/pressure during the round. The results of this questionnaire were used to determine whether a player was anxious or not. The next round started when the participants finished filling the questionnaire of the previous round. The fNIRS headset was equipped until all three rounds were completed. After the experiment was finished, the participants were debriefed and a structural post-interview on the experience concluded the experiment.
Whilst conducting the experiments, the majority of conditions were kept constant. An artificial soccer pitch was used to ensure the quality of the pitch was constant across experiments. Furthermore, all participants were right-handed/footed and of similar age. Every participant faced a goalkeeper of the same gender and all goalkeepers were of similar skill level, as they all played in the first team of vv Drienerlo. For all experiments, the same ball was used, namely, a Derbystar size 5, which is typically used in professional soccer matches. The air pressure of the ball was between 0.7 and 0.9 bar, following the professional soccer guidelines. As the experiment was conducted outdoor, there were also a few conditions that were variable, such as weather, temperatures which varied between 12 and 31 • C, and wind force which varied from level 0 (calm) to 4 (moderate breeze) on the Beaufort scale. During three experiments, there was fog. The experiments were conducted before regular training sessions of the football club and scheduled between 4 and 8 p.m. (Central European summer time), meaning that some experiments were conducted after sunset. During these experiments, the light poles of the soccer pitch were lit. The lights were either off or on throughout the experiment, and we ensured that there was no case that the lights were switched on/off during the experiment. Therefore, ambient light was assumed consistent.

Data Acquisition
For fNIRS measurements, the Artinis Brite 24 was used to record oxygenated hemoglobin (O2Hb) and deoxygenated hemoglobin (HHb) in each channel at a sampling rate of 10 Hz. O2Hb is the form of hemoglobin with the oxygen bound, whereas HHb does not have this bound to oxygen. The Brite 24 is a portable and wireless device that allows flexibility in fNIRS optode placement with a total of 10 transmitter optodes and 8 receiver optodes. Numerous templates are available to arrange these optodes. OxySoft, which is proprietary software developed by Artinis, was used to record and transform fNIRS signals, which were then analyzed in Python. A maximum distance of 30 mm was used between each pair of optodes and a differential pathlength factor of 6 was used for all participants. During the experiment, fNIRS data were obtained from the left PFC, right PFC, left temporal cortex, motor cortex, left DLPFC, and right DLPFC, as these regions were found relevant to choking under pressure in the literature (see section 1). In order to measure all of these areas, The standard "4 × 4 + 2" template of Artinis was used 1 where the corresponding optode placement can be found in Figure 1. Four channels were used to record each region of the left PFC, right PFC, left temporal cortex, FIGURE 1 | The layout of all fNIRS channels on the scalp. The yellow circles represent transmitter optodes and the blue circles represent receiver optodes. A channel is lying between each transmitter-receiver pair. Channels 1-4 correspond to the motor cortex, channels 5-8 correspond to the right PFC, channels 9-12 correspond to the left PFC, and channels 13-16 correspond to the left temporal cortex. Channels 17 and 18 correspond to the right and left DLPFC, respectively. Certain electroencephalogram's electrode positions, in accordance with 10-20 international system, are included for references. placement of the equipment. The laptop close to the chair was used by the participants to fill in the questionnaires. The GoPro camera that was closest to the goal was aiming at the player and the other camera was aiming at the goal.
Frontiers in Computer Science | www.frontiersin.org and motor cortex. Two channels were used to cover left and right DLPFC.
One HERO7 GoPro 2 camera was used to record the penalty kicking, such that the placement and power of the shot could be determined. The power of the shot was defined by the time it took for the ball to reach the goal. This was manually timed using a stopwatch and was expected to provide insights on kicking performance. Another GoPro camera was used to record videos of the player. We specially investigated the duration for which players looked at the goalkeeper by comparing between rounds. A longer fixation at the goalkeeper can indicate that the goalkeeper is a distracting factor (Wilson et al., 2009;Wood and Wilson, 2010;Furley et al., 2017). For consistency reasons, only the fixations during the 5-s waiting period were used. The videos of both cameras were recorded at 60 frames/s and had an image quality of 1080p.
In Figure 2, an overview of the set-up of the experiment is shown. The participant filled in the questionnaires on a laptop and the researcher was monitoring the fNIRS signals on a separate laptop. A Sena UD100 Bluetooth adapter 3 was used, which allowed the measurements up to a distance of 300 m. This means that the laptops could be placed at a safe distance from the goal. Furthermore, two GoPro's were used to record the player and the goal.

Signal Pre-processing
The acquired fNIRS signals were first preprocessed by applying a fifth-order Butterworth bandpass filter between 0.02 and 0.5 Hz to get rid of physiological noises and drift in optical data (Kamran et al., 2018). The motion correction method Temporal Derivative Distribution Repair (Fishburn et al., 2019) was used to reduce the impact of motion artifacts on the signals. This novel artifact correction method shows superior performance compared to other correction methods, such as Targeted Principle Component Analysis (tPCA) (Yücel et al., 2014), correlation-based signal improvement (CBSI) (Cui et al., 2010), Movement Artifact Reduction Algorithm (MARA) (Scholkmann et al., 2010), and wavelet based methods (Molavi and Dumont, 2012;Chiarelli et al., 2015) (see Jahani et al., 2018 for a review on traditional artifact correction methods). Furthermore, TDDR method requires no parameter tuning and only minimal assumptions need to be made on the fNIRS data, while other methods assume normal distribution on fNIRS data (Cui et al., 2010) or require extensive parameter supplies from users (Scholkmann et al., 2010;Yücel et al., 2014) or suffer from baseline shift of signals (Molavi and Dumont, 2012;Chiarelli et al., 2015). The TDDR method was applied for each channel separately, using the following protocol. Given that x t represents a datapoint of the fNIRS channel for a certain timepoint (t), the temporal derivative, y t , of the channel was first computed by subtracting the data of the previous timepoint from the current datapoint: y t = x t −x t−1 .
Then, A vector of observation weight (w) was initialized: w t = 1, and the weighted mean of the fluctuations (µ) was estimated by: Afterwards the absolute residuals (r t ) of the estimated mean were computed using: r t = |y t − µ|. An estimate of the standard deviation (σ ) of these residuals was computed. This was done by multiplying the median absolute residual by the appropriate constant for the normal distribution: σ = 1.4826 * median(r).
For each observation the scaled deviation (d t ) was computed. This was done by using the standard deviation of the residuals and the tuning constant that achieves 95% efficiency on normally distributed data: (2) Tukey's biweight function was used to computed new observation weights: The steps from Equations (1) to (3) were repeated until µ converged. This was considered the case when the differences between the current µ and the previous µ was smaller than 10 −50 . If this criterion was not satisfied after 1,000 loops (where one loop is one repetition of Equations 1-3), the process was stopped. On average 98.75 loops were needed in this process. After µ was converged, the resulting robust weights were applied to the centered temporal derivative (subtracting the mean), in order to produce the corrected derivative (y ' t ): y ' t = w t (y t − µ). At last, the corrected temporal derivative was integrated in order to obtain the corrected signal (x ' t ): After the motion artifacts are corrected, the channels were baselined. At the beginning of every round, a 30-s resting period was recorded to serve as a baseline. The average of the last 15 s of the resting period was used to subtract from all datapoints of the signal. The baselining process was done for each channel separately. Two channels, namely channels 4 and 11 (related to the motor cortex and the left PFC, respectively), were removed due to their ultra-low fNIRS activities possibly caused by bad optode connections. Despite applying motion artifact correction, it is still possible that artifact effect still remained in the form of unreliable fNIRS waveforms. In normal situations where artifact is absent, the direction of concentration changes of the chromophores oxygenated hemoglobin (O2Hb signal) is opposite to that of deoxygenated hemoglobin (HHb signal), and therefore negative correlation of O2Hb and HHb can be expected. Nevertheless, motion artifacts in the signals can lead to concurrent change of both signals in the same direction, leading to more positive correlation. To measure the extent to which the fNIRS signals were affected by motion artifacts, we therefore calculated the correlation coefficient between both signals per trial and channel. It was suggested in the literature that large head movements can already increase the correlation coefficient to 0.1 (Cui et al., 2010) and jumping artifact can enhance the coefficient to 0.4 (Lee et al., 2018). As the movement during the experiment is comparable to jumping, we opted to use a higher threshold and the validity of the threshold has been confirmed by our empirical study on the threshold effect. Therefore, data for certain channels of certain trials were removed if the correlation coefficient between the O2Hb signal and the HHb signal was larger than a threshold of ρ = 0.4. The noisy data removal was done at individual trialchannel level, i.e., only noisy channel data was removed per trial rather than discarding the whole trial data. Consequently, ∼41% of the all trial-channel data remained.

Feature Extraction
Afterwards, we extracted features from the cleaned fNIRS signals in valid trials and channels.
1. Motor cortex activation; three mean features were obtained from three remained channels (channels 1, 2, and 3) related to this cortex. 2. Left temporal cortex activation; four mean features were obtained from three remained channels (channels 13, 14, 15, and 16) related to this cortex. 3. Averaged PFC activation; as there was no channel lying exactly in the middle between right and left hemispheres, we calculated a feature to represent PFC activation by averaging fNIRS signal from one representative channel in the left hemisphere and one from the right hemisphere. After badchannel removal, there were three channels relevant to the left PFC (channels 9, 10, and 12) and four channels relevant to the right PFC (channels 5, 6, 7, and 8), generating 12 possible leftright combinations. We derived all 12 mean features from all channel-pairs as the features. 4. PFC asymmetry; similar to the averaged PFC feature, we calculated hemispheric asymmetry from all 12 combinations of left and right PFC channels by subtracting a left PFC channel from a right PFC channel and then calculated the mean of the result as a feature (Hatfield and Kerick, 2007;Silveira et al., 2019). Therefore, a positive value corresponds to a higher right PFC activation relative to left PFC activation, on average. 5. Connectivity between DLPFC and motor cortex; we calculated the connectivity index by following the method of Nguyen et al. (2018). First, Pearson correlation coefficients (ρ) between two fNIRS channels were calculated per trial by: where X and Y denote channel data, σ X and σ Y refer to the standard deviation of channel X and Y, respectively, and cov(X, Y) refers to the covariance between the two channels. In order to convert the sampling distribution of the Pearson correlation coefficients into the normal distribution, the obtained ρ values were transformed to z values using the Fischer z-transformation: The connectivity index was computed for all combinations of motor cortex channels (channels 1, 2, and 3) and DLPFC channels (channels 17 and 18). The number of significant connections was then determined by counting the number of connections that had an absolute z-value greater than our pre-defined threshold of 0.6, which is corresponding to a correlation of ρ ≈ 0.54 and motivated by the results of Nguyen et al. (2018). Apart from the feature derived from counting the number of significant connections, we also calculated the mean of absolute z-values of all corresponding connections as another feature.
Negative feature values correspond to the lower feature values during the task 5-s waiting time before task execution compared to feature values during the resting period, and positive values mean vice versa. Outliers, which were defined as the values that deviated from the corresponding means across all participants for more than three standard deviations, were then removed in each feature.

Statistical Analysis
In order to test the hypotheses with the obtained features, permutation statistical tests were used, as they made no assumption on the distribution of data. A total of 100,000 permutations were used, suggesting that the smallest possible p-value is 10 −5 . The analysis was performed on the extracted features under three different studies, each of which compared two different conditions, that help answer our research questions. The hypotheses were made by following previous findings in the literature.
1. Experienced vs. Inexperienced players; it was hypothesized that motor cortex activation (three channels) is higher (Wolf et al., 2014) and left temporal cortex activation (four channels) is lower (Hatfield et al., 1982;Haufler et al., 2000;Wolf et al., 2015) in experienced players compared to inexperienced players. In total, seven statistical tests were made for the hypotheses in these two features. 2. Anxious vs. Non-Anxious players; it was hypothesized that motor cortex activation (three channels) is lower (Lee and Grafton, 2015), while the averaged PFC activation (12 channel-pairs) (Korb, 2010;Schweizer et al., 2013;Nosrati et al., 2016), PFC asymmetry (12 channel-pairs) (Hatfield and Kerick, 2007), and the connection between DLPFC and motor cortex (two indices) (Yoon et al., 2006;Clapp et al., 2009;Lee and Grafton, 2015) are higher in anxious players compared to non-anxious players. As the left temporal cortex was found related to the suppression of automated skills, which are possessed only by an experienced player, different hypotheses were made for experienced and inexperienced players. It was hypothesized that experienced players have higher left temporal cortex activation (four channels) when being anxious compared to non-anxious experienced players as the automatic skill suppression does not function properly when being anxious (Zhu et al., 2011;Wolf et al., 2015). In contrast, the opposite hypotheses were made for the inexperienced players. In total, 3 + 12 + 12 + 2 + 4 + 4 = 37 statistical tests were made in this study of anxiety. 3. Scored vs. Missed penalties; the number of statistical tests was identical to the study of anxiety. In particular, it was hypothesized that motor cortex activation (three channels) is higher (Lee and Grafton, 2015), while the averaged PFC activation (12 channel-pairs) (Korb, 2010;Schweizer et al., 2013;Nosrati et al., 2016) and PFC asymmetry (12 channelpairs) (Meyer et al., 2015;Silveira et al., 2019) are lower, and the connection between DLPFC and motor cortex (two indices) are higher (Yoon et al., 2006;Clapp et al., 2009;Lee and Grafton, 2015) when scoring the penalties, compared to when missing penalties. Again, the analysis of the left temporal cortex was done separately by experienced and inexperienced players; experienced players were hypothesized to exhibit lower left temporal cortex activation (four channels) when scoring (Wolf et al., 2015), and the opposite hypotheses were made for inexperienced players.
The total 7 + 37 + 37 = 81 statistical tests are also summarized in Table 2 that enumerates all features and all studies in this research. A multiple-testing correction was done using a false discovery rate (FDR) test as correction procedure (Singh and Dan, 2006), with significance threshold Q = 0.05 and the number of statistical tests m = 81. It is noteworthy that the connectivity analysis between DLPFC and motor cortex was done only on data from the last round of the penalty kick, which should involve the highest level of pressure, as the connectivity indices were found to be related to choking under pressure.

Classifying Brain Data
In order to assess how well fNIRS data can be used to distinguish the different levels of experience, anxiety, and success in penalty shooting, classification was done separately in each study. In each classification, a single type of feature, except connectivity indices, was used in order to allow an investigation on which feature is the most powerful for distinguishing two classes. In addition to the mean feature as used in the statistical analysis in the previous section, we also calculated the standard deviation, the minimum value, and the maximum value as additional features for each trial. Support Vector Machines (SVMs) with linear kernels were trained and tested on the feature data, where 80% of total data were randomly selected as training data and the rest 20% were used as test data. The classification was implemented using scikitlearn 4 package of Python and evaluated by the accuracy and area under the receiver operating curve (ROC) between true-positive rate and false-positive rate. As random shuffling involved with training and testing, the classification was performed five times and the grand average and standard deviation of the accuracy were reported. 4 https://scikit-learn.org/stable/ 3. RESULTS Table 1 shows the performance of the players as the percentage of scored penalties in each round, duration for which the players were looking at the goalkeeper, and ratings of satisfaction and motivation to score at the end of each round. Wilcoxon's Rank Sum statistical tests with Bonferroni correction were performed on the comparison between experienced and inexperience players. In addition, Kruskal Wallis one-way analysis of variance with Bonferroni correction was used to compare performances and scores between round 1, 2, and 3. The performance scores indicate that inexperienced players performed the worst in the last round, whereas experienced players had a similar performance in the second and the last round. Overall, experienced players performed better than inexperienced players. The exception is the first round, as experienced players scored less in this round. Inexperienced players took considerably more risks in the later rounds. Figure 3 shows the placement (shot-accuracy) of each penalty, demonstrating that during the last round more penalties were shot over or wide by inexperienced players. On average, inexperienced players shot their penalties higher and wider per round. Interestingly, this is not the case for experienced players. Although they also shot their penalties higher on average, the horizontal placement did not change between the rounds. Furthermore, the shot power for both experienced and inexperienced players increased in the later rounds. A significantly poorer performance from inexperienced players was notable when comparing between the second and the last rounds (see Table 1). For experienced players, this decrease in performance was not apparent. In fact, they performed slightly better in the last round as compared to the second round although the difference was not significant. Inexperienced players looked significantly longer at the goalkeeper during the last round, but this is not the case for experienced players that looked slightly shorter during the last round. The goalkeeper was a larger distracting factor for inexperienced players in the last round. The aim was to distract the player in the last round and this tactic seems to have been successful for inexperienced players. Again, this does not seem the case for experienced players, as they fixated for a shorter period at the goalkeeper during the last round. This can be explained by the fact that experienced players are more familiar with these distracting methods of a goalkeeper and therefore know how to keep their concentration under these circumstances.

Behavioral Results
After each round was explained, the players were briefly asked how confident they were and how many goals they thought they would score. We observed that players were less confident in the last round, as the expectation on the number of goals to be scored became lower in the later rounds, especially among the inexperienced group.
Also, the results of the SAS questionnaire show that the pressure was highest in the last round (see Table 1). Apart from the total anxiety score that is reported in the table, we also found that its compositing worry score and somatic anxiety score TABLE 1 | Behavioral results showing the percentage of the penalties that were scored for both experienced (Exp) and inexperienced (Inexp) players in each round, the averaged duration (out of the 5-s waiting period) that the players were looking at the goalkeeper, the average and standard deviation of the reported SAS scores enumerated by total (T), worry (T), disruption (D), and somatic (S) scores, satisfaction ratings and motivation ratings at the end of each round; * indicates significant difference between Exp and Inexp at p < 0.01 (corrected by Bonferroni correction); a, b, and c indicate significant differences at p < 0.05 (corrected by Bonferroni correction) between rounds 1-2, 1-3, and 2-3, respectively. were lowest in the first round and highest in the last round. The concentration disruption score did not change between the rounds. For 12 out of the 22 participants, the total anxiety score increased per round. During the first round, six participants were considered to be at least somewhat anxious. This is determined by satisfying one of four following conditions; total score was above or equal to 42; worry score was above or equal to 14; disruption score was above or equal to 10; somatic score was above or equal to 18. During the second round, this number increased to nine participants, and during the last round, this number increased to twelve participants. In total, during 27 rounds (equivalent to 135 trials) out of the entire 66 rounds (namely 330 trials), the players reported to be at least somewhat anxious, which corresponds to 40.9% of the trials. Anxious players also missed more penalties (around 58%) than non-anxious players (around 31%). Furthermore, Table 1 shows that the motivation rating was higher in the second round and significantly greater in the last round. Meanwhile, the satisfaction rating was lowest in the last round. As anxious players performed worse (more misses), it can be assumed that the results of the SAS questionnaire are trustworthy. Anxious players were more likely to miss (Wilson et al., 2009).

Figure 4
shows correlation coefficients between each behavioral resultant score considering all players, only experienced players, and only inexperienced players. Correlations were computed by Spearman's rank correlation method where significant results at p <0.05 (corrected by Bonferroni correction) are surrounded by red borders. Apart from trivial correlation within SAS scores, it can be observed that percentage of goal scoring is negatively correlated with SAS total and SAS worry scores when taking scores from all players into account. This indicates that anxiety can adversely affect performance in general. Considering experienced players, the scoring percentage is correlated with satisfaction but negatively correlated with SAS worry scores. It suggests that successful performance can strongly lead to satisfaction with the penalty kick in this group, while anxiety can hinder the performance. In contrast, SAS worry scores in inexperienced players are correlated with motivation. It can be inferred that willingness to score can enhance worrisome in novice players or vice versa.
However, fNIRS data of one participant were discarded from the subsequent analysis due to technical failure in the recording. Statistical analysis and classification were done on the data from the remaining 21 participants. FIGURE 5 | Averaged features across all trials from all participants where the difference between two conditions in a specific study, regarding experience, anxiety, and success/failure, are significant at p < 0.05 (uncorrected). Features are shown in the variation over a 5-s waiting period. Note that the studies of anxiety and success/failure in the left temporal cortex were done separately by groups of experience [experienced (Exp) and inexperienced (Inexp) players]. Results that remain statistically significant at p < 0.05 after FDR correction are with a gray background.

Statistical Analysis
As O2Hb concentration is directly related to the activation of a brain area, we only focus on O2Hb concentration contrast with the 15-s baseline period preceding each round and discard HHb data from our analysis. Figure 5 shows all testings where the feature values for the two conditions differ at the significance level p <0.05 in a particular study. Tests that remained providing significant results with p <0.05 after FDR correction were labeled with a gray background. A blank cell means no significant results were found in the test. The mean and standard deviation of the mean values for each test were summarized in Table 2.
The trials of all rounds were included for the comparison between experienced and inexperienced players but significant results were not found. Regarding anxiety, the results in Figure 5 show that in the motor cortex, the difference between anxious and non-anxious players was the largest for channel 1, where the cortex was clearly less activated for anxious players. The averaged PFC activation was greatly higher in channel-pairs 5-9, 7-9, and 8-9 in anxious players. Whilst being anxious, a right PFC activation was found higher compared to left PFC activation as evidenced by more PFC asymmetry in channel-pairs 6-12 and 8-12. For the left temporal cortex, experienced and inexperienced players were analyzed separately, as the hypotheses suggest that there could be a difference between the two. Anxious experienced players showed a clear higher activation in channel 15 compared to those who were not anxious, while no significant results were found from inexperienced players. Similarly, DLPFC-motor cortex connectivity analysis on the last-round data did not indicate any significant difference between anxious and nonanxious players.
The contrast between a successful performance (scoring) and a failed performance (missing) can be reflected mostly by the PFC asymmetry, especially in channel-pairs 5-10 and 7-10 (see Figure 5). Again, for the left temporal cortex, the analyses for experienced and inexperienced players were analyzed separately. The results suggest that inexperienced players showed an increased left temporal cortex activation when scoring. Similarly, DLPFC-motor cortex connectivity analysis in the lastround penalty kick (with highest pressure) did not indicate any significant difference between scoring and missing. Table 3 shows the classification results in each study using different features. The best result was obtained by using the motor cortex feature to distinguish between experienced and inexperienced players, achieving 66.7% of accuracy and 0.6806 area under ROC. In general, anxiety and non-anxiety were classified most correctly by motor cortex data. The informative feature to classify scored penalties against missed penalties is the averaged PFC activation feature based on accuracy and the left temporal cortex feature based on area under ROC.

Classifier Results
Considering the supposed chance-level of 50%, all classification results were above the chance level but with a small margin. This led us to the analysis of data distribution and its impacts on the classification performance. Specifically, principal component analysis (PCA) was applied to some features in each classification, i.e., motor cortex data for experienced vs. inexperienced players classification, averaged PFC for anxious vs. non-anxious players classification, and PFC asymmetry for scored vs. missed penalties classification. Then, we visualized the distribution of data that were projected into the first and second principal components (PCs) as shown in Figure 6. Apparently, datapoints of both classes were clustered together, instead of nicely spreading into different locations in dimensional space. Therefore, it is difficult for a linear classifier to achieve high performance in the classification task.

DISCUSSION
In the present study, we demonstrated the feasibility to explore brain activity in the field prior to executing a penalty kick, which is a strenuous physical activity that has been challenging neuroimaging research (Carius et al., 2020). Our results show neurological evidence in fNIRS signals that are related to the level of experience in soccer, anxiety before task execution, and scoring success/failure.

Success in Pressure Induction
The poorer scoring performance in the second and last rounds, over-bar, and wider shots for missed penalties over round, and the increment of shot powers per round can be observed from the results. These all indicate that the players took more risks in the last round. This can be the indication that the pressure was successfully induced. Especially, inexperienced players seem to have experienced a higher level of pressure. There must have been other factors involved in this phenomenon. In the last round, the players were namely competing for a prize and in order to win this prize, they had to not only score the most goals but also to create the best-quality goals. We speculated that the incentive could have influenced them to take more risks in the last round, e.g., by trying to shoot the ball in the top corner.
Significantly poorer performance from inexperienced players could be an indication of heightened mental pressure. It could also be explained by the fact that the players had already taken five penalties against the goalkeeper. Based on these five penalties, the goalkeeper could potentially already know what the shooting technique and favorite corner of the player would be. In contrast, the comparable performance of experienced players between the second and the last round can be observed. Some experienced players verbally reported that they needed a certain eustress in order to well perform, which could be an explanation of the slight increase in performance.
At the end of the experiments, the players were also asked how much pressure they experienced in each round and the majority indicated that they experienced the most pressure in the last round. Overall, it can be concluded that pressure was successfully induced as reflected by most indicators. The distribution between anxious and non-anxious players is also nicely balanced (41-59%).

Results in Line With Neural Efficiency Theory
Focusing on anxiety analysis, it is implied that our results are mostly in line with the neural efficiency theory. When being anxious, the motor cortex (task-relevant area) was activated significantly less in one channel. The activation of task-irrelevant areas of the brain was more common when being anxious. This was most prominently observable in the PFC, as a significant increase in averaged PFC activation in three channel-pairs was related to being more anxious. These results are in line with the previous works (Korb, 2010;Schweizer et al., 2013;Nosrati et al., 2016) that reported the association between overactivation of PFC and choking under pressure. According to Korb (2010), this overactivation would cause a distraction, decreasing one's focus on the task. The results of the present study agree with such theory, as an increase in PFC activation was paired with a decrease in motor cortex activation when being anxious. The long-term thinking element of the PFC could be the source of this distraction, as players might think about the consequences of missing or scoring the penalty (Korb, 2010). Besides this increase in averaged PFC activation, the anxiety level of the player was also notable in the difference between left and right PFC activation. For two channel-pairs, the right PFC was more    Features were extracted from channels related to the motor cortex (MC) and the left temporal cortex (LTC), from channel-pairs demonstrating PFC asymmetry (Asym) and averaged PFC (Avg-PFC), and from DLPFC-MC connectivity showing the averaged number of significant connections (♯Con) and averaged z-scored connectivity index (z-val). Note that the studies of anxiety and success/failure in LTC were done separately by groups of experience (experienced (Exp) and inexperienced (Inexp) players). Results that are statistically significant at p < 0.05 (uncorrected) are in bold, and those that are significant after FDR correction are marked with asterisks.
activated compared to the left PFC for anxious players; this is in line with the previous findings (Hatfield and Kerick, 2007;Meyer et al., 2015;Silveira et al., 2019) that imbalanced PFC activation (caused by a stronger right PFC activation) leads to choking under pressure. For experienced players, increased activation of the left temporal cortex was linked with being anxious; this is in line with previous evidence (Zhu et al., 2010;Wolf et al., 2015) that the left temporal cortex's relationship to self-instruction and self-reflecting can cause a distraction for experienced players. Experienced players should trust on their automated skills and therefore do need to suppress self-instruction and self-reflection processes, which are essential skills in the early stages of learning a motor skill (Wolf et al., 2015). By activating the left temporal cortex more, experienced players neglect their automated skills and start to overthink the situation. This increase can be seen as a distracting factor. An increase in left temporal cortex activation was expected to be related to neglecting automated skills and therefore to poorer performance (i.e., missing penalties) among experienced players. For inexperienced players, the opposite trend was expected and indeed observed in one channel in our results. For PFC asymmetry, however, the results were greatly in line with the literature (Hatfield and Kerick, 2007;Meyer et al., 2015;Silveira et al., 2019). For channel-pairs 5-10 and 7-10, the right PFC was more activated, as compared to the left PFC, when missing a penalty. It is noteworthy that when using an FDR-correction, with Q = 0.05 and m = 81, 3 out of the 10 significant results remain significant. The FDR correction is more often applied to channelwise fNIRS analyses, similar to this study (Singh and Dan, 2006). These FDR-corrected results imply that most of the significant results could be a coincidence. The only significant results that remain after the correction are: the left PFC is more activated than the right PFC when scoring a penalty, anxious players show a higher averaged PFC activation, and inexperienced players show an increased left temporal cortex activation when scoring a penalty. Although most results are not significant after FDRcorrection, these results are still in line with previous findings in the literature. Therefore, although no direct conclusions can be drawn for the results of this study alone, the results can still be seen as a support of the theory in the literature.

Limitations of the Study
The greatest challenge of this study is similar to other in situ studies-motion artifacts, despite the fact that the fNIRS technology is less susceptible to motions artifacts and electrical noise. Although the participants were instructed to minimize their movement during the 5-s waiting period before whistle signal to start executing the kick, the intensive eagerness to perform the task led to undesirable tiny movements in certain participants, resulting in the loss of 60% of total data. A solution would be to prolong the waiting period to 10 s, which might help decrease the probability of motion artifacts in the signal. Also, prolonging the waiting period would enable an alternative baselining method to utilize the early period of the trial as a baseline and allow the comparison with baselining by 30-s resting period in the current study.
Within the current study, scoring a penalty was seen as a successful performance, and missing a penalty is considered as a failed performance. However, this may not be the best measurement to use for this comparison. Scoring a penalty does not necessarily indicate that the penalty was taken well. For example, the goalkeeper can make a mistake, meaning that a badly taken penalty can still be a goal. In contrast, missing a penalty does not necessarily mean that a penalty was taken badly, as a goalkeeper can still save a penalty by correctly guessing the direction. It would therefore be recommended to, instead, look at the quality of the penalty. This can be done by, for example, looking at the shot-placement and shot-power.
Although mental pressure was successfully induced during this experiment, the levels of pressure were not the same as in professional soccer matches. The level of pressure during an important (professional) soccer match was not met and therefore it is uncertain if the pressure was sufficiently high to induce choking. A way to increase the level of pressure during an experiment is to recruit more spectators to witness penalty kicks.

Recommendations for Future Work
Future research should consider adding more trials per condition and prolonging the duration of each trial. This would allow performing reliable statistical analyses and calculating heart rate variability, which can be captured from the embedded cardiac cycles in the fNIRS signals and was found to be a useful measure to detect stress and choking (Taelman et al., 2009). However, this has to be compromised with potential fatigue, which was reported by the participants as minimal because the task execution in this experiment lasted for about 25 min on average (std = 2.32 min). Besides, longer trials would enable the application of the sliding window technique, which was found to improve accuracy in detecting a mental state (mind-wandering) (Liu et al., 2020). Repeating the experiment with similar protocol to this study can also allow the comparison of classification methods.
The statistical analyses of the fNIRS data and the classification performances are merely based on the mean of O2Hb features. However, it is known from previous fNIRS studies investigating mental states that alternative features, such as amplitude, slope, standard deviation, kurtosis, skewness, and signal peaks can provide insights and be used as discriminative features for classifying mental states. It is anticipated that alternative features, such as the maximum signal value, the time to peak, and the signal slope have the potential to improve the classification results.
Other factors that can influence mental states and affect the results are also worth investigation in future works, such as the interaction between penalty takers and goalkeepers, weather condition, comfortability of the fNIRS headset, amateur vs. professional players, the noise-sensitivity of the methods (Veale, 2013;Molavi et al., 2014), and the inter-subject variability in pressure induction.
It is noteworthy that the goal of this study is not to find the best classification model but to examine to what extent a simple linear classifier with minimal parameter tuning can classify different levels of experience, anxiety, success in penalty shooting. In our case, SVMs with linear kernels were employed and achieved 66.7% of accuracy and 0.6806 area under ROC at maximum of classification task. Future works can further improve the performance of classification by applying sophisticated algorithms, therefore the results in this study can only serve as a baseline. As we encourage other researchers to test FIGURE 6 | Data distribution projected to first and second principal components for classifying: (A) experienced vs. inexperienced players using motor cortex data; (B) anxious vs. non-anxious players using averaged PFC data; (C) scored vs. missed penalties using the PFC asymmetry data.
other classification paradigms, we made the physiological data publicly available.
We believe that neurofeedback regarding neural efficiency can have implications not limited to the soccer domain but also in other professions and tasks where physical performance under pressure is essential.

CONCLUSION
In the present study, a penalty-kick experiment in the field was set-up, where pressure was successfully induced. Our results provide supportive evidence for the neural efficiency theory where the correct regions of the brain should be activated to successfully perform motor tasks under mental pressure. We demonstrated that brain activity associated with choking under pressure in a penalty kick situation can be reflected by in-the-field fNIRS measurement.
The results help answer our defined research questions. Regarding RQ1 that focused on performance, we related our findings with neural efficiency theory, demonstrating that the task-irrelevant PFC was related to missing penalties. This PFC activation showed itself in a higher right PFC activation compared to left PFC activation. The activation of the PFC can infer a distraction. This distraction is potentially caused by the long-term thinking ability of the PFC, as players might concern about the consequences of scoring or missing the penalty. However, we expected that connectivity between the motor cortex and the DLPFC during the last round of task execution (when mental pressure was highest) should provide insights on performing under high pressure, but no significant results were found. We therefore cannot answer this question. Similarly, we did not find significant difference in brain activity between experienced and inexperienced soccer players when taking a penalty kick to answer our RQ3.
We found that experienced players showed a higher left temporal cortex activation when being anxious, answering our RQ4 that focuses on anxious experienced players. As the left temporal cortex is related to self-instruction and self-reflection, this increased left temporal cortex activation indicates that experienced players overthink the situation and neglect their automated skills.
Focusing on our RQ5 related to anxious inexperienced players, no significant results were found. However, when discarding level of expertise, we found that the averaged PFC activation was also related to players with anxiety. Similarly, an increased right PFC activation, as compared to left PFC activation, was shown to be related to anxious players, irrespective of the level of expertise. Also, the motor cortex tends to have lower activation when being anxious regardless of the experience group.

DATA AVAILABILITY STATEMENT
The original contributions presented in the study are publicly available. This data can be found here: [https://doi.org/10.4121/ 14453556].

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by Ethics Committee of the Faculty of Electrical Engineering, Mathematics and Computer Science, University of Twente (reference number: RP 2020-118). The patients/participants provided their written informed consent to participate in this study.

AUTHOR CONTRIBUTIONS
MS, NT, and MP conceived, planned, and designed the experiment. MS provided a critical review, carried out the experiment, data acquisition, analysis, processing interpretation, and wrote the full research report. NT wrote the manuscript with input from all the authors. MP supervised the research, provided the critical feedback, and proofread the manuscript. All authors discussed the results and contributed to the final manuscript.

FUNDING
This work was partially supported by the European Regional Development Fund's operationeel programma oost (OP-OOST EFRO PROJ-00900) and by the Netherlands Organization for Scientific Research (NWA Startimpuls 400.17.602).