Visual short-term memory load modulates the early attention and perception of task-irrelevant emotional faces

The ability to focus on task-relevant information, while suppressing distraction, is critical for human cognition and behavior. Using a delayed-match-to-sample (DMS) task, we investigated the effects of emotional face distractors (positive, negative, and neutral faces) on early and late phases of visual short-term memory (VSTM) maintenance intervals, using low and high VSTM loads. Behavioral results showed decreased accuracy and delayed reaction times (RTs) for high vs. low VSTM load. Event-related potentials (ERPs) showed enhanced frontal N1 and occipital P1 amplitudes for negative faces vs. neutral or positive faces, implying rapid attentional alerting effects and early perceptual processing of negative distractors. However, high VSTM load appeared to inhibit face processing in general, showing decreased N1 amplitudes and delayed P1 latencies. An inverse correlation between the N1 activation difference (high-load minus low-load) and RT costs (high-load minus low-load) was found at left frontal areas when viewing negative distractors, suggesting that the greater the inhibition the lower the RT cost for negative faces. Emotional interference effect was not found in the late VSTM-related parietal P300, frontal positive slow wave (PSW) and occipital negative slow wave (NSW) components. In general, our findings suggest that the VSTM load modulates the early attention and perception of emotional distractors.


Introduction
Visual short-term memory (VSTM) was proposed as a broadly-defined limited-capacity cognitive system for temporarily maintaining representations of external information to support human cognition and behavior (Baddeley, 2003). The central executive, which is the most important component of VSTM, is responsible for allocating attention to task-relevant information while suppressing task-irrelevant information (Norman and Shallice, 1986;Baddeley, 2003). This function is critical for us in the real world, (e.g., driving) where we need to focus our attention and devote our processing resources to a goal-related task, while ignoring other distracting information. However, numerous studies have demonstrated that task-irrelevant information impairs the performance of VSTM and reduces its capacity (Dolcos and McCarthy, 2006;Erk et al., 2007;Anticevic et al., 2010;MacNamara et al., 2012). The active maintenance of taskrelevant information and the inhibition of task-irrelevant information have been linked to the dorsolateral prefrontal cortex (DLPFC) and lateral parietal cortex (LPC; Chafee and Goldman-Rakic, 2000;Sakai et al., 2002;Jha et al., 2004;Dolcos and McCarthy, 2006). In addition, Iordan et al. (2013) demonstrated that the ventrolateral prefrontal cortex (VLPFC) is also linked to coping with emotional distraction.
According to the valence of emotional stimuli, emotional stimuli were categorized as positive, negative and neutral. Numbers of studies have reported valence-related performance or activity of emotional stimuli (Smith et al., 2003;Erk et al., 2007;Luo et al., 2010;Kong et al., 2013). Emotional stimuli capture attention easily and induce a reallocation of resources, which enables a rapid evaluation and decision making in the service of survival (Dolcos and McCarthy, 2006;Petroni et al., 2011). For example, negative emotional stimuli are processed rapidly and automatically by the amygdala and ventral striatum structures, and this may be valuable for detecting and recognizing potential threats and dangers (Pessoa, 2008;Stout et al., 2013). However, if the emotional stimuli serve as distractors, they may capture the limited attention or processing resources, and thus impair cognitive performance of functions such as VSTM. Several studies have found that emotional stimuli serving as distractors disturb VSTM function in all of three phases of VSTM, including encoding, maintaining, and retrieving relevant information (Dolcos et al., 2008;Clapp et al., 2010;Ziaei et al., 2014). However, encoding-distraction and maintaining-distraction tasks are different. For example, the encoding-distraction task requires selective attention and perception of the task-relevant stimuli, when ignoring the simultaneous presentation of distractors (Ziaei et al., 2014). Whereas, the maintaining-distraction task exhibits a sudden onset of distractors after the encoding of task-relevant stimuli, and the distractors may reopen the perception gate and capture attention (McNab and Dolan, 2014). Using emotional faces as encoding-distractors, Stout et al. found that negative face distractors gained unnecessary access to VSTM and were inefficiently filtered, when compared with neutral face distractors (Stout et al., 2013). On the other hand when emotional faces are used as maintaining-distractors, negative distractors show increased activity in ventral regions and decreased activity in dorsal regions, resulting in impaired VSTM performance when compared with neutral distractors (Dolcos and McCarthy, 2006). Furthermore, McNab and Dolan (2014) compared the VSTMC of three conditions: no distraction, encoding distraction and delay distraction, and found a dissociative result across the three conditions. This study used hierarchical regression analysis, where the predicted VSTMC was the no distractor condition, and the regressors were the VSTMC of encoding and delay distraction conditions. The authors found that both k values at the encoding and delay distraction conditions were uniquely positively associated with predicted VSTMC, suggesting a separate mechanism of distractor filtering in the encoding and delay phase of VSTM (McNab and Dolan, 2014). In summary, negative distractors gained preferential processing and impaired task-related VSTM compared with neutral distractors, whether in the encoding or in the maintaining distraction task. In the present study, we examine whether the emotional maintaining-distractors interfere with the taskrelevant information maintained in VSTM, and whether the interference effect is valence-dependent.
Studies show that due to the limited capacity of VSTM, the emotional interference on VSTM task is reduced with the increasing cognitive load. Behavioral studies have found that maintaining-distractors slowed down the VSTM reaction times (RTs) and increased the error rates (Kim et al., 2005), and this effect was significantly bigger when viewing negative or positive compared with neutral distractors (Miendlarzewska et al., 2013). However, this interference was diminished after increases in task difficulty, suggesting that negative distractors may have the equal interference effect as neutral distractors during VSTM maintenance in a high VSTM load task (Anticevic et al., 2010;Miendlarzewska et al., 2013). Moreover, the activity, induced by emotional distractors, in the amygdala and ventral striatum was reduced in the high-load task, implying insufficient VSTM capacity for emotional processing (Zald, 2003;Erk et al., 2007). An alternative interpretation could be that high cognitive load demands enhance top-down control, which in turn acts to inhibit emotional processing, resulting in decreased emotion-related interference (Clarke and Johnstone, 2013).
According to Lavie's load theory of attention, there is an ''early and late'' selective attention mechanism that may account for the interaction between emotional distraction and perceptual load. The early selection view suggests that as the perceptual load increases, there is no sufficient capacity to perceive task-irrelevant distractors, resulting in better taskrelated performance and less distractibility, while the taskirrelevant distractors are perceived when the perceptual load is low. In recent years, Konstantinou and Lavie (2013) have expanded the theory to VSTM and proposed that VSTM load should have the same effect as perceptual load in reducing visualrepresentation capacity to perceive distractors. The late selection view suggests that even when the perceptual load is low and task-irrelevant distractors are perceived, a more active cognitive control mechanism, with functional links to WM (and perhaps also VSTM), can modulate top-down control responsible for maintaining task-relevant information and suppressing the distractors in later stages. However if cognitive control is loaded by a previous task (e.g., memory maintenance), performance on a subsequent task (e.g., visual search) will suffer from a worsened ability to inhibit task-irrelevant distractors (Lavie et al., 2004;Konstantinou and Lavie, 2013). In summary, early and late selection are associated with high and low perceptual load conditions, respectively. Lavie's load theory of attention solves the early and late selection debate and demonstrates that early and late selection are two processing levels in a hybrid model of attention, instead of two separate mechanisms.
To investigate early and late selection, evoked-related potential (ERP) measurements with high time-resolution can be used to track the time course of interaction between emotional distractors and VSTM load. To this end we used the distractor perception-related P1/N1 (early ERP component), the VSTMrelated P300 and the sustained slow wave (late ERP component). We examined how the VSTM load modulates the emotional distraction during early and late selection. We also examined the valence-related interference effect from distractors on taskrelevant information maintained in VSTM.
The ERP P1 component, which peaks around 80-150 ms over occipital areas is associated with attention allocation (Clark and Hillyard, 1996). For example, P1 is larger for stimuli that are located in an attended location compared to stimuli that are located in an unattended location (Hillyard et al., 1998). Combining the dipole localization and MRI method, Di Russo et al. (2002) argued that P1 component is generated in the extrastriate cortex. Specifically, previous ERP studies have demonstrated that the P1 component is linked to early perceptual processing of emotional faces (Smith et al., 2003), showing enhanced amplitudes in the presence of negative faces (such as fear and angry faces) vs. neutral or positive faces (Batty and Taylor, 2003;Palermo and Rhodes, 2007;Olofsson et al., 2008;Rellecke et al., 2012). It has been suggested that negative stimuli are processed rapidly and automatically via a fast magnocellular route (Vuilleumier, 2005). Recently, Valdés-Conroy et al. (2014) found that angry faces elicited larger P1 amplitude compared with neutral faces, whether the face was task-relevant or taskirrelevant. In addition, Clapp and Gazzaley have suggested that the early P1 component (marker of selective attention for faces) can be influenced by top-down modulation of visual processing, showing longer latencies and smaller P1 amplitudes for ignored compared to attended faces (Clapp et al., 2010). However, as the VSTM load increases (memorization of four items), the enhancement indices of face significantly decreases, suggesting that increasing VSTM load exhausts the limited top-down attentional resources, thus resulting in diminished early activity modulation (Gazzaley et al., 2005(Gazzaley et al., , 2008Gazzaley, 2011). In addition, numerous ERP studies have found that N1 component (80-150 ms post face onset) over frontal areas was related to facial expression processing, showing that N1 amplitude for negative faces was larger compared with neutral (Eimer and Holmes, 2002;Holmes et al., 2003;Wessing et al., 2013), and happy faces (Luo et al., 2010). This suggests that the orbitofrontal cortex serves as a rapid detector of emotional stimuli (Rippon et al., 2001;Luo et al., 2010). Santos et al. (2008) further suggested that orbitofrontal cortex could modulate the extrastriate cortex activity through a top-down attentional alerting mechanism to generate rapid responses to potentially threatening stimuli. Thus the P1 and N1 may be relevant measures for the study of early attention and perception of task-irrelevant emotional distractors.
Regarding late ERP components, we used the VSTM-related P300, which is a positive wave observed at central-parietal electrode sites approximately 300-650 ms post-stimulus onset. P300 is thought to reflect the update of VSTM and its amplitude is associated with cognitive task demands (Donchin and Coles, 1988). Numerous studies have found that the P300 amplitude decreases when more items are maintained in VSTM (Kok, 2001;Watter et al., 2001;Busch and Herrmann, 2003). The P300 amplitude is thought to reflect the demands of ''perceptualcentral'' resources. Thus, as more items are maintained in VSTM, less resources remain and the P300 amplitude decreases (Kramer and Spinks, 1991), suggesting that the P300 may be an index of VSTM demands. We examined whether the VSTM-related P300 component is influenced by emotional distraction. Specifically we tested the hypothesis that emotional distractors, containing negative and positive faces, are inefficiently filtered in the early stages of perception, gaining unnecessary access to VSTM. We thus, expected the P300 amplitude to decrease for emotional faces compared with neutral faces.
In addition, we measured the sustained slow wave, which is a long-duration ERP component during the VSTM maintenance period (Ruchkin et al., 1995). Previous ERP studies have found a sustained negative slow wave (NSW) over the posterior area during VSTM maintenance, showing increased amplitudes with increased VSTM load (Ruchkin et al., 1992(Ruchkin et al., , 1995. Researchers proposed that the increased amplitude of NSW may be associated with either increased representations maintained in VSTM, or with increased processing resources used towards larger memory arrays (Bosch et al., 2001;Vogel and Machizawa, 2004;Zimmer, 2008). Rösler et al. (1997) further found that larger NSW amplitudes were associated with better VSTM performance. In recent years, Vogel et al. (2005) used a bilateral display of stimuli and asked participants to memorize a single hemifield stimulus. The study showed a contralateral and ipsilateral NSW according to the hemifield of the remembered stimulus. Furthermore, the contralateral NSW amplitude increased with increased VSTM load (Vogel et al., 2005;Fukuda et al., 2010). In addition, the frontal cortex is thought to play a key role in cognitive control, responsible for suppressing distractors and maintaining task-relevant information (Jha et al., 2004;Dolcos and McCarthy, 2006;Dolcos et al., 2008). McEvoy et al. (1998) have found a sustained frontal positive slow wave (PSW) during VSTM maintenance, with enhanced amplitudes when both verbal and spatial VSTM load is increased. The authors suggested that these responses were sensitive to highorder attentional demands with frontal areas serving as topdown attentional control for orienting sustained attention to task-relevant representations. Therefore, to measure the late suppression of task-irrelevant distractors, we used the NSW/PSW as the component of interest. We examined whether emotional distractors, containing negative and positive faces, gained unnecessary access to VSTM, and whether this would result in an amplitude increase of the NSW/PSW compared with neutral faces.
In the present study, we manipulated the VSTM load and the valence of emotional distractors to measure the interference of emotion distraction on VSTM performance and neural activity. This work could serve to further clarify aspects related to the temporal dynamics of the interactions between cognitive load and emotional valence, and their relation with the predictions of the ''load theory of attention and cognitive control'' for high-level processing (Lavie et al., 2004). We employed the method of event-related potentials (ERPs) and divided the ERPs during the maintenance interval into early and late components. The P1/N1 was defined as the early perception and attention of emotional distractors. Hence, we hypothesized that there will be an enhanced P1/N1 activity for negative faces based on the negativity attentional bias. If VSTM load modulates the neural correlates of emotional distraction in an ''early'' phase, we hypothesized that the increasing VSTM load would decrease this early activity and decline the negativity bias, based on the Lavie's load theory of attention. The P300 is linked to the demands of ''perceptual-central'' resources, thus we hypothesized that the increases of VSTM load would result in P300 amplitude decreases. The NSW/PSW was defined as the late sustained attention for task-relevant representations and suppression of emotional distractors, thus our hypothesis was that high VSTM load will increase NSW/PSW amplitudes. If VSTM load modulates the neural correlates of emotional distraction in the ''late'' phase, we hypothesized that emotional distractors would gain unnecessary access to VSTM in the lowload condition, resulting in decreased P300 amplitudes and increased NSW/PSW amplitudes for emotional faces compared with neutral faces. We expect this effect to be absent in the highload condition.
To test the above hypotheses, we employed a delayed-matchto-sample (DMS) task using distinctly colored circles (Fukuda et al., 2010) in which encoding and reporting are trivially easy, so that imperfect performance represents the loss of VSTM maintenance. Emotional faces were presented after the encoding phase, for 50 ms which is a shorter duration time compared with previous studies (MacNamara et al., 2012). Participants were required to memorize the colors of one to six circles and ignore the emotional face distractors (Chinese Facial Affective Picture System, CFAPS; Lu et al., 2005) during the maintenance interval. Behavioral data (including accuracy, discrimination index and RTs) and ERP data were recorded and analyzed.

Participants
Fourteen right-handed undergraduate students (2 female) with normal or corrected-to-normal vision participated in the study. The mean age of the participants was 23.5 years, ranging from 21 to 25 years. One participant's data was excluded due to technical problems. Another participant's data was excluded due to excessive blinking and other artifacts. The study was approved by the University of Electronic Science and Technology of China Ethics Board. Written informed consent was signed by each participant in accordance with an experimental protocol approved by the University of Electronic Science and Technology of China Ethics Board before the experiment. The methods were carried out in accordance with the approved guidelines.

Materials
Seven distinct colored circles (red, green, blue, yellow, cyan, purple, and white) were chosen to measure VSTM capacity. The diameter of each circle was 3.2 • . All colored circles were modified using Adobe Photoshop 7.0 (Adobe Systems, San Jose, CA, USA) to achieve uniform luminance, saturation and resolution. 70 positive (35 female), 70 negative (35 female) and 70 neutral faces (35 female) were picked from the Chinese Affective Picture System (Lu et al., 2005). Positive faces were selected from happy faces and negative faces were selected from angry faces. The viewing angle was 9.5 • × 11 • . All the stimuli were presented with EPRIME software on the monitor screen (32 × 24 cm 2 ). Participants sat 60 cm in front of the computer screen.

Procedure and Task
Figure 1 depicts a sample trial from the DMS task. Each trial began with a fixation (200 ms), during which time participants fixated on a white cross that was centrally presented against a black background. The memory array consisted of 1-6 colored circles, randomly selected without replacement from the set of seven colors, shown randomly in the 3 × 3 matrix (13 • × 13 • viewing angle) region at the center of black background. The distance between circles was at least 3.8 • (center to center). Participants were told to memorize the colors of one to six circles which were presented for 200 ms. At the offset of the memory array, a task-irrelevant emotional face distractor was briefly flashed centrally for 50 ms. We selected 50 ms as the duration of the face distractor in order to let participants avoid doing additional processes of faces, such as passing view, focusing on the facial salient areas, or mental regulation of emotional faces (up-or down-regulate emotion), which might produce different results Wessing et al., 2013;Calvo et al., 2014). Furthermore, a long presentation FIGURE 1 | Trial demonstration of delayed-match-to-sample (DMS) task. Participants were told to memorize the colors of one to six circles, which were presented for 200 ms (five color circles presented in this figure as an example). Following this, a positive, negative, or neutral face distractor was flashed for 50 ms at the location of the white box (the white box marker was invisible during the experiment). After an 800 ms maintenance interval, participants pressed a button to indicate whether the "Test Array" matched the "Memory Array" or not. The emotional face picture was picked from the Chinese Affective Picture System (Lu et al., 2005). duration of a face distractor during the VSTM maintenance interval would attenuate the memory of previous task-relevant information, no matter what kind of emotional distractors are used. Previous studies have shown that happy and angry faces can be discriminated during the 50 ms presentation (Grimshaw et al., 2004). ERP studies used 50 ms to measure the early attention and fast detection of emotional faces, resulting in different P1 and N1 components for different emotional faces (Dennis and Chen, 2007;Eimer et al., 2008). In the present study, we examined how the VSTM load modulates the early perception of face distractors using P1 and N1 components. Participants were instructed to ignore the emotional face. After the maintenance interval of 800 ms, participants pressed a button to indicate whether the ''Test Array'' matched the ''Memory Array'' or not. Participants were told to respond using only one finger, pressing 1 for a match and 2 for no match, as quickly and accurately as possible. The color of one circle in the test array was different from the corresponding item in the memory array in 50% of the trials.
Each participant performed two practice blocks and 20 test blocks. Each block included 72 trials (a total of 1440 trials). Our experimental design was a 3 × 6 factorial with the within-subject factors being emotion (positive, negative and neutral) and VSTM load (one, two, three, four, five and six). Trial order was varied randomly within each block for each participant.

Data Recording and Processing
Participants were seated comfortably in front of the computer in a dark and silent room. Electroencephalography (EEG) signals were collected using a 128-channel EGI HydroCel GSN (EGI, Eugene, OR, USA). EEG data were sampled at 1000 Hz with an amplifier band-pass of 0.1-48 Hz and recorded using NetStation 4.1.2 (EGI Software: EGI, Eugene, OR, USA). Electrode impedances were kept under 50 KΩ at the beginning of the session. All channels were referenced to Cz (129th) during recording and re-referenced to an average reference off-line. EEG data were processed off-line using Net Station Waveform Tools: 1. Filter: An FIR 0.1-30 Hz bandpass filter was applied; 2. Segmentation: the data were segmented from 200 ms before the onset of the memory array to 1600 ms after the memory array onset; 3. Artifact Detection and Bad Channel Replacement: for each segments, channels with amplitude exceeded 200 µV were marked as bad and replaced through the interpolation of neighboring electrodes; and 4. File Export: all of the data were exported in .mat format (data from one participant was excluded due to technical problems). MATLAB software was used to exclude the bad segments which contained significant eye movements, blinks, muscle artifact (potential exceeding ± 100 µV), or contained an incorrect button press. We decomposed the data into 40 ICs using the PCA method (EEGLAB toolbox). According to the scalp maps and time-courses, we identified and removed the eye movement artifact during the maintenance interval. This rejection retained at least 84% of original trials for each condition. One participant's data was excluded due to excessive blinking and other artifacts. The remaining segments were baseline corrected using the 200 ms before the onset of the memory array. Segments were then averaged across trials according to load (one to six) and emotional faces (positive, negative and neutral).

Behavioral Analysis
We calculated the VSTM capacity using the equation: K = S × (Hit rate − False alarm rate)/(1 − False alarm rate; Pashler, 1988), where S is the load (the number of color circles in the memory array), the hit rate is the conditional probability that participants accurately reported matches, and the false alarm rate is the conditional probability that participants reported a match when the two arrays did not match. It was easy for subjects to memorize one or two items (accuracy: 94 ± 1% for positive, 93 ± 1% for negative and 94 ± 0.8% for neutral face). Therefore we split the data for subsequent analyses, such that data with loads one and two were averaged into a low-load condition and data with loads 3-6 were averaged into a high-load condition. For the low-load condition, 1460, 1422 and 1445 trials remained in total across all subjects for positive, negative and neutral faces, respectively. For the high-load condition, 2374, 2281 and 2317 trials remained in total across all subjects for positive, negative and neutral faces, respectively. To balance the trial number of low load and high load conditions, we selected the trials with equal numbers between low load and high load for each distractor, for each subject randomly. An average of 121 (sd 24), 118 (sd 25), 120 (sd 25) trials were left for each face distractors.
We calculated the accuracy and mean RTs across all participants by using a 2 × 3 repeated measure analysis of variance (ANOVA) with independent repeated factors load (lowload and high-load) and emotional faces (positive, negative and neutral). Only RTs less than 1200 ms were included for accuracy analysis, and only correct responses were applied to measure RTs. In addition, discrimination index was defined as: d = Z (hit rate) − Z (false alarm rate), where Z (P) is the z-score associated with probability P by the standard normal distribution.

ERP Analysis
ERPs were measured for the following regions of interest (ROIs): frontal (F3, FZ and F4), central (C3, CZ and C4), parietal (P3, PZ and P4) and occipital (PO7, OZ and PO8) regions. We defined the first peak during the maintenance interval as the early component, with the P1 at occipital sites (PO7, OZ and PO8) and the N1 at frontal sites (F3, FZ and F4). The peak values and corresponding latency of P1 and N1 in the 70-200 ms time-window post face onset were used in subsequent statistical analysis. The P300 was measured using the mean amplitude in the 300-500 ms time-window at parietal sites (P3, PZ and P4). Between 600-1050 ms, the late components were defined as the NSW at occipital sites and the PSW at frontal sites. The mean amplitudes for the 600-1050 ms time-window were used in subsequent statistical analysis.

Regression Analysis
Linear regression models were used to evaluate the relationship between behavioral performance and ERP activity. Since it was easy for subjects to memorize one or two items (accuracy: 94% for positive, 93% for negative and 94% for neutral face), the ERP activity for the low-load condition served as the baseline and the difference score between high and low load was measured. The difference score of P1/N1 amplitude and latency was used to correlate with RT costs (high-load minus low-load) separately.

Statistical Analysis
Statistical analysis was performed using SPSS Statistics Release 19 (IBM, Somers, NY, USA) General Linear Model. Bonferroni corrections were performed for multiple comparisons and Greenhouse-Geisser epsilon corrections were performed for non-sphericity data where necessary. All regression analyses used Pearson's linear correlation coefficient.

Behavioral Results
Accuracy (percent correct) and RTs for positive, negative and neutral faces for low-load and high-load conditions are shown in Figures 2A,B. For accuracy, there was a significant main effect of load (F (1,11) = 188.4, p < 0.001) with higher accuracy in the low-load than in the high-load condition. Neither main effect of emotion nor interactions between VSTM load and emotion were found. For RTs, there was a significant main effect of load (F (1,11) = 59.9, p < 0.001) with longer RTs in the high-load than the low-load condition. No main effect of emotion and no interactions between VSTM load and emotion were found. Two-way repeated ANOVA of accuracy and RTs with emotion (positive, negative, and neutral) and load (1, 2, 3, 4, 5, 6) as factors revealed a main effect of load (ACC: F (5,55) = 61.5, p < 0.001; RTs: F (5,55) = 25.1, p < 0.001), but there was no significant main effect for emotion nor a significant interaction. Similarly, there were no significant differences in maximum VSTM capacity for positive (4.6 ± 0.28 items), negative (4.8 ± 0.29 items) and neutral (4.4 ± 0.27 items) distraction (ACC: F (2,11) = 1.32, p = 0.3).
We also calculated the discrimination index (d ) for positive, negative and neutral faces for low-load and high-load conditions. This revealed only a significant main effect of VSTM load (F (1,11) = 205, p < 0.001) with higher discrimination index in the low-load than high-load condition. Figure 3 depicts the grand-averaged ERP waveforms of positive, negative and neutral faces for low and high loads at the ROIs: frontal (F3, FZ and F4), central (C3, CZ and C4), parietal (P3, PZ and P4) and occipital (PO7, OZ and PO8) regions. Emotional face distractors were associated with a relative negative shift over frontal sensors and a relative positive shift over occipital sensors around 150 ms post face onset. An overall 2 × 3 × 2 × 3 ANOVA with factors of site (frontal and occipital), hemisphere (left, middle and right), load (low-load and highload) and emotional faces (positive, negative and neutral) revealed a significant interaction of site × emotion (F (2,10) = 5.96, p < 0.05), a significant interaction of hemisphere × load (F (2,10) = 4.87, p < 0.05), and a significant main effect of hemisphere (F (2,10) = 7.13, p < 0.05) with bigger amplitudes for lateral vs. middle electrode sites. Then we performed a 3 (left, middle and right) × 2 (low-load and high-load) × 3 (positive, negative and neutral) for frontal N1 and occipital P1 separately, as shown below.

ERP Results
For the late sustained slow wave, VSTM load was associated with a relative positive shift slow wave over frontal sensors and a relative negative shift slow wave over occipital sensors. A 2 × 3 × 2 × 3 repeated ANOVA with site (frontal and occipital), hemisphere (left, middle and right), load (low-load and high-load) and emotional faces (positive, negative and neutral) revealed a significant interaction of site × load (F (1,11) = 6.41, p < 0.05). Hence, we performed a 3 (left, middle and right) × 2 (low-load and high-load) × 3 (positive, negative and neutral) repeated ANOVAs for frontal PSW and occipital NSW separately, as shown below.

PSW and NSW Components
The area of the PSW components were observed at frontal regions (Figures 3, 5E,F), and NSW components were observed at occipital regions (Figures 3, 5G,H). A repeated 3 × 2 × 3 (hemisphere × load × emotion) ANOVA of frontal PSW amplitudes revealed a significant main effect of load (F (1,11) = 13, p < 0.01), with higher amplitude for high vs. low loads. Identical ANOVA of occipital NSW amplitudes revealed a significant main effect of load (F (1,11) = 5.54, p < 0.05), with higher amplitude for high vs. low loads. We did not observe a main effect of emotion nor an interaction.

Correlations Between ERP and Behavior
We computed a correlation analysis to measure the relationships between the early ERP activity and VSTM performance for three types of distractors, respectively. We found an inverse correlation between the N1 activation difference (high-load minus low-load) and RT costs (high-load minus low-load) when viewing negative face distractors at electrode site F3 (r = −0.76, p < 0.01; Figure 6). There was no correlation for the positive (r = 0.28, p = 0.38) and neutral (r = 0.02, p = 0.95) distractor conditions. In addition, there was no significant correlation between P1 latency changes and RT costs (all r < 0.4 and all p > 0.2).

Discussion
Due to the limited capacity of VSTM, more resources (including attentional and processing resources) are allocated to taskrelevant information, resulting in fewer resources allocated to distractors and less distractibility. The aim of the present study was to investigate the time course of emotional distraction on maintenance of task-relevant information in VSTM, and the modulation of such interference by VSTM load.
In line with previous work, negative face distractors elicited larger N1 amplitudes compared with positive and neutral distractors, which were maximal at frontal sites (Eimer and Holmes, 2002;Luo et al., 2010). Wenbo suggested that this early frontal N1 represented the rapid detection of facial expression in the frontal cortex, especially for negative faces. Such negativity bias is valuable for recognizing threatening information in the real world. Eimer et al. suggested that frontal N1 represented the rapid attentional alerting to emotional faces (Eimer et al., 2008). Negative faces preferentially attract attention early in the information processing stream, as reflected by larger N1 amplitudes (Olofsson et al., 2008). In addition, the frontal N1 amplitude was reduced by VSTM load, which may indicate that the high VSTM load exhausted limited attentional resources, resulting in less attention allocated to distractors (Lavie et al., 2004). An alternative interpretation could be that high cognitive load demands enhances top-down control, which in turn acts to inhibit emotional processing. In the current study, an inverse correlation was observed between the N1 activation difference (high-load minus low-load) and RTs difference (high-load minus low-load) when viewing negative face distractors at the F3 electrode (Figure 6), indicating that individuals with larger decreases in N1 activity showed smaller RT costs from lowload to high-load, and vice versa. This finding suggests that for the high-load condition, when there is less early attention allocation to negative distractors, VSTM performance improves, since RTs in the high-load condition are almost as fast as the RTs in the low-load condition. However, there was no significant correlation between N1 activation difference and RTs difference in the positive or neutral distractor condition. This finding suggests that frontal areas may be involved in resisting emotional interference in the early phase (Jha et al., 2004;Anticevic et al., 2010;Ziaei et al., 2014).
As expected, we found larger P1 activity in response to negative compared with positive distractors. Consistent with our findings, P1 amplitudes for fear or angry faces at occipital sites have been shown to be larger than those for positive or neutral faces (Batty and Taylor, 2003;Rellecke et al., 2012), suggesting a rapid and automatic emotional processing. Moreover, some researchers found such an effect in the brief presentation of faces and suggested it reflected a rapid extraction of emotional information before fine-grained perceptual processing begins (Pourtois et al., 2004;Dennis and Chen, 2007). In addition, previous studies have reported that P1 latency for faces that need to be ignored were significantly longer than attended faces, suggesting P1 was modulated by top-down attention control (Clapp et al., 2010). In our study, we found significant delayed P1 latency for negative vs. neutral distractors. As the P1 latency shift reflects the large neurons engaged in visual association cortex (VAC; Lopes da Silva, 1991), this may suggest that ignoring negative faces is harder than ignoring neutral faces and thus more processing resources are required. Moreover, we found that high VSTM load prolongs P1 latency when compared with low VSTM load, suggesting that neural processing speed of ignored distractors was slowed down. This finding may imply that the action of ignoring distractors becomes more difficult and ineffective since the VSTM capacity is insufficient. Previous studies, using inverse and upright faces as stimuli, found P1 latency to be delayed, suggesting that inverse faces undermine the holistic processing of a face Taylor, 2002, 2004;Meeren et al., 2005). Thus, an alternative interpretation of the delayed P1 latency observed in the current study for the high-load condition could be that high VSTM load disrupts the early facial expression processing of a face.
Researchers have proposed that the P300 amplitude reflects demands on ''perceptual-central'' resources or processing capacity (Kramer and Spinks, 1991). Previous studies have shown that P300 over parietal regions is related to VSTM load, and that amplitude decreases with increasing VSTM load (Kok, 2001;Watter et al., 2001;Busch and Herrmann, 2003). Consistent with above studies, we found a decreased amplitude of P300 from low-load to high-load. This finding suggests that increasing VSTM load attenuates the perceptual-central resources and consumes the limited VSTM capacity, as reflected by decreased P300 amplitudes (Kok, 2001).
Late slow wave components were used to measure the late attentional control engaged in representational selection between task-relevant information and distractors. The current results revealed that VSTM load increased occipital NSW activity. This result is in line with previous studies reporting that the sustained activity during VSTM maintenance is related to the amount of information retained in VSTM (Ruchkin et al., 1992(Ruchkin et al., , 1995. Studies using the bilateral stimulus, have found that the contralateral NSW amplitude was strongly modulated by the amount of color objects held in VSTM (Vogel et al., 2005;Fukuda et al., 2010).
Studies have shown that the frontal cortex is associated with VSTM maintenance and manipulation, and that it plays a role in allocating attention toward or away from information, via top-down communication with sensory cortices (Curtis and D'Esposito, 2003;Dolcos and McCarthy, 2006;Ranganath, 2006). McEvoy et al. (1998) have found that the frontal slow wave activity was enhanced with the increasing verbal and spatial VSTM load, suggesting that these responses were sensitive to high-order attentional demands in the task. Similar results have FIGURE 6 | Regression between the visual short-term memory (VSTM) load-related change in N1 amplitude (High-Load minus Low-Load) and the VSTM load-related change in RT (High-Load minus Low-Load) for positive face (red), negative face (blue) and neutral face (green) conditions, respectively. Pearson correlation coefficient (r = −0.759) was significant (p = 0.004) for negative faces only. also been found in auditory VSTM tasks (Monfort and Pouthas, 2003). The present results revealed that the frontal PSW activity was higher in the high-load condition than in the low-load condition, which may be due to the greater working memory or attentional demands associated with the high-load condition.
Our results showed no effect of emotional faces on the frontal PSW and occipital NSW. One possibility is that the very brief presentation duration of the face (50 ms) in the present study decreased the salience of emotional faces, thus having no impact on task-relevant information maintained in VSTM, which is in line with behavioral results. Another possibility is that the brief duration of the faces prevents the participants from further processing of faces, including response selection and decision .
In summary, we found that high VSTM load modulates the early perception of emotional distractors (P1, N1). This is in line with Lavie et al.'s (2004) early selective attention theory, showing that perception of irrelevant distractors is reduced during a high load condition, suggesting insufficient capacity for distractor processing. Moreover, individuals with larger decreases in N1 activity showed smaller RT costs from low-load to high-load when viewing negative face distractors. However, emotional distractors had no effect on the late VSTM-related activities (P300, PSW and NSW) and VSTM behavioral performance. This may be due to the short duration of presentation of faces, which were easily ignored and thus subjects did not need further late cognitive control suppression.

Conclusion
The present study illustrates the modulation effects of VSTM on emotional processes. Negative faces elicited higher frontal N1 and occipital P1 activity, reflecting rapid attentional alerting and early perceptual processing of negative faces. In addition, increasing VSTM load reduced the N1 activity and delayed the P1 latency. These results are in line with Lavie et al.'s (2004) early selection view that perception of irrelevant distractors is reduced during the high load condition due to the insufficient capacity for distractor processing. An alternative interpretation could be that high cognitive load demands enhanced top-down control, which in turn acts to inhibit emotional processing. The late P300 response decreased with increasing VSTM load, suggesting that increasing VSTM load attenuates the perceptualcentral resources and consumes the limited VSTM capacity. The late frontal PSW and occipital NSW responses were sensitive to increases in VSTM load, suggesting higher VSTM or attentional demands during the high-load vs. the low-load condition. Emotional interference effect was not found in the late VSTM-related P300, PSW and NSW components. The present findings support the evidence that VSTM modulates emotional distraction in the early phase. Longer duration of presentation of face distractors may be helpful in further understanding how VSTM modulates emotional distraction in the late phase.