Alpha oscillations and early stages of visual encoding
- 1 Department of Physiological Psychology, University of Salzburg, Salzburg, Austria
- 2 Center for Lifespan Psychology, Max Planck Institute for Human Development, Berlin, Germany
For a long time alpha oscillations have been functionally linked to the processing of visual information. Here we propose an new theory about the functional meaning of alpha. The central idea is that synchronized alpha reflects a basic processing mode that controls access to information stored in a complex long-term memory system, which we term knowledge system in order to emphasize that it comprises not only declarative memories but any kind of knowledge comprising also procedural information. Based on this theoretical background, we assume that during early stages of perception, alpha “directs the flow of information” to those neural structures which represent information that is relevant for encoding. The physiological function of alpha is interpreted in terms of inhibition. We assume that alpha enables access to stored information by inhibiting task-irrelevant neuronal structures and by timing cortical activity in task relevant neuronal structures. We discuss a variety findings showing that evoked alpha and phase locking reflect successful encoding of global stimulus features in an early post-stimulus interval of about 0–150 ms.
We proceed from the assumption that alpha may reflect a processing mode that closely is associated with accessing and retrieving information from a long-term storage system (for reviews see Klimesch, 1999; Klimesch et al., 2007b). With respect to perception, the central idea discussed in this article is that alpha is associated with a certain type of encoding stage that enables access to stored information and thereby “extracts” the meaning of sensory information. The meaning of sensory information is not represented by a stimulus. It is represented in memory. The crucial point, thus, is that for the processing of sensory information access to “semantic” information stored in memory is an important stage during encoding. The term “semantic” must be understood in a very broad sense, comprising not only declarative information (such as the meaning of words and pictures) but also implicit information (such as the meaning of motor sequences). Our hypothesis is that these processes of accessing and retrieving “semantic” information are reflected by two types of alpha responses. Alpha desynchronization and synchronization reflect a processing mode that controls retrieval of information. Post-stimulus phase responses (such as, e.g., phase locking) probably reflect early access to stored information.
We will discuss these hypotheses on the basis of different findings. In Section “The Functional Meaning of Alpha Power During Perception,” we try to show that event-related alpha power changes (as measured by event-related desynchronization/synchronization or ERD/ERS; a method pioneered by Pfurtscheller and his colleagues, cf. Pfurtscheller and Aranibar, 1977; Pfurtscheller and Lopes Da Silva, 1999) are strictly task- and not stimulus-dependent. We argue that ERS reflects the suppression of irrelevant information, and hypothesize that the time period of maximal ERD reflects retrieval of semantic information and that this process is essential for stimulus recognition. Early research from our laboratory has demonstrated that alpha ERD is particularly responsive to semantic task demands, which require the retrieval and processing of semantic information (for reviews see Klimesch, 1997, 1999).
In Section “Alpha Phase and Encoding,” we report data, showing that in response to a stimulus, alpha phase exhibits a complex response pattern, including phase locking, phase alignment, and (at least in some cases) the initiation of traveling waves. Considering the fact that in memory tasks the onset of ERD is comparatively late (at around 250 ms post-stimulus, depending on task demands) we assume that the alpha phase response which starts at around 100 ms reflects early access to memory. One important aspect of this assumption relates to the P1 component of the ERP and its function. As recently proposed in a theoretical paper (Klimesch, accepted) we assume that the P1 reflects a process – termed “early stimulus categorization” – that enables access to memory.
In Section “Alpha and the Flow of Information in the Brain: Physiological Considerations,” we discuss the physiological aspects of alpha and in Section “Discussion” we compare the proposed hypotheses with other theoretical concepts proposed in the literature. We emphasize that alpha may reflect a processing mode that controls the flow of information in the brain, either in a bottom up or top-down “direction.” This processing mode is considered “specific” in a sense that it belongs to a specific memory system which we have termed knowledge system (KS; e.g., Klimesch et al., 2007b, 2010). The general conclusion is that alpha oscillations play an important role, by directing the information flow to task relevant brain structures on the basis of two general mechanisms: inhibition and timing (cf. Klimesch et al., 2007b).
The Functional Meaning of Alpha Power During Perception
The traditional belief is that alpha desynchronizes in response to a visual stimulus. Research on ERD/ERS has shown that in many cases a visual stimulus leads to a strong ERD which reflects a decrease in amplitudes in relation to a pre-stimulus reference. Although this was (and still is) considered the “typical” event-related alpha response, it is now clear that this view must be extended. It is now clear that in response to a stimulus and/or task demands alpha may exhibit either ERD (a decrease in amplitudes) or ERS (an increase in amplitudes).
Event-Related Power Changes (ERD, ERS) are Not Strictly Stimulus Dependent
It is important to emphasize that the type of event-related alpha reactivity (ERD or ERS) as well as the time course of a power change (associated with ERD or ERS) is not strictly stimulus but rather task dependent. We will discuss four different examples in order to underline the validity of this statement.
Example 1: The “typical” ERD in response to a visual stimulus. Figure 1A depicts the time course of upper alpha ERD during word recognition (data are replotted from Klimesch et al., 2000). A set of 96 target words (which were presented in a preceding study session) and 96 distractor words were presented in a random sequence (exposure time: 250 ms). The interstimulus-interval was 4 s. Subjects responded by pressing a “yes” or a “no” response key. Mean response times for hits was 966 ms. Data are analyzed only for correctly remembered words. As Figure 1A shows, after a brief increase in amplitudes (ERS) reflecting the influence of early ERP components, a sharp drop in amplitudes (ERD) can be observed that reaches a maximum at around 600 ms. Considering the time needed for the manual response and the mean response time of about 970 ms, the ERD peak at around 600 ms most likely indicates the end of processing the stimulus. The interpretation of this finding is that an increasing ERD reflects an increase in excitation that is associated with different stages of stimulus processing comprising the retrieval and evaluation of a memory trace.
Figure 1. The event-related alpha power response is task, not stimulus dependent. Alpha power changes were determined relative to a pre-stimulus reference interval and were calculated as percentage or z-values. An increase in power is termed event-related synchronization (ERS), a decrease event-related desynchronization (ERD). All of the results shown here are for the upper alpha band (10–12 Hz) and from posterior recording sites [with the exception of the data shown (D) which were recorded from temporal sites]. (A) In a word recognition task, a brief and transient increase in power can be observed in response to the onset of a correctly recognized word. Then, a large ERD develops, reaching a maximum at around 600 ms post-stimulus. Data are replotted from Klimesch et al. (2000). (B) In a visual oddball task a significant ERD can be observed already around 500 ms pre-stimulus. Data are replotted from Klimesch et al. (1998a). (C) An immediate onset of ERD can be observed in a task, in which light flashes were presented when alpha power exceeded a predetermined (individually determined) threshold. Data are replotted from Woertz et al. (2004). (D) In a memory scanning task, ERS can be observed during the presentation of the memory set (i.e., during encoding) but ERD during the processing of the probe item (i.e., during retrieval). Data are replotted from Klimesch et al. (1999).
Example 2: Early anticipatory ERD in a simple target search paradigm (data are from Klimesch et al., 1998a,b). As Figure 1B demonstrates, a significant drop in amplitudes (in relation to a reference interval of −4000 ms to −3000 ms pre-stimulus) can be observed already around 500 ms before the presentation of a target (a row of five X’s: XXXXX) or a non-target (five O’s: OOOOO). The task was to silently count the targets (presented with a probability, p = 0.30) and to press a response key with the right index finger. Non-targets (p = 0.70) had to be ignored. During the processing of the target, a pronounced ERD can be observed. Non-targets, however, elicited an increase in amplitudes (ERS; not shown in Figure 1B). ERD may again be interpreted to reflect excitatory processes that are associated with the activation of a memory trace. The anticipatory ERD might reflect the pre-activation of neural networks that represent knowledge about processing targets and non-targets. After the presentation of the target the large ERD, peaking at about 400 ms, may again reflect the end of processing a task relevant stimulus. This is in sharp contrast to non-targets, which elicit an ERS. In addition, several studies have shown that anticipatory attentional effect reduce alpha power (i.e., elicit ERD) over brain areas that are task relevant, whereas alpha power is increased over task-irrelevant/potentially competing brain areas (e.g., Freunberger et al., 2008a; Rihs et al., 2009).
Example 3: Immediate onset of ERD after visual stimulation in a “non-cognitive” task. In a study by Woertz et al. (2004) in which spontaneous alpha power changes were monitored in real time and visual stimuli (brief light flashes) were presented if power exceeded a predetermined threshold, an instantaneous decrease in power (triggered by the light flash) could be observed, as is illustrated in Figure 1C. Stimuli could not be anticipated and light flashes do not convey structured visual information. Thus, the observed ERD cannot be interpreted to reflect the processing of a stimulus. It probably reflects increased alertness that is elicited by the stimulus.
Example 4: “Paradoxical” ERS during the presentation and processing of visual information. In a study by Klimesch et al. (1999) in which visual stimuli (numbers and letters) were used in a memory scanning task, a decrease in amplitudes (ERD) was observed only in response to the test item (when subjects had to retrieve/compare the presented item with those stored in working memory). As illustrated in Figure 1D, no ERD was observed in response to the presentation of the memory set (i.e., to items that had to be encoded into working memory). Instead, an increase in power (reflected by ERS) was observed that lasted into the retention period. This finding appears paradoxical but could be observed meanwhile in a variety of studies (e.g., Jensen et al., 2002; Busch and Herrmann, 2003; Cooper et al., 2003; Herrmann et al., 2004; Sauseng et al., 2005, 2009). It most likely reflects the inhibition of task-irrelevant/interfering processes.
In summarizing, the reported findings clearly demonstrate that ERD is not an obligatory response to visual stimulation. The onset of ERD (i) may precede visual stimulation (cf. Example 2; Figure 1B), (ii) may occur immediately (cf. Example 3; Figure 1C), or (iii) may be delayed (cf. Example 1; Figure 1A). Most importantly, there are even cases, where visual stimulation elicits ERS (cf. Example 4; Figure 1D).
For the interpretation of ERD and ERS we consider two levels, a more general, and a task or processing specific level. On a more general level, it appears safe to assume that ERD is an electrophysiological correlate of excitatory processes, whereas ERS is a correlate of inhibitory processes. This hypothesis is discussed in detail in Klimesch et al. (2007b). One central assumption is that alpha -controls the flow of information by inhibiting the processing of task irrelevant and/or competing information and by disinhibiting the processing of task relevant information. An inhibitory function of alpha is also suggested by Jensen and Mazahari (2010) but is not considered by other researchers in the field (Palva and Palva, 2007).
The waxing and waning ongoing alpha with periods of decreased and increased amplitudes may be interpreted in terms of excitation (release from inhibition) and inhibition respectively. The findings by Woertz et al. (2004) are a good example to illustrate this interpretation. Because stimuli were presented when the waxing and waning ongoing alpha activity exceeded a certain threshold value, stimuli were “hitting” the brain always when it was moving into a more inhibitory state (note the sharp ERS in Figure 1C, preceding stimulation), thereby triggering a release of inhibition, moving the brain into a more excitatory state. Findings from memory scanning tasks (cf. Figure 1D) are another good example to illustrate the role of ERS and ERD for inhibition and disinhibition respectively. For this interpretation it is important to consider the typical task demands during encoding and retrieval. Particularly if the items of the memory set vary between trials (as was the case in Klimesch et al., 1999), subjects have to make sure to block retrieval of items from previous trials in order to avoid interference with the encoding and retention of new items. The assumption is that alpha ERS reflects the inhibitory top-down control to block retrieval of items from previous trials during the perception and encoding of new items. During the presentation of the probe item, however, subjects focus on the memory traces that were established during encoding. Finally, the finding that targets elicit ERD whereas non-targets elicit ERS in an oddball task, also is in good line with the idea that ERD reflects task relevant, and ERS task-irrelevant processes.
On a more specific level of interpretation, we have to ask, whether ERD and ERS reflect a wide variety of different or a certain type of specific processes. The fact that ERD and ERS is not an obligatory response to a stimulus suggests that event-related alpha power changes reflect a processing mode and not a stimulus specific response. This processing mode controls the encoding of a stimulus in a task specific way. As an example, if a stimulus has to be recognized, alpha responds with ERD. If the encoding of a stimulus (or certain aspects associated with this stimulus) must be suppressed, alpha will respond with ERS.
ERD and the Retrieval of Memory Traces
Event-related desynchronization may also be observed without the presentation of a stimulus, e.g., in a problem solving task (with closed eyes), when information must be retrieved from memory. During the processing of a stimulus, ERD may reflect retrieval of information from memory that is used for encoding. In contrast, ERS is associated with processes that suppress access to and retrieval from memory (cf. Klimesch et al., 2007b, 2008). If this is a valid interpretation, we have to assume that alpha is closely associated with memory performance. This is indeed the case as a variety of studies (reviewed in Klimesch, 1997, 1999) have shown. Good memory performance is characterized by a large ERS during a period preceding a task but by a large ERD during actual task performance. We also have shown (cf. Klimesch et al., 2003) that brief bursts of rTMS in individual alpha frequency applied in a period preceding a task facilitates task performance and at the same time increases pre-stimulus alpha power but decreases post-stimulus power (i.e., enhances ERS and ERD). This finding suggests that the functional meaning of alpha for memory can be considered causal and not only correlative. The increase in amplitudes preceding a memory task was interpreted to reflect a transient blocking of access to memory in order to enhance a focused retrieval during task performance. Or in other words, a pre-task inhibition avoids interference with the task-related activation process in memory.
It is very important to emphasize that the suggested meaning of ERD/ERS is also specific with respect to the type of memory. A variety of findings indicate that alpha band power changes are particularly sensitive to information accessed in and retrieved from long-term memory (LTM; for a review see, e.g., Klimesch, 1999; Khader and Rösler, 2009; Meeuwissen et al., 2011). In contrast, theta band power changes are particularly sensitive to reflect the encoding and retrieval of new information. In standard memory tasks (such as encoding and recognition tasks), alpha ERD and theta ERS develop in about the same time window after the presentation of a memory relevant item. The interpretation of this finding is that the encoding or retrieval of new information (which is reflected by theta ERS) is associated with a decreasing influence of alpha (which is reflected by alpha ERD).
Dissociating sensory encoding from memory retrieval
If the onset of ERD signals the onset of retrieval, it should be possible to show that sensory encoding precedes ERD. In an attempt to dissociate these two processes, we designed a task in which stimuli were presented with progressively decreasing levels of distortion. In this study by Freunberger et al. (2008b), a pool of 150 pictures of familiar objects was used. In addition, 50 pictures were used as control items that were obtained by smearing real objects. Objects and control items were presented randomly. A trial consists of a series of four pictures that were presented in four different levels of distortion (for an illustration see Figure 2A) starting with maximal distortion (= level 4) down to minimal distortion (level 1). Pretesting ensured that objects could be reliably recognized within level 2. The task was designed to put a subject in a “top-down control mode” characterized by the expectancy to recognize an object (with comparably high probability of 66%) at a certain level of degradation (i.e., at level 2, as determined by pretesting).
Figure 2. Findings from an object recognition task show that the onset of ERD is not related to the onset of stimulation. Data are from Freunberger et al. (2008b). (A) Example of stimuli (upper panel). (B) A large ERD (indicated by an arrow) develops during the time window, in which an object is recognized. The upper panel shows time–frequency power plots for object- and control-items. The lower panel shows absolute power values for the lower and upper alpha band (Reprinted with permission).
The result, which is crucial for our hypothesis is depicted in Figure 2B and shows that the onset of ERD is not associated with the presentation of the first stimulus (i.e., the “level-4” picture) but can be observed in a time window during the presentation of the “level-2” picture that is associated with the recognition of a real picture. Most importantly, the onset of ERD is clearly distinct from the onset of sensory processes that are induced already by the “level-1” item about 1500 ms earlier. This is empirical evidence for the hypothesis that ERD does not reflect sensory processes but rather the access to meaningful information, stored in LTM. Thus, we could replicate earlier findings (Klimesch et al., 1994; Vanni et al., 1997; Mima et al., 2001) showing that the extent of upper alpha ERD is specifically related to the processing of semantic information and not, e.g., to general task demands such as task difficulty. It should also be noted that the time course of ERD is different for the lower and upper alpha band. The former exhibits a tendency to desynchronize earlier. This may reflect the general expectancy to recognize an item because objects appeared with higher -likelihood (66%) as compared to control items (33%). It is important to emphasize that only upper alpha exhibits a distinct ERD in response to object recognition.
Event-related desynchronization is not the only alpha response that is associated with successful retrieval. During the time window of ERD, we observed an increase in upper alpha phase synchronization for long-distance anterior–posterior electrode pairs. This synchronization pattern was significantly larger during the recognition of real objects as compared to control items. It is well accepted that memory must be understood as an extremely distributed system with long-term memory located primarily in posterior cortices and with access processes controlled via prefrontal regions. Thus, a distributed frontal–posterior alpha network most likely reflects the interplay between these two brain regions during retrieval from memory.
Another important finding is that the typical P1 of the event-related potential (ERP) – which usually can be observed in response to a visual stimulus – is absent for level 4 stimuli. In a similar way as ERD, the P1 emerges for level 2 stimuli. This finding is remarkable because it suggests that the onset of ERD and the appearance of a P1 may reflect the beginning of a recognition process which can be understood as the start of an access process to memory.
Alpha Phase and Encoding
As we have shown in the previous section, in memory tasks, ERD has a comparatively late onset. The question, we ask in this section is whether alpha shows an early and immediate response to the presentation of a stimulus that may have a specific role for encoding which is distinct from ERD/ERS. We argue here that alpha exhibits at least three different but related phase responses during early stages of encoding: phase locking, phase alignment, and traveling waves. We also try to show that these alpha phase responses are closely associated with – or underlie the generation of – the P1 component.
Alpha Phase Locking, Phase Alignment, and the P1
There is good evidence that early ERP components – including the P1 – have a frequency characteristic with a strong focus on alpha. In a study by Gruber et al. (2005), we investigated the hypothesis, whether early ERP components can be described by a phase alignment between frequencies that lie in the broad alpha range of about 7–13 Hz. This question was motivated by findings, showing that oscillations in the theta and alpha frequency exhibit a significant phase locking in response to the presentation of a stimulus (e.g., Jansen and Brandt, 1991; Tallon-Baudry et al., 1996; Brandt, 1997; Tesche and Karhu, 2000; Makeig et al., 2002; Schack and Klimesch, 2002; Klimesch et al., 2004). The idea here is that oscillations with different frequencies contribute to the generation of an ERP, and that these different oscillations must become synchronized (i.e., aligned) in absolute phase (within a certain time window post-stimulus) in order to generate components such as, e.g., the P1 or N1. If this would not be the case, evoked oscillations with different frequencies may largely cancel each other due to a lack of alignment in absolute phase.
For the calculation of phase alignment between frequencies Gruber et al. (2005) developed a method, which was termed “instantaneous phase alignment” (IPA). The calculation – which was done for data of a memory retrieval task – is based on the following analyzing steps. First, those frequencies were determined showing a significant phase locking (as calculated by the phase locking index, PLI). The average PLI for a 120-ms pre-stimulus period (ranging from −840 to −720 ms pre-stimulus) for each frequency bin was computed. From these data confidence intervals were calculated. Only those frequencies and time periods post-stimulus were used for further analyses, in which the PLI exceeded the upper confidence limit (cf. Figure 3C). Second, based on these data, the phase angle was calculated for each frequency and time point (cf. Figure 3E). Note that at around 100 ms a vertical red band reflecting a positive peak followed by a blue band (reflecting a negative peak at about 150 ms) can be observed. By using circular statistics, frequencies with significant phase alignment were determined. The results are depicted in Figure 3D and demonstrate significant phase alignments at certain frequencies and latencies.
Figure 3. Instantaneous phase alignment (IPA) as analyzed by Gruber et al. (2005) for the data of a memory retrieval task. (A) Example of the ERP with the P1-component. (B) The frequency characteristic of the ERP is reflected by a measure termed evoked power. It represents the filtered ERP. (C) Significant phase locking can be observed in a broad frequency range. (D) Significant IPA can be observed particularly for the time window of the P1 and N1. The white line represents the ERP. For the P1, phase alignment is around the positive peak, for the N1 it is most pronounced for the positive going slope beyond the negative peak. Note that the IPA is significant only for the broad alpha frequency range. (E) Time–frequency representation of absolute phase. Note the vertical red and blue bands (around 100–200 ms post-stimulus) representing the positive and negative peak of aligned frequencies. These bands correspond to the positive and negative peaks of the ERP, i.e., to the P1 and N1 respectively (Reprinted with permission).
The findings, thus, indicate that a significant IPA could be observed particularly within the alpha and beta 1 band. Furthermore, absolute phase exhibits a significant alignment (indicated by straight vertical red and blue bands within the white rectangle) particularly in the time window of the P1–N1 complex (cf. Figures 3A,B for an illustration). It should also be noted that pronounced alpha activity can be observed in the upper alpha frequency range (between 10 and 13 Hz) during the pre-stimulus period. Despite a rather continuous decrease in power (ERD) starting already pre-stimulus and reaching a maximum at around 250 ms post-stimulus (cf. Figure 1G), alpha PLI and IPA reach significance during a time window centered around about 100 ms post-stimulus (cf. Figures 1D,E). These findings document that alpha phase reorganization is strongly associated with the generation of the P1 component.
The conclusion, thus, is that the P1 is generated at least in part by alpha oscillations. We want to emphasize that this assumption does not necessarily depend on phase reset. The controversy between the evoked and phase reset model for the generation of early ERP components has unnecessarily narrowed and focused the potential influence of oscillations on ERPs by considering only one and highly specific mechanism, namely phase reset. There are different mechanisms other than phase reset that may have an important influence on the generation of ERPs (for a discussion see Klimesch et al., 2007c). It also should be emphasized that even in a case where alpha would be the only driving force for the generation of the P1, its amplitude may very well be influenced by stimulus evoked processes. On the other hand, however, as we have argued, the P1 cannot be considered to be solely generated by an evoked response to a stimulus (Klimesch et al., 2007c).
The P1 and Traveling Alpha Waves
It is well documented that the P1 exhibits topographical latency differences that are task dependent (e.g., Taylor et al., 2001). Most interestingly, recent evidence suggests that these latency differences can be interpreted as traveling alpha waves (Klimesch et al., 2007a) as the analysis of topographical alpha phase differences, observed in a stroop task, revealed. In this task subjects had to respond only to the color but to ignore the meaning of the presented words. Compared to P8, O2, and O1 (with P1 latencies of 108, 112, and 113 ms respectively), the P1 appeared considerably delayed – with a latency of 138 ms – at Pz. The analysis of traveling speed is based on the general idea that a systematic phase spread implies a certain propagation direction for each single trial. Thus, we first calculated instantaneous phase for each time point and frequency bin (between 2 and 20 Hz; width: 0.5 Hz) for each single trial and subject within a time window of ±1000 ms (with respect to stimulus onset). Then, cumulative phase was calculated for each trial. Based on these data, relative phase (i.e., phase difference) was determined for eight selected electrodes (P3, P4, P7, P8, Po3, Po4, O1, and O2) with respect to Pz as trailing site. The phase differences were transformed to latency differences in milisecond. Finally, the distance (in millimeter) between each selected electrode site and Pz (as reference site) was divided by the respective latency difference to obtain travel speed in meter per seconds (m/s; cf. Figure 4A). These data were then averaged over the eight selected electrode pairs to obtain an estimate of traveling speed for each single trial. As illustrated in Figure 4B, depending on the direction, traveling speed is characterized by positive or negative values. Positive values indicate a posterior to anterior direction whereas negative values indicate an anterior to posterior direction. If a systematic traveling direction is lacking, positive and negative values tend to cancel each other. On the other hand, in the case of a systematic traveling direction, averaging removes unsystematic noise and gives an estimate of travel speed. Travel speed was calculated for each single trial and was then averaged for each subject.
Figure 4. P1 latency differences can be described in terms of a traveling alpha wave. Data are from a stroop task analyzed in Klimesch et al. (2007a). (A) A systematic and consistent travel speed was observed only for the extended alpha band and during the time window of the P1–N1 complex at around 0–200 ms post-stimulus. (B) The direction of the traveling wave can be determined by visual inspection of ERP’s (Reprinted with permission).
The findings demonstrate that a systematic (posterior to anterior) traveling pattern appeared only in the broad time window of the P1–N1 complex and in the extended alpha frequency range. The observed traveling speed of about 3 m/s was calculated with respect to electrode differences on the scalp. In order to obtain an estimate for the cortical surface, a folding factor of about two must be considered (cf. Nunez et al., 2001) which means that actual traveling speed is around 6 m/s. This estimate is close to a value obtained by Burkitt et al. (2000), who analyzed phases of steady-state visual-evoked potentials and observed evoked traveling waves with a velocity ranging from 7 to 11. Furthermore, it is important to note that not only the P1, but also the N1 and an early negative component behave as traveling waves. The frequency characteristics of all of these three components lie in the alpha range. Thus, the P1 can be considered part of an evoked traveling alpha wave complex which emerges in a time window of about 50–250 ms (cf. Figure 4B).
The phenomenon of traveling alpha waves is not a new finding. It already was reported early in EEG research (e.g., Adrian and Yamagiwa, 1935; Adrian, 1936; Petsche and Marko, 1955) and is in line with physiological investigations indicating that from pulsating neurons periodic waves of activity spread over the cortex (Adrian and Matthews, 1934; Nadasdy, 2010). Traveling waves are also well documented for the ongoing EEG (Hughes, 1995; Nunez and Silberstein, 2000; Nunez et al., 2001) but with a few exceptions, they have not been investigated in the context of evoked activity. As an example, Lilly and Cherry (1954) found that auditory evoked potentials in cats spread in different directions from the focus of activity initiated by acoustical stimulation.
Finally, it should be emphasized that pre- and peri-stimulus alpha phase is related to perceptual performance (e.g., Hanslmayr et al., 2007; Busch et al., 2009; Mathewson et al., 2009). As an example, Hanslmayr et al. (2007) found that decreased pre-stimulus alpha phase coherence is associated with good performance in a perception (target detection) task. Furthermore, there is some evidence that alpha:gamma phase coupling may reflect a perception related activation of cortical networks that is driven by alpha (Voytek et al., 2010).
The Functional Role of Alpha Phase and the P1 Component for Encoding
In the preceding section, we have argued that the P1 is modulated (or possibly is generated) by alpha oscillations. As a consequence, the functional (physiological and cognitive) meaning of the P1 component should be similar to that of alpha oscillations. In a recent theoretical work (Klimesch, accepted) this idea was developed in detail by applying the assumptions of the inhibition–timing hypothesis (which was suggested for the interpretation of the physiological and cognitive function of alpha, Klimesch et al., 2007b) to the interpretation of the P1 component. According to the inhibition–timing hypothesis, alpha is considered an inhibitory oscillations (see also Jensen and Mazaheri, 2010). If the amplitudes of an inhibitory oscillation are increased, the time window, in which action potentials (AP’s) are elicited in target cells, becomes increasingly smaller. Thus, with an increase in amplitudes of an inhibitory oscillation, the timing of excitatory activity becomes more precise. With respect to the cognitive function we have suggested that alpha reflects an inhibitory control process that enables selective access to meaningful information stored in the KS. The function of inhibition is seen in a process that blocks access to task-irrelevant information. Considering the huge amount of information stored in memory, blocking access to irrelevant neural structures may be considered an important aspect for the encoding of information. The basic idea here is that during a time window of up to 100 ms post-stimulus sensory information is processed that allows for an early categorization of the encoded stimulus. According to our considerations, the P1 and the alpha phase response at around 100 ms should be functionally related to early stimulus categorization and to the inhibition of irrelevant information. We focus here only on a few examples and refer to Klimesch (accepted) for a detailed review of this issue.
Early categorization may be based on quite different types of stimulus information, comprising, e.g., the spatial location of an object, global stimulus features and, most importantly, the category-meaning of an object. There is good evidence that all of these types of information affect the size and/or latency of the P1 component. As an example, it is well established that stimuli flashed at attended locations elicit a larger P1 than stimuli flashed at unattended locations (for reviews cf. Hillyard and Anllo-Vento, 1998; Hillyard et al., 1998; Mangun, 2003). Object-based attention was also demonstrated in a variety of studies. As an example, by controlling for spatial attention effects, Handy et al. (2003) found P1 amplitude differences that were related to the category-meaning of objects. One of the most robust findings is that scrambled and/or inverted faces (as compared to upright faces) elicit a larger P1 (e.g., Linkenkaer-Hansen et al., 1998; Allison et al., 1999; Sagiv and Bentin, 2001; Itier and Taylor, 2004) that in addition tends to be longer in latency (Linkenkaer-Hansen et al., 1998; Sagiv and Bentin, 2001; Taylor et al., 2001).
Finally, it should also be mentioned that there is direct empirical evidence for the view that the P1 – in a very similar way as alpha – reflects an inhibitory process. As an example, let us consider the findings by Freunberger et al. (2008b; Experiment 2) who used a semantic (living/non-living) categorization task with meaningful and meaningless pictures. Meaningful pictures represent living, and non-living objects. Meaningless pictures were obtained by distorting pictures of living and non-living objects. We assumed that the P1 will be larger for distorted pictures because they can be considered task irrelevant with respect to semantic categorization. Distorted pictures (with no semantic meaning) may very early (on the basis of their sensory features) be categorized as semantically meaningless which allows suppression of irrelevant “spreading activation processes” aiming at identifying the stimulus. The findings show that the P1 for meaningless pictures is delayed and significantly larger than for the “task- or processing-relevant” pictures denoting living and non-living objects. This and related findings (cf. Freunberger et al., 2009) suggest that the P1 reflects an inhibitory process that directs the flow of information to task relevant neuronal structures by inhibiting the activation of task-irrelevant structures.
Alpha and the Flow of Information in the Brain: Physiological Considerations
Animal studies, in particular, have documented that the likelihood for the generation of action potentials (AP’s) is increased during the excitatory phases of oscillatory activity (e.g., Bollimunta et al., 2008; Buzsaki, 2006; Fries et al., 2001a,b; Fries, 2005; Lakatos et al., 2005). Bollimunta et al. (2008) is (to our knowledge) the first study which documented a similar function for alpha in a cognitive task. The authors analyzed the laminar distribution of cortical alpha oscillations and have shown that under certain conditions, cortical alpha oscillations are indeed capable of modulating the generation of AP’s as measured by multiunit activity (MUA). Using intracortical recordings in macaques performing an auditory oddball task, Bollimunta et al. (2008) analyzed the relationship between current source density (CSD) and MUA at two occipital (V2 and V4) and one inferotemporal (IT) recording site. Alpha activity was observed at all recording sites, but the laminar distribution of the generators of alpha oscillations and their impact on MUA was different for supragranular and infragranular layers in the visual and inferotemporal cortex. In visual cortices (V2 and V4) a significant CSD–MUA coherence was observed that showed up only in the alpha frequency range and in infragranular layers. In supragranular layers, alpha activity had no impact on MUA. In the inferotemporal cortex, alpha activity, and CSD–MUA coherence was most pronounced in supragranular layers. Most interestingly, a close association between the power (an estimate of amplitude size) of alpha and reaction time (RT) – in response to an auditory target stimulus in the oddball task – was observed that was different in direction in the visual and inferotemporal cortex. In V2 and V4 a negative correlation was observed (short RT’s were associated with large power), but a positive correlation was obtained in the temporal cortex (here short RT’s were associated with small power).
What is the functional meaning of these findings? We proceed here from the assumption that the reported results can be interpreted on the basis of the “inhibition–timing” hypothesis (Klimesch et al., 2007b) which was derived from findings of the human scalp EEG: Large amplitude alpha activity reflects the inhibitory influence of top-down control during the performance of a task. In an auditory oddball task, visual cortices must not be considered task relevant brain regions. Nonetheless, intersensory interconnectivity provides input via thalamocortical and possibly also cortical pathways. According to our hypothesis we assume that in the visual cortex infragranular alpha activity reflects top-down control to inhibit further processing of intersensory activity which is irrelevant for the performance of the auditory oddball task. The following facts support this interpretation: (i) Analyzing the directionality of laminar alpha activity by Granger causality revealed that granular and infragranular alpha drives supragranular alpha. (ii) Supragranular alpha exhibits a lack of CSD–MUA coherence that most likely is due to inhibitory dampening (Schröder et al., 1995; Bollimunta et al., 2008). As a consequence, the spread of neural activation via cortico-cortical connections to other parts of the visual cortex is suppressed. (iii) Large alpha power is associated with good performance (i.e., short RT’s), probably indicating that the extent of inhibiting task-irrelevant processes improves performance. The interpretation of the findings from the inferotemporal cortex is more difficult because it is less clear in which way this cortical region is involved in processing an auditory oddball task. As a “higher” association cortex, it may be co-involved in integrative processes, such as monitoring task performance, or establishing associations between target and standard stimuli. Importantly, these integrative processes are associated more specifically with pyramidal cells in layer 3, because their intracortical connectivity reflects a particularly high sampling capacity as can be judged from the size of their receptive fields. In addition, they show a high ability to generate spontaneous activity. These integrative processes are very likely under top-down control (as reflected by a high CSD–MUA coherence), but the important point here is that top-down control would not enhance task performance. The reason is that in an oddball task, target, and standard stimuli are presented in a random sequence and that integrative strategies (and their top-down control) such as looking for associations between stimuli and making specific predictions about the appearance of a target stimulus are not helpful. These considerations could explain, why in the inferotemporal cortex short RT’s (reflecting good performance) are associated with small alpha power (reflecting reduced top-down control).
These physiological findings about alpha document several important facts. They show that alpha phase is indeed capable of modulating neural firing, but they also show that this ability depends on specific preconditions. We assume that a lack of modulation reflects inhibition, whereas evidence of modulation reflects controlled timing of neuronal firing. Most importantly, inhibition appears to be related to increased alpha power, whereas active information processing is associated with decreased power. This interpretation is supported by recent findings of Bollimunta et al. (2011) who have found (in a very similar experimental design as that used in Bollimunta et al., 2008) that auditory information elicits larger alpha amplitudes in the visual cortex than visual information does.
We have tried to show that alpha reflects the controlled access to meaningful information (stored in the KS) by inhibiting task--irrelevant neuronal structures or brain areas. With respect to perception, we have argued that alpha controls those stages of encoding that are related to the processing of meaningful information. These stages are, early categorization (which is reflected primarily by a phase response of alpha at around 100 ms), and retrieval of information from the KS (which is reflected primarily by ERD). Or in other words, the early phase response reflects access to, and ERD retrieval from the KS. With respect to physiology the assumption is that (synchronized) alpha as measured by ERS reflects inhibition. Thus, ERD reflects a gradual release from inhibition which is important to allow retrieval of information stored in the KS. Access is initiated by the perceptual analysis of more global features that allow to establish “access routes” to the KS (see also Klimesch et al., 2007b,c; Klimesch, accepted).
An essential aspect of the proposed interpretation is based on the assumption that alpha reflects a processing mode and that all different types of (event-related) alpha responses (phase responses such as phase locking, and/or amplitude changes such as ERD/ERS) appear “embedded” in a longer lasting and ongoing modulation of alpha. This assumption is well supported by a rich body of evidence showing, e.g., that the extent of ERD and ERS depend on the magnitude of pre-stimulus alpha power (for a review see Klimesch, 1999). The ongoing and task-related nature of alpha is also nicely demonstrated by data reported by Bäuml et al. (2008). In a directed forgetting paradigm, Bäuml et al. (2008) investigated the possible role of unbinding (a decrease in phase coupling) for forgetting. Subjects had to learn two lists of semantically unrelated words in two experimental conditions (remember vs. forget) which differed with respect to a cue presented between list 1 and list 2. In the remember condition, subjects were informed to continue to remember the words of list 1, whereas in the forget condition, participants were told to forget the words of list 1, as they will not be tested later. The interesting finding with respect to forgetting was, that during the encoding of list 2 the number of electrode pairs with significant phase coupling varies as a function of the magnitude of forgetting of list 1. Most importantly, the decrease in phase coupling was observed over the entire trial not only in the post- but in the pre-stimulus interval as well. Thus, this type of alpha response does not reflect an event-related phenomenon, it reflects a processing mode that most likely operates under top-down control, because forgetting was induced by a cue. The general interpretation is that alpha phase coupling reflects the topographical pattern of top-down processes that control access to and processing of stored memories (Klimesch et al., 2010). In a recent review Hanslmayr et al. (2011) also emphasize the function of alpha in terms of a processing mode. It is argued that a certain configuration of alpha parameters reflects an internally oriented processing mode, whereas a different configuration reflects an externally oriented processing mode.
These considerations are well in line with the general proposal that alpha controls the flow of information in the brain. But the question here is, whether alpha is associated with the processing of any or a specific type of information? Our argument is that alpha reflects processes that are related to already stored and meaningful information. Accordingly, alpha is not (directly) related to the processing of new (episodic) information (for an extensive review, see Klimesch, 1999). We suggest that synchronized alpha reflects a basic processing mode that controls access to any kind of meaningful information that is stored in the KS. This processing mode provides us with the basic ability to be continuously semantically orientated in our environment with respect to all kinds of meaningful information that represent our knowledge of that environment. Within the visual processing domain, the perception and transient representation of objects and their locations allows us to be continuously oriented in space and time. Early categorization establishes “access routes” to information stored in the KS and precedes stimulus recognition. These processes that control the flow of information into (the KS of) the brain establish transient mental representations but are not (directly) involved in the encoding of new (episodic) information. This distinction is important because access to the KS is considered a continuous process that may modify information stored in this system without creating new episodic memories.
The interpretation of alpha as a processing mode also allows for a simple interpretation of the fact why ongoing activity reflects the typical waxing and waning of alpha. As suggested by Hanslmayr et al. (2011) periods of increased and decreased alpha activity may reflect periods of internally vs. externally oriented processing states. Or in terms of the interpretation suggested here, the waxing and waning reflects periods of increased “semantic awareness” (with a focus of information stored in the KS) and periods of decreased “semantic awareness” (with a focus on the processing of new information). The findings by Woertz et al. (2004) are a good example to show that even a meaningless light flash may change the internal processing state (reflected by large amplitudes, or ERS) to an externally oriented processing state (reflected by small amplitudes, or ERD).
Finally, let us consider the question, whether alpha may be considered a phenomenon that reflects attention. We argue here that an answer depends mainly on a cognitive definition on what attention is. This is a difficult question because there probably is no conscious cognitive process without attention. But in trying to answer this question we refer on two suggestions, we already have proposed earlier. One is based on the distinction between the processing of (semantic) information stored in the KS and the processing of new (episodic) information. The other is that alpha reflects a certain type of processing mode. Our proposal here is that this processing mode reflects a specific type of attention that is related to the KS. It may operate either in a top-down or bottom up mode. Visual and temporal attention may be considered top-down processes that are specific to control the flow of information into and from the KS. They allow to narrow down the relevant search area in memory and allow to establish “access routes” to information stored in memory. Usually, this process (which we also termed early categorization) will operate under top-down control that is guided by specific expectations. In the absence of expectancy, however, early categorization may operate in a default-like mode that is guided by reflexive attention.
Sustained attention with a focus on the updating of new information is a different type of attentional process, which primarily is associated with theta (and the frontal midline theta in particular, see e.g., Ishii et al. 1999). Thus, our suggestion is that those functional aspects of alpha that reflect a processing mode are related to attention, but to a specific “subtype” that operates in a top-down mode to control processes related to access and retrieval from the KS.
Conflict of Interest Statement
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
This research was supported by the Austrian Science Foundation (FWF Project P21503-B18). Roman Freunberger is supported by the Max Planck Society.
Allison, T., Puce, A., Spencer, D. D., and Mccarthy, G. (1999). Electrophysiological studies of human face perception. I: Potentials generated in occipitotemporal cortex by face and non-face stimuli. Cereb. Cortex 9, 415–430.
Cooper, N. R., Croft, R. J., Dominey, S. J. J., Burgess, A. P., and Gruzelier, J. H. (2003). Paradox lost? Exploring the role of alpha oscillations during externally vs. internally directed attention and the implications for idling and inhibition hypotheses. Int. J. Psychophysiol. 47, 65–74.
Freunberger, R., Fellinger, R., Sauseng, P., Gruber, W., and Klimesch, W. (2009). Dissociation between phase-locked and nonphase-locked alpha oscillations in a working memory task. Hum. Brain Mapp. 30, 3417–3425.
Freunberger, R., Höller, Y., Griesmayr, B., Gruber, W., Sauseng, P., and Klimesch, W. (2008a). Functional similarities between the P1 component and alpha oscillations. Eur. J. Neurosci. 27, 2330–2340.
Hanslmayr, S., Aslan, A., Staudigl, T., Klimesch, W., Herrmann, C. S., and Bäuml, K. H. (2007). Prestimulus oscillations predict visual perception performance between and within subjects. Neuroimage 37, 1465–1473.
Herrmann, C. S., Senkowski, D., and Röttger, S. (2004). Phase-locking and amplitude modulations of EEG alpha: two measures reflect different cognitive processes in a working memory task. Exp. Psychol. 51, 311–318.
Hillyard, S. A., Vogel, E. K., and Luck, S. J. (1998). Sensory gain control (amplification) as a mechanism of selective attention: electrophysiological and neuroimaging evidence. Philos. Trans. R. Soc. B Biol. Sci. 353, 1257–1270.
Ishii, R., Shinosaki, K., Ukai, S., Inouye, T., Ishihara, T., Yoshimine, T., Hirabuki, N., Asada, H., Kihara, T., Robinson, S., and Takeda, M. (1999). Medial prefrontal cortex generates frontal midline theta rhythm. Neuroreport 10, 675–679.
Jensen, O., Gelfand, J., Kounios, J., and Lisman, J. E. (2002). Oscillations in the alpha band (9-12 Hz) increase with memory load during retention in a short-term memory task. Cereb. Cortex 12, 877–882.
Khader, P., and Rösler, F. (2009). “Where memories are stored discloses how memory are formed – content specificity of long-term memory representations,” in Neuroimaging of Human Memory – Linking Cognitive Processes to Neural Systems, ed. F. Rösler, C. Ranganath, B. Röder, and R. Kluwe (New York: Oxford University Press).
Klimesch, W., Doppelmayr, M., Rohm, D., Pollhuber, D., and Stadler, W. (2000). Simultaneous desynchronization and synchronization of different alpha responses in the human electroencephalograph: a neglected paradox? Neurosci. Lett. 284, 97–100.
Klimesch, W., Russegger, H., Doppelmayr, M., and Pachinger, T. (1998b). A method for the calculation of induced band power: Implications for the significance of brain oscillations. Electroencephalogr. Clin. Neurophysiol. 108, 123–130.
Klimesch, W., Freunberger, R., Sauseng, P., and Gruber, W. (2008). A short review of slow phase synchronization and memory: evidence for control processes in different memory systems? Brain Res. 1235, 31–44.
Klimesch, W., Sauseng, P., and Gerloff, C. (2003). Enhancing cognitive performance with repetitive transcranial magnetic stimulation at human individual alpha frequency. Eur. J. Neurosci. 17, 1129–1133.
Klimesch, W., Schack, B., Schabus, M., Doppelmayr, M., Gruber, W., and Sauseng, P. (2004). Phase-locked alpha and theta oscillations generate the P1-N1 complex and are related to memory performance. Brain Res. Cogn. Brain Res. 19, 302–316.
Lakatos, P., Shah, A. S., Knuth, K. H., Ulbert, I., Karmos, G., and Schroeder, C. E. (2005). An oscillatory hierarchy controlling neuronal excitability and stimulus processing in the auditory cortex. J. Neurophysiol. 94, 1904–1911.
Linkenkaer-Hansen, K., Palva, J. M., Sams, M., Hietanen, J. K., Aronen, H. J., and Ilmoniemi, R. J. (1998). Face-selective processing in human extrastriate cortex around 120 ms after stimulus onset revealed by magneto- and electroencephalography. Neurosci. Lett. 253, 147–150.
Meeuwissen, E. B., Takashima, A., Fernández, G., and Jensen, O. (2011). Increase in posterior alpha activity during rehearsal predicts successful long-term memory formation of word sequences. Hum. Brain Mapp. 32, doi: 10.1002/hbm.21167. [Epub ahead of print].
Nunez, P. L., Wingeier, B. M., and Silberstein, R. B. (2001). Spatial-temporal structures of human alpha rhythms: theory, microcurrent sources, multiscale measurements, and global binding of local networks. Hum. Brain Mapp. 13, 125–164.
Sauseng, P., Klimesch, W., Doppelmayr, M., Pecherstorfer, T., Freunberger, R., and Hanslmayr, S. (2005). EEG alpha synchronization and functional coupling during top-down processing in a working memory task. Hum. Brain Mapp. 26, 148–155.
Sauseng, P., Klimesch, W., Heise, K. F., Gruber, W. R., Holz, E., Karim, A. A., Glennon, M., Gerloff, C., Birbaumer, N., and Hummel, F. C. (2009). Brain oscillatory substrates of visual short-term memory capacity. Curr. Biol. 19, 1846–1852.
Schröder, C. E., Steinschneider, M., Javitt, D. C., Tenke, C. E., Givre, S. J., Mehta, A. D., Simpson, G. V., Arezzo, J. C., and Vaughan, H. G. Jr. (1995). Localization of ERP generators and identification of underlying neuronal processes. Electroencephalogr. Clin. Neurophysiol. Suppl. 44, 55–75.
Voytek, B., Canolty, R. T., Shestyuk, A., Crone, N., Parvizi, J., and Knight, R. T. (2010). Shifts in gamma phase-amplitude coupling frequency from theta to alpha over posterior cortex during visual tasks. Front. Hum. Neurosci. 4:191. doi: 10.3389/fnhum.2010.00191
Keywords: alpha, oscillations, perception, memory, knowledge system
Citation: Klimesch W, Fellinger R and Freunberger R (2011) Alpha oscillations and early stages of visual encoding. Front. Psychology 2:118. doi: 10.3389/fpsyg.2011.00118
Received: 11 February 2011;
Accepted: 20 May 2011;
Published online: 31 May 2011.
Edited by:Ole Jensen, Radboud University, Netherlands
Reviewed by:Michael X. Cohen, University of Amsterdam, Netherlands
Fernando Henrique Lopes da Silva, University of Amsterdam, Netherlands
Copyright: © 2011 Klimesch, Fellinger and Freunberger. This is an open-access article subject to a non-exclusive license between the authors and Frontiers Media SA, which permits use, distribution and reproduction in other forums, provided the original authors and source are credited and other Frontiers conditions are complied with.
*Correspondence: Wolfgang Klimesch, Department of Physiological Psychology, Institute of Psychology, University of Salzburg, Hellbrunnerstr. 34, A-5020 Salzburg, Austria. e-mail: firstname.lastname@example.org