What Klein’s “Semantic Gradient” Does and Does Not Really Show: Decomposing Stroop Interference into Task and Informational Conflict Components

Levin, Yulia; Tzelgov, Joseph

doi:10.3389/fpsyg.2016.00249

ORIGINAL RESEARCH article

Front. Psychol., 26 February 2016

Sec. Cognition

Volume 7 - 2016 | https://doi.org/10.3389/fpsyg.2016.00249

What Klein’s “Semantic Gradient” Does and Does Not Really Show: Decomposing Stroop Interference into Task and Informational Conflict Components

Yulia Levin¹

Joseph Tzelgov^1,2,3*

¹Automaticity Skill and Consciousness Lab, Department of Psychology, Ben-Gurion University of the Negev, Beer Sheva, Israel
²Department of Brain and Cognitive Sciences, Zlotowski Center for Neuroscience, Ben-Gurion University of the Negev, Beer Sheva, Israel
³Achva Academic College, Arugot, Israel

The present study suggests that the idea that Stroop interference originates from multiple components may gain theoretically from integrating two independent frameworks. The first framework is represented by the well-known notion of “semantic gradient” of interference and the second one is the distinction between two types of conflict – the task and the informational conflict – giving rise to the interference (MacLeod and MacDonald, 2000; Goldfarb and Henik, 2007). The proposed integration led to the conclusion that two (i.e., orthographic and lexical components) of the four theoretically distinct components represent task conflict, and the other two (i.e., indirect and direct informational conflict components) represent informational conflict. The four components were independently estimated in a series of experiments. The results confirmed the contribution of task conflict (estimated by a robust orthographic component) and of informational conflict (estimated by a strong direct informational conflict component) to Stroop interference. However, the performed critical review of the relevant literature (see General Discussion), as well as the results of the experiments reported, showed that the other two components expressing each type of conflict (i.e., the lexical component of task conflict and the indirect informational conflict) were small and unstable. The present analysis refines our knowledge of the origins of Stroop interference by providing evidence that each type of conflict has its major and minor contributions. The implications for cognitive control of an automatic reading process are also discussed.

Introduction

A landmark cognitive task in the field of automaticity research is rightfully considered the Stroop task (Stroop, 1935). In the classic variation, participants are required to name the color of the ink in which a word stimulus is presented. It usually takes more time for participants to name the color when it is incompatible with the meaning of the word (i.e., when the stimulus is incongruent, e.g., the word BLUE written in red ink) than when the meaning of the word is color-unrelated (e.g., the word DOG) or when the stimulus is meaningless (e.g., a letter string such as XXXX). This finding is known as the interference effect and it is commonly believed to occur because there is an incompatibility between the meaning of the word and a color of the ink the word is presented in. However, as will be further explained in more detail, the observed incompatibility is only a visible part—the “top of an iceberg”—which should not be confused with a primary origin of the Stroop interference effect. To preview the following discussion, it is our belief that Stroop interference should generally be viewed as a behavioral expression of the fact that stimulus words are being automatically read. Let us briefly discuss the two main points of this proposal—the key role of reading and the automaticity of reading—in turn. The key role of reading in the Stroop task is emphasized by the fact that only by means of reading can the meaning be extracted from a visual lexical symbol (i.e., a word). That is, even if accepting a somewhat limited view of the Stroop interference as representing an incompatibility effect (see further discussion below), it is clear that such incompatibility can only arise if the word stimulus has been read. However, why should the word stimuli be read at all if the required task is to name the color of the ink? Based on the revised definition of automaticity proposed by Tzelgov (1997) and Perlman and Tzelgov (2006), according to which a process is automatic if it occurs in spite of the fact that it is not required for the successful performance of the task, the words are read automatically. Note, that in contrast to early views of automaticity (Posner and Snyder, 1975; Shiffrin and Schneider, 1977; Hasher and Zacks, 1979), the definition proposed by Tzelgov, and Tzelgov and Perlman does not involve relying on other cognitive constructs such as attention or awareness. Instead, it emphasizes the ballistic feature of automatic processes—their inclination to run to full completion once they have been trigged by the stimulus they are highly associated with (Bargh, 1989). Hence, the Stroop situation is unique in that it provides tangible evidence of automaticity of the reading process that can be measured and explored.

Although it has been extensively investigated for almost 80 years, the behavior observed during performance of the Stroop task has yet to be fully understood. Findings of different studies, however, have led to an important recognition that the interference effect has multiple components. Currently, there are at least two different theoretical frameworks indicating multiple origins of the interference effect. Although these frameworks seem at first glance to be conceptually different, we believe their integration, as carried out in the experiments herein, is crucial and is a very important step toward our understanding of automaticity of reading in general, and of the Stroop phenomenon in particular. The first of these frameworks—a Semantic Gradient framework—centers on Klein’s (1964) pioneering work. Klein (1964) argued for the existence of what he called a semantic gradient within the interference effect. The name, however, is somewhat misleading because as the results of the study showed, the nature of the observed gradient of interference was not exclusively semantic. Klein conducted a between-participants, blocked by stimulus category, Stroop task study where he used four colors as possible responses (red, blue, green, and yellow), and six stimulus categories: nonsense syllables (HJH, EVGJC, BHDR, GSXRQ); rare neutral words (i.e., color-unrelated; SOL, EFT, HELOT, ABJURE); frequent neutral words (PUT, HEART, TAKE, FRIEND); words associatively related to possible responses (LEMON, GRASS, FIRE, SKY); color words representing colors not available as a response (BLACK, GRAY, TAN, PURPLE); and color words representing possible responses (RED, BLUE, GREEN, YELLOW). Klein observed that the magnitude of the obtained interference became gradually stronger as the stimuli became (1) more readable (e.g., nonsense syllables vs. neutral words) and (2) its meaning was more closely related to (response-relevant) colors (e.g., neutral words vs. color-associated words vs. color words). While Klein was the first to discover that different features of stimuli affect the speed of the color-naming response, he was not the first to explicitly propose specific components contributing to Stroop interference. Much later, in Sharma and McKenna (1998), conceptualized the semantic gradient obtained by Klein (1964; see also Fox et al., 1971; Dalrymple-Alford, 1972; Majeres, 1974; Regan, 1978; Li and Bosman, 1996) as reflecting the contribution of various components to the interference effect. The strength of the relation between the meaning of a given stimulus and one of the colors in the experiment was assumed to be captured by a “semantic relatedness” component. The fact that readable stimuli interfered more than non-readable ones was labeled by the authors as a “lexical” component, since to be readable, the stimulus had to be represented in the lexical system.¹

In the domain of language processing, it is well documented that at least three processes underlie reading: orthographic, lexical, and semantic encoding (e.g., McClelland and Rumelhart, 1981). Orthographic information about the individual letters is represented at the orthographic level. During orthographic encoding some of these representations are activated, leading to letter identification. Knowledge as to whether these letters do or do not constitute a real word (e.g., word identification) becomes available through the lexical encoding. Letter strings that form real words are also represented lexically. The lexical encoding involves activation of these representations after the word has been (visually) presented. In case of real words, lexical encoding is usually complemented by semantic encoding, during which the meaning of the word is accessed.²

Apparently, “lexical” and “semantic relatedness” components of Stroop interference proposed by Sharma and McKenna (1998) represent the automaticity of the respective reading sub-stages (e.g., lexical encoding and semantic encoding). Note, however, this is not to say that neutral (i.e., color-unrelated) words that are used to estimate a “lexical” component are only encoded lexically, whereas color-related words that are used to estimate the “semantic relatedness” component are also encoded semantically. Obviously, reading every real word would result in lexical and semantic encoding. However, it is only possible to disentangle between the two, and expose the relative contribution of the automaticity of the semantic encoding to the interference effect by using incongruent color-related words (see a detailed discussion of this issue in the next two paragraphs). In addition, in Klein’s (1964) study, more interference was also observed for nonsense syllables than for colored asterisks (see also Monsell et al., 2001). Based on this finding, we propose that Stroop interference might also have an orthographic component, reflecting the automatic nature of the initial—orthographic encoding—stage of the reading process.

As already mentioned, the semantic gradient framework is not the only one addressing the notion of multiple origins of Stroop interference. MacLeod and MacDonald (2000) as well as Goldfarb and Henik (2007) suggested a different perspective that we will refer to as the two-conflict framework. In this framework two types of conflict contribute to the Stroop interference. Task conflict represents the competition between two possible tasks—the relevant color-naming task and the irrelevant but automatically triggered (by the word stimulus) reading task. The existence of task conflict is supported by neuroimaging and behavioral data. Bench et al. (1993), for example, demonstrated that the anterior cingulate cortex (i.e., ACC) is activated more by incongruent but also congruent color words than by unreadable neutral stimuli (i.e., crosses). Since the ACC is assumed to be involved in conflict monitoring (Carter et al., 1998; Botvinick et al., 2001, 2004), its increased activation by congruent items implies that informationally compatible stimuli may also evoke some kind of conflict. Some researchers noted the ability of various stimuli to trigger the performance of the task they are closely associated with Rogers and Monsell (1995) and Monsell et al. (2001) argued that lexical stimuli such as words, or word-like stimuli (e.g., pronounceable letter-strings) automatically evoke the reading task. In the Stroop task, such an automatic tendency that characterizes congruent but not neutral stimuli might produce a conflict because the stimuli are read instead of being color-named. That is, an increased ACC activation by congruent items is likely an expression of the (task) conflict caused by the automatically performed irrelevant reading task. However, at the behavioral level, reaction times (RTs) to congruent words are in most cases slower than RTs to neutrals—a pattern that one would expect to obtain according to the neuroimaging data. As suggested by Goldfarb and Henik (2007), task conflict that arises in the congruent condition is usually not exposed by behavioral studies due to a very efficient control that operates quickly to eliminate the task conflict. In their, and other studies (Kalanthroff et al., 2013a,b; Entel et al., 2014; Kalanthroff and Henik, 2014) that weakened control by various manipulations, slower RTs to congruent than to neutral stimuli emerged, supporting the notion of the task conflict. Task conflict has also been demonstrated by studies employing Stroop-like task switching paradigms (Aarts et al., 2009; Steinhauser and Hübner, 2009).

When the meaning of the word is related to a color, informational conflict arises, enhancing the observed interference. In the color-naming task, the informational conflict can only follow the task conflict and cannot exist by itself because to retrieve the meaning of the word, one should initially start reading it (i.e., perform the irrelevant reading task) (see Levin and Tzelgov, 2014, for a detailed analysis of this issue). Task conflict, in contrast, can exist without informational conflict. When the stimulus is readable but color-unrelated, the extraction of its meaning does not produce informational conflict because color-unrelated meaning does not provide conflicting color information. For instance, the word DOG in red ink would produce task interference because it can be read. However, it would not produce informational interference because it does not belong to the conceptual category of colors (i.e., DOG cannot compete with RED for a response). This notion is critical with regard to the use of the Stroop task in the research on automaticity of reading and its controllability because it emphasizes the importance of task conflict as a marker of automaticity of reading. By contrast, informational conflict is an episodic effect stemming from the dimensional overlap between stimuli and responses (e.g., Kornblum et al., 1990; Zhang and Kornblum, 1998; Zhang et al., 1999). Note that the independence of task conflict from dimensional overlap makes it a “pure” measure (i.e., a marker) of automaticity of the reading process. With this notion in mind, let us introduce the integrated framework.

A Proposed Integrated Framework

In our view, the two frameworks refer to the same idea that can be more elaborated by their integration. Noteworthy, the integration we propose here is not only about suggesting a more consistent taxonomy with regard to the components of the semantic gradient reported by early studies, but about deepening a theoretical understanding of what these components represent. Thus, we believe the part of Stroop interference that expresses the automaticity of reading per se arises due to task conflict, and it can be estimated by the orthographic and lexical components of the semantic gradient. The contribution of the informational conflict, which is an episodic amplification of task interference, can be estimated by the semantic relatedness component (Sharma and McKenna, 1998). However, with respect to the latter, we suggest that in order to capture the whole idea of informational conflict, it can be split into two different components. The first component, which we refer to as the indirect informational conflict, measures the contribution of the informational conflict caused by color-associated words (e.g., TOMATO). The second component, which we call the direct informational conflict, reflects informational interference due to semantic encoding of the color-word stimulus (e.g., RED). The label “indirect” captures the idea that the irrelevant color concept that subsequently competes for response becomes initially activated through its association with another color-associated word, such as the word TOMATO. In contrast, when the stimulus is a color word, the activation of the competing color-concept is “direct,” meaning that it is an outcome of reading the stimulus itself. According to the semantic network model of Collins and Loftus (1975), indirect activation is weaker than the direct one because the activation fades out as it spreads out and is shared between more semantic links. Thus, a color-concept that has been activated indirectly, as in the case of color-associated stimuli, would constitute a weaker competitor in the Stroop task, causing less interference.

The integrated framework suggests a notion particularly important for the research on controllability of the automaticity of reading. This line of research uses modulations of Stroop interference by specific experimental manipulations (e.g., the congruency proportion effects) as an indication of control operation. However, the integrated framework demonstrates that Stroop interference can also be modulated by using a specific stimulus type. The observed interference can be reduced or enhanced depending on the stimulus type that is used in the color-naming task. Employing neutral (i.e., color unrelated) words, for example, would “peel off” the amplification of the interference due to informational conflict, leaving only the contribution of the task conflict.³ In this case, the obtained interference effect would be smaller, however, it would be a more precise measure of the automaticity of reading (see the previous paragraph) than the interference effect including informational amplification. Therefore, when investigating the controllability of reading, one should be especially interested in selectively affecting the components reflecting task conflict, which according to the integrated framework, are the orthographic and lexical components. It would be especially interesting to investigate whether the task conflict expressed by each of those components can be controlled. Such a study would shed a light not only on whether reading can be controlled, but on whether such control can be exerted on all reading sub-stages, even the earliest ones, such as the orthographic encoding.

Importantly, in contrast to the previous studies in which the estimation of various components was carried out by performing intuitively more appealing multiple pair-wise comparisons between all stimulus categories used in the experiment, the integrated framework implements a different approach. The disadvantage of the analyses performed in previous studies is that they did not allow for correct estimation of each of the components because of using the same information multiple times. Since according to the integrated framework each of the components has a solid, distinguishable theoretical basis, their estimation should be unique and not contaminated by the information used to estimate other components. For that reason, we used a set of independent contrasts (see Table 1), which in our view allows the most adequate and clean estimation of the contribution of each of the four components to the semantic gradient pattern. Hence, within task conflict components, the orthographic component was estimated by contrasting unreadable shapes with the various readable stimuli, color words excluded. The lexical component, representing the modulation of task conflict magnitude by the lexical status of the word, was estimated by comparing minimally readable letter strings with real words, color words excluded. The direct informational conflict component was estimated by contrasting color-word stimuli with all remaining stimuli, whereas indirect informational conflict, representing the modulation of informational conflict by color-related meaning, was estimated by contrasting color-associated words with neutral words. Note, the proposed integrated framework that uses multiple stimulus types and employs a set of independent contrasts allows more stable estimation of some of the underlying interference components. Thus, for example, the estimation of the lexical component, which expresses a difference between readable and unreadable (or minimally readable) material, would be more realistic with regard to the true effect in the general population when calculated according to the proposed framework. This is because, in contrast to what has been usually done,⁴ the “readable” stimuli that are used for its calculation, instead of being represented by only one stimulus category, include a number of similarly readable, yet different stimulus types.

TABLE 1

TABLE 1. Contrasts allowing for independent estimation of the semantic gradient components as suggested by the proposed integrated framework.

In addition to the estimation of the four main components, we were also interested in accessing the effect of lexical frequency on color naming latencies. Klein’s (1964) observation, replicated later by Fox et al. (1971), was that the RTs produced by “common” neutral words were significantly slower than those produced by “rare” neutral words. The enlarged task interference obtained for high frequency words seems to be consistent with the faster visual recognition of high frequency words reported for word naming and lexical decision tasks (Forster and Chambers, 1973; Monsell et al., 2001). Faster visual recognition is usually attributed to the more efficient lexical encoding/access (Monsell et al., 1989; Murray and Forster, 2004). Thus, it is not surprising that more efficient reading would express itself in larger task interference. Monsell et al. (2001), however, observed slightly shorter color-naming response latencies for the high frequency neutral words than for the low frequency neutral words across the three experiments. In the present study we tested which of these findings can be successfully replicated. Importantly, in addition to neutral words, we also tested the effect of lexical frequency with color-association words in order to investigate whether the effect of lexical frequency can be observed in the domain of informational interference as well. The influence of lexical frequency on the magnitude of task and informational conflicts was assessed separately for neutral words and color-associated words⁵ by contrasting the high and low frequency items in each condition. Note, these comparisons were independent from each other as well as from the rest of the contrasts (see Table 1).

To summarize, the goal of the present series of experiments was to put the semantic gradient pattern to another empirical test, while applying the integrated framework to analyze its components. It is our belief that the proposed integrated framework suggests a theoretical and statistical elaboration of the multiple origins of Stroop interference. Along with the empirical investigation reported below and critical review of the literature (see Discussion), it should allow obtaining a clearer sense of what contributes to Stroop interference.

Experiments 1 and 2

To the best of our knowledge, a semantic gradient has been only reported for English words. The aim of the first two experiments was to evaluate the generality of the semantic gradient pattern phenomenon by testing its existence in Hebrew (Experiment 1) and Russian (Experiment 2). As previously mentioned, the data from these and subsequent experiments were analyzed by carrying out a set of independent contrasts that were aimed at providing independent estimates for each of the four components producing the semantic gradient.

Method

Participants

Twenty-seven (11 females and 16 males) undergraduate students of Ben-Gurion University of the Negev participated in Experiment 1 for course credit. All were native Hebrew speakers, with a mean age of 24.5 years old (SD = 2.03). Eighteen undergraduate students (8 females and 10 males) of Ben-Gurion University of the Negev participated in Experiment 2 and were paid 20 NIS. All were native Russian speakers⁶ with a mean age of 26.9 years old (SD = 3.45). All reported having normal or corrected-to-normal vision acuity, as well as normal color vision. All participants gave written informed consent in accordance with the Declaration of Helsinki. The experimental protocol was approved by the Ethical Committee of Ben-Gurion University of the Negev.

Materials

The stimuli used were Hebrew (Experiment 1) or Russian (Experiment 2) words of the following types⁷: color words (adom/ krasnii-RED, kahol/sinii-BLUE, yarok/zelenii-GREEN, tzahov/ jeltii-YELLOW); high frequency color-associated words (esh/ ogon-FIRE, agam/nebo–LAKE/SKY, etz/trava-TREE/GRASS, shemesh/solntze-SUN); low frequency color-associated words (agvaniya/pomidor-TOMATO, shamaim/djinsi-SKY/DJEANS, esev/lyagushka-GRASS/FROG, tiras/kukuruza-CORN); high frequency neutral words (rehov/oficer -STREET/OFFICER, regel/ samolet-LEG/PLANE, mafteah/pis’mo-KEY/LETTER, kvish/ sobaka-ROAD/DOG); low frequency neutral words (uga/zontik- CAKE/UMBRELLA, tzipor/golub’-BIRD/DOVE, buba/igla-DOLL/NEEDLE, yareah/koshelek-MOON/WALLET); letter strings (shshshsh/ hhhh; ssss/ssss; pppp/oooo; rrrr/rrrr), and geometric shapes (rectangle, circle, triangle, and rhombus). The stimuli to be included in each stimulus category were selected based on the norms available in each language—the Russian Frequency Dictionary⁸ developed by Sharoff (2002), and the Word Frequency Database for Printed Hebrew⁹ developed by Frost and Plaut (2005). In addition, the selection was made so that the mean frequency of the two high frequency categories would match as would the mean frequency of the two low frequency categories. Thus, in Experiment 1 the mean frequencies were as follows: 52 and 45 (appearances per million) for high frequency color-associated words and high frequency neutral words, respectively; and 5 and 9 for the same categories but for the low frequency. In Experiment 2, the mean frequencies were 220 and 200 for high frequency color-associated words and neutral words, respectively; and 13 and 14 for low frequency color-associated and neutral words, respectively. In addition, an effort was made to equate all stimulus words for the number of letters as much as it was possible. As for shape stimuli, they were made up of the same number of pixels as the mean number of pixels of the words and letter strings.

The possible ink colors were red, blue, green, and yellow, with the following RGB values: (255, 0, 0) for red; (0, 0, 255) for blue; (0, 128, 0) for green; and (255, 255, 0) for yellow. In order to create only incongruent combinations of stimuli, color words and color-associated words were presented in three of the four possible colors, excluding the color matching their meaning. In contrast, neutral words, letter strings and geometric shapes appeared in all four possible colors. All stimuli appeared in a quasi-randomized order: consecutive trials did not repeat the same word as a word or as a color, and also did not repeat the same color. For example, if in a given trial the word RED appeared in green ink, then in the subsequent trial the stimulus could not be the word RED, GREEN, FIRE, TOMATO, GRASS or TREE, and it could not be printed in red or green ink color.

Procedure

A Dell computer with an Intel Pentium Core 2 Duo processor and a 19-inch monitor with a resolution of 1024 × 768 pixels were used to present the stimuli. Participants sat approximately 60 cm from the computer screen. Responses were collected via a high-quality microphone attached to the computer keyboard through a “voice key” device, which allowed RT measurement. In addition, an experimenter coded all responses by typing them on the keyboard. Participants were told not to read the word, but to name its color as accurately and as fast as possible.

The experiment started with seven practice trials followed by two experimental blocks. A 5-min break was given between the blocks. Each of the seven stimulus types was shown 35 times during the block, resulting in total 245 trials per block.

At the beginning of each trial a fixation (white cross) was presented at the center of the screen. After 1,000 ms the fixation was replaced by the target, which remained visible until a response was made or for 3,000 ms. A trial ended with a blank, black display during which the experimenter coded the participant’s response. Trials with technical problems such as laughing or sneezing were coded as technical errors in order to distinguish them from the erroneous responses made by participants.

The design of Experiments 1 and 2 included one within subject variable of stimulus category with the following levels: color words, frequent color-associated words, infrequent color-associated words, frequent neutral words, infrequent neutral words, letter strings and shapes. Language frequency, which was relevant only for color-associated and neutral words, was analyzed at the second stage.

Results

Errors, not including technical ones, accounted on average for 1.11% of the trials in Experiment 1 and 2.37% in Experiment 2. All error trials (3.23% in Experiment 1 and 3.43% in Experiment 2) were excluded from the analysis as were the RT outliers (RTs > 2,500 ms and RTs < 300 ms). The mean RTs of correct responses for each participant in each condition were analyzed by a one-way repeated measures analysis of variance (ANOVA). Mean RTs of each experimental condition in the two experiments (as summarized in Table 2) are plotted in Figure 1. All effects were tested at the significance level (α) of 0.05.

TABLE 2

TABLE 2. Mean response times (in milliseconds) obtained in Experiments 1 and 2 for each stimulus category.

FIGURE 1

FIGURE 1. Reaction time obtained for each stimulus category in Experiments 1 and 2. CW, incongruent color words; CAW, incongruent color-associated words; NeW, neutral words; H, high frequency; L, low frequency.

The results of both experiments revealed a significant main effect of stimulus category, F(6,156) = 90.67, MSE = 464, $η_{p}^{2}$ = 0.77. (Experiment 1) and F(6,102) = 34.461, MSE = 1,014, $η_{p}^{2}$ = 0.7 (Experiment 2). The sum of squares due to the differences among stimulus categories were decomposed by carrying out four orthogonal contrasts (see Table 1). These planned comparisons allowed estimating two components of the semantic gradient representing task conflict (i.e., orthographic and lexical components), and two additional components representing informational conflict (i.e., indirect and direct informational conflict components).

Planned Comparisons: Experiment 1

Markers of both types of conflict were revealed when Hebrew was used as the experimental language. Thus, the existence of the task conflict was supported by the significant, and quite large, orthographic component, F(1,26) = 155.06, MSE = 394.4, $η_{p}^{2}$ = 0.24, $η_{p}^{2}$ = 0.86. The existence of the informational conflict was confirmed by the direct informational conflict component, F(1,26) = 160.07, MSE = 1,175.6, η² = 0.74, $η_{p}^{2}$ = 0.86. However, whereas the magnitude of the task conflict was modulated by the lexical status of the stimuli, showing a significant though relatively small (as evidenced by the $η_{p}^{2}$ index) lexical component, F(1,26) = 4.83, MSE = 382.5, η² = 0.007, $η_{p}^{2}$ = 0.16, no indication of modulation of informational conflict by color-related meaning was obtained. That is, there was no significant indirect informational conflict component, F(1,26) = 2.66, MSE = 239.3, NS, η² = 0.003, $η_{p}^{2}$ = 0.09.

To complete the analysis, the effect of lexical frequency was assessed. In our study lexical frequency was only manipulated for color-associated and neutral words. Since, according to the proposed integrated framework, interference produced by neutral words is only contributed to by task conflict, whereas color-associated words interfere also because of the informational conflict, the effect of frequency produced by both stimulus types was estimated independently. Hence, interference created by frequent neutral words was not significantly different from the interference produced by infrequent neutral words, F(1,26) = 2.44, MSE = 228.6, NS, η² = 0.003, $η_{p}^{2}$ = 0.09, implying the task conflict is not enhanced by lexical frequency. The same comparison was conducted for color-associated words, revealing similar results, F < 1. Thus, the informational conflict seems not to be affected by lexical frequency.

Planned Comparisons: Experiment 2

When Russian was the language of the experiment, the marker of the task conflict (i.e., the orthographic component) was successfully replicated, F(1,17) = 37.08, MSE = 1,133.4, η² = 0.2, $η_{p}^{2}$ = 0.69, as was the marker of informational conflict (i.e., the direct informational conflict), F(1,17) = 62.99, MSE = 2,571.3, η² = 0.77, $η_{p}^{2}$ = 0.79. Responses to color-associated words were not slower than the responses to neutral words, F(1,17) = 1.46, MSE = 694.4, NS, η² = 0.005, $η_{p}^{2}$ = 0.08, again indicating no modulation of the magnitude of informational conflict by color-related meaning. That is, replicating the result of Experiment 1, no indirect informational conflict component was exposed. However, contrary to the previous results, modulation of the task conflict magnitude, as expressed by the lexical component, was not obtained, F(1,17) = 1.6, MSE = 837.5, NS, η² = 0.007, $η_{p}^{2}$ = 0.09, in the present experiment. In addition, the effect of lexical frequency was estimated by contrasting high frequency and low frequency conditions. Whereas RTs for neutral words were not affected by lexical frequency, F < 1, RTs for color-associated words were, F(1,17) = 4.77, MSE = 672.5, η² = 0.015, $η_{p}^{2}$ = 0.22.

Discussion of Experiments 1 and 2

The data from Experiments 1 and 2 indicate that there are two robust components consistently contributing and almost entirely constituting the interference effect.¹⁰ These are the markers of the task and informational conflicts (i.e., the orthographic and the direct informational conflict component, respectively), which were easily replicated in two experiments employing different languages. However, the other two components, representing modulation of the magnitude of these conflicts by variables such as the stimulus’ lexical status and semantic distance (i.e., the lexical and indirect informational conflict component, respectively) seem either to appear inconsistently or be hard to obtain. Supportive of this conclusion, both experiments did not succeed in revealing the indirect informational conflict component that was found insignificant and of small size ( $η_{p}^{2}$ = 0.09 and $η_{p}^{2}$ = 0.08 in Experiments 1 and 2, respectively) in both experiments. The same was true for the lexical component ( $η_{p}^{2}$ = 0.16 and $η_{p}^{2}$ = 0.09 in Experiments 1 and 2, respectively), except for it being significant in Experiment 1. Thus, it seems that while these components might occasionally reach a significance level, they are likely to be fragile and to inconsistently contribute to the general interference.

Regarding the effect of lexical frequency, it was only found in the second experiment and only for color-associated words. However, contrary to the findings of Klein (1964) and Fox et al. (1971), performing the color-naming task elicited slower responses to low frequency rather than to high frequency color-associated words. We will further discuss the lexical frequency effect in the Section “General Discussion.”

Yet, one can argue that some differences between the results of the two experiments (e.g., a significance of the lexical component) might partially be due to the fact that Hebrew and Russian belong to different language types. According to the depth-of-orthography hypothesis (Frost, 2006), Hebrew is different from languages like English (and Russian) since the words are written almost solely by consonants and missing the vowels. Thus, reading in Hebrew might proceed differently, or require additional processes (e.g., completing not presented vowel information) than reading in Russian. Hence, it is possible, for example, that in Hebrew all words are initially perceived as letter strings, and considered as words only after the vowel information is completed by additional cognitive processing. If so, then this should lead to a hypothesis inconsistent with present results. Specifically, a “deep-orthography” of the Hebrew language should result in elimination or diminishing of the lexical component in that language, since words are initially perceived as letter strings. However, the two experiments presented reveal the opposite pattern: a significant lexical component in Hebrew, and an insignificant one in Russian. Hence, the differences between the languages used in the present experiments do not seem to be responsible for the obtained pattern.

Experiment 3

Looking further for the reason why some of the semantic gradient components are not easy to replicate, we reconsidered the methodological details of early studies that reported successful results. In these studies (Klein, 1964; Fox et al., 1971; Sharma and McKenna, 1998), stimuli were presented in a blocked rather than mixed format. Blocking, however, may affect the results by, for example, strengthening the semantic activation of the concepts corresponding to the stimulus words, since each of these repeats itself in a very high temporal proximity. Such temporally proximal repetition of the same words may lead to an accumulation of activation within the orthographic, lexical and/or semantic representations of these words, which in turn can make the relevant interference components visible. Unfortunately, studies that used a mixed presentation format cannot shed light on this issue, since they focused only on particular components (Langer and Rosenberg, 1966; Proctor, 1978; Schmidt and Cheesman, 2005; Risko et al., 2006; Goldfarb and Tzelgov, 2007; Brown, 2011). Therefore, the possibility that the mixed presentation format in our experiments was responsible for the absence of the informational and lexical components from the gradient pattern was tested in Experiment 3. This experiment replicated Sharma and McKenna’s (1998) experimental protocol using Hebrew as the language of the experiment.