Associative and Identity Words Promote the Speed of Visual Categorization: A Hierarchical Drift Diffusion Account

Words can either boost or hinder the processing of visual information, which can lead to facilitation or interference of the behavioral response. We investigated the stage (response execution or target processing) of verbal interference/facilitation in the response priming paradigm with a gender categorization task. Participants in our study were asked to judge whether the presented stimulus was a female or male face that was briefly preceded by a gender word either congruent (prime: “man,” target: “man”), incongruent (prime: “woman,” target: “man”) or neutral (prime: “day,” target: “man”) with respect to the face stimulus. We investigated whether related word-picture pairs resulted in faster reaction times in comparison to the neutral word-picture pairs (facilitation) and whether unrelated word-picture pairs resulted in slower reaction times in comparison to neutral word-picture pairs (interference). We further examined whether these effects (if any) map onto response conflict or aspects of target processing. In addition, identity (“man,” “woman”) and associative (“tie,” “dress”) primes were introduced to investigate the cognitive mechanisms of semantic and Stroop-like effects in response priming (introduced respectively by associations and identity words). We analyzed responses and reaction times using the drift diffusion model to examine the effect of facilitation and/or interference as a function of the prime type. We found that regardless of prime type words introduce a facilitatory effect, which maps to the processes of visual attention and response execution.


INTRODUCTION
Words facilitate visual decisions in a variety of tasks. For example, the brief presentation of a word before a semantically related picture has been shown to facilitate the processing of the picture in a number of tasks such as detection of motion direction (Meteyard et al., 2007), word-picture matching (Boutonnet and Lupyan, 2015), recognition of ambiguous "Mooney" images (Samaha et al., 2018), and familiarity judgments (Amado et al., 2018). On the other hand, a word presented right before the picture can also lead to semantic interference, i.e., when a semantically unrelated word interferes with the judgment of an immediately following target compared to when the word is semantically related. In terms of behavioral performance, this effect translates to longer reaction times and/or more errors (Wentura and Degner, 2010) as has been shown to be the case in a variety of language tasks such as Stroop task(s), word-picture matching, and spoken-towritten word matching (Jefferies et al., 2008;Campanella et al., 2013;Faria et al., 2015). While facilitation effects have been interpreted as cognitive [with the locus of their influence being rooted either in the lexico-semantic system (Francken et al., 2015) or in the visual system (Boutonnet and Lupyan, 2015)] some of the interference effects have been associated with response processing. For example, in the Stroop task, the increased response latencies for naming the ink color of a conflicting color word (word: red, ink: green) in comparison to the non-conflicting (congruent) one (word: red, ink: red) are interpreted as a result of response conflict as opposed to informational conflict (Duncan-Johnson and Kopell, 1981;Cohen et al., 1990). Whether semantic facilitation and interference effects operate at the level of target and/or response processing is not yet entirely clear.
A paradigm, which is structurally similar to the Stroop task but less studied in terms of cognitive processes, is the response priming paradigm (Wentura and Degner, 2010). While Strooplike effects appear at the response processing stage, response priming paradigms may tap into both target and response processing. In semantic priming paradigms, two stimuli are presented successively and participants are instructed to perform a task on the second stimulus (target), while the first stimulus (prime) is deemed to be non-relevant to the task (for example, participants might have to classify letter strings as words or nonwords, with prime-target pairs being either semantically related or unrelated). In contrast, in response priming paradigms, primes are either congruent or incongruent with the response that has to be given to the target (prime: skirt, target: woman, response: woman, task: categorization). In the Stroop task, interference occurs when the ink color maps onto one response and the semantic meaning of the incongruent word to the other. In other words, interference arises due to the conflict between the responses. In response priming, similarly, interference occurs when the semantic meaning of the prime maps to one response and the semantic meaning of the target maps to the other response, leading to response conflict (De Houwer et al., 2002;Musch and Klauer, 2003). The fact that response priming and Stroop tasks are structurally similar does not exclude the possibility that interference effects could be explained by either one or both processes of response competition and target processing. It is therefore of theoretical importance to investigate semantic facilitation and interference effects in a response priming paradigm.
In this study, we focused precisely on this aspect and investigated in a response priming paradigm the effects of semantic facilitation and interference as reflected by changes in behavioral performance. We further explored whether facilitation and/or interference can be accounted for by mechanisms of response execution and/or target processing, which we formalized using the drift-diffusion model (DDM) (see further details in the next section). Participants in our study were asked to judge whether the presented (target) stimulus was a female or male face when the target was briefly preceded by a gender word (prime) either congruent (prime: "man, " target: "man"), incongruent (prime: "woman, " target: "man") or neutral (prime: "day, " target: "man") with respect to the face stimulus. Participants were instructed to decide about the gender of the target while ignoring the prime. We then looked at whether related word-picture pairs resulted in shorter reaction times in comparison to neutral word-picture pairs (facilitation) and whether unrelated word-picture pairs resulted in longer reaction times in comparison to neutral word-picture pairs (interference). We further examined whether these effects (if any) map onto response conflict or target processing. In order to shed light on the cognitive mechanisms underlying semantic facilitation and interference effects, we used a well-established approach from the cognitive modeling literature, the DDM (Usher and McClelland, 2001;Smith and Ratcliff, 2004;Ratcliff and McKoon, 2008).

Drift-Diffusion Model
In the DDM approach, the process of making a decision about the gender of a face is described as the stochastic accumulation of sensory evidence over time toward one of two decision boundaries (male or female response, for instance). Once enough evidence is accumulated and one of the two decision boundaries is reached, the associated response is produced (for example, female). In the DDM, a total of four parameters describe the processing components underlying the decision-making process (see Figure 1): the rate at which evidence accumulates over time (drift rate, v), the amount of evidence that is necessary to produce a response (boundary separation, A), an optional a priori bias for a specific response (bias, z), and finally the time required to complete non-decision processes, such as motor preparation and/or stimulus encoding (non-decision time, T er ).
The DDM model has been successfully applied to choice reaction time data in various experimental tasks (see, for reviews, Ratcliff and McKoon, 2008;Mulder et al., 2014), and the parameters recovered by the model have been shown to be well characterized in terms of cognitive processes (Voss et al., 2004). The drift rate or the speed of evidence accumulation is "determined by the quality of information extracted from the stimulus" (Ratcliff and McKoon, 2008). For example, in a color discrimination task, trials with less visible colors resulted in a lower drift rate as opposed to trials with the more visible colors (Voss et al., 2004). Crucially, the drift rate can be modulated by factors that are not only related to the stimulus being processed but are also related to contextual information associated with the study episode. For example, in a memory recognition task, a word that was studied three times had a higher drift rate than a word that was studied only once (Ratcliff and McKoon, 2008). Similarly, in an associative priming task, word primes that were associatively related to the target words resulted in a higher drift rate in comparison to word primes non-associatively related. It has been suggested that the drift rate in these contexts could represent "the quality of the match between a test word and memory" (Ratcliff and McKoon, 2008). The accumulation of the drift rate stops when either one of the two decision boundaries has been reached. The amount of evidence that is needed to make a decision is characterized by the boundary separation, that is, the distance of the boundaries from the starting point of the accumulation process, and it has been shown to be modulated by changes in task strategy (e.g., response caution). For example, when participants were instructed to prioritize response accuracy over response speed, changes in behavioral performance due to the adoption of a new response criterion by the decision maker were explained in the DDM model in terms of a higher value for the boundary separation parameter. The higher boundary separation translates to a longer period of information accumulation and, as a result of this, fewer errors being made by the decision maker (Bogacz et al., 2010). The starting point of the accumulation process instead reflects potential biases participants might have, which result in certain responses being "a priori more likely" (Mulder et al., 2014). For example, participants might, a priori, favor a "word" response in a lexical decision task (Wagenmakers et al., 2007). In addition, it was also shown that in a color discrimination task, a higher reward for a certain response resulted in participants adopting a starting point (z) closer to the decision boundary for the response with the higher value reward (Voss et al., 2004). Finally, the model component that does not account for decision processes is referred to as non-decision time. It reflects either stimulus encoding (which may not necessarily be perceptual encoding but rather access to memory in a memory task or lexical access in a lexical decision task, Ratcliff and McKoon, 2008) or the time required to execute a motor response.
These contributions are combined together in one parameter (T er ), which does not allow by itself the separation of encoding from response execution but rather allows for the separation of decision vs. non-decision processes. Studies that interpret T er as an encoding or execution parameter use auxiliary methods such as neuroimaging to facilitate interpretations. For example, in an fMRI study investigating age-related performance in a visual search task, changes in non-decision time were associated with targets accompanied by response-incompatible distractors in the elderly group. T er was correlated with the Frontal eye fields (FEF) and dorsal fronto-parietal regions, which suggested a major contribution from the visual encoding process (Madden et al., 2019). An Electroencephalography (EEG) study of figure-ground segregation instead found a correlation between N200 latency and the non-decision component, suggesting that N200 tracks the completion of visual encoding (Nunez et al., 2019).
In the DDM model, the parameters are combined nonlinearly to enable inferences on the complete distribution of reaction time data. Technically, this is done by computing for each accuracy interval or bin (e.g., five intervals of 20% accuracy performance increments) the relative RT distribution and then fitting a (Gaussian) random walk model to each of the quantiles of the RT distribution. An intuitive way to think about how the parameters of the DDM model combine to produce RT distributions is the following. As an example, assume a hypothetical classification task where subjects have to classify a face as female or male and we assign the female face response to the upper boundary (Response alternative 1 in Figure 1) and the male response to the lower boundary (Response alternative 2 in Figure 1). Assuming there is no bias for either of the two response and the same boundary separation for both male and female face targets (same amount of evidence to be accumulated) and no difference in non-decision time, a drift rate toward the female-response boundary (i.e., positive drift) would indicate faster correct responses for face-related judgments of female faces. By contrast, differences in threshold or non-decision time would suggest that overall RTs are either longer or shorter in female-related judgments compared to male-related judgments (depending on the directionality of the difference), regardless of the correctness of the response. Similarly, a higher boundary separation would predict on average slower responses (since more information has to be accumulated), and thus a longer mean RT for a particular condition would be predicted. Whether a response would be correct or not, however, would be mainly driven by the drift rate. "Mainly" is used here because boundary separation (A), drift rate (v), and bias (z) together contribute to the generation of choice RT data, specifically to the differentiation of fast and slow responses in both correct and incorrect responses. For comprehension purposes, we point the reader to a simple qualitative taxonomy of the type of responses predicted by the DDM given a particular combination of A (boundary separation) and v (drift rate) parameters. If A is high and v is toward the correct response boundary (i.e., positive for upper boundary and negative for lower boundary), then the responses are predicted to be on average slow and correct. If, instead, A is high but v is toward the incorrect response boundary, then the responses are predicted to be on average slow and incorrect. f there is a change in the boundary separation parameter and, for example, if A is low and v is toward the correct response boundary, then the model predicts on average fast correct responses. If instead A is low and v is toward the incorrect response boundary, then the model predicts on average fast errors.
To sum up, in this study, we capitalize on the DDM model since it provides an ideal analytical tool for disentangling in the response priming task the cognitive processes involved in semantic facilitation and interference. Importantly, previous drift-diffusion studies revealed intriguing cognitive mechanisms underlying behavioral performance in language-related tasks that use the response priming paradigm. In the primed Stroop task, words-distractors were presented ahead of colored symbols that had to be categorized, and the facilitation effect was explained in terms of a change in boundary separation among congruent and incongruent conditions (Kinoshita et al., 2017). This result, however, could be explained in terms of stimulus probability, since a word prime was predictive of its matching response color (Kinoshita et al., 2017, p. 833). It has also been shown that predictive cue information results in boundary separation modulation (not drift rate) in a task where participants have to identify a house or face masked by noise, preceded by either a house or face cue with different degrees of reliability (Dunovan et al., 2014). The study of Kinoshita et al. (2017) is one of the recent studies on response priming that investigates language influences on target and response processing from the computational modeling perspective. Results are, however, not conclusive due to the predictability confounds described above. In this study, language primes were not predictive of the upcoming target, which allowed us to investigate the influence of language on facilitation and interference without introducing probabilistic confounds. Another interesting property that seem to affect participants' performance in the response priming task, and which is crucial for this study, is the type of semantic relationship between prime and target, which will be discussed further in the following section.

Type of Semantic Relationship
The type of semantic relationship between the target and the prime is another variable to be considered when investigating priming effects. For example, in the response priming paradigm, it was shown that associative and categorical primes involve different cognitive processes (Voss et al., 2013). While primes that belonged to the same category as the targets (i.e., categorical primes such as prime: lion, target: tiger) mapped onto the response execution stage, primes that were semantically associated with the targets (i.e., associative primes such as prime: king, target: crown) mapped onto the target processing stage. Specifically, the drift-diffusion analysis revealed that associative priming effects mapped onto the drift rate parameter, which indicated increased informational uptake for associative (prime: king, target: crown) word pairs in comparison to categorical ones (prime: lion, target: tiger). Furthermore, categorical primes mapped onto the non-decisional component, with congruent word-target pairs leading to a facilitation of the non-decisional processes, and incongruent ones resulting in interference. The authors explained categorical priming effects in terms of response competition and associative priming effects in terms of spreading activation (Collins and Loftus, 1975). Crucially, categorical congruency effects were associated with response competition processes regardless of the relevance of the congruency dimensions (i.e., whether the task was lexical decision or semantic categorization did not affect the results). On the contrary, Gomez et al. (2013) showed that categorical prime-target pairs mediated both the non-decision and the drift rate components. In the study of Gomez, however, the authors used variation of categorical primes involving identity primes (prime: house, target: house) instead of different words being related to each other categorically (such as prime: lion, target: tiger). Furthermore, they used a lexical decision task instead of the semantic categorization task used by Voss et al. (2013), which altogether might have led to differences in the experimental results. Other evidence from the word production literature shows a dissociation between associative and identity primes. For example, in picture-word interference, where participants name a picture and ignore a distractor word, picture naming is slower when the target image and distractor word are related in comparison to when they are unrelated (Glaser and Düngelhoff, 1984;Piai et al., 2013;Piai and Knight, 2018). Interestingly, other types of semantic relations such as for example associations, hypernym-hyponym, part-whole, or nouns-verbs (Lupker, 1979;Mahon et al., 2007;Kuipers and La Heij, 2008) result in facilitation or no modulation. Brain imaging studies have also shown a dissociation between associative and categorical relationships in terms of their neural bases (for a review, see Mirman et al., 2017). In sum, existing experimental evidence suggests that associations and categorical relations might have a differential effect on facilitation and interference and therefore should be properly accounted for in tasks that involve response and attention control. In this study, we specifically address the question of whether prime-target pairs that are related to each other either categorically (e.g., prime: "man, " target: man) or associatively (e.g., prime: "tie, " target: man) result in facilitation or interference effects and, if so, whether such effects occur at the level of response processing, target processing, or both.

The Present Study
Our primary interest was verbal interference/facilitation in the response priming paradigm, either at the level of response execution or of target processing, when manipulating the type of semantic relationship linking prime and target. The facial features of the target pictures were morphed from male to female parametrically, and the task included ambiguous faces solely for participants' engagement purposes. The primes were either associations ("tie, " "dress") or identity primes ("man, " "woman"). In addition, primes were either congruent (prime: "man, " target: man), incongruent (prime: "woman, " target: man) or neutral (prime: "day, " target: man) with respect to the face. Participants were asked to decide about the gender of the target while ignoring the prime. Participants' behavioral performance (responses and reaction times) was analyzed with the DDM approach to examine the effect of facilitation and/or interference as a function of the prime type.
The DDM approach focused specifically on testing two hypotheses. First, we investigated whether congruency effects in response priming tap into the cognitive mechanisms associated with the processing of the response and/or target. Under the response competition account, congruency effects (e.g., the prime maps onto a "female" response and the face is a female face) would map onto the speed of the non-decisional processes (i.e., slower for incongruent and faster for congruent). Under the target processing account, congruency effects would map onto the speed of processing of the target picture (drift rate). In this case, we expected the drift rate (v) to be higher in congruent vs. incongruent word-picture pairs. Second, we investigated whether the type of semantic relationship -associative or identity -would have an influence on the direction of the effect, i.e., facilitation or interference. For this purpose, we tested whether identity words ("man, " "woman") and associative words ("beard, " "dress") tap into response conflict processes (T er ) and/or perceptual uptake (v). We furthermore tested whether identity words would enhance the rate of evidence accumulation as opposed to associative words and whether identity words would lead to faster motor preparation/execution in comparison to the associative ones.
To sum up, with this work, we aimed to investigate whether associative and identity words relating to gender categories affect cognitive processing, specifically the stages of response execution and/or information evaluation of the target during gender categorization.

Subjects
The study was approved by the local ethics committee (CMO Arnhem-Nijmegen, Radboud University Medical Center) under the general ethics approval. All participants provided written informed consent approved by Radboud University, Nijmegen. A total of 47 volunteers (23 females) recruited via the Radboud Research Participation System (native Dutch speakers, righthanded, age range: 19-35 years, mean age = 24.74, SD = 3.53) took part in the study. We performed two additional studies to develop and pre-test the materials. All participants reported no neurological disease and had normal or corrected-to-normal vision. All of the participants received monetary compensation for their participation.

Target Pictures
A set of realistic 3D gender-morphed faces was created with the use of FaceGen Modeller 3.5 (Singular Inversions). The technical details of the computation method used by the software are discussed in Blanz and Vetter (1999). We created 81 Western face identities, for which we gradually morphed gender features from extremely female to extremely male in 10 equal steps (Figure 2A). Face stimuli were cropped to remove hair and ears and presented frontally. We controlled for luminance using the SHINE toolbox for MATLAB (Willenbockel et al., 2010).

Target Picture Evaluation and Selection
In order to obtain subjective perceptual ratings of the face gender, we conducted a stimulus evaluation experiment where subjects performed a forced two-choice task on the gender of each of the faces of the experimental set. A separate pool of 47 volunteers (24 males, native Dutch speakers, age range 20-32 years, mean age = 23.83, SD = 2.88) participated in a separate experiment to evaluate the face stimuli. Another pool of volunteers (24 females, native Dutch speakers, right-handed, age range 20-53 years, mean age = 26.75, SD = 6.65) performed the semantic ratings of priming words. Each trial in the evaluation experiment started with a fixation cross that stayed on the screen for 1500 ms, after which the target was visually presented for 500 ms. Participants were able to deliver a response for an additional 2000 ms after the picture was removed from the screen. Based on the responses, we identified the morphing step of the faces that was perceived ambiguously. We included five morphing steps in the main experiment: the most ambiguous face in the middle of the continuum, plus two steps in either direction away from the middle point ( Figure 2B). We define faces of morphing step 6 as ambiguous faces, and all the others as less ambiguous, with morphing step 4 as the most male face and morphing step 8 as most female face. The percentage of faces evaluated as males for ambiguous faces (step 6 in the Figure 1B) was 47.87% (SD = 16.45), for the selected extremely male faces (step 4),

Prime Word Evaluation and Selection
The experiment consisted of two conditions. In the first (identity) condition, the labels "man" (English: man) and "vrouw" (English: woman) were used as primes. In the second (associative) condition, a set of gender-associated words such as "mascara" and "tie" were used as primes (see Supplementary Table S1.2). The words for this condition were preselected using the database from the "small world of words project" (De Deyne et al., 2013) and subsequently rated by naïve participants. In the rating experiment, the participants had to indicate for each word how related that word was to the words "male" or "female" on a 7-point scale (-3 = related to male; +3 = related to female). For half of the participants, female and male axes were swapped. Based on the rating outcomes, a selection of 40 words was made, which included 20 male-related and 20 female-related words. In addition, 20 words from the semantic category "furniture" (the neutral condition) and 20 catch words from diverse semantic categories were selected. The furniture and filler words were associated with neither the male nor the female categories according to the ratings. Male-and femaleassociated words were matched for word length, frequency per million, and concreteness (all p > 0.30). The catch trials were excluded from subsequent analyses. Frequencies for all words were extracted using the Subtlex corpus (Keuleers et al., 2010). The mean frequency, concreteness, and length of the materials used are indicated in Table 1.

Semantic Similarity
It could be argued that the proposed associative words in the stimulus list we used contain both associations ("tie") and identity-like words ("brother, " "father"). Therefore, we sorted the initial associative set into associations and labels (see Supplementary Table S1.2). Further, to control for the semantic similarity between the prime words (identity: "man, " labels: "father, " associative: "tie, " "dress") and target concept ("male" introduced by a male face or "female" introduced by a female face), we used the snaut tool (Mandera et al., 2017) which is based on word2vec representations (Mikolov et al., 2013). The word2vec model represents words' semantics as a vector of features, and the semantics of a certain word can be characterized by comparing the vector representations. The measure of semantic similarity we report here is cosine similarity, which has particular advantages over other measures such as Euclidean and Manhattan in cases where the vector magnitude matters. First, we calculated the semantic distance between each of the primes to the target concepts (associations: "beard" -"man, " labels: "father" -"man, " identity words: "man" -"man") in terms of cosine distance (see Supplementary Table S1.2). Next, we tested whether the proposed word sets (associations, labels, identity words) differed in semantic measures using Bayesian ANOVA, which accounts for the non-equal number of words per word group. The null hypothesis states that there is no difference between the conditions of interest, whereas the alternative posits that the conditions of interest are different. A Bayes factor (BF) is defined as the ratio between the evidence in favor of the alternative hypothesis (H 1 ) over the evidence in favor of the null hypothesis (H 0 ), denoted by the subscript 10 in the Bayes factor abbreviation BF 10 . BFs estimate graded evidence in favor of or against the alternative hypothesis (Wagenmakers et al., 2018) and can be interpreted as follows: BF 10 = 1-3 indicates "anecdotal" evidence for H 1 compared to H 0 ; BF 10 =3-10 indicates "moderate" evidence for H 1 compared to H 0; BF 10 = 10-30 indicates "strong" evidence for H 1 compared to H 0; BF 10 = 30-100 indicates "very strong" evidence for H 1 compared to H 0; BF 10 > 100 indicates "extreme" evidence for H 1 compared to H 0 . Bayesian ANOVA was carried out using JASP (JASP Team, 2018).
The results of the analysis are presented in Figure 3. Words of different types, fully overlapping with the picture  FIGURE 3 | Semantic distance between prime words (labels, associations, and identity words) and target concept introduced by the face ("female"/"male") color-coded for type of prime word: red, labels ("father"/"daughter"); green, associations ("beard"/"skirt"); black, identity words ("man"/"woman").
Therefore, for the first set of analyses (see section "Congruency Analysis"), we collapsed labels and associations (the non-identity condition), especially in light of the fact that in the current paper we did not focus on the various types of similarity between the words but rather focused on the contrast between identity words and all other words that point to the same concept but have larger semantic distances.
However, since the difference between the labels and associations is statistically significant, we provide additional analyses excluding the labeling primes and thus only comparing associative and identity primes. These additional analyses are explained further below (see the "Behavioral Analysis" section, and section "Secondary Congruency Analysis": associative versus identity primes).

Procedure
Participants were seated in a comfortable chair in front of a computer screen in a sound-protected room. On each trial, a prime word was presented for 250 ms, after which a fixation cross remained on the screen for 300 ms. The visual target was presented for 500 ms, followed by a jittered inter-trial interval of 1500-3000 ms (Figure 4). The total number of trials was 800. The experiment was divided into two blocks (400 trials per each block) with an optional break between the blocks. Overall, for the associative prime condition, there were 20 male-related, 20 female-related, and 20 neutral words. For identity words, we had the word "woman"/"man" presented 20 times each, and 20 neutral words (the same as we used in the associative condition). The words were presented with each morphing step (face identity was shuffled with no repetition), and each prime word was repeated five times. The trial order was randomized. The total duration of the experiment was approximately 90 min. | Experimental design. A prime (label or word associated with the target concept) was presented for 250 ms, followed by a fixation cross (300 ms), after which the target picture was presented, followed by a jittered fixation cross for 1500-3000 ms. Participants had to indicate their decision on the gender of the presented face by button press.

Task
Participants were instructed to decide upon the gender (male or female) of the face based on the image presented and to respond with a keyboard button press (middle/index finger; the mapping of the response buttons was counterbalanced across participants). Participants had up to 2 s to respond after the onset of the picture and were instructed to skip the trials on which the prime belonged to the category "furniture." This "go/no-go" task ensured that participants read the prime word.

Behavioral Analysis
We investigated priming effects by separately analyzing measures of behavioral performance: reaction times (RT) and choice responses. We performed a 2 by 3 repeated measures analysis of variance (rm ANOVA) with prime type (identity or associations) and congruency (congruent, incongruent, neutral) as factors. We preselected congruent (target: male face, prime "man"; target: female face, prime "female"), incongruent (target: male face, prime "female"; target: female face, prime "man"), and neutral (target: male face, prime "day"; target: female face, prime "day") word-target pairs collapsing across very-and less-gendered morphing steps (step 4, 5; step 7, 8). An arcsin transformation was applied to the percentage of correct responses before entering it into the ANOVA. We performed post hoc comparisons for main effects using Holm correction. Morphing step 6 was excluded from all analyses since ambiguous faces can require a different configuration of the decision process in comparison to male-and female-gendered faces. Whereas faces of step 6 are marked by high-uncertainty, faces of steps 4, 5, 7, and 8 are instead marked by low uncertainty. Given this potential difference and given our focus on the potential differences among types of word primes, we decided to exclude step 6 items to avoid potential confounds in terms of uncertainty. It could be argued that the associative condition introduced in this study may be viewed as a conjunction of "labels" and "associations, " which would make the comparison between the two groups confounded (see section"Semantic Similarity"). Thus, in a secondary set of analyses, we repeated a 2 by 3 repeatedmeasures (rm) ANOVA analysis on the associative words, omitting potentially confounded words in the non-identity condition (i.e., the 15 "label primes" (e.g., father) and resulting in 25 associative words, see Figure 3 and Supplementary  Table S1.2) using the Bayesian approach, which accounts for the unequal number of trials.
Moreover, it could be argued that facilitation/interference effects can be modulated by repetition of the materials (i.e., the identity primes are repeated more than associative primes); therefore, we also performed an analysis of repetitions. First, we calculated the number of repetitions for the conditions of interest (congruent, incongruent, and neutral) separately for associative and identity words. We then calculated the average effect of interference (unrelated-neutral) and facilitation (relatedneutral) for associations and identity words. Since the number of repetitions was unequal per condition for identity words (∼ 80 repetitions) and for associative words (∼ 50 repetitions), we calculated the facilitation/interference effect across the whole session with a step augmentation of 10 trials in order to better illustrate the differences between the identity/associative conditions. Further, we performed four ANOVAs (separately for interference and facilitation per associations and identity words) with the number of trial repetitions as the dependent variable. Additionally, to investigate the effect of prime type and prime gender on face gender categorization, we performed rm ANOVA (both RTs and responses) with target face (male or female), prime gender (male-related, female-related, or neural), prime type (association or identity) as within-subject factors. Finally, to investigate the role of the subjects' sex on gender categorization, we ran a rm ANOVA with target face (male or female), prime gender (male-related, female-related, or neural), prime type (association or identity) as withinsubject factors and subject's sex (female, male subjects) as a between-subjects factor. The analyses were performed using JASP (JASP Team, 2018).

Hierarchical Drift-Diffusion Model
In order to gain insights into the processing components underlying categorization in the identity and associative conditions, we analyzed choice reaction time data with the hierarchical DDM. The analysis was implemented in the Python toolbox HDDM 0.6.0 (Wiecki et al., 2013). One of the main advantages of the hierarchical Bayesian framework is that the simultaneous estimation of the model parameters at both the single-subject and group levels enhances statistical power since fewer trials are required to recover the parameters and the estimates are less susceptible to outliers (Wiecki et al., 2013), making it an appropriate analytic approach for the present study. Models with different combinations of free parameters were fitted to the data via Markov Chain Monte Carlo (MCMC) fitting routines. We used an accuracy coding scheme for the responses for congruent/incongruent/neutral prime-target pairs with the upper boundary reflecting a correct face categorization and the lower boundary an incorrect one. We defined the model space by allowing the parameters to vary freely over the factors of interest (congruency, prime type) of the experimental design (see Supplementary Table S2).
For each model, we evaluated the rate of convergence of the numerical fitting routines and then the ability of the model to capture the observed RT distributions. Models that failed to reach convergence or failed to capture the observed RT distributions were excluded from further analyses. Finally, the remaining models were compared against each other by computing the relative Deviance Information Criterion (DIC), which is a measure of the goodness of the model fit to the data that penalizes for the complexity of the model (Schwarz, 1978). A rm ANOVA was then used to test for significant differences in the parameter estimates of the best-fitting model and to quantify the evidence in support of a given hypothesis.
Finally, to investigate whether the sex of subjects could modulate the cognitive processes related to gender categorization (e.g., female subjects could have had a general inclination toward categorizing a face as a female), we stratified the bias structure (z parameter) according to the subject sex in a separate set of models, this time using a stimulus coding scheme. We reasoned that females could be more accurate in face categorization in comparison to males, which would be reflected in different amounts of a priori information being available before each decision (the starting point could be closer to the female response boundary). To test for this hypothesis, we compared the following HDDM models: a model with no bias included (model 33), a model testing whether males and females have different a priori information guiding the categorical decisions (model 32), and the current winning model (model 30), which included bias with no gender differentiation. We report the results in section 2.0.
We further found that participants' performance varied depending on the type of the prime [main effect of prime type: F (1, 46) = 4.70, p = 0.035]. On average, identity words resulted in longer RTs in comparison to associative words (associative > identity: t = −2.16, p = 0.035, SE = 0.007, Mean difference = −0.014).

Analysis of Repetitions
Since there was intrinsically a different number of trials for associative vs. identity words, we performed a repetition analysis with the purpose of investigating the facilitation and interference effects as a function of the number of trials. The results are presented in the Figure 6 for associative ( Figure 6A) and identity (Figure 6B) words. To reiterate, we defined the facilitation effect as the difference between congruent and neutral prime-target pairs. The interference effect was defined as the difference between incongruent and neutral pairs. There were, on average, 80 repetitions for identity primes (for each of the related and unrelated conditions) and 50 repetitions for associative primes (for each of the congruent and incongruent conditions) per subject.
Within each of the conditions -associative or identity -we tested the effect of repetition separately for interference and facilitation effects. For the identity condition, we did not find a repetition effect for either facilitation (F (7,322) = 0.96, p = 0.46) or interference (F (7,322) = 0.91, p = 0.49). For the associative condition, the repetition effect did not modulate the facilitation effect (F (4, 184) = 0.71, p = 0.58). However, the interference effect was affected by the repetition effect (F (4,184) = 4.39, p = 0.002). This was mainly driven by a larger interference effect for the first 10 trials. Specifically, analysis of repeated contrasts showed significant difference only between 10 vs. 20 trials (t = 2.69, p = 0.008, SE = 0.006, Estimate = 0.016) but neither for 20 vs. 30 (t = 0.79, p = 0.42, SE = 0.006, Estimate = 0.005) nor for other repeated contrasts.
To sum up, the analysis investigating the effects of congruency and prime type, even when accounting for the potential confounds in the prime type, showed consistent results. Particularly, both identity words and associations resulted in a facilitation effect but only identity words produced an interference effect. However, as we show in Figure 6A, the repetition of items did affect the associative words, which reduced after 20 trials.

Analysis of Prime Effects on Gender Categorization
To investigate the effect of prime type and prime gender on face gender categorization, we performed a 2 by 3 by 2 rm ANOVA (both RTs and responses) with target face (male or female), prime gender (male-related, female-related, or neural), prime type (association or identity) as withinsubject factors. The results of this analysis are presented in For the RTs (Figures 7A,B), we found a main effect of target gender (male vs. female faces) [F (1,46) = 7.27, p = 0.01], a main effect of semantic relatedness of the prime with the target gender (male/female/neutral prime) [F (2,92) = 6.13, p = 0.003], and a main effect of prime type (identity or associative word) [F (1,46) = 5.05, p = 0.02]. Most importantly, we found a tree-way interaction between target face, prime gender, and prime type [F (2,92) = 6.32, p = 0.003].
As for the male target faces, in case of identity primes, participants were faster when it was preceded by the male prime in comparison to the female prime but not in comparison to the neutral primes [identity (female vs. male): t (46) = 3.62, p < 0.001; identity (neutral vs. male): t (46) = 1.68, p = 0.05]. For male faces, we do not find enough evidence in favor of facilitation or interference.
We present the results of the analysis focusing on the response in Figure 7C. As for responses, we found a significant interaction between target face and prime gender [F (2, 92) = 18.92, p < 0.001]. Particularly, participants were more accurate for female faces when preceded by female words in comparison to male words [female > male: t (46) = 4.45, p < 0.001] and neutral words [t (46) = 3.47, p = 0.001]. With regards to the male faces, participants were more accurate in classifying them when preceded by male words in comparison to female words [t (46) = 4.24, p < 0.001) and neutral words [t (46) = 2.99, p = 0.004].

Hierarchical Drift-Diffusion Modeling
Next, we conducted a drift-diffusion analysis using RTs and choice responses from the associative and identity conditions (following the analysis described in section "Secondary Congruency Analysis").

Model Convergence and Model Fit
For all of the analyses reported, the MCMC (Gelman and Rubin, 1992) fitting routines were run for 25,000 iterations with a burn-in period of 10,000 iterations and a thinning of 1. Model convergence was assessed by examination of the posterior samples and of the R-hat statistic, which is a measure of convergence among multiple MCMC chains (three for the present study). Posterior density estimates, which are stable FIGURE 6 | Interference and facilitation effects across trials for (A) associative words ("beard"/"skirt") and (B) identity ("man"/"woman") words. Error bars represent 95% confidence interval (CI); RT, response time.
woman man woman man woman man

Target
Gender word female neutral male FIGURE 7 | Mean reaction times for gender categorization of the target as a function of identity words ("man"/"woman") (A) and associations ("beard"/"skirt") (B). Percentage of correct responses (C). Error bars represent 95% confidence interval (CI); RT, response time.
over multiple samples, indicate that the fitting routines have converged to a fixed estimate. An R-hat statistic below 1.1 indicates that chains with different starting values have converged to the same posterior estimate. Successful convergence was confirmed by an MCMC error for all of the parameters smaller than 0.01. We further performed a comparison between observed and recovered RT distributions produced by the model (see Supplementary Figure S1). After assessing convergence, we carried out a quantitative comparison of alternative models by computing the associated DIC score for each model. DIC is a measure of the goodness of fit of the model to the data that is penalized for the complexity of the model, and therefore a model with a lower DIC score is to be preferred over an alternative model with a higher DIC score as the most parsimonious explanation of the data. Models that did not reach convergence were discarded and not included in the DIC comparisons.
Below, we report modeling results using behavioral data from congruency analysis (see section "Secondary Congruency Analysis") and analysis of prime effects on face categorization (from section "Analysis of Prime Effects on Gender Categorization"). For the congruency analysis, the model that best described the data (i.e., the model with the lowest DIC score) was the model with the following parameters estimated per subject: drift rate (v) free over congruency and prime type, threshold (A) free over prime type, and non-decision time (T er ) free to vary across congruency and prime type (see Supplementary Table S2 for details). Conventionally, a DIC difference of more than 10 indicates that the evidence in favor of the winning model is substantial (Burnham et al., 2002). Because the difference between the winning model (model 6, DIC −16601.7) and the second-best model (model 12, DIC −16588.0) exceeds 10 (13.7), we consider this evidence sufficient to select model 6 as the most parsimonious account of the data, and, therefore, further analyses focus on this model. For the analysis of prime effects on face categorization, the model that best described the data had the following parameters estimated per subject: drift rate (v) free over prime gender and target face, boundary separation (A) free over prime type and target face, and non-decision time (T er ) free over prime gender, prime type, and target face (see Supplementary  Table S3 for details). Since the difference between the winning model (model 30, DIC −17677.7) and the second-best model (model 24, DIC −17627.3) exceeds 10, we consider this evidence satisfactory for selecting model 30 as the most parsimonious account of the data, and, therefore, further analyses focus on this model. Finally, to investigate whether the sex of subjects could modulate cognitive processes related to gender categorization, we stratified the bias structure according to the subject sex. We reasoned that females could be more accurate in face categorization in comparison to males, which would be reflected in information available a priori. We found that both models with bias included (model 30A, DIC −17674.72, and model 30, DIC −17677.69) explain the data better in comparison to the model without bias included (model 30B, DIC −17382.4268). However, we did not find enough evidence to postulate that there is a difference in male and female bias settings (the DIC of model 30 does not exceed the difference of 10 in comparison to model 30A). See a short summary in Supplementary Table S4.

Congruency Analysis for HDDM Parameters
The results of the modeling analysis are summarized graphically in Figure 8.

Boundary separation
We did not find a congruency effect [F (1, 46) = 2.63, p = 0.11]. To sum up, we found a congruency effect for the drift rate (v) and the non-decisional parameter (T er ) but not for the decision boundary (A).

HDDM Representation of Priming Effects on Face Gender
The results of the modeling analysis are summarized graphically in Figure 9.

Drift rate
For drift rate, we found main effects of target face [F (1, 46) = 4.55, p = 0.038] and of prime gender [F (2, 92) = 12.70, p < 0.001]. We also found an interaction between target face and prime gender [F (2, 92) = 22.17, p < 0.001]. Results showed a higher drift rate for female faces preceded by female words in comparison to male words [t (46) = 5.96, p < 0.001] and in comparison to neutral words [t (46) = 6.03, p < 0.001). As for the male faces, the results showed an increased drift rate for male faces with male words in comparison to male faces with neutral words [t (46) = 4.36, p < 0.001) and female words [t (46) = 4.18, p < 0.001].
We did not find any effects when including sex as a betweensubjects factor.

Boundary separation
We did not find any significant effects. Inclusion of sex as a between-subjects factor did not lead to any significant effects either.

DISCUSSION
In this study, we investigated whether the different types of semantic relationships (Wentura and Degner, 2010;Mirman et al., 2017) (associations or identity words) result in facilitation and/or interference in response priming of a gender categorization task. Participants had to decide about the gender of presented faces after having seen a word prime. From the analysis of reaction times, we found that both identity (e.g., prime "man") and associative (e.g., prime "beard"/"father") words resulted in a facilitation effect (congruent vs. neutral), whereas only identity words resulted in interference (incongruent > neutral). We further combined RTs and choice responses within the analytical framework of the DDM with the purpose of investigating the cognitive processes underlying facilitation and/or interference. We found a facilitation effect in both associative and identity words that translated to modulations in drift rate and non-decisional time.

Congruency With the Target Words Facilitates Information Processing of the Target Picture
Words are one of the top-down factors (such as reward or task strategy) that influence perceptual decisions. Indeed, it has been shown that a larger reward for one of the response options or increased likelihood of occurrence of two events results in an enhanced starting point of evidence accumulation for that particular response (Mulder et al., 2012). It has recently been proposed that language affects perception by setting predictive priors that sharpen perceptual representations (Simanova et al., 2016). Kinoshita et al. (2017) indeed showed that the brief presentation of a color word followed by the presentation of a color sign to be categorized resulted in a facilitation effect that translated to boundary separation and starting-point modulations. This prompted the idea that words do indeed affect perceptual decisions in a predictive fashion.
However, this could be attributable to the internal statistics of the experiment -words were predictive of the upcoming color of the target. In our experiment, we unambiguously show that when words are non-predictive of upcoming features of the target stimuli, modulations in neither threshold nor starting point are manifested. Instead, we found that words result in an increased drift rate, which we interpret in terms of faster target processing speed.
Our finding supports well-established facilitation effects of language on perceptual decisions. For example, it has been shown that language can speed up recognition of visually presented objects (perceptual sensitivity or d' prime), as has been demonstrated for the identification of facial expression (Carroll and Young, 2005) and for the detection of motion direction (Meteyard et al., 2007). Effects of language on visual perception have been demonstrated across different tasks and perceptual domains, including color categorization (Gilbert et al., 2006(Gilbert et al., , 2008Winawer et al., 2007) and face recognition (Landau et al., 2010;Anderson et al., 2014). Experimental studies in the language domain are prone to interpret perceptual sensitivity results in terms of perceptual advantage, i.e., language tapping into the low-level representations (Meteyard et al., 2007). However, the theoretical premises of the drift rate describe it as a post-encoding measure that does not reflect the lowlevel encoding (of a target picture) but rather reflects an intermediate processing stage between stimulus encoding and response execution.
Usually, in priming studies, both prime and target are words, and the modulations in drift rate are therefore interpreted in terms of increased spreading activation of a lexico-semantic nature (Voss et al., 2013). However, in cross-domain priming (prime: word, target: picture), the properties of the visual stimulus might change the nature of the process reflected by the drift rate. For example, it has been shown that the physical strength of the stimulus (i.e., intensity) was captured by a late event-related EEG potential, the centro-parietal positivity (CPP), that also tracked subjective perceptual experience (above the physically presented evidence) (Tagliabue et al., 2019). Another study showed that the rate of evidence accumulation is correlated with the P300 component, which scaled with target detection difficulty (Twomey et al., 2015) and indexed the duration of stimulus evaluation processes (Kutas et al., 1977;Duncan-Johnson, 1981).
Together, these studies suggest that the modulation of evidence accumulation can reflect a separate meta-process. This could be tested by looking at whether the advantage in processing speed is due to lexico-semantic, visual, or metaprocessing facilitation, for example, using EEG. One would expect a modulation of either the N400 (see, for a review, Kutas and Federmeier, 2011) in case of a semantic advantage, a P300/CPP modulation in case of an attention advantage, or a P1 modulation in case of early visual advantage (Boutonnet and Lupyan, 2015). Regardless of the exact nature of the drift modulation, we show that the priming effects modulate the informational processing of the picture. To clarify which type of information is needed and, as a consequence, is reflected in the drift rate requires a combination of mathematical modeling and neuroimaging tools, which may be of use for future studies.

On the Processing of Associations and Identity Words
We found that associative and identity words differ in the magnitude of RTs: longer for identity words in comparison to associative ones (no difference was found in accuracy data). It is known that upon repeated presentations of an item, the chance of an error can be both diminished (repetition priming) and increased, resulting in cumulative semantic interference (Oppenheim et al., 2007). For example, in a continuous naming paradigm, subjects name pictures that belong to different categories, and the naming times increase linearly with the number of pictures belonging to that category. Interestingly, the repetition of an item produced the same cumulative interference effect as additional novel exemplars in the category (Navarrete et al., 2010). We show that in a semantic categorization task with cross-domain priming, the repetition of an identity primetarget resulted in longer RTs in comparison to the repetition of an associative prime-target pair, which is suggestive of cumulative semantic interference.
In spite of showing the differential effect in RTs for association and identity words, we showed that this difference cannot be attributed to the speed of evidence accumulation. Traditionally, semantic priming effects have been explained in terms of memory aliasing, a process that helps to integrate contextual linguistic information from the prime with the visual target -see the spreading-activation theory of semantic processing (Collins and Loftus, 1975) and the compound-cue account (Ratcliff et al., 1988). According to the spreading-activation theory, semantic memory can be seen as a network of interconnected nodes. If two nodes share semantic features, they are connected, and the semantic distance determines the strength of this connection. This theory predicts that identity primes (i.e., the words "man" and "woman") would lead to a greater accessibility of the target in memory in comparison to the associative primes. Here, however, we found that the drift rate does not change as a function of semantic distance, which indicates that the drift rate does not necessarily reflect lexico-semantic memory effects but rather reflects metacognitive processes (see discussion of drift rate and meta-cognitive processes in "Words facilitate information processing of the target picture"). Previous experimental evidence suggests that briefly presented primes influence behavior via meta-cognitive fluency heuristics Williams, 1998, 2000). Fluency, or the meta-cognitive experience of the ease with which we process information, affects a wide variety of decisions (categorization: Oppenheimer and Frank, 2008;familiarity: Monin, 2003;and lexical decisions: Potter et al., 2018). In a general sense, decisions can be made not only on the basis of the content but also on the basis of the feeling of how easy it is to make a decision -in this sense, fluency operates as a heuristic that facilitates decision making (Schwarz, 2004).
In summary, we show that both identity and associative words increase the processing speed of the visual target. The fact that processing speed does not reflect differences in RT attributable to a semantic cumulative effect suggests that drift rate does not reflect lexico-semantic processes in this case.

Effects of Words on Processing Speed of Gendered Faces
The facilitation of informational target processing could reflect target-specific effects related to gender given the social nature of the stimuli. To investigate this aspect, we performed an additional analysis considering the effect of words separately for male and female faces. While a facilitation effect (neutral > female) for female faces was evident from both RTs and percentage of correct responses (in both identity and associative words), we found neither a facilitation nor an interference effect for male faces (RTs for identity words showed a trend toward facilitation, but this did not reach significance). In terms of cognitive modeling, we found increased drift rates indicating facilitation for both male (male > neutral) and female (female > neutral) targets. As for the non-decisional parameter T er , we found facilitation for female targets. However, we found neither facilitation nor interference for male primes -only a general difference between male and female primes. The facilitation effect that we found in congruent vs. neutral prime-target pairs was not modulated by the gender of the target face. This suggested that both male-and female-related words introduced the same amount of "fluency" toward deciding on the target category of man or woman.
In social psychology, words or bigger chunks of verbal material are typically used to study social group effects in decision making (Bodenhausen, 1988;Kunda and Sherman-Williams, 1993). For example, when reading a description of a court case that described a defendant either as Latino or White, the Latino was considered more guilty in comparison to the White by the participants (Bodenhausen, 1988). The words with which we communicate refer to various "schemata" that people use in their perception (Bartlett, 1932). By "schemata" we mean a set of features/beliefs that subjectively reflect the social category of gender and subsequently influence our decisions. Put differently, language can introduce a top-down bias that can influence both visual perception (Simanova et al., 2016) and social decision making (Kunda and Sherman-Williams, 1993).
Studies investigating the effects of social category on perception propose two different mechanisms that can explain this bias: perception and executive accounts. In a weapon identification task (WIT), participants have been found to more often judge an object as a gun when the object is preceded by the face of a black person as opposed to the face of a white person. In the study by Payne et al. (2005), participants who had a chance to correct their response did almost perfectly on the task. It was proposed that the bias in performance was therefore due to the participants' incorrect initial response. If they actually misperceived an object as a gun, then regardless of time limits, their answer would have stayed constant. This account was supported by a study by Amodio et al. (2004), who used the WIT to study stereotypic responses by examining the event-related negativity potentials recorded from the brain (ERN), which are known to reflect response conflict. They found an increased ERN response on the trials where black faces were followed by an object perceived as a gun, which suggested that participants were partially aware of the fact that they had made a mistake. Overall, the fact that people misperceive the object as a gun is explained by the failure of the executive control mechanism. On the other hand, Correll et al. (2015) showed that stereotypes can affect the speed of visual identification of the object. They used a first-person shooter task where participants saw black or white men holding either innocuous objects (e.g., a wallet) or guns and showed that the information about a gun was accumulated faster if a black man held it. To sum up, the "misperception" might have occurred for two different reasons: either because of the fluency of visual processing or the mechanisms of control.
In the field of language, a similar issue has been debated. Previous studies investigated whether semantic effects in response to priming paradigms (e.g., the words "A" or "B" are used as primes, and the participants have to make an "A" or "B" decision on the target) can be explained by response facilitation at the motor stage and/or target processing facilitation (Voss et al., 2013). In other words, it has been proposed that a prime can pre-activate a certain motor response option, bypassing target processing (see the interference-facilitation account of linguistic priming effects in De Houwer et al., 2002;Musch and Klauer, 2003). In this view, the prime can either facilitate motor execution if the upcoming target is coherent with the prime or interfere with the execution if there is a primetarget mismatch. In the current study, we found that both the speed of evidence accumulation and the non-decisional time were affected by the primes. This suggests that the primes do not bypass the evaluation of the target but rather exert their influence on the target, thus ruling out the interferencefacilitation account. In conclusion, in this study, we found that both associations and identity words in response priming led to a facilitation of face gender categorization. This effect was mapped to both response and target processing: when related to the target, both prime types resulted in increased processing speed and faster motor response preparation. This result highlights the multidimensionality of the cognitive processes affected by language.

DATA AVAILABILITY STATEMENT
All datasets generated for this study are included in the article/Supplementary Material.

ETHICS STATEMENT
This study was carried out in accordance with the recommendations of CMO Arnhem-Nijmegen, Radboud University Medical Center ethical committee with written informed consent from all subjects. All subjects gave written informed consent in accordance with the Declaration of Helsinki. The protocol was approved by the CMO Arnhem-Nijmegen, Radboud University Medical Center.