Your new experience awaits. Try the new design now and help us make it even better

ORIGINAL RESEARCH article

Front. Lang. Sci., 29 October 2025

Sec. Language Processing

Volume 4 - 2025 | https://doi.org/10.3389/flang.2025.1625213

What drives response time and accuracy in image naming? Moderators in the relationship between number of phonological neighbors and image naming performance


Naomi Hashimoto
Naomi Hashimoto1*Sabine HeuerSabine Heuer2Chi C. ChoChi C. Cho3
  • 1Communication Sciences & Disorders Program, Eastern Michigan University, Ypsilanti, MI, United States
  • 2Program of Communication Sciences & Disorders, University of Wisconsin-Milwaukee, Milwaukee, WI, United States
  • 3Zilber College of Public Health, University of Wisconsin-Milwaukee, Milwaukee, WI, United States

Insights into phonological activation patterns during lexical retrieval have been gained from simple image naming and picture word interference paradigm (PWIP) studies. Simple image naming studies allow for the manipulation of phonological variables, such as the number of phonological neighbors (NPN). PWIP studies allow for the manipulation of the relationship between a target and distractor, considering the effects of lexical co-activation. PWIP studies have reported a phonological facilitation effect when phonologically related stimuli are introduced during certain time-frames. We conducted a series of experiments in young, neurotypical adults using images that were validated across a number of measures known to affect naming performance. A simple image naming experiment (Experiment 1) was followed by two PWIP experiments, where the SOAs were set at +300 ms (Experiment 2) and +150 ms (Experiment 3) and images were paired with phonologically related or unrelated distractors. Across all experiments, we found that NPN was modulated by other variables such as age-of-acquisition, and image familiarity. While a main effect of distractor type was obtained for the PWIP experiments, there was no interaction between NPN and distractor type. The findings highlight the complex nature of NPN, and the subtle influences that NPN has on picture naming process.

Introduction

Naming an object is an essential act of speech production that may appear to be relatively simple, but actually requires several different processes (Dell, 1986; Indefrey and Levelt, 2004; Levelt et al., 1999; Rapp and Goldrick, 2000). Initially, the speaker must visually recognize the image of the object to be named, such as a picture of a dog. Next, during the semantic processing stage, the speaker must retrieve semantic information, which might include features such as “furry”, “canine”, and so forth. After that, during phonological encoding, the phonological properties for the target, [dOg], are retrieved. The speaker must spell out the entry into individual phonemes ([d], [O], [g]), prepare a motor plan for articulation and, finally, pronounce the word. One of the key insights of speech production research is that words do not activate in isolation. Rather, in the course of producing a particular target word, other words in the lexicon activate too (e.g., Meyer and Schvaneveldt, 1971; Neely, 1977; Oberle and James, 2013). At the phonological stages of naming, activation of the word form corresponding to the meaning we wish to convey will result in co-activation of similar word forms. Therefore, activation of one word spreads to its related words.

The picture-word interference paradigm (PWIP) is a well-established tool designed to study co-activation patterns of lexical items during single word production (see Arrigoni et al., 2025; Korko et al., 2024, for recent reviews). In the PWIP, participants are told to ignore the word and concentrate on naming the image. Although primarily used with young adults in chronometric studies (e.g., Bürki and Madec, 2022; Damian and Martin, 1999; Rayner and Springer, 1986; Schriefers et al., 1990; Starreveld and La Heij, 1995, 1996) or neuromapping studies (e.g., Abel et al., 2009, 2012; De Zubicaray and Mcmahon, 2009; Diaz et al., 2014; Rizio et al., 2017; Sakreida et al., 2019), other populations have been examined, including older adults (Taylor and Burke, 2002), bilinguals (Roelofs et al., 2016; Sá-Leite et al., 2021), or adults and children with language impairments (Hashimoto and Thompson, 2010; Seiger-Gardner and Schwartz, 2008). Additionally, this effect has been reported across different languages, including Chinese (Bi et al., 2009; Qu et al., 2021), Dutch (Meyer, 1991; Meyer and Schriefers, 1991; Starreveld, 2000), English (De Zubicaray et al., 2002; Lupker, 1982), German (Jeschniak and Schriefers, 2001), and Italian (Pisoni et al., 2017). An important element of the PWIP is that it allows us to examine certain aspects of the cognitive architecture of the language production system, namely, the time course of language production processes. This is accomplished by manipulating when the word is presented relative to when the image is presented. Known as stimulus-onset-asynchrony (SOA), this manipulation makes it possible to track the effects that occur over the course of the word retrieval process. The type of distractor word paired with the image elicits different effects: naming response times are slowed if the word is semantically (categorically) related to the picture (e.g., web – NET) compared to if the word is unrelated to the picture (e.g., rabbit – NET). This effect is known as the semantic interference effect (see Arrigoni et al., 2025; Bürki et al., 2020; Korko et al., 2024, for recent reviews). Of relevance to this study is the emergence of another effect, the phonological facilitation effect (PFE), wherein participants provide faster naming response times when phonologically related segment-image pairs (e.g.,/nε-/- NET) are presented relative to unrelated phonologically segment-image pairs (e.g.,/pi/- NET). Typically, PFEs occur when the word is presented after the image (see Indefrey, 2011; Strijkers and Costa, 2011 for reviews).

The emergence of PFEs at late, positive SOAs is taken as an indication that the phonological segments facilitated the preparation of the target name at the right time. If the phonological segments were to be presented at much earlier, negative SOAs (e.g., SOA = −300 ms), there would be no activation because phonological encoding of the segments would have decayed by the time of image presentation. Thus, if the distractor occurs at a late SOA, such as +300 ms, when lexical selection is already complete, the activation of phonemes may facilitate the final stage of production, namely articulatory-motor planning. Alternatively, if the distractor occurs at an earlier SOA, such as +150 ms, activation of phonemes could facilitate the process of phonological encoding. Both SOA conditions should therefore produce PFEs since phonological encoding processes are occurring at a time when phonological processing is active, thereby leading to stronger activation of the target word compared to unrelated words (see Indefrey, 2011; Strijkers and Costa, 2011, for reviews). The manipulation of SOAs in the PWIP therefore allows us to examine how phonological processes unfold over time and perhaps pinpoint the time-frame for the activation of PFEs.

The phonological-based distractors used in picture-word interference studies share the same phonological segments as the image name. As an example, an image such as net would be paired with the segment,/nε-/. The presented distractor,/nε-/, would activate a cohort of similar begin-related words (e.g., neck, nectar, nest), including the target, net, since all of these words share the distractor's component phonemes. While the PFE is well established when initial segments overlap between the distractor word and target picture name (e.g., Bi et al., 2009; De Zubicaray et al., 2002; Jeschniak and Schriefers, 2001; Lupker, 1982; Meyer, 1991; Meyer and Schriefers, 1991; Pisoni et al., 2017; Qu et al., 2021; Starreveld, 2000), no studies to date have examined this effect when using other similar word forms. One metric by which to characterize word form similarity is the number of phonological neighbors a target has. A phonological neighbor is a word that differs from the target by the substitution (and in some instances, the addition or deletion) of a single phoneme (Luce and Pisoni, 1998). As an example, the word net has neighbors such as pet, not, and neck, while the word judge has neighbors such as fudge and jut. Words differ from each other in the number of neighbors they have; thus, net has a relatively high phonological neighborhood density value (23 neighbors) compared to judge which has a relatively low phonological neighborhood density value (four neighbors).

Studies that have explored phonological neighborhood density effects in English picture naming paradigms have not reported consistent results in terms of the presence and direction of an effect: studies have either reported facilitative effects of phonological neighborhood density for naming response times (RTs) and naming accuracy (Newman and Bernstein Ratner, 2007), facilitative effects for naming RTs but not for naming accuracy (Vitevitch, 2002), inhibitory effects on both naming RTs and naming accuracy (Vitevitch and Stamer, 2006), or inhibitory effects on naming accuracy only (Newman and German, 2005). Additionally, some studies have reported no effects (Gordon and Kurczek, 2014; Vitevitch et al., 2004). Therefore, the effects of phonological neighborhood on word retrieval in language production are still not clear. The PWIP may be able to help resolve some of these conflicting results by allowing us to simulate the word selection process while manipulating the relationship between picture name and phonologically similar distractors.

The range of effects reported for phonological neighborhood density in simple image naming tasks also highlights the complexity of phonological processes and their interaction with other variables such as image agreement, name agreement, age of acquisition (AoA), lexical frequency, and conceptual familiarity. Image agreement, which refers to how well the mental image of a concept aligns with the presented image, is thought to index early stages in the picture naming process, specifically at the pre-linguistic object recognition stage (Alario et al., 2004; Perret and Bonin, 2019). Image agreement has been reported as a significant predictor of picture naming response time, indicating that pictures with higher agreement ratings are named faster than those with lower ratings (Alario et al., 2004; Barry et al., 1997; Bonin et al., 2002). A related variable, name agreement, refers to the degree to which participants agree on the name for an image, is measured by the number of different names elicited for a given image. This construct is localized at a pre-linguistic object recognition stage or the post-semantic stage, depending on whether the name represents an error or an alternate correct name, respectively (Barry et al., 1997; Vitkovitch and Tyrrell, 1995). Naming agreement is a robust predictor of naming performance; images with a high name agreement are named more quickly and accurately than images with low name agreement (e.g., Alario et al., 2004; Barry et al., 1997; Dell'Acqua et al., 2000; Lachman et al., 1974; Paivio et al., 1989; Snodgrass and Yuditsky, 1996; Vitkovitch and Tyrrell, 1995). Another important variable is AoA which refers to the age at which a particular concept has been acquired. AoA effect refers to the findings that earlier acquired words are named more quickly (Carroll and White, 1973). Subsequent reviews (Elsherif et al., 2023; Juhasz, 2005; Perret and Bonin, 2019) have consistently reported robust AoA effects in picture naming studies; specifically, naming RTs are significantly faster for earlier acquired words than later acquired ones. While several theories exist to explain the AoA effect, an integrated account of the AoA effect proposes a hybrid of these theories (Elsherif et al., 2023). According to the integrated account, the AoA effect is found because early acquired concepts have richer representations and connections with other concepts in the network compared to later acquired concepts. Later-acquired concepts must further fit into a network that has already well-established early-acquired concepts. Thus, the consolidation for the later-acquired concepts is not as strong as early-acquired concepts. This AoA effect becomes even more pronounced during tasks of arbitrary mapping between semantics and phonology, such as picture naming. Another frequently mentioned variable is conceptual familiarity, which refers to the degree to which a depicted concept (picture or drawing) is familiar to a participant. This variable influences the semantic processing stages, indicating the ease with which a conceptual representation is accessed. Conceptual familiarity effects refer to the fact that higher familiarity ratings will result in faster naming response times. Familiarity effects on naming performance are mixed; some studies report faster naming RTs for highly familiar concepts, while others have found no significant effects (see Alario et al., 2004, for a review). Finally, there is lexical frequency, which indicates how frequently a word is used in a given language. The word frequency effect refers to the fact that naming RTs will be faster for more frequent names. While a recent Bayesian meta-analysis (Perret and Bonin, 2019) found that the effects of lexical frequency were inconclusive, the effect has typically been found to be both reliable and replicable (e.g., Alario et al., 2004; Bates et al., 2003) and has broad support as an important determinant of naming response times and naming accuracy. When found, the lexical frequency effect is assumed to influence the phonological processing stages of naming (Barry et al., 1997).

The impact of these variables on naming performance has long been recognized when creating sets of pictures in English (Snodgrass and Vanderwart, 1980) and other languages (Duñabeitia et al., 2018). More recently, a Bayesian meta-analysis (Perret and Bonin, 2019) revealed that image agreement, name agreement, imageability, age of acquisition, and conceptual familiarity all had strong influences on naming response times. Moreover, subsequent studies have found interaction effects between phonological neighborhood density and phonological frequency (Hameau et al., 2021) as well as between name agreement, age of acquisition, and phonological neighborhood density (Karimi and Diaz, 2020).

Another source for the variability in results may stem from a simple, but often overlooked, issue: image standardization. While some studies have used images from a single standardized source (Hameau et al., 2021; Pisoni et al., 2017), others have not provided any description of their images (Laganaro et al., 2013), included images amassed from more than one source (Chan and Vitevitch, 2009; Middleton and Schwartz, 2010; Newman and Bernstein Ratner, 2007; Vitevitch, 2002; Vitevitch et al., 2004), or included a mixture of line drawings and photographs (Newman and Bernstein Ratner, 2007). Although some studies conducted off-line analyses of factors such as naming agreement or image complexity (Chan and Vitevitch, 2009; Laganaro et al., 2013; Pisoni et al., 2017; Vitevitch et al., 2004), these analyses do not necessarily capture the full range of visual variables that can affect image naming. This is of particular importance for studies that rely on reaction time data where influences could be subtle and more difficult to detect compared to a binary measure such as naming accuracy.

Current study

The aim of the study was to examine the PFE associated with number of phonological neighbors (NPN)1 on picture naming accuracy and RTs in young neurotypical adults. The first experiment examined the effects of NPN in a simple naming experiment. As a step toward resolving the mixed results reported for simple naming paradigms, some of which we believe are due to a lack of image standardization, we created a new set of images developed from a single source. Since there are numerous factors known to influence naming performance, the list of stimuli would have become very constrained; therefore, our strategy was to obtain ratings for factors known to exert an influence on naming performance which would then be considered later in the statistical analysis. We hypothesized a significant main effect of NPN, wherein better naming performance would be found for images with denser neighbors compared to images with sparser neighbors. Interaction effects are also expected between NPN and other variables known to influence naming RTs and naming accuracy; specifically, earlier acquired words, highly familiar items, and words with higher name and image agreements should produce faster naming RTs and higher naming accuracy (Karimi and Diaz, 2020; Perret and Bonin, 2019). The second and third experiments were PWIP studies in which we manipulated word distractor types, each presented at SOAs of +300ms (Experiment 2) and +150 ms (Experiment 3). The PWIP was used because it allowed us to manipulate distractor word types, and consequently, to examine the time-course of PFEs during lexical retrieval processing. Since no study to date has manipulated phonological neighbors in the context of the PWIP, we chose SOAs that have consistently elicited PFEs when using begin-related phonological segments (e.g., Bi et al., 2009; De Zubicaray et al., 2002; Jeschniak and Schriefers, 2001; Lupker, 1982; Meyer, 1991; Meyer and Schriefers, 1991; Pisoni et al., 2017; Qu et al., 2021; Starreveld, 2000). As is the case with existing literature on PFEs, we also hypothesized PFEs when manipulating NPN. Two SOAs were chosen to ensure that the time-frame of phonological activation was adequately covered. Specifically, we hypothesized a main effect of NPN wherein better naming performance in terms of naming RTs and naming accuracy rates would be found for images with denser neighbors compared to images with sparser neighbors. We also hypothesized a significant main effect for distractor type in that phonologically related distractors should lead to significantly faster naming RTs compared to unrelated distractors. Lastly, we hypothesized an interaction effect between NPN and distractor type whereby images with higher NPN paired with related distractors should be named significantly faster and more accurately than either higher NPN images paired with unrelated distractors or lower NPN images paired with related or unrelated distractors.

Method

Word stimuli

We selected 96 monosyllabic words to use across all experiments. Using the CLEARPOND database (Marian et al., 2012), the words were described in terms of NPN. While only monosyllabic words were used, the number of phonemes differed. Therefore, phoneme complexity, or the number of phonemes of each word, was also calculated. Length effects are indicative of phonological processes (Barry et al., 1997; Perret and Bonin, 2019). Some studies have reported that an increased number of phonemes leads to longer naming RTs in young adults while other studies have reported no effects (see Alario et al., 2004, for review). Word frequency indicates how frequently a word is used in a given language. The word frequency effect refers to the fact that naming RTs will be faster for more frequent names. While a recent Bayesian meta-analysis (Perret and Bonin, 2019) found that the effects of lexical frequency were inconclusive, the effect has typically been found to be both reliable and replicable (e.g., Alario et al., 2004; Bates et al., 2003) and has broad support as an important determinant of naming RTs and naming accuracy. Therefore, we included this variable using a language-specific corpus (Brysbaert and New, 2009). When found, the lexical frequency effect is assumed to influence the phonological processing stages of naming (Barry et al., 1997).

Image stimuli

Three graphic design artists were directed to create 96 black-and-white line drawings, one for each target word. Each artist was asked to create 32 images. The following parameters were predetermined and incorporated equally across all images: size; line weight; (absence of) color; orientation; viewpoint; depth cues and shading; luminance; and visual complexity. Size, orientation and level of detail across the depictions of these images were edited for the same level of visual complexity and consistency, as judged by one of the authors and the artists. Images were edited until consensus was achieved. Thus, the physical aspects of the images were consistently created. Figure 1 provides an example.

Figure 1
Illustration of a fishing net with a long handle and a diamond-patterned netting. The net is depicted in black and white. The handle is straight and extends to the left of the net.

Figure 1. An example of an image used across the three experiments.

Given the robust effects of AoA, name agreement, image agreement, familiarity, visual complexity on picture naming, we included AoA ratings (Kuperman et al., 2012) and obtained normative ratings for the other variables (see Perret and Bonin, 2019, for review). This was a particularly important step given that we were using newly created images for our naming experiments. A description of the participants, procedures, instructions, and results of normative data collection are detailed in the Supplementary material - Validation of Images. Table 1 displays a summary of ratings across the various variables. Supplementary material – Image Stimuli provides the lists of images used in Experiment 1 and image-distractor word pairings used in Experiments 2 and 3.

Table 1
www.frontiersin.org

Table 1. Descriptive statistics–normative ratings of images.

Experiment 1 – image naming

Experiment 1 was a simple naming experiment. Its primary purpose was to test a newly-developed set of standardized images, and to replicate previous studies using phonological neighbors as a variable. On each trial, participants saw a single image, and named it as quickly as possible. The variable of interest was the NPN of the image label, which included a range of high number of neighbors (e.g., net has 24 neighbors) to low number of neighbors (e.g., judge has 4 neighbors).

Participants

Forty-eight participants (M = 21.91; SD = 2.17) provided informed consent to be part of the study (project number 18.051, granted 11.03.2017). They met the following inclusionary criteria: (a) age range between 18 and 35 years of age; (b) English as a native language or English as the primary language; (c) normal or corrected-to-normal vision with contacts or glasses; (d) adequate hearing acuity by self-report. Exclusionary criteria included past or current language or cognitive impairments.

Procedure

Experiment 1 consisted of two phases. During the familiarization phase, participants were seated in front of a computer screen to view all images while the experimenter named each image. This procedure ensured that participants recognized each image and knew its label. During the testing phase, images were presented in a randomized order and participants were asked to name each image as accurately and as quickly as possible. Each participant saw all 96 images. E-Prime 2.0 was used to present the stimuli. Chronos, a voice-activated response recording device (Psychology Software Tools) was used to record participants' naming RTs. All responses were provided in English. The entire experimental session took approximately 15–25 min.

Data analyses

The outcome of interest for each trial of an experiment was the response time to correctly identify the image, commonly referred to as ‘time-to-event” outcome that requires a specific type of statistical method called survival analysis (Altman and Bland, 1998). In our experiments, we were interested in accuracy as well as naming RTs. Since participants may make errors in identifying the image, simply taking the response time as the outcome would have been inappropriate because it was possible to have a fast RT with an erroneous response. Conversely, excluding erroneous responses and using only the RT from correct responses would have reduced the sample size and potentially led to selection bias. A survival analysis method called Cox's proportional hazard (PH) model was used to analyze the time needed to correctly name an image and hazard ratios (HR) are estimated from the model to summarize the difference in the risk of event (i.e., correct identification of image) between groups (Harre et al., 1988; Machin et al., 2006). In addition, Kaplan-Meier curves were created to illustrate the probability of getting the correct response over time, with greater separation between curves (i.e., lines) indicating greater difference in the probability correct identification of the image (Rich et al., 2010).

A total of 4,688 trials were included in the analyses. Data were excluded due to equipment errors (3.69%) and RTs that were less than 500 ms (1.13%), resulting in a total of 4.82% of data that were excluded.

The independent factors considered for Experiment 1 included: NPN (range: 3–39); word frequency (natural-log transformed range: 0.77–4.97); phoneme complexity, indexed as number of phonemes (range: 2–5); image familiarity, indexed as ratings from the normative sample (range: 2.59–5.0); and AoA (range 2.5–12.5). Data were collected at two sites and site was used as a control variable. To answer the hypothesis for Experiment 1, the main effects and two-way interaction between NPN and the other factors were examined in the model. All analyses were completed using SAS 9.4 (Cary, NC).

Results

Overall, 91.88% of images were correctly identified (M = 1,019.03 ms; SD = 579.11 ms). See Table 2 for the descriptive statistics. A significant NPN × Image Familiarity (p < 0.0001) interaction was found. While the time needed to correctly name an image generally decreased with greater image familiarity, NPN contributed significantly to how fast an image was correctly identified. Specifically, the results indicated that for images with high familiarity ratings (i.e., images with ratings greater than 4), there was no significant difference in the time needed to correctly name an image across the NPN. However, images with relatively low familiarity image ratings (i.e. images with ratings less than 4), required significantly longer time to be correctly named if they were images with higher NPN. See Figure 2 for an illustration of the NPN x Image Familiarity interaction effect. The results also revealed a significant NPN x AoA (p = 0.0278) interaction. Generally, images whose concepts were acquired later in childhood required significantly longer time to be correctly named. The significant interaction effect, however, indicated that for words acquired in early childhood, NPN had minimal effects while for words acquired in later childhood, NPN had a significant effect: the time to correctly name an image took significantly longer when NPN was higher. See Figure 3 for the NPN x AoA interaction effect. Finally, there was a significant phoneme complexity effect (p < 0.0001); words with less phonemes took a significantly shorter time be named correctly.

Table 2
www.frontiersin.org

Table 2. Descriptive statistics – response time (RT) in milliseconds and accuracy data (%) for experiments 1, 2, and 3.

Figure 2
Three-panel line graph showing cumulative correct response percentage versus reaction time in milliseconds. Panels represent image familiarity levels 3, 4, and 5. Four lines in each panel indicate phonological neighbors: blue line for 4, red dashed for 14, green dashed for 24, and black for 34. Reaction time varies across familiarity levels, with response times significantly longer for image names with higher NPN.

Figure 2. The panel headings indicate image familiarity ratings from low (3) to high (5). The x- axis indicates response time in milliseconds for each panel. The y axis indicates the response in percent accuracy. The curves within each panel indicate the response times for images of varying number of phonological neighbors (NPN). In general, time to correctly name an image decreased with increased image familiarity. For highly familiar images, there was no significant NPN effect on response time. The less familiar the images, the greater the differences in response time became for images of varying NPN and response times were significantly longer for image names with higher NPN (p < 0.0001).

Figure 3
Four cumulative distribution graphs show the cumulative correct response percentage over reaction time in milliseconds for acquisition ages of 3, 6, 9, and 12. Each graph displays different lines representing the number of phonological neighbors: 4, 14, 24, and 34. Each panel suggests that older acquisition ages and more phonological neighbors correspond to slightly longer reaction times.

Figure 3. The panel headings indicate age of acquisition (AoA) for the image name from early-acquired age (3) to later-acquired age (12). The x-axis indicates response time in milliseconds for each panel. The y-axis indicates response in percent accuracy. The curves within each panel indicate the response times for images of varying number of phonological neighbors (NPN). In general, time to correctly name an image increased with increased AoA. For images with early AoA, there is no significant NPN effect on response time. For image names with late AoA, NPN had a significant effect on response time. Response times were significantly longer for image names with higher NPN, (p = 0.0278). Moreover, the later the AoA, the greater the impact of NPN.

Experiments 2 and 3: naming with an auditory distractor at SOA+300 and +150 ms

The visual stimuli were the same images used in Experiment 1. However, each image was accompanied by a spoken distractor that occurred 300 and 150 ms after image onset, respectively. The distractor was either phonologically related (e.g., neck – net) or unrelated to the target (e.g., wedge – net).

Participants

For Experiments 2 and 3, 41 college-age participants (M = 21.85; SD = 3.22) and 46 college-age participants (M = 21.85; SD = 3.22) were recruited, respectively. They met the same inclusionary criteria described for Experiment 1 and provided informed consent to be part of the study (project number UHSRC-FY-19-20-39, granted 08.21.19). None had participated in any of the other experiments.

Stimuli

In addition to the same 96 images, we selected 192 distractor words (96 target images x 2 distractor types). That is, for each target word, we selected two monosyllabic distractor words, one that was phonologically and semantically unrelated to the target (e.g., wedge for target net), and one that was a phonological neighbor of the target (e.g., neck for target net). The 96 phonologically-related distractors were evenly distributed among C1 substitutions (e.g., shop for target mop), vowel substitutions (e.g., bolt for target belt), and C2 substitutions (e.g., neck for target net). Additionally, semantic relatedness (e.g., categorical, subordinate, superordinate, associative, coordinate) between related and unrelated distractors was avoided. See Table 3 for descriptive data by distractor type.

Table 3
www.frontiersin.org

Table 3. Descriptive statistics – distractor words.

Lists

Two lists were developed. In List A, half of the 96 target words were paired with their unrelated distractor. In List B, the other half were paired with their phonologically-related distractor.

Procedures

All participants were randomly assigned to either List A or List B. During the familiarization phase, participants were seated in front of a computer screen to view all images while the experimenter named each image. During the testing phase, images were paired with auditory distractors. Each trial consisted of the following: first, a fixation point was displayed in the center of the screen; second, the target image was presented; third, the auditory distractor was presented 300 or 150 ms after image presentation. Participants were asked to name each image as accurately and as quickly as possible. E-Prime 2.0 was used to present the stimuli, and Chronos (Psychology Software Tools) was used to record participants' responses and RTs. All responses were provided in English. The entire experimental session took approximately 15–25 min.

Data analyses for Experiments 2 and 3

The analysis for Experiments 2 and 3 were also conducted using a similar Cox's PH model as in Study 1. However, to address the specific hypotheses for Experiments 2 and 3, the main effect of NPN was examined along with Distractor Type (Neighbor/Unrelated). In addition, the model also explicitly tested the interaction effects of NPN x Distractor Type and NPN × AoA.

A total of 3,937 trials were included for Experiment 2. Data were excluded due to equipment error (17.96%) and RTs that were less than 500 ms (1.75%). This resulted in a total of 19.72% of data that were excluded. A total of 4,416 trials were included for Experiment 3. Data were excluded due to equipment error (1.15 %) and RTs that were less than 500 ms (2.58 %), resulting in 3.74% of data that were excluded.

Results – Experiment 2

Overall, 91.46% of images in Experiment 2 were correctly identified (M = 889.12 ms; SD = 259.55 ms). See Table 2 for the descriptive statistics. The NPN x Distractor Type interaction was not significant (p = 0.4586). However, a significant main effect of Distractor Type was obtained (p = 0.0211): A higher percentage of images paired with related distractors (92.53%) were correctly named relative to images paired with unrelated distractors (90.30%). Consequently, the time needed to correctly name images paired with related distractors was significantly shorter compared to images paired with unrelated distractors. A significant NPN x Image Familiarity interaction effect (p = 0.0002) was also obtained, replicating the results described in Experiment 1: Images with lower familiarity ratings and high NPN required significantly longer times to be named. Please refer to Figure 4 for graphical illustration of this interaction effect. The NPN × AoA interaction effect was moderately significant (p = 0.0487), and replicated effects described for Experiment 1: NPN had minimal effects on words acquired in early childhood while for words acquired in later childhood, the time to correctly name an image took significantly longer when NPN was higher. Please refer to Figure 5 for a graphical illustration of this interaction effect. Additionally, significant frequency effects (p = 0.0393) and phoneme complexity effects (p = 0.0030) were obtained, such that more frequent words and words with less phonemes resulted in faster times to correctly name an image.

Figure 4
Cumulative correct response curves across three panels display reaction time in milliseconds versus percent correct responses. Panels compare image familiarity from 3 to 5. Each panel shows curves for different numbers of phonological neighbors: 4 (blue), 14 (red), 24 (green), and 34 (black). The response times were significantly longer for image names with higher NPN.

Figure 4. The panel headings indicate image familiarity ratings from low (3) to high (5). The x- axis indicates response time in milliseconds for each panel. The y axis indicates the response in percent accuracy. The curves within each panel indicate the response times for images of varying number of phonological neighbors (NPN). In general, time to correctly name an image decreased with increased image familiarity. For highly familiar images, there was no significant NPN effect on response time. The less familiar the images, the greater the differences in response time became for images of varying NPN and response times were significantly longer for image names with higher NPN, (p = 0.0002).

Figure 5
Cumulative correct response percentage graphs are plotted against reaction time in milliseconds, segmented by acquisition age: 3, 6, 9, and 12. Each panel shows lines for four numbers of phonological neighbors: 4, 14, 24, and 34, represented in blue, red, green, and black respectively. The response curves demonstrate a decrease in cumulative correct response with increasing reaction times, especially notable in older acquisition ages.

Figure 5. The panel headings indicate age of acquisition (AoA) for the image name from early-acquired age (3) to later-acquired age (12). The x-axis indicates response time in milliseconds for each panel. The y-axis indicates response in percent accuracy. The curves within each panel indicate the response times for images of varying number of phonological neighbors (NPN). In general, time to correctly name an image increased with increased AoA. For images with early AoA, there was no significant NPN effect on response time. For image names with late AoA, NPN had a significant effect on response time. Response times were significantly longer for image names with higher NPN, (p = 0.0487).

Results – Experiment 3

Overall, 93.56% of images in Experiment 3 were correctly identified (M = 1,294.94 ms; SD = 810.76 ms). See Table 2 for the descriptive statistics. Findings from Experiment 3 were similar to those found in Experiment 2. An NPN x Distractor Type interaction effect was not obtained (p = 0.5002). Rather, a significant main effect of Distractor Type was found (p = 0.0006). Specifically, a higher percentage of images paired with related distractors (94.86%) were correctly named relative to images paired with unrelated distractors (92.27%). Consequently, the time to correctly name image—related distractor word pairs was significantly faster compared to image—unrelated distractor word pairs. A significant NPN x Image Familiarity interaction effect (p = 0.0002) and a significant NPN × AoA interaction effect (p = 0.0015) was found again. See Figures 6, 7 for graphical illustration of these effects. Finally, a significant phoneme complexity effect was again found (p < 0.0001).

Figure 6
Line graphs show the cumulative correct response percentage over reaction time in milliseconds across three panels with different image familiarity levels: 3, 4, and 5. Each graph presents lines for phonological neighbor numbers 4, 14, 24, and 34, distinguished by blue, red, green, and black colors. Reaction time varies across familiarity levels, with response times significantly longer for image names with higher NPN.

Figure 6. The panel headings indicate image familiarity ratings from low (3) to high (5). The x- axis indicates response time in milliseconds for each panel. The y axis indicates the response in percent accuracy. The curves within each panel indicate the response times for images of varying number of phonological neighbors (NPN). In general, time to correctly name an image decreased with increased image familiarity. For highly familiar images, there was no significant NPN effect on response time. The less familiar the images, the greater the differences in response time became for images of varying NPN and response times were significantly longer for image names with higher NPN, (p = 0.0002).

Figure 7
Four line graphs compare cumulative correct response percentages against reaction time in milliseconds for acquisition ages 3, 6, 9, and 12. Each graph has lines representing four levels of phonological neighbors: 4, 14, 24, and 34. The graphs show that higher correct responses correspond to higher acquisition ages, with varying reaction times across different numbers of phonological neighbors.

Figure 7. The panel headings indicate age of acquisition (AoA) for the image name from early-acquired age (3) to later-acquired age (12). The x-axis indicates response time in milliseconds for each panel. The y-axis indicates response in percent accuracy. The curves within each panel indicate the response times for images of varying number of phonological neighbors (NPN). In general, time to correctly name an image increased with increased AoA. For images with early AoA, there was no significant NPN effect on response time. For image names with late AoA, NPN had a significant effect on response time. Response times were significantly longer for image names with higher NPN, (p = 0.0015). Moreover, the later the AoA, the greater the impact of NPN.

Discussion

The aim of the study was to examine the PFEs associated with PWIPs when we manipulated the NPN during picture naming in young neurotypical adults. While the first experiment involved a simple naming experiment, the second and third experiments utilized a PWIP in which word distractor types, either phonologically related or unrelated to the image, were presented at 300 ms (Experiment 2) or 150 ms (Experiment 3) after image presentation.

In both PWIP experiments, we observed a PFE which confirmed our prediction that images paired with phonologically related distractors would be named significantly faster compared to images paired with unrelated distractors. Our finding of whole-word facilitation adds to the PWIP literature which report PFEs for phonological segments (e.g., Bi et al., 2009; De Zubicaray et al., 2002; Jeschniak and Schriefers, 2001; Lupker, 1982; Meyer, 1991; Meyer and Schriefers, 1991; Pisoni et al., 2017; Qu et al., 2021; Starreveld, 2000). The significant distractor type effect suggests that words will co-activate other phonologically related words, thus allowing for stronger convergence onto the targeted image name. The PFE appears to be a robust phenomenon that occurs not only at the phonological segment level (e.g., Bi et al., 2009; De Zubicaray et al., 2002; Jeschniak and Schriefers, 2001; Lupker, 1982; Meyer, 1991; Meyer and Schriefers, 1991; Pisoni et al., 2017; Qu et al., 2021; Starreveld, 2000), but at the lexical (word) level as well.

The hypothesized significant interaction between NPN and distractor types was not, however, found in Experiments 2 and 3. Findings from English picture naming paradigms have reported facilitative effects of phonological neighborhood density for naming RTs and/or naming accuracy (Newman and Bernstein Ratner, 2007; Vitevitch, 2002), inhibitory effects on naming RTs and/or naming accuracy (Newman and German, 2005; Vitevitch and Stamer, 2006), or no effects (Gordon and Kurczek, 2014; Vitevitch et al., 2004). These mixed results had been thought to be due to variables known to influence naming performance (e.g., AoA, name agreement, lexical frequency) but which had not been carefully controlled (e.g., Hameau et al., 2021; Karimi and Diaz, 2020; Perret and Bonin, 2019). However, those variables were carefully considered in this study so the lack of a significant main effect of NPN or significant interaction between NPN and distractor type are not completely understood. These findings underscore the fact that NPN is a construct which exerts subtle yet complex influences on the picture naming process.

The emergence of an NPN effect only in the context of other variables is exemplified by the significant interaction effects we obtained across the three experiments, a finding which has also been reported by others (e.g., Karimi and Diaz, 2020). In our study, the time to correctly name an image was shorter for highly familiar items compared to low familiar items (as indexed by familiarity ratings). This was presumably because highly familiar items activated a strong semantic network structure (i.e., a highly inter-connected network with strong representations) relative to low familiar items (Alario et al., 2004; Ellis and Morrison, 1998; Snodgrass and Yuditsky, 1996). However, the more interesting finding was the significant interaction obtained between NPN and familiarity: the time needed to correctly name highly familiar images was not influenced by NPN, while the time needed to correctly name low familiar images was significantly influenced by NPN, with longer RTs for images with higher NPN. In other words, an NPN effect appeared only for images with low familiarity ratings and the effect was opposite to the hypothesized direction. Thus, it appears that highly familiar target images possessed stable semantic network activation such that the co-activation of the image's phonological neighbors did not impact naming performance. In contrast, for low familiar images, the semantic network activation was not as strong or as stable as for highly familiar images; therefore, it could be that co-activation of many phonological neighbors created competitive processes that negatively impacted naming performance.

Across both PWIP experiments, we also observed a robust AoA effect. According to the integrated theory (see Elsherif et al., 2023, for review), AoA effects occur because early acquired concepts establish stronger, richer connections with other concepts in the network compared to later acquired concepts. These AoA effects become even more apparent when using tasks such as picture naming because the arbitrary mappings between semantics and phonology produce even greater naming latencies compared to other tasks such as word naming or lexical decision tasks. In the current study, images whose concepts were acquired later in childhood required a longer time to be correctly named than images whose concepts were acquired earlier in childhood. These findings were consistent with the AoA literature (see Elsherif et al., 2023; Juhasz, 2005; Perret and Bonin, 2019, for reviews). The significant interaction effect between NPN and AoA was in line with our predictions that NPN could interact with other variables such as AoA (Karimi and Diaz, 2020). The time needed to name images whose concepts were acquired early in childhood was not affected by the image's NPN, whereas images whose concepts were acquired later in childhood took longer to be named correctly if the image's NPN was higher. This inhibitory effect of NPN in the context of later-acquired AoA concepts was also observed in a meta-analysis of simple image naming studies by Karimi and Diaz (2020) who, like us, did not observe a main effect of phonological neighborhood density but a significant interaction effect with AoA. In our study specifically, later-acquired AoA concepts were associated with longer naming RTs which may have been due to not only a weaker network structure, but also because of competitive processes induced by the activation of a greater number of phonological neighbors. Thus, two factors, the network's structure and competition, negatively impacted naming performance for later acquired concepts whose images had higher NPN. However, there was no NPN effect for early acquired concepts, presumably because the stronger semantic and phonological network structures characteristic of early AoA concepts was stable enough so as to not be influenced by the co-activation of NPN of the image when a related distractor was presented.

To summarize, our results highlight the dynamic processes that can occur between NPN and image naming such that naming performance can be influenced by other variables such as AoA and image familiarity (Hameau et al., 2021; Karimi and Diaz, 2020). More specifically, NPN can inhibit word retrieval processes when using later-acquired and low familiarity stimuli but have no effect on earlier acquired AoA and highly familiar stimuli. Thus, the manipulation of NPN in naming studies must take into consideration the influence of factors, such AoA and familiarity, on picture naming RTs and accuracy. These results also align with the phonological neighborhood density literature which reports a range of effects in English picture naming paradigms (see Hameau et al., 2021, for a review).

The observed phoneme complexity effect found across all three experiments indicated that words with less phonemes resulted in faster naming times. These findings were in keeping with studies which report that more phonemes in a word leads to longer naming RTs (Snodgrass and Yuditsky, 1996). A frequency effect, which was only found in Experiment 2, indicated that high frequency words would result in faster naming RTs. This finding was consistent with several previous studies (e.g., Alario et al., 2004; Bates et al., 2003). However, the fact that this effect was only found in one of three experiments attests to its tenuous status, a finding that is consistent in reviews that examine the effects of both lexical frequency and AoA effects in picture naming. Such reviews typically find significant AoA effects without a corresponding lexical frequency effect (e.g., Elsherif et al., 2023; Juhasz, 2005; Perret and Bonin, 2019).

The other aspect of the study involved a manipulation of SOA. The results in Experiment 2 and 3 differed only marginally. We observed a significant frequency effect at SOA +300 ms, but not at SOA +150 ms. Further, the significance level for the AoA by NPN interaction was different at +300 ms (p = 0.0487) compared to the +150 ms SOA (p = 0.0015), indicating subtle differences in underlying activation processes at the two time points. Nevertheless, the overall findings were replicated at both SOAs, indicating consistency in activation patterns and a robust time frame in which these processes are observed, at least in young, language-normal adults. Further, the interaction effects between NPN, familiarity, and AoA effects suggested a degree of interactivity between the semantic and phonological processing levels at both SOAs (It should be noted, however, that the specific nature and direction of activation was not explored in this study).

Another finding that has methodological implications relates to image use. Variables known to affect naming RTs and naming accuracy (e.g., name agreement, familiarity, visual complexity) are image-set specific and should be carefully controlled; otherwise, factors that influence naming performance can exert unknown influences on participants' performances. Since we accounted for these variables in our study, we feel that the significant interactions found between NPN, AoA, and image familiarity were valid. The interactions between the variables described in the study highlight the complex dynamics of underlying cognitive-linguistics processes that affect naming RTs and accuracy rates, even when they are not the target of experimental manipulations. Researchers and clinicians should therefore consider image norms for characterizing their visual stimuli in order to control for potential confounding effects in research and clinical applications.

Our study included the following limitations: although the Cox's PH model used in this study assumed that responses for different images were independent, this was probably not the case (since each participant provided responses for a number of the same images). Nevertheless, this analysis was still preferred over an analysis of only correct responses (i.e., linear regression) since the observed reaction time for incorrect responses was included as censored data. With regards to the interpretation of the results using linear regression vs. survival analysis, for linear regression, the results are interpreted as the relationship between the independent variable and the dependent variables, which for our study would be correct RTs. The focus of linear regression would be on how well the set of independent variables explain the variation in the outcome of correct reaction time. In contrast, for survival analysis, the results are interpreted in terms of the probability of responding correctly (as opposed to responding incorrectly) throughout the range of reaction times. The focus for survival analysis is on the timing of events (correct/incorrect) and the impact of the independent variables on the probability of being correct/incorrect. Furthermore, it should be noted that both linear regression and survival analysis are statistical methods to identify associations and do not address causation and thus, cannot provide answers to why people succeed. Related to this, we also acknowledge that by not imposing a time limit on a participant's response, we may have captured responses that reflected other cognitive processes than those associated with the intended automatic, rapid naming responses (e.g., participants might have focused more on accuracy than response speed in our experimental paradigm). A final limitation was our decision to remove outliers below 500 ms. While this choice was based on review studies (Indefrey, 2011; Strijkers and Costa, 2011), that provide the average timeline for verbal production, we also recognize that eliminating these outliers could have resulted in eliminating some valid responses.

A direction for further research is the application of this novel image naming experiment to relevant populations who experience prominent naming difficulties since the PWIP mirrors what is typically done in clinical practice: cues are provided to the person with word finding difficulties after the picture has been presented, when it is apparent that some form of clinician support is needed to facilitate the naming process. Thus, the application of the paradigm to individuals who experience word retrieval difficulties as a result of neurological conditions (e.g., aphasia, TBI, dementia) or aging processes (e.g., older adults) may provide further insights into the role of NPN on naming performance.

Data availability statement

The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

Ethics statement

The studies involving humans were approved by Eastern Michigan University Institutional Review Board and the University of Wisconsin-Milwaukee Institutional Review Board. The studies were conducted in accordance with the local legislation and institutional requirements. The participants provided their written informed consent to participate in this study. Written informed consent was obtained from the individual(s) for the publication of any potentially identifiable images or data included in this article.

Author contributions

NH: Conceptualization, Investigation, Methodology, Writing – original draft, Writing – review & editing. SH: Conceptualization, Investigation, Methodology, Writing – original draft, Writing – review & editing. CC: Formal analysis, Methodology, Writing – original draft, Writing – review & editing.

Funding

The author(s) declare that no financial support was received for the research and/or publication of this article.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Generative AI statement

The author(s) declare that no Gen AI was used in the creation of this manuscript.

Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.

Publisher's note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Supplementary material

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/flang.2025.1625213/full#supplementary-material

Footnotes

1. ^We chose to use the term, “number of phonological neighbors” (NPN) since this variable was treated as a continuum and not artificially separated into low, medium, or high PND items.

References

Abel, S., Dressel, K., Bitzer, R., Kümmerer, D., Mader, I., Weiller, C., et al. (2009). The separation of processing stages in a lexical interference fMRI-paradigm. NeuroImage 44, 1113–1124. doi: 10.1016/j.neuroimage.10.018

Crossref Full Text | Google Scholar

Abel, S., Dressel, K., Weiller, C., and Huber, W. (2012). Enhancement and suppression in a lexical interference fMRI-paradigm. Brain Behav. 2, 109–127. doi: 10.1002/brb3.31

PubMed Abstract | Crossref Full Text | Google Scholar

Alario, F. X., Ferrand, L., Laganaro, M., New, B., Frauenfelder, U. H., Segui, J., et al. (2004). Predictors of picture naming speed. Behav. Res. Methods Instrum. Comp. 36, 140–155. doi: 10.3758/BF03195559

Crossref Full Text | Google Scholar

Altman, D. G., and Bland, J. M. (1998). Time to event (survival) data. BMJ 317, 468–469. doi: 10.1136/bmj.317.7156.468

PubMed Abstract | Crossref Full Text | Google Scholar

Arrigoni, E., Rappo, E., Papagno, C., Lauro, L. J. R., and Pisoni, A. (2025). Neural correlates of semantic interference and phonological facilitation in picture naming: a systematic review and coordinate-based meta-analysis. Neuropsychol. Rev. 35, 35–53. doi: 10.1007/s11065-024-09631-9

PubMed Abstract | Crossref Full Text | Google Scholar

Barry, C., Morrison, C. M., and Ellis, A. W. (1997). Naming the Snodgrass and Vanderwart pictures: effects of age of acquisition, frequency, and name agreement. Q. J. Exp. Psychol. 50A, 560–585. doi: 10.1080/783663595

Crossref Full Text | Google Scholar

Bates, E., D'Amico, S., Jacobsen, T., Székely, A., Andonova, E., Devescovi, A., et al. (2003). Timed picture naming in seven languages. Psychonomic Bull. Rev. 10, 344–380. doi: 10.3758/BF03196494

Crossref Full Text | Google Scholar

Bi, Y., Xu, Y., and Caramazza, A. (2009). Orthographic and phonological effects in the picture-word interference paradigm: evidence from a logographic language. Appl. Psycholing. 30, 637–658. doi: 10.1017/S0142716409990051

Crossref Full Text | Google Scholar

Bonin, P., Chalard, M., Méot, A., and Fayol, M. (2002). The determinants of spoken and written picture naming latencies. Br. J. Psychol. 93, 89–114. doi: 10.1348/000712602162463

PubMed Abstract | Crossref Full Text | Google Scholar

Brysbaert, M., and New, B. (2009). Moving beyond Kučera and Francis: a critical evaluation of current word frequency norms and the introduction of a new and improved word frequency measure for American English. Behav. Res. Methods 41, 977–990. doi: 10.3758/BRM.41.4.977

PubMed Abstract | Crossref Full Text | Google Scholar

Bürki, A., Elbuy, S., Madec, S., and Vasishth, S. (2020). What did we learn from forty years of research on semantic interference? A Bayesian meta-analysis. J. Mem. Lang. 114:104125. doi: 10.1016/j.jml.2020.104125

Crossref Full Text | Google Scholar

Bürki, A., and Madec, S. (2022). Picture-word interference in language production studies: exploring the roles of attention and processing times. J. Exp. Psychol. Learn. Memory Cognit. 48, 1019–1046. doi: 10.1037/xlm0001098

PubMed Abstract | Crossref Full Text | Google Scholar

Carroll, J. B., and White, M. N. (1973). Word frequency and age of acquisition as determiners of picture-naming latency. Q. J. Exp. Psychol. 25, 85–95. doi: 10.1080/14640747308400325

Crossref Full Text | Google Scholar

Chan, K. Y., and Vitevitch, M. S. (2009). The influence of the phonological neighborhood clustering coefficient on spoken word recognition. J. Exp. Psychol. Hum. Percept. Perform. 35, 1934–1949. doi: 10.1037/a0016902

PubMed Abstract | Crossref Full Text | Google Scholar

Damian, M. F., and Martin, R. C. (1999). Semantic and phonological codes interact in single word production. J. Exp. Psychol. Learn. Memory Cognit. 25, 345–361. doi: 10.1037//0278-7393.25.2.345

PubMed Abstract | Crossref Full Text | Google Scholar

De Zubicaray, D. I., and Mcmahon, K. L. (2009). Auditory context effects in picture naming investigated with event-related fMRI. Cognit. Affect. Behav. Neurosci. 9, 260–269. doi: 10.3758/CABN.9.3.260

PubMed Abstract | Crossref Full Text | Google Scholar

De Zubicaray, G. I., McMahon, K. L., Eastburn, M. M., and Wilson, S. J. (2002). Orthographic/phonological facilitation of naming responses in the picture-word task: an event-related fMRI study using overt vocal responding. NeuroImage, 16, 1084–1093. doi: 10.1006/nimg.2002.1135

PubMed Abstract | Crossref Full Text | Google Scholar

Dell, G. S. (1986). A spreading-activation theory of retrieval in sentence production. Psychol. Rev. 93, 283–321. doi: 10.1037/0033-295X.93.3.283

Crossref Full Text | Google Scholar

Dell'Acqua, R., Lotto, L., and Job, R. (2000). Naming times and standardized norms for the Italian PD/DPSS set of 266 pictures: direct comparisons with American, English, French, and Spanish published databases. Behav. Res. Methods Instrum. Comp. 32, 588–615. doi: 10.3758/BF03200832

Crossref Full Text | Google Scholar

Diaz, M. T., Hogstrom, L. J., Zhuang, J., Voyvodic, J. T., Johnson, M. A., Camblin, C. C., et al. (2014). Written distractor words influence brain activity during overt picture naming. Front. Hum. Neurosci. 8:167. doi: 10.3389/fnhum.2014.00167

PubMed Abstract | Crossref Full Text | Google Scholar

Duñabeitia, J. A., Crepaldi, D., Meyer, A. S., New, B., Pliatsikas, C., Smolka, E., et al. (2018). MultiPic: a standardized set of 750 drawings with norms for six European languages. Q. J. Exp. Psychol. 71, 808–816. doi: 10.1080/17470218.2017.1310261

PubMed Abstract | Crossref Full Text | Google Scholar

Ellis, A. W., and Morrison, C. M. (1998). Real age-of-acquisition effects in lexical retrieval. J. Exp. Psychol. Learn. Memory Cognit. 24, 515–523. doi: 10.1037//0278-7393.24.2.515

PubMed Abstract | Crossref Full Text | Google Scholar

Elsherif, M. M., Preece, E., and Catling, J. C. (2023). Age-of-acquisition effects: a literature review. J. Exp. Psychol. Learn. Memory. Cognit. 49, 812–847. doi: 10.1037/xlm0001215

PubMed Abstract | Crossref Full Text | Google Scholar

Gordon, J. K., and Kurczek, J. C. (2014). The ageing neighbourhood: phonological density in naming. Lang. Cognit. Neurosci. 29, 326–344. doi: 10.1080/01690965.2013.837495

PubMed Abstract | Crossref Full Text | Google Scholar

Hameau, S., Biedermann, B., Robidoux, S., and Nickels, L. (2021). Effects of phonological neighbourhood density and frequency in picture naming. J. Mem. Lang. 120:104248. doi: 10.1016/j.jml.2021.104248

Crossref Full Text | Google Scholar

Harre Jr, F. E, Lee, K. L., and Pollock, B. G. (1988). Regression models in clinical studies: determining relationships between predictors and response. J. Natl. Cancer Inst. 80, 1198–1202. doi: 10.1093/jnci/80.15.1198

PubMed Abstract | Crossref Full Text | Google Scholar

Hashimoto, N., and Thompson, C.K. (2010). The use of the picture-word interference paradigm to examine naming abilities in aphasic individuals. Aphasiology 24, 580–611. doi: 10.1080/02687030902777567

PubMed Abstract | Crossref Full Text | Google Scholar

Indefrey, P. (2011). The spatial and temporal signatures of word production components: a critical update. Front. Psychol. 2:255. doi: 10.3389/fpsyg.2011.00255

PubMed Abstract | Crossref Full Text | Google Scholar

Indefrey, P., and Levelt, W. J. M. (2004). The spatial and temporal signatures of word production components. Cognition 92, 101–144. doi: 10.1016/j.cognition.06,001.

Crossref Full Text | Google Scholar

Jeschniak, J. D., and Schriefers, H. (2001). Priming effects from phonologically related distractors in picture-word interference. Q. J. Exp. Psychol. Section A 54, 371–382. doi: 10.1080/713755981

PubMed Abstract | Crossref Full Text | Google Scholar

Juhasz, B. J. (2005). Age-of-acquisition effects in word and picture identification. Psychol. Bull. 131, 684–712. doi: 10.1037/0033-2909.131.5.684

PubMed Abstract | Crossref Full Text | Google Scholar

Karimi, H., and Diaz, M. (2020). When phonological neighborhood density both facilitates and impedes: age of acquisition and name agreement interact with phonological neighborhood during word production. Mem. Cognit. 48, 1061–1072. doi: 10.3758/s13421-020-01042-4

PubMed Abstract | Crossref Full Text | Google Scholar

Korko, M., Bose, A., Jones, A., Coulson, M., and de Mornay Davies, P. (2024). Do words compete as we speak? A systematic review of picture-word interference (PWI) studies investigating the nature of lexical selection. Psychol. Lang. Commun. 28, 262–321. doi: 10.58734/plc-2024-0011

Crossref Full Text | Google Scholar

Kuperman, V., Stadthagen-Gonzalez, H., and Brysbaert, M. (2012). Age-of-acquisition ratings for 30,000 English words. Behav. Res. Methods 44, 978–990. doi: 10.3758/s13428-012-0210-4

PubMed Abstract | Crossref Full Text | Google Scholar

Lachman, R., Shaffer, J. P., and Hennrikus, D. (1974). Language and cognition: effects of stimulus codability, name-word frequency, and age of acquisition on lexical reaction time. J. Verbal Learn. Verbal Behav. 13, 613–625. doi: 10.1016/S0022-5371(74)80049-6

Crossref Full Text | Google Scholar

Laganaro, M., Chetelat-Mabillard, D., and Frauenfelder, U. H. (2013). Facilitatory and interfering effects of neighbourhood density on speech production: evidence from aphasic errors. Cognit. Neuropsychol. 30, 127–146. doi: 10.1080/02643294.2013.831818

PubMed Abstract | Crossref Full Text | Google Scholar

Levelt, W. J. M., Roelofs, A., and Meyer, A. S. (1999). A theory of lexical access in speech production. Behav. Brain Sci. 22, 1–38. doi: 10.1017/S0140525X99001776

Crossref Full Text | Google Scholar

Luce, P. A., and Pisoni, D. B. (1998). Recognizing spoken words: the neighborhood activation model. Ear Hear. 19, 1–36. doi: 10.1097/00003446-199802000-00001

PubMed Abstract | Crossref Full Text | Google Scholar

Lupker, S. J. (1982). The role of phonetic and orthographic similarity in picture-word interference. Can. J. Psychol. 36, 349–367. doi: 10.1037/h0080652

PubMed Abstract | Crossref Full Text | Google Scholar

Machin, D., Cheung, Y. B., and Parmar, M. (2006). Survival Analysis: A Practical Approach. West Sussex: John Wiley Sons. doi: 10.1002/0470034572

Crossref Full Text | Google Scholar

Marian, V., Bartolotti, J., Chabal, S., and Shook, A. (2012). CLEARPOND: cross-linguistic easy-access resource for phonological and orthographic neighborhood densities. PLoS ONE 7:e43230. doi: 10.1371/journal.pone.0043230

PubMed Abstract | Crossref Full Text | Google Scholar

Meyer, A. S. (1991). The time course of phonological encoding in language production: phonological encoding inside a syllable. J. Mem. Lang. 30, 69–89. doi: 10.1016/0749-596X(91)90011-8

Crossref Full Text | Google Scholar

Meyer, A. S., and Schriefers, H. (1991). Phonological facilitation in picture-word interference experiments: effects of stimulus onset asynchrony and types of interfering stimuli. J. Exp. Psychol. Learn. Mem. Cognit. 17, 1146–1160. doi: 10.1037//0278-7393.17.6.1146

Crossref Full Text | Google Scholar

Meyer, D. E., and Schvaneveldt, R. W. (1971). Facilitation in recognizing pairs of words: evidence of a dependence between retrieval operations. J. Exp. Psychol. 90, 227–234. doi: 10.1037/h0031564

PubMed Abstract | Crossref Full Text | Google Scholar

Middleton, E. L., and Schwartz, M. F. (2010). Density pervades: an analysis of phonological neighbourhood density effects in aphasic speakers with different types of naming impairment. Cognit. Neuropsychol. 27, 401–427. doi: 10.1080/02643294.2011.570325

PubMed Abstract | Crossref Full Text | Google Scholar

Neely, J. H. (1977). Semantic priming and retrieval from lexical memory: roles of inhibitionless spreading activation and limited-capacity attention. J. Exp. Psychol. Gen. 106, 226–254. doi: 10.1037/0096-3445.106.3.226

Crossref Full Text | Google Scholar

Newman, R., and Bernstein Ratner, N. (2007). The role of selected lexical factors on confrontation naming accuracy, Speed, and fluency in adults who do and do not stutter. J. Speech Lang. Hear. Res. 50, 196–213. doi: 10.1044/1092-4388(2007/016)

PubMed Abstract | Crossref Full Text | Google Scholar

Newman, R. S., and German, D. J. (2005). Life span effects of lexical factors on oral naming. Lang. Speech 48, 123–156. doi: 10.1177/00238309050480020101

PubMed Abstract | Crossref Full Text | Google Scholar

Oberle, S., and James, L. E. (2013). Semantically- and phonologically-related prime improve name retrieval in young and older adults. Lang. Cognit. Process. 28, 1378–1393. doi: 10.1080/01690965.2012.685481

PubMed Abstract | Crossref Full Text | Google Scholar

Paivio, A., Clark, J. M., Digdon, N., and Bons, T. (1989). Referential processing: reciprocity and correlates of naming and imaging. Mem. Cognit. 17, 163–174. doi: 10.3758/BF03197066

Crossref Full Text | Google Scholar

Perret, C., and Bonin, P. (2019). Which variables should be controlled for to investigate picture naming in adults? A Bayesian meta-analysis. Behav. Res. Methods 51, 2533–2545. doi: 10.3758/s13428-018-1100-1

PubMed Abstract | Crossref Full Text | Google Scholar

Pisoni, A., Cerciello, M., Cattaneo, Z., and Papagno, C. (2017). Phonological facilitation in picture naming: when and where? A tDCS study. Neuroscience 352, 106–121. doi: 10.1016/j.neuroscience.03,043.

Crossref Full Text | Google Scholar

Qu, Q., Feng, C., and Damian, M. F. (2021). Interference effects of phonological similarity in word production arise from competitive incremental learning. Cognition 212:104738. doi: 10.1016/j.cognition.2021.104738

PubMed Abstract | Crossref Full Text | Google Scholar

Rapp, B., and Goldrick, M. (2000). Discreteness and interactivity in spoken word production. Psychol. Rev. 107, 460–499. doi: 10.1037/0033-295X.107.3.460

Crossref Full Text | Google Scholar

Rayner, K., and Springer, C. J. (1986). Graphemic and semantic similarity effects in the picture-Word interference task. Br. J. Psychol. 77, 207–222. doi: 10.1111/j.2044-8295.1986.tb01995.x

PubMed Abstract | Crossref Full Text | Google Scholar

Rich, J. T., Neely, J. G., Paniello, R. C., Voelker, C. C., Nussenbaum, B., Wang, E. W., et al. (2010). (2010). A practical guide to understanding Kaplan-Meier curves. Otolaryngology Head Neck Surg. 143, 331–336. doi: 10.1016/j.otohns.05,007.

Crossref Full Text | Google Scholar

Rizio, A. A., Moyer, K. J., and Diaz, M. T. (2017). Neural evidence for phonologically based language production deficits in older adults: an fMRI investigation of age-related differences in picture-word interference. Brain Behav. Cognit. Neurosci. Perspect. 7, 1–19. doi: 10.1002/brb3.660

PubMed Abstract | Crossref Full Text | Google Scholar

Roelofs, A., Piai, V., Rodriguez, G. G., and Chwilla, D. J. (2016). Electrophysiology of cross-language interference and facilitation in picture naming. Cortex 76, 1–16. doi: 10.1016/j.cortex.12.003

Crossref Full Text | Google Scholar

Sakreida, K., Blume-Schnitzler, J., Heim, S., Willmes, K., Clusmann, H., and Neuloh, G. (2019). Phonological picture-word interference in language mapping with transcranial magnetic stimulation: an objective approach for functional parcellation of Broca's region. Brain Struct. Funct. 224, 2027–2044. doi: 10.1007/s00429-019-01891-z

PubMed Abstract | Crossref Full Text | Google Scholar

Sá-Leite, A., Haro, J., Comesaña, M., and Fraga, I. (2021). Of beavers and tables: the role of animacy in the processing of grammatical gender within a picture-word interference task. Front. Psychol. 12:661175. doi: 10.3389/fpsyg.2021.661175

PubMed Abstract | Crossref Full Text | Google Scholar

Schriefers, H., Meyer, A. S., and Levelt, W. J. M. (1990). Exploring the time course of lexical access in language production: picture-word interference studies. J. Mem. Lang. 29, 86–102. doi: 10.1016/0749-596X(90)90011-N

Crossref Full Text | Google Scholar

Seiger-Gardner, L., and Schwartz, R. G. (2008). Lexical access in children with and without specific language impairment: a cross-modal picture-word interference study. Int. J. Lang. Commun. Disord. 43, 528–551. doi: 10.1080/13682820701768581

PubMed Abstract | Crossref Full Text | Google Scholar

Snodgrass, J. G., and Vanderwart, M. (1980). A standardized set of 260 pictures: norms for name agreement, image agreement, familiarity, and visual complexity. J. Exp. Psychol. Hum. Learn. Memory 6, 174–215. doi: 10.1037//0278-7393.6.2.174

PubMed Abstract | Crossref Full Text | Google Scholar

Snodgrass, J. G., and Yuditsky, T. (1996). Naming times for the Snodgrass and Vanderwart pictures. Behav. Res. Methods Instrum. Comp. 28, 516–536. doi: 10.3758/BF03200540

Crossref Full Text | Google Scholar

Starreveld, P. A. (2000). On the interpretation of onsets of auditory context effects in word production. J. Mem. Lang. 42, 497–525. doi: 10.1006/jmla.1999.2693

Crossref Full Text | Google Scholar

Starreveld, P. A., and La Heij, W. (1995). Semantic interference, orthographic facilitation and their interaction in naming tasks. J. Exp. Psychol. Learn. Mem. Cognit. 21, 686–698. doi: 10.1037//0278-7393.21.3.686

Crossref Full Text | Google Scholar

Starreveld, P. A., and La Heij, W. (1996). Time course analysis of semantic and orthographic context effects in picture naming. J. Exp. Psychol. Learn. Mem. Cognit. 22, 896–918. doi: 10.1037//0278-7393.22.4.896

Crossref Full Text | Google Scholar

Strijkers, K., and Costa, A. (2011). Riding the lexical speedway: a critical review on the time course of lexical selection in speech production. Front. Psychol. 2:356. doi: 10.3389/fpsyg.2011, 00356.

Crossref Full Text | Google Scholar

Taylor, J. K., and Burke, D. M. (2002). Asymmetric aging effects on semantic and phonological processes: naming in the picture-word interference task. Psychol. Aging 17, 662–676. doi: 10.1037/0882-7974.17.4.662

Crossref Full Text | Google Scholar

Vitevitch, M. S. (2002). The influence of phonological similarity neighborhoods on speech production. J. Exp. Psychol. Learn. Mem. Cognit. 28, 735–747. doi: 10.1037//0278-7393.28.4.735

PubMed Abstract | Crossref Full Text | Google Scholar

Vitevitch, M. S., Armbrüster, J., and Chu, S. (2004). Sublexical and lexical representations in speech production: effects of phonotactic probability and onset density. J. Exp. Psychol. Learn. Mem. Cognit. 30, 514–529. doi: 10.1037/0278-7393.30.2.514

PubMed Abstract | Crossref Full Text | Google Scholar

Vitevitch, M. S., and Stamer, M. K. (2006). The curious case of competition in Spanish speech production. Lang. Cognit. Process. 21, 760–770. doi: 10.1080/01690960500287196

PubMed Abstract | Crossref Full Text | Google Scholar

Vitkovitch, M., and Tyrrell, L. (1995). Sources of disagreement in object naming. Q. J. Exp. Psychol. Section A 48, 822–848. doi: 10.1080/14640749508401419

Crossref Full Text | Google Scholar

Keywords: picture naming, phonological neighbors, picture word interference paradigm, image familiarity, age-of-acquisition

Citation: Hashimoto N, Heuer S and Cho CC (2025) What drives response time and accuracy in image naming? Moderators in the relationship between number of phonological neighbors and image naming performance. Front. Lang. Sci. 4:1625213. doi: 10.3389/flang.2025.1625213

Received: 08 May 2025; Accepted: 06 October 2025;
Published: 29 October 2025.

Edited by:

Lucia Colombo, University of Padua, Italy

Reviewed by:

Emilie Dujardin, University of Poitiers, France
Coline Gregoire, UMR7295 Centre de recherches sur la cognition et l'apprentissage (CeRCA), France

Copyright © 2025 Hashimoto, Heuer and Cho. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Naomi Hashimoto, bmhhc2hpbW9AZW1pY2guZWR1

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.