Richer concepts are better remembered: number of features effects in free recall

Hargreaves, Ian  Scott; Pexman, Penny  M; Johnson, Jeremy  S; Zdrazilova, Lenka

doi:10.3389/fnhum.2012.00073

ORIGINAL RESEARCH article

Front. Hum. Neurosci., 03 April 2012

Sec. Cognitive Neuroscience

Volume 6 - 2012 | https://doi.org/10.3389/fnhum.2012.00073

This article is part of the Research TopicMeaning in mind: Semantic richness effects in language processingView all 18 articles

Richer concepts are better remembered: number of features effects in free recall

Ian S. Hargreaves*

Penny M. Pexman*

Jeremy C. Johnson and Lenka Zdrazilova

Department of Psychology, Language Processing Lab, University of Calgary, Calgary, AB, Canada

Many models of memory build in a term for encoding variability, the observation that there can be variability in the richness or extensiveness of processing at encoding, and that this variability has consequences for retrieval. In four experiments, we tested the expectation that encoding variability could be driven by the properties of the to-be-remembered item. Specifically, that concepts associated with more semantic features would be better remembered than concepts associated with fewer semantic features. Using feature listing norms we selected sets of items for which people tend to list higher numbers of features (high NoF) and items for which people tend to list lower numbers of features (low NoF). Results showed more accurate free recall for high NoF concepts than for low NoF concepts in expected memory tasks (Experiments 1–3) and also in an unexpected memory task (Experiment 4). This effect was not the result of associative chaining between study items (Experiment 3), and can be attributed to the amount of item-specific processing that occurs at study (Experiment 4). These results provide evidence that stimulus-specific differences in processing at encoding have consequences for explicit memory retrieval.

Words vary on a large number of lexical dimensions that characterize factors such as their frequency of usage, or that refer to structural characteristics such as shape (orthography) and sound (phonology). Words, rather helpfully, also vary in meaning, and this variability can be captured by numerous semantic dimensions that influence the speed with which words can be recognized or categorized (Pexman et al., 2008). A vast word recognition literature has sought to characterize how orthographic, phonologic, and semantic dimensions interactively contribute to our ability to read. Consistently, researchers have shown that the variability of a given word along any or all of these dimensions is an important determinant in how that word is processed, manifesting in differences in reading times and accuracy (Yap and Balota, 2009). Words are also convenient stimuli for experiments, and are often utilized in memory research as they offer a well-defined minimal unit that can easily serve in recognition and free recall memory paradigms. This raises an interesting question: we know that there are many characteristics of individual words that shape how those words are processed, but do these item-specific differences influence subsequent memory when words are used as stimuli?

One approach to characterizing these effects is also one of the most influential frameworks in human memory research. The levels of processing framework proposed by Craik and Lockhart (1972) provided a number of important ideas, including the assertion that deeper processing at encoding leads to more accurate recollection at retrieval. In later work the framework was refined in a number of ways, and depth of processing was distinguished from another important type of encoding: elaboration. While depth of processing refers to the fact that some domains of processing typically involve richer or more extensive processing than others, elaboration has been characterized as “richness or extensiveness of processing within each qualitative type (of processing)” (Lockhart and Craik, 1990, p. 100). That is, within a particular type or domain of processing (e.g., semantic processing) there is variability in processing richness and this variability has consequences for memory. Numerous studies of semantic elaboration showed that free recall could be influenced by manipulating the encoding conditions applied to the to-be-remembered items (e.g., Craik and Tulving, 1975; Klein and Saltz, 1976; Ritchey and Beal, 1980; Ross, 1981; Hashtroudi, 1983) and importantly for the present discussion, by the variability in semantic elaboration prompted by the characteristics of the to-be-remembered items themselves (Seamon and Murray, 1976). This revised emphasis on elaboration helped to shift the levels of processing framework away from a focus on the depth of processing per se and toward a focus on how qualitatively distinct encoding operations influence memory. This shift was important, as the levels of processing framework was criticized for being underspecified (Morris et al., 1977) or worse, inherently circular (Nelson, 1977). However, despite this advancing construal of levels of processing, researchers continued to struggle with implementing the framework within a computational model (Eich, 1985; cf. Craik and Lockheart, 1986).

Researchers still show great interest in characterizing how variability in processing during encoding can influence subsequent memory. Indeed, the primary assertion of semantic elaboration (that the relative amount of processing within a single domain should predict subsequent memory) finds a more clearly specified counterpart in the construct of encoding variability¹. Similar to elaboration, encoding variability captures the idea that variability in how items are processed will lead to differences in memory strength across items. This intuitive assumption has been implemented in models of recognition memory in order to account for the observation that studied items vary more in memory strength than new items (Hintzman, 1986; cf. Koen and Yonelinas, 2010). It has also been used to interpret the observation that brain-based changes at encoding predict the subsequent recall of items, for example item-wise variability in hippocampal gamma oscillations predict the likelihood of successful free recall (Sederberg et al., 2007). Encoding variability can also be implemented in models of free recall (Sederberg et al., 2008), offering a level of specification that the elaboration account lacks.

Both semantic elaboration and encoding variability literatures make the prediction that processing differences at encoding will lead to subsequent effects in free recall. However, neither has given much attention to potential differences in processing that are spontaneously elicited by the lexico-semantic characteristics of to-be-studied items. This is an important point; words are known to vary on a large number of lexico-semantic dimensions, and to the extent that this variability automatically shapes the processing of these items, both semantic elaboration and encoding variability accounts would predict subsequent effects in free recall.

In related research, Nelson and colleagues have investigated how the associative relationships between words can influence memory performance for individual words. In natural language usage words are produced in structured sentences that lead them to become entangled with one another. Nelson and colleagues captured these associative relationships by asking a large number of participants to list the first word that comes to mind in relation to a presented target word (Nelson et al., 1998). Using this database, Nelson and colleagues documented effects of words' Number of Associates (NoA; also known as associative set size) in a variety of memory tasks. Compared to words with many associates, words with fewer associates are more likely to be successfully retrieved during cued recall, however, manipulating NoA did not influence free recall performance (Nelson and Schreiber, 1992). That NoA influences cued but not free recall suggests that the influence of lexico-semantic variables on memory performance is likely task-specific. The concreteness variable shows a different pattern across tasks: relative to abstract words (e.g., VIRTUE), concrete words (e.g., CAT) show more accurate performance in cued recall, free recall and recognition memory tasks (Paivio and Csapo, 1973; Nelson and Schreiber, 1992; Hamilton and Rajaram, 2001). In visual word recognition, there have been repeated demonstrations that the effects of item-specific relative semantic richness are multidimensional, leading variables like NoA and concreteness to dissociate across different visual word recognition tasks (Pexman et al., 2008; Yap et al., 2011). While it is not surprising to observe similar dissociations in a task as unconstrained as free recall, the potential for lexico-semantic variables to selectively influence different memory tasks highlights the importance of properly balanced stimulus sets. Seamon and Murray (1976) manipulated subjectively rated meaningfulness, which uses a Likert-type scale to measure the extent to which participants feel that a word arouses other associated words (with more words leading to higher values; Toglia and Battig, 1978). Unfortunately, it is unclear what information participants use when placing words on a dimension of meaningfulness and this variable shows significant correlations with other subjectively rated variables such as familiarity, imageability, and concreteness. Indeed, in the Seamon and Murray study the high meaningful words were also high on ratings of imagery and concreteness. Because of the difficulty in operationalizing meaningfulness it is unknown whether this manipulation is fine-grained enough to test theories of elaboration, since high meaningful words may differ on any number of dimensions from low meaningful words. The goal of the current study was to investigate item-specific encoding variability in a more precise fashion than in previous studies, by investigating number-of-features (NoFs) effects in free recall.

NoF refers to the number of semantic features that participants list for different concepts in a feature-listing task (Pexman et al., 2002). The features listed for different concepts are considered “verbal proxies for packets of knowledge” (McRae, 2005, p. 42), rather than veridical descriptions of semantic memory. As they generate features, participants access representations derived from their experience with the concepts. McRae and colleagues (McRae et al., 2005) published feature norms for 541 concrete concepts. For instance, for the concept cow the normative features include perceptual characteristics like has four legs, has an udder, and is smelly. Other features describe behaviors, like eats grass, and moos. Some of the features describe the concept's function, like produces milk, or its context, as in lives on farms. There is variability in the number of features listed for different concepts (e.g., 20 for couch, 23 for cougar, 11 for table, 9 for leopard) and this variability is related to responding in word recognition tasks (lexical decision, semantic categorization), such that responses are faster and more accurate for words with many features than for words with few features, even when other variables, like word length, frequency, typicality, and concreteness, are controlled (Pexman et al., 2002, 2003, 2008; Grondin et al., 2009; Yap et al., 2011). The processing advantage observed for high NoF words has been attributed to greater semantic activation for high NoF concepts (Pexman et al., 2003).

NoF effects have only been examined in visual word recognition tasks. In the present study we investigated whether NoF effects can be observed in free recall. Compared to past investigations that manipulated meaningfulness and concreteness, the relative transparency with which the NoF variable is defined allowed us to test for fine-grained effects of item-specific encoding variability in memory performance. Given the nature of these effects as outlined above, one would expect that the enriched encoding afforded by high NoF words would lead to more accurate recall. Of course, given the narrow definition of semantic richness captured by NoF, it was also possible that the difference between high and low NoF words would be too subtle to influence memory accuracy. To investigate these possibilities we chose free recall because an extensive literature shows that this task produces effects of another stimulus-specific property: concreteness (Dukes and Bastian, 1966; Paivio and Csapo, 1973; Nelson and Schreiber, 1992; Paivio et al., 1994; Ruiz-Vargas et al., 1996; Hamilton and Rajaram, 2001; ter Doest and Semin, 2005), and we modeled our procedure after the most recent of these studies. To be clear, however, we investigated NoF effects for sets of items for which concreteness, word frequency, familiarity, and contextual diversity was controlled, so any memory effects observed for NoF could be interpreted as incremental to those of each of these other factors. In Experiments 1 and 2 we tested for fine-grained effects of item-specific encoding variability by investigating whether NoF effects can be observed in free recall. In Experiments 3 and 4 we further explored the mechanisms for those effects by investigating whether NoF effects are the result of associative chaining among items rather than superior recall for individual items (Experiment 3) and by investigating whether NoF effects emerge during the incidental encoding of to-be-remembered items in a lexical decision task (LDT; Experiment 4).

Experiment 1

Method

Participants

Participants in Experiment 1 were 30 undergraduate students at the University of Calgary. In all of the experiments reported in this paper, participants reported that English was their first language, had normal or corrected-to-normal vision, and received course credit for participation.

Materials

The stimuli for Experiment 1 were 30 low NoF words and 30 high NoF words selected from the McRae et al. (2005) norms (Table A1). The selected word sets differed significantly in NoF (p < 0.001) but were matched for printed frequency, contextual diversity (Brysbaert and New, 2009), familiarity, printed length, orthographic neighborhood size (Coltheart et al., 1977), and concreteness (see Table 1). As a result of this matching, differences between the low NoF and high NoF words on each of these dimensions were non-significant at p > 0.10². We obtained concreteness values for 55 of the items from the MRC database (Wilson, 1988), and collected concreteness ratings for the five remaining items from a separate group of 31 participants.

TABLE 1

Table 1. Mean stimulus characteristics (standard deviations in parentheses).

Procedure

There were three components in a testing session: (1) a study phase, (2) a distraction phase, and (3) a recall phase. On each trial in the study phase, a word was presented in the center of a 17″ monitor controlled by a Macintosh G3 computer using PsyScope (Cohen et al., 1993). Each word was presented for 2 s, followed by 3 s of blank screen before presentation of the next word (ter Doest and Semin, 2005). A total of 60 words were presented for study, in a different random order to each participant. Participants were asked to memorize the words for a later recall test. In the distraction phase, participants were asked to complete two unrelated tasks on the computer: a semantic categorization task and a ratings task, both with word stimuli. The time taken to complete the distraction tasks was 9 min. In the recall phase, participants were presented with a blank computer screen and were asked to try to remember the studied words, typing in each word they recalled. Participants were given 4 min to complete the recall phase but could request more time (Hamilton and Rajaram, 2001). None of the participants requested additional time.

Coding procedures for recall responses were adopted from those used in previous studies (e.g., ter Doest and Semin, 2005). Responses were judged correct if they were identical to, or were inflectional or misspelled variants of words on the study list (e.g., we accepted shelf for shelves, and plyers for pliers). Responses were judged incorrect if they did not appear on the study list or were synonyms of a studied word (e.g., we did not accept cabinet for cupboard).

Results and Discussion

The mean proportions of low NoF and high NoF words recalled are presented in Table 2. In addition to the studied items, participants recalled an average of 2.80 words (SD = 3.08) that were not in the studied list. T-tests were conducted with subjects (t₁) and, separately, items (t₂) as random factors to compare correct recall for low and high NoF words. Results showed a significant NoF effect (t₁₍₂₉₎ = 3.65, p < 0.001, SE = 0.02; t₂₍₅₈₎ = 2.91, p < 0.01, SE = 0.03): recall was better for high NoF words than for low NoF words. This was, to our knowledge, the first report of a NoF effect in memory and we sought to replicate it with a different set of items in Experiment 2.

TABLE 2

Table 2. Mean proportion of words correctly recalled.