Preschool Children’s Memory for Word Forms Remains Stable Over Several Days, but Gradually Decreases after 6 Months

Gordon, Katherine R.; McGregor, Karla K.; Waldier, Brigitte; Curran, Maura K.; Gomez, Rebecca L.; Samuelson, Larissa K.

doi:10.3389/fpsyg.2016.01439

ORIGINAL RESEARCH article

Front. Psychol., 27 September 2016

Sec. Psychology of Language

Volume 7 - 2016 | https://doi.org/10.3389/fpsyg.2016.01439

Preschool Children’s Memory for Word Forms Remains Stable Over Several Days, but Gradually Decreases after 6 Months

Katherine R. Gordon^1*

Karla K. McGregor¹

Brigitte Waldier¹

Maura K. Curran¹

Rebecca L. Gomez²

Larissa K. Samuelson³

¹DeLTA Center and Department of Communication Sciences and Disorders, University of Iowa, Iowa City, IA, USA
²Department of Psychology, The University of Arizona, Tucson, AZ, USA
³School of Psychology, University of East Anglia, Norwich, UK

Research on word learning has focused on children’s ability to identify a target object when given the word form after a minimal number of exposures to novel word-object pairings. However, relatively little research has focused on children’s ability to retrieve the word form when given the target object. The exceptions involve asking children to recall and produce forms, and children typically perform near floor on these measures. In the current study, 3- to 5-year-old children were administered a novel test of word form that allowed for recognition memory and manual responses. Specifically, when asked to label a previously trained object, children were given three forms to choose from: the target, a minimally different form, and a maximally different form. Children demonstrated memory for word forms at three post-training delays: 10 mins (short-term), 2–3 days (long-term), and 6 months to 1 year (very long-term). However, children performed worse at the very long-term delay than the other time points, and the length of the very long-term delay was negatively related to performance. When in error, children were no more likely to select the minimally different form than the maximally different form at all time points. Overall, these results suggest that children remember word forms that are linked to objects over extended post-training intervals, but that their memory for the forms gradually decreases over time without further exposures. Furthermore, memory traces for word forms do not become less phonologically specific over time; rather children either identify the correct form, or they perform at chance.

Introduction

In a typical word-learning task, children see an unknown object, either alone or with some familiar objects, and hear its name several times (e.g., see the dax). Afterward, they are presented with an array of objects and asked to point to the target when given the word form (e.g., Which one is the dax?) (see Carey, 2010; Swingley, 2010 for reviews). The majority of research on word learning has focused on these referent selection tests. However, word learning encompasses a variety of other abilities including the ability to retrieve the word form when given the target referent (i.e., form tests). Although far less commonly used than referent tests, typical form tests include asking children to name a trained object (e.g., What is this one called?) (see Dollaghan, 1985; Gray, 2003; Booth et al., 2008; Hoover et al., 2010 for examples). When tested immediately after training of word-referent pairs, children perform well on referent tests but poorly on form tests (Kiernan and Gray, 1998; Gray, 2003; Booth et al., 2008; Munro et al., 2012). In fact, performance on form tests that require naming is often at floor; clearly such tests are not sensitive to children’s word learning. Critically, given the relatively infrequent use of form tests, and the lack of sensitively of these tests, our understanding of children’s ability to encode and retrieve forms is limited. An additional gap in the literature is that children’s memory for words is typically only assessed immediately after training. Thus, we lack a detailed understanding of how children’s memory for word-referent pairs, particularly their memory for forms, changes across post-training delays. A related unexplored question is how children’s phonological representations of forms change across post-training delays. In the current work, we begin to address these gaps by assessing 3- to 5-year-old children’s memory for forms across short and long delays using a more sensitive word form test.

Measures of Word Form

Children typically perform much better in referent than form tests when tested immediately after training, yet these tests differ in task demands. Referent tests allow for manual responses and recognition memory in that children respond by pointing to one of the objects presented to them. In contrast, form tests require verbal responses and recall memory in that children must produce forms and are not given a variety of forms to choose from. To address the difference between test outcomes, Gordon and McGregor (2014), following Storkel (2001) and Nash and Donaldson (2005), developed a form test with task demands similar to referent tests. In this test, which we will call the dot test, the trained target object and a paper with three large dots on it are placed in front of the child. The experimenter gives the child three forms to choose from by pointing to one of the dots while producing each form. The tested forms are the target; a minimally different form that differs from the target in either the initial, medial, or final constant; and a maximally different form that differs from the target in number of syllables and the majority of phonemes represented. For example, if the trained form was “dorb,” the experimenter would ask, “What is this one called?” while pointing to the target object: “Is it a dorb?” while pointing to the first dot; “Is it a vorb?” while pointing to the second dot; “Or is it a zinnip?” while pointing to the third dot. Children respond by pointing to the dot that corresponds with one of the forms, stating a form, or doing both.

In Gordon and McGregor (2014), 4- to 6-year-old children received five exposures each to 12 word-object links. A week later, children’s memory for the links was tested through the dot test and a traditional referent test. Children’s performance on the dot test was well above chance, much better than performance on typical form tests that are administered immediately after training. Given that the dot test is a more sensitive measure of children’s memory for forms than typical tests, we can utilize this test to assess children’s ability to retrieve word forms across various delays.

Long-Term Memory for Word Forms

The majority of research on children’s word learning has focused on assessing their memory for word-referent links immediately after training. Although less common, there is some work investigating children’s memory for words across various post-training delays. For example, after a minimal number of exposures, 2-year-old children are good at selecting the referent immediately after training, yet they are poor after delays as short as 5 mins (Horst and Samuelson, 2008). In contrast, preschool-age children can maintain a word over a delay of several days (Rice et al., 1994), 1 week (Carey and Bartlett, 1978; Markson and Bloom, 1997; Waxman and Booth, 2000; Holland et al., 2015), 1 month (Markson and Bloom, 1997), and several months (Kan, 2014) when their memory is assessed through referent tests. These differences are consistent with broader work on children’s memory development, namely the length of time that children can retain a memory is positively correlated with age (see Bauer, 2015a).

Although assessing children’s long-term retention of word-referent links through referent tests is rare in the literature, assessing children’s retention of word forms over delays is even rarer. This is likely due, in part, to a lack of sensitive measures for children’s memory of forms. For example, Kan and Kohnert (2012) and Kan (2014) assessed bilingual preschool children’s memory for words through both referent and form tests immediately after training and after a 1-week and 4-month delay. As children performed near floor in the form test at all time points, not much information was gathered on children’s memory for forms. Munro et al. (2012) provide more information about children’s long-term memory for forms. In this study, 2- to 3-year-old children were given six exposures to novel word-referent pairs then tested 1 min, 5 mins, and several days later. At each test point, they were first asked, “What is this one called?”, but were given the first syllable of the target word as a memory cue if they did not produce a form. The researchers found that children’s memory for the forms sharply decreased during the 5-min interval, but did not differ between tests given 5 mins and several days after training. Some children who were unable to demonstrate memory for the forms during free recall were able to do so after cuing. However, their performance remained near floor. After several days children produced, on average, 4% of the forms through free recall, and 7% of the forms after cueing.

Do children remember more about newly learned word forms than production tests-even cued production tests-reveal? Recognition/manual versions of form tests, such as the dot test, offer a more sensitive measure to address this question. In addition to addressing this primary question, the influence of factors both internal and external to the child on long-term retention can be explored through this more sensitive measure. An internal factor that may affect children’s ability to encode and retrieve forms is their current vocabulary level. Research that has explored the relationship between vocabulary and word learning, as measured by referent tests, has yielded mixed results; some researchers have found evidence of a relationship (Gathercole et al., 1997; Masoura and Gathercole, 2005; Munro et al., 2012; Law and Edwards, 2015) and others have not (Gray, 2003; Tan and Schafer, 2005). These mixed results are likely due to variations in the child’s age, cognitive abilities, and specific demands of the word-learning task.

Critically, although research with infants has shown that vocabulary size is related to their ability to learn word forms that are linked to objects (Werker et al., 2002), little work has explored this question in older children. Furthermore, the question of how vocabulary is related to children’s ability to maintain trained words across delays remains largely unexplored. Munro et al. (2012) is an exception that addresses both of these limitations in that they found a relationship between 2- and 3-year-old children’s receptive vocabulary and their ability to retrieve trained words after a 1-min, 5-min, and several-day delay. However, we are unaware of any work that has looked at how vocabulary size is related to children’s ability to retain words across delays longer than a week.

An external factor that has been shown to affect retrieval is context cues, such as the similarity of the décor and location of the room where learning and testing occurs, or whether the same person administers the training and test (Smith and Vela, 2001; Hupbach et al., 2008; Hupbach et al., 2011; Vlach and Sandhofer, 2011; Goldenberg and Sandhofer, 2013). There is some work demonstrating that consistency of context cues positively influences children’s ability to learn (Horst et al., 2011), retrieve (Goldenberg and Sandhofer, 2013), and generalize (Vlach and Sandhofer, 2011) novel word-referent pairs (also see Horst, 2013). One critical finding in the adult literature is that the longer the delay, the more likely context cues are to affect retrieval (Smith and Vela, 2001). However, currently, it is not understood how context cues affect preschool-age children’s ability to retrieve word-referent links after longer delays.

Phonological Specificity of Word Forms

In addition to providing more information about children’s memory for forms across delays, the dot test can offer new information about how the phonological specificity of children’s representations of forms changes post-training. Phonological specificity is typically assessed by measuring children’s tendency to select or visually focus on the target object when they are given correct pronunciations vs. mispronunciations of the target form (see Vihman, 2014 for a review). The vast majority of this work has focused on infants ages 2 years or younger. Children’s responses vary with age (Vihman, 2014), vocabulary level (Werker et al., 2002), manner and degree of phonological variation from the target (Halle and de Boysson-Bardies, 1996), familiarity with the form (Halle and de Boysson-Bardies, 1994, 1996; Fennell and Werker, 2003; Ballem and Plunkett, 2005), sentence context (Cole and Perfetti, 1980), language (Vihman et al., 2004), and number of exposures to the form (Kay-Raining Bird and Chapman, 1998). Despite this variation, the general conclusion is that children encode fairly specific representations of forms after training, but that performance varies with the particulars of the task.

Because children are selecting among objects in these tests, the familiarity and similarity of objects, and whether objects were previously named can affect performance (Oviatt, 1980; Horst et al., 2011; Kucker and Samuelson, 2012). This limitation has been addressed in two ways. First, researchers have assessed children’s memory for forms in the absence of an object, which does improve performance (see Werker et al., 2002 for a review). However, a limitation of this methodology is that it is testing children’s recognition of the trained form, not their recognition of a specific form linked to a specific object. Second, researchers have assessed children’s representations of word forms through form tests (i.e., asking children “What is this one called?”), which eliminate the need to select among objects. A limitation of this method is its heavy recall demands without a supportive cue, which contributes to near floor performance. Additionally, when children do attempt the forms, it is unclear whether mispronunciations reflect children’s phonological representations or their production difficulties (Swingley and Aslin, 2000).

Although not broadly used, there are two innovative methodo logies to measure children’s phonological representations of forms that do not require verbal productions or object selection. Namely, children are asked to indicate through yes/no responses whether individual forms apply to a specific object, or they are asked to identify the correct form for an object when there are several forms to choose from, similar to the dot test (see Weismer and Hesketh, 1996, 1998; Alt et al., 2004; Alt and Plante, 2006). Similar to measures typically used with very young children, these assessments reveal that preschool- and school-age children encode fairly specific representations of forms when assessed immediately after training.

Given that the dot test includes a target and a foil that comprise a minimal pair, it can reveal how the specificity of forms changes across various post-training intervals. One possibility is that with adequate memory supports during training, the specificity of encoded forms will remain stable over longer delays. Vlach and Sandhofer (2012) found that children could recognize the target object when given the form (i.e., a referent test) over the delay of a month when given adequate memory supports during training. Thus, given adequate memory supports, children may maintain the ability to retrieve the form via a form test after a longer delay. A second possibility is that forms will become more specific over time due to consolidation. There is some evidence that this happens over shorter time scales (e.g., 24 hrs) (Henderson et al., 2012). However, research on forgetting curves suggest that this is unlikely to happen over longer time scales without further training (Henderson et al., 2012; Murre and Dros, 2015). A third possibility is that phonological representations will become weaker over time. It is possible that it is harder to retain the ability to identify the form when given the target object than the reverse. Thus, the ability to recognize forms may show a typical forgetting curve over time even when given memory supports during training.

If representations of forms become weaker over time, the dot test could offer insights into how they change. If the representations become less phonologically specific, children should gradually decrease selections of the target form while increasing selections of the minimally different form. However, children’s representations of forms could remain fairly specific, but children will lose the ability to retrieve the memory trace or the memory trace could disappear completely. In this case, when children no longer reliably select the target form, they should be at chance responding when choosing between the target, a minimally different form, and a maximally different form. Gordon and McGregor (2014) found that children were no more likely to select the minimally different form than the maximally different form 1 week after training; providing support for the second hypothesis. However, as children were not tested directly after training, or at time points longer than a week, it is unknown how the phonological memory for forms changes across various delays.

The Current Study

The primary questions of the current study are twofold: (1) How does the number of word forms that children correctly identify change based on the length of the delay?, and (2) How do children’s phonological representations of word forms change based on the length of the delay? To address these questions, in Experiment 1 we trained children on word-object links in a similar manner to Gordon and McGregor and assessed children’s memory for the forms 10 mins after training (short-term delay) and 2 to 3 days after training (long-term delay). For Experiment 2, a subset of children who participated in Experiment 1 were tested on their retention of forms 6 months to 1 year after training (very long-term delay). Using the dot test as the assessment measure at each time point provided insight into whether phonological representations become less specific over time. Finally, to address the question of whether children’s selections in the dot test simply reflect their ability to identify a form they heard during training, or their ability to identify the specific form that goes with a specific object, we administered a 4-dot version of the test after the final testing session. In this test, children are given the target form for the object being asked about and a minimal pair of the target, plus an alternate trained form that is linked to another object and a minimal pair of that form. We report this methodology and results as Experiment 3.

As we wanted to focus on differences in retrieval at various time points, we decided to maximize encoding by including optimal memory supports during training. Thus, training included: multiple spaced presentations of word-referent links (McGregor et al., 2007; Vlach et al., 2008), ostensive naming (Horst and Samuelson, 2008; Axelsson et al., 2012), and the opportunity to handle and manipulate the objects (Scofield et al., 2009).

We also investigated how individual and contextual factors affected performance after various delays. Thus, we assessed children’s vocabulary level and general language comprehension and production abilities. Additionally, to explore whether younger children reliably show retrieval of forms after various delays and how retrieval of forms varies between age groups, we assessed 3-year-old children’s in addition to 4- and 5-year-old children’s performance on the dot test. With regards to environmental factors, we assessed the effect of room environment (i.e., location and décor) and experimenter on children’s ability to retrieve previously learned word-object links.

Based on past findings, we predicted that given memory supports during training, children would maintain a reliable memory for forms over shorter (2 to 3 days) and longer (6 months to 1 year) delays. Additionally, children should maintain fairly specific representations of forms across short retention intervals. However, given the lack of research we do not make a firm prediction about how phonological representations change across longer post-training delays. Age and language abilities should be positively related to children’s ability to encode and retain forms. Furthermore, context cues should aid retrieval after longer delays, but are less likely to do so after shorter delays.

Experiment 1

Materials and Methods

Participants

Participants included sixteen 3-year-old children (mean age = 41.13 months, range = 36–46 months, females = 9) and sixteen 4- to 5-year-old children (mean age = 58 months, range = 49–67 months, females = 8). According to parental report, none of the children had a history of speech or language problems. Children were recruited via mass email to university faculty, staff, and students. The data from five additional participants were excluded: two due to experimenter error and three because the children refused to continue.

According to self-report, mothers of the participants completed a mean of 18.11, SD = 3.04, years of education (one participant did not provide this information). Fathers of the participants completed a mean of 18.31, SD = 3.60, years of education (two participants did not provide this information). Children’s racial/ethnic backgrounds were as follows: white/non-Hispanic = 22, white/Hispanic = 2, Hispanic (without race provided) = 2, black/non-Hispanic = 2, biracial = 2, not available = 2.

All reported experiments received approval from the Institutional Review Board of the university of the lead author. Parents and/or guardians of all participants gave written informed consent for their child to participate.

Stimuli

The objects presented to the children were similar to the objects in Gordon and McGregor (2014). There were 12 referent categories: each included 1 referent exemplar (the prototype) and 1 referent exemplar that varied from the other one in color, size, or both (the non-prototype). Two referent exemplars for each category were used during training as some similarity and some variation within categories promotes both learning and generalization (Rost, 2011; Twomey et al., 2013). There was one generalization exemplar for each referent category used during testing that varied from the other two exemplars in color, size, or both. Stimuli also consisted of 12 foil categories that each included 1 foil exemplar (the prototype) and 1 foil exemplar that varied from the other one in color, size, or both (the non-prototype).

A novel word was assigned to each of the 12 referent categories. Six were monosyllabic and six were disyllabic; all had low lexical neighborhood densities and similar phonotactic probabilities. The words were designed to maximize learning in that they were composed of early acquired sounds, and none shared the same initial syllable (Creel et al., 2006). The words were divided into two sets that each contained 3 monosyllablic and 3 disyllabic words. Place and manner characteristics varied within each word set, but the two sets were similar in place and manner features represented. Twenty-four additional novel words served as foils in the dot test. Twelve foils, the minimally different forms, varied from the targets in either onset, medial or final consonant with the position of change counterbalanced across foils. The other 12 foils, the maximally different forms, varied from the target in number of syllables and in the majority of phonemes.

Procedure

The majority of children passed a pure-tone audiometric screening administered in a non-soundproofed laboratory room at 1, 2, and 4 kHz at 20 dB and 0.5 kHz at 25 dB. Two children responded at 30 dB in the right ear and 25 dB in the left ear at 0.5 and 4 kHz, respectively. One child responded at 30 dB in the right ear and 40 dB in the left ear at 0.5 kHz. As all three children passed at 25 dB at all other levels, their data were retained.

Children participated for two consecutive weeks. Each week included one training session and one testing session conducted 48–72 hrs later. The first training session occurred in a room on the lowest floor of a building on campus that was decorated like a living room with a couch and a colorful rug. The second training session occurred in a room on the top floor of the same building decorated like an office with two chairs on either side of a desk and a gray rug. Each testing session was conducted in the training context used that week or in a novel testing context (order counterbalanced across children). The novel testing context was a room that resembled a hospital room in the building next door.

During training sessions, the experimenter presented the objects from one of the sets on a tray on her lap while she sat on the couch or presented the objects on the desk. Set order was counterbalanced across children. Children were shown a prototype referent exemplar (i.e., the red dorb) that was named two times (i.e., Look at this dorb. See a dorb), followed by a prototype foil that was not named (i.e., Look at this. See this.), until all 6 referent exemplars and 6 referent foils had been presented. Children were encouraged to handle the objects. As spacing of presentations has been shown to aid retention, children were given a 3-min stretch break after all the objects were presented once (Vlach et al., 2008). After the first break, the experimenter presented a non-prototype referent exemplar (e.g., the blue dorb) that was named (i.e., Look at this dorb), followed by a non-prototype foil (i.e., Look at this) until all 6 target and 6 foil categories had been shown, followed by a 10-min break that involved riding the elevator in the building. After the second break, children were shown the prototype referent exemplar again (e.g., the red dorb) named one time (i.e., We’ve got this dorb), and then shown the prototype foil (We’ve got this). After the prototype exemplar from all the referent categories and foil object categories had been shown a second time, the child took another break on the elevator before the short-term test. The unnamed foils were presented during training so that the training protocol would match previous work, but they were not used during testing in the current experiment.

For the short-term test, the experimenter placed a paper with three large dots on it on the tray or desk. The experimenter showed a prototype referent exemplar (e.g., the red dorb) and asked the child, “What is this one called?” The experimenter than produced three forms (e.g., the target, minimally different form, and maximally different form), and pointed to one of the dots on the paper as she produced each form. The order in which the three choices were produced was counterbalanced across test trials. Children could state a form, point to a dot, or do both. If the child pointed to a dot and stated a form that did not match that dot, the stated form took precedence. If a child stated a form that was not exactly like one of the options given, the child was asked to repeat his/her answer and the response was coded as the form that it was most similar to. To challenge children’s memory, the referent exemplars were tested in a different order than they were presented during training. Namely, the 2nd, 4th, and 6th referent exemplars trained were tested first, followed by the 1st, 3rd, and 5th exemplars. Children received two practice trials before the short-term test that included familiar objects (i.e., a book and a car) and were given three forms to choose from (e.g., “What is this one called? Is this a car, a lar, or a daxim?”).

Two to three days later, children’s long-term memory of forms was tested using the dot test and generalization exemplars of each referent category. Generalization exemplars were used to sufficiently challenge children’s long-term memory such that any context effects would be revealed. The objects were tested in the same order as the short-term test (e.g., 2nd, 4th, 6th, 1st, 3rd, 5th). However, the order of the presentation of the forms (target, minimally different form, maximally different form) varied between the short- and long-term tests.

When tested in the familiar context, at the end of the session children were brought outside of the room and the door was closed. The children were asked, “Do you remember the room that we were just in? What did it look like?” They were shown a paper with photos of four versions of the room: the same décor and the same arrangement of furniture, different décor and the same arrangement of furniture, the same décor and a different arrangement of furniture, and different décor and a different arrangement of furniture than the actual room. When tested in the novel context, they were shown the four photos at the end of testing and asked, “Do you remember the room where we first played with these things? What did it look like?”

Children also completed a variety of standardized tests: a receptive vocabulary test, PPVT-4 (Dunn and Dunn, 2007), and the comprehension and expression tests from the Preschool Language Scale, PLS-4 (Zimmerman et al., 2002). All standardized testing was conducted while seated on the rug of each room to encourage the specific training context (i.e., on the couch or at the desk) to be uniquely associated with the object sets.

Results

A series of one-sample t-tests was conducted to compare the number of times children selected the target form to chance (2 correct responses out of 6 3AFC-alternative forced choice trials). Because we performed four t-tests for each age group, a Bonferroni correction was used to determine significance, in this case 0.05/4 = 0.0125. Both age groups performed above chance at all time points (see Table 1). A t-test was conducted to assess whether performance at the long-term test differed from performance at the short-term test. Children’s performance did not significantly differ at these two times points t(31) = 0.474, Cohen’s d = 0.128.

TABLE 1

TABLE 1. Comparisons of children’s performance at each time point in Experiment 1 to chance.

To assess the effects of the various factors on short- and long-term memory for forms, a mixed effects logistic regression in an R environment, using the lme4 package was conducted. Models predicted log odds of a correct response on the dot test. See Table 2 for the predictors that were explored for the short-term and long-term test. Age was coded categorically as “Three” or “Four/Five” on the basis of age at the initial testing session, and was included as a covariate in all models.

TABLE 2

TABLE 2. Covariates tested for each model.

For the model looking at short-term performance, the maximal random effects structure supported by the data included random intercepts for participant and item. Preliminary testing found no reliable differences in performance associated with object set A or B (p > 0.10). Model fit did not drop when omitting this predictor. Thus, all data were collapsed across training set for the remainder of the analyses.

The final model included main effects for week and age (see Table 3). A significant effect of week emerged, such that log odds of correct performance was lower during week 2 than week 1 (z = -4.93, p < 0.01). A reliable main effect of age emerged, such that log odds of a correct response were higher for the 4- and 5-year-old group than the 3-year-old group (z = 1.99, p < 0.05). There were no significant effects of PLS comprehension raw score (z = 0.73), PLS expressive raw score (z = -0.38) or PPVT-IV raw score (z = -0.43).

TABLE 3

TABLE 3. Final model for the short-term test.

For the model looking at long-term performance, the maximal random effects structure supported by the data included random intercepts for participant and item. We systematically tested the same predictors as those included in the short-term test model with two additions: testing context (familiar or novel testing room) and accuracy at the short-term test. Preliminary testing found no reliable differences in performance associated with object set or context of the test (all p > 0.10). Model fit did not drop when omitting these predictors. Thus, all data were collapsed across training set and context for the remainder of the analyses.

The final model included main effects for week, age, PPVT-IV raw score, and short-term accuracy; and an interaction between age and short-term accuracy (see Table 4). A significant main effect of week emerged, such that log odds of a correct response were lower for items taught during the second week (z = -3.78, p < 0.01). Also, a reliable, positive, main effect of PPVT-IV raw score emerged, such that log odds of a higher PPVT score predicted more accurate responses at the long-term test (z = 2.04, p < 0.04). No significant main effect of PLS Comprehension or Expressive scores emerged (z = 1.47, n.s. and z = 0.65, n.s. respectively). A significant interaction between age and prior accuracy emerged (z = 2.22, p = 0.03). Three-year-olds tended to show little effect of prior accuracy on a specific item (e.g., dorb) on performance of that item at the long-term test. The 4- and 5-year-old participants demonstrated increased log odds of a correct response at long-term in association with a correct short-term response, and decreased log odds of a correct response at long-term in association with an incorrect short-term response (see Figure 1).

TABLE 4

TABLE 4. Final model for the long-term test.

FIGURE 1

FIGURE 1. Likelihood of a correct response on the long-term test plotted by response accuracy on the short-term test and age, short-term accuracy is coded as 0 (incorrect) or 1 (correct) above.

Response Type

Four coders rated whether children said a form, pointed to a dot, or did both for each response in the tests (regardless of whether the child indicated the correct form or not). At least two coders independently rated each video and inter-rater reliability was 99%. In cases where there was a disagreement, the primary investigator watched the video to resolve the discrepancy. Five children’s data were excluded from this analysis because videos of the sessions did not record successfully. Thus, the following analyses included thirteen 3-year-old children, and ten 4- and 5-year-old children. There was one trial in which one 3-year-old refused to respond, but his responses to all other trials were included.

Three-year-old children: out of all 312 trials (14 children, 24 trials each), children just pointed to a dot for 131 or 42% of the trials (mean 10.07 trials per child, SD = 9.10), just stated a form for 97 or 31% of the trials (mean 7.46 trials per child, SD = 7.36), and both pointed to a dot and stated a form for 83 or 27% of the trials (mean 6.38 trials per child, SD = 6.48). Four- and 5-year-old children: out of all 240 trials (10 children, 24 trials each) children just pointed to a dot for 118 or 49% of the trials (mean 11.80 trials per child, SD = 9.03), just stated a form for 82 or 34% of the trials (mean 8.2 trials per child, SD = 9.84), and both pointed to a dot and stated a form for 40 or 17% of the trials (mean 4.00 trials per child, SD = 5.14).

Children were fairly consistent in the type of responses they gave. For the 3-year-olds, children used the same response type an average of 72% (range 37.5–100%) of the time. Five children pointed most often, five children said a form most often, and three children did both most often. For the 4- and 5-year-olds, children used the same response type an average of 79% of the time (45.8–100%). Five children pointed most often, three children said a form most often, and two children did both most often.

To assess whether type of response (i.e., point, say, both) was related to accuracy on short- and long-term memory tests for forms, a mixed effects logistic regression in an R environment, using the lme4 package, was conducted. Random effects for participant and item were included. Age was coded categorically as “Three” or “Four/Five” on the basis of age at the initial testing session, and was included as a covariate. This analysis revealed no main effect for type of response (p > 0.10).

Word-Level Factors

Only the number of choices of the target form has been analyzed thus far. To compare children’s choices of the target, minimally different form, and maximally different forms at the short- and long-term tests, we conducted a series of t-tests (Bonferroni correction, 0.05/6 = 0.0083). At both time points, children were more likely to select the target than the minimally different form [short-term t(31) = 8.330, p < 0.0001, Cohen’s d = 4.253, long-term t(31) = 7.715, p < 0.0001, Cohen’s d = 3.942] and more likely to select the target than the maximally different form [short-term t(31) = 8.580, p < 0.0001, Cohen’s d = 4.027, long-term t(31) = 8.091, p < 0.0001, Cohen’s d = 3.849], but the number of selections of the minimally different and maximally different forms did not differ [short-term t(31) = 0.084, p = 0.934, Cohen’s d = 0.015, long-term t (31) = 0.685, p = 0.498, Cohen’s d = 0.383].

Children’s selection of the target, the minimally different form, and the maximally different form were also compared across the short- and long-term test (Bonferroni correction, 0.05/3 = 0.016). Children’s selections did not differ across the two tests: target, t(31) = 0.725, p = 0.474, Cohen’s d = 0.128; minimally different t(31) = 1.052, p = 0.301, Cohen’s d = 0.186; maximally different t(31) = 0.109, p = 0.914, Cohen’s d = 0.019.

We analyzed whether children were more likely to select the minimally different form based on the position of change (initial, medial, final) from the target based on their combined responses to the short- and long-term test. Thus, we conducted a repeated measures ANOVA with age (3-year-old, 4- and 5-year-old) as the between-subjects factor and position of change (initial, medial, and final) as the within-subjects factor, and the number of selections of the minimally different form as the dependent variable. There was not a significant main effect for position of change, F(2,29) = 2.967, p = 0.067, $η_{p}^{2}$ = 0.170 or age F(2,30) = 3.946, p = 0.056, $η_{p}^{2}$ = 0.116, and there was not a significant age by position of change interaction F(2,29) = 0.819, p = 0.451, $η_{p}^{2}$ = 0.053.

The number of one-syllable words and the number of two-syllable words that children correctly identified across the short- and long-term test were compared through a mixed-model ANOVA with age (3-year-olds, 4- and 5-year-olds) as the between-subjects factor and number of syllables (one, two) as the within-subjects factor. Children correctly identified more two-syllable words than one-syllable words F(1,30) = 4.416, p = 0.044, $η_{p}^{2}$ = 0.128, but there was no age by number of syllables interaction.

Memory for Context

One child’s data was excluded from the results of this test, as he was not shown the photos of the rooms after being tested in the familiar room. Chi-square analyses revealed that there was not a significant relationship between photo selection and age group χ(3) = 5.560, p = 0.135, nor a significant relationship between photo selection and the novel or familiar testing condition χ(3) = 3.159, p = 0.368. Thus, all children’s responses in both the novel and familiar context were analyzed together. Children’s selections of the training room at long-term test were compared to chance through a series of t-tests (Bonferroni Correction, 0.05/4 = 0.0125): children selected the correct photo significantly above chance t(61) = 4.159, p < 0.0001, Cohen’s d = 0.528. Their selections of the photo with the same décor, but a different arrangement, t(61) = 1.484, p = 0.653, Cohen’s d = 0.057, and the photo with different décor and the same arrangement t(30) = 0.177, p = 0.143, Cohen’s d = 0.188 did not differ from chance. Their selections of the photo with different décor and a different arrangement t(30) = 4.858, p < 0.0001, Cohen’s d = 0.617 were significantly below chance.

Summary of Results

In responding to the dot test, both younger (3-year-olds) and older (4- and 5-year-olds) children demonstrated a memory for forms 10 mins and 2 to 3 days after training, and their performance at these two time points did not differ. This finding is in marked contrast to past results in which children performed near floor on form tests when they were asked to recall and produce the forms immediately after training. Age and week of training were both shown to be reliably related to performance at the short- and long-term test with older children performing better than younger children and children performing better during week 1 than week 2. Additionally, for older children, performance on a specific item at the short-term test was related to performance on that item at the long-term test, but this was not true of younger children. With regards to the specificity of the representation of word forms, children selected the target form reliably more than the other two forms at both the short- and the long-term test, and the number of selections of the target form did not differ based on delay.

Experiment 2

There were two main purposes for Experiment 2: (1) To assess whether children could retrieve forms months after training (very long-term delay test), and whether their short- and long-term performance was related to their very long-term performance, and (2) To assess how delay length was related to children’s ability to retrieve the target forms. Additionally, to tap memory traces not detectable via the dot test, we retrained children on the word-object links and retested them. This allowed us to assess whether any residual memory trace would aid in relearning the forms.