Spatio-Temporal Structure, Path Characteristics, and Perceptual Grouping in Immediate Serial Spatial Recall

De Lillo, Carlo; Kirby, Melissa; Poole, Daniel

doi:10.3389/fpsyg.2016.01686

ORIGINAL RESEARCH article

Front. Psychol., 11 November 2016

Sec. Cognitive Science

Volume 7 - 2016 | https://doi.org/10.3389/fpsyg.2016.01686

This article is part of the Research TopicWhat Next - The Cognition of SequencesView all 12 articles

Spatio-Temporal Structure, Path Characteristics, and Perceptual Grouping in Immediate Serial Spatial Recall

Carlo De Lillo^*

Melissa Kirby

Daniel Poole^†

Department of Neuroscience, Psychology and Behaviour, University of Leicester, Leicester, UK

Immediate serial spatial recall measures the ability to retain sequences of locations in short-term memory and is considered the spatial equivalent of digit span. It is tested by requiring participants to reproduce sequences of movements performed by an experimenter or displayed on a monitor. Different organizational factors dramatically affect serial spatial recall but they are often confounded or underspecified. Untangling them is crucial for the characterization of working-memory models and for establishing the contribution of structure and memory capacity to spatial span. We report five experiments assessing the relative role and independence of factors that have been reported in the literature. Experiment 1 disentangled the effects of spatial clustering and path-length by manipulating the distance of items displayed on a touchscreen monitor. Long-path sequences segregated by spatial clusters were compared with short-path sequences not segregated by clusters. Recall was more accurate for sequences segregated by clusters independently from path-length. Experiment 2 featured conditions where temporal pauses were introduced between or within cluster boundaries during the presentation of sequences with the same paths. Thus, the temporal structure of the sequences was either consistent or inconsistent with a hierarchical representation based on segmentation by spatial clusters but the effect of structure could not be confounded with effects of path-characteristics. Pauses at cluster boundaries yielded more accurate recall, as predicted by a hierarchical model. In Experiment 3, the systematic manipulation of sequence structure, path-length, and presence of path-crossings of sequences showed that structure explained most of the variance, followed by the presence/absence of path-crossings, and path-length. Experiments 4 and 5 replicated the results of the previous experiments in immersive virtual reality navigation tasks where the viewpoint of the observer changed dynamically during encoding and recall. This suggested that the effects of structure in spatial span are not dependent on perceptual grouping processes induced by the aerial view of the stimulus array typically afforded by spatial recall tasks. These results demonstrate the independence of coding strategies based on structure from effects of path characteristics and perceptual grouping in immediate serial spatial recall.

Introduction

One of the most enduring problems in psychology and the neurosciences is the characterization of the mechanisms supporting the representation of serial order information (Lashley, 1951; Rosenbaum et al., 2007; Hurlstone et al., 2014). Serial Spatial Recall (SSR) refers to the ability to temporarily retain a sequence of spatial locations in a prescribed order and is one of the most common instantiations of the problem of serial order in short-term and working memory. The assessment of SSR is of central importance in several areas of psychological research. It has been used to evaluate the extent to which the processing of serial order in the verbal and visuo-spatial domain rests on similar mechanisms (Baddeley, 1992; Smyth and Scholey, 1992; Jones et al., 1995; Hurlstone et al., 2014), a crucial issue for the characterization of human cognitive architecture. SSR is one of the most widespread neuropsychological measures (Berch et al., 1998; Kessels et al., 2000) and is included as a test in widely used batteries (e.g., WAIS-R, Kaplan et al., 1991; Wechsler, 1997a; Wechsler Memory Scale, WMS-III, Wechsler, 1997b; Cantab, Cambridge Cognition, 2006). SSR has been extensively employed in the study of individual differences in working-memory (Cornoldi and Vecchi, 2003) and as a predictor of scholastic achievement (Jarvis and Gathercole, 2003; St Clair-Thompson, 2007). Because of its non-verbal nature, the assessment of SSR has been used for the comparison of memory skills in monkeys and humans, with important implications for the evaluation of primate models of human memory (Botvinick et al., 2009; Fagot and De Lillo, 2011).

Despite the popularity of SSR as a psychological measure and its suitability for addressing the problem of serial order from a cognitive, comparative, and neuropsychological perspective, its cognitive bases are still poorly understood and, as argued below, a number of central constructs for its description are often confounded. One of the most important issues to address in relation to SSR, as identified by a recent eminent review (Hurlstone et al., 2014) and as further elaborated below, is the characterization of the organizational factors that can contribute to accurate SSR (e.g., Kemps, 1999, 2001; Bor et al., 2003; De Lillo, 2004; Busch et al., 2005; Parmentier et al., 2005; Rossi-Arnaud et al., 2005; Parmentier and Andrés, 2006; Parmentier et al., 2006; Ridgeway, 2006; Imbo et al., 2009; De Lillo and Lesk, 2010).

SSR is typically measured by assessing spatial span with the Corsi test (Milner, 1971; Corsi, 1972), allegedly the most widely used non-verbal neuropsychological test (Berch et al., 1998; Kessels et al., 2000). In the Corsi test participants observe a sequence of spatial items, such as a series of finger tapping movements across an array of wooden blocks, or a series of flashing icons presented on a touch-screen. Then, they are required to reproduce the series by tapping the items in the same order. Because the items are all identical in shape and color, they need to be identified by their spatial position. For this reason, the Corsi test is considered one of the purest measures of spatial memory span (see Baddeley, 2001, for a review).

Traditionally, the Corsi test has featured irregular arrays of items and random sequences as recall material (Milner, 1971). However, it was realized soon that that not all random sequences are recalled at the same level of accuracy (Smirni et al., 1983) and attempts to standardize the test ensued with important applied implications for the use of these tests for clinical diagnosis (Kessels et al., 2000; Busch et al., 2005).

The complexity of Corsi sequences has been manipulated in order to assess the relative autonomy of short and long-term memory structures (Kemps, 1999). The results of ingenious experiments have clarified that items in spatial working-memory are coded configurationally, using allocentric frames of reference (Avons, 2007; Avons and Oswald, 2008; Boduroglu and Shah, 2014), thus highlighting the role of relational properties of items in the display in recall.

Some studies (De Lillo, 2004, 2012; De Lillo and Lesk, 2010) have emphasized the notion that the understanding of the effects of organizational factors in SSR is of interest apart from the assessment of memory span per se. They proposed that with the irregular spatial arrangement of the items in Corsi-type tasks and randomly selected sequences of block tapping (see Berch et al., 1998 for examples of Corsi displays and criteria for selecting sequences that have been used in the literature), it is impossible to isolate the effect of particular organizational factors and interpret them in relation to the memory representation that they afford.

In order to assess the contribution of a specific type of organizational factor on spatial span De Lillo (2004) used a Corsi display, presented on a touch-screen, where 9 squares were arranged spatially to form 3 clusters of 3 items each, so that the separation of the items within clusters was inferior to that between clusters. The use of a configuration of items grouped in spatial clusters was motivated by different considerations. It seemed the appropriate way to convey in a Corsi task the fact that space can be divided in different sub-regions. It provided a spatial analogy of forms of semantic clustering and chunking observed in non-spatial domains. Finally, a configuration of items grouped in spatial clusters resembles a “patchy” foraging environment that according to foraging theories of cognitive evolution provided the pressures for the emergence of large brain and working memory skills in humans and other primates (e.g., Milton, 1993). Importantly, the use of a clustered Corsi display with items arranged in spatial clusters enables the manipulation of the serial organization of sequences so that they can be made either compatible or not with chunking by spatial proximity. De Lillo (2004) used different types of sequences. Some sequences were segregated by clusters, so that consecutive items were always in the same cluster and a transition to a different cluster occurred only after all the items within a cluster had been selected. Clustered sequences were deemed to afford a hierarchical representation because the order of the clusters, into which the sequence was segregated, could be stored independently from the order of the items within a given cluster. Other sequences were designed to be incompatible with such hierarchical organization because consecutive items were always in different clusters.

When recall for the two types of sequences was compared a beneficial “clustering effect” emerged; sequences segregated by clusters were reported at a higher level of accuracy. Consistently with a hierarchical model, in sequences segregated by clusters, longer Response Times (RTs) emerged at cluster boundaries. This suggested that the retention of spatially clustered sequences could be supported by a hierarchical representation similar to that observed for chunking in non-spatial domains (Miller, 1956; Klahr et al., 1983). By contrast, non-clustered sequences showed longer reaction times for the items at intermediate ordinal positions within the sequences. This is a pattern of RT that resembles the serial position curve typically observed for lists of unrelated items in other domains, such as nonsense words, where items at intermediate ordinal positions are the most difficult to recall.

The recall of Corsi sequences typically shows a long initial RT which is indicative of the processing of serial order just before recall (see Fischer, 2001). Further evidence for the hierarchical representation of clustered Corsi sequences has been provided by showing that a component of this initial RT is proportional to the number of clusters in which sequences are segregated and RTs at cluster boundaries that are proportional to the number of items within each cluster (De Lillo and Lesk, 2010).

The neural correlates of the processing of organizational factors in SSR have been highlighted in an f-MRI study (Bor et al., 2003) where participants faced an array of items arranged as a square matrix and were presented with “structured” and “unstructured” sequences. “Structured” sequences were operationally defined as those sequences where consecutive items were within the same row, column or diagonal. “Unstructured” sequences were defined as those violating this constraint (i.e., with consecutive items never within the same row, column, or diagonal). Behavioral results confirmed that structured sequences were reported at a higher level of accuracy than unstructured sequences. Moreover, f-MRI data indicated a higher activation of the dorsolateral prefrontal cortex (DLPFC) during the encoding of structured sequences than during the encoding of unstructured sequences.

Other important factors have been reported to affect the reproduction of spatial sequences. These include the length of the path of the trajectory necessary to connect all the items in the sequence and the number of times the path crosses itself (Orsini et al., 2001, 2004; Parmentier et al., 2005, 2006). These factors sometimes confound the effects of the structure of the representation underpinning performance. For example, it has been proposed that the clustering effect as observed by De Lillo (2004) could be explained by the fact that clustered sequences can have on average a shorter path length than non-clustered sequences (Parmentier et al., 2006). Similarly, the effect of structure observed by Bor et al. (2003) could be due to the fact that unstructured sequences can contain more crossings.

The RT patterns reported for structured sequences (Bor et al., 2003; De Lillo, 2004; De Lillo and Lesk, 2010) and the fMRI results of Bor et al. (2003) suggest that the detection and use of structure in Corsi sequences determines the formation of specific forms of hierarchical representation that contribute to efficient recall quite apart from other effects of path characteristics. Nevertheless, considering the possible contribution of all these factors, it is important to assess their relative role in SSR. We attempted to do so with the present study. The approach we took in the first experiment was to dissociate path-length and organization in a clustered array. With this experiment we tested the notion proposed by Parmentier et al. (2006) that path-length can be the sole explanation of the clustering effect in spatial span. We manipulated display size so that clustered sequences in a large display had a longer path-length than non-clustered sequences in a small display. Thus, if the benefits of clustering are explained by the shorter path that is normally associated with clustered sequences, then we should expect a more accurate recall for the non-clustered sequences with a short path when compared with the recall for structured sequences with a longer path. In the second experiment, we manipulated the timing structure of the sequence leaving its path-length and any other characteristics of the sequences unchanged. Using the same clustered sequences, we imposed pauses in the sequence presentation either at transitions between items within a cluster or at cluster boundary. An effect of timing in this experiment would indicate that the clustering effect is more likely to be related to the way in which the sequence is represented, rather than to mere effects of path characteristics, such as path-length or number of path-crossings.

In a third experiment we used a square matrix of locations which allowed a fully factorial manipulation of path length, presence of crossings, and structure as defined by Bor et al. (2003). By doing so we aimed to disentangle the effects of path length, presence of crossings, and structure. We then determined which of these factors explained most of the variance in the recall score of the participants.

The aim of the fourth and fifth experiments was to evaluate the importance of perceptual grouping for the emergence of beneficial effects of structure in SSR. In fact, the use of terms such as perceptual grouping, perceptual organization and gestalt principles is so widespread in the literature in relation to the explanation of the benefits of organizational factors in SSR and so often used interchangeably with that of efficient memory coding (e.g., Kemps, 1999; Bor et al., 2003; Rossi-Arnaud et al., 2005; Ridgeway, 2006; Bor, 2012; Hurlstone et al., 2014) to warrant an explicit assessment of the extent to which perceptual grouping is actually required for the benefits of organization in SSR to emerge.

In Experiments 4 and 5 we used immersive virtual reality to implement a navigational version of the Corsi task. In this task the order in which the sequence items had to be reproduced could not be apprehended from the same viewpoint. Having identified the first item in the sequence participants were required to move toward it and select it. Only then would the next item be presented at a different location within the environment. The presentation of the sequence was a lengthy process that involved continuous changes of directions and viewpoints. We reasoned that the observation of beneficial effects of structure in these conditions would have made the hypothesis that perceptual grouping processes are necessary for their emergence implausible.

Experiment 1

In SSR, path length refers to the length of the trajectory that connects the items that need to be reported in the prescribed order. It can be manipulated independently from other characteristics of the spatial sequences by altering the size of the item display so that the relative distance between the items is different in the two displays (Smyth and Scholey, 1994). The critical variable in this experiment was the relative distance of the items in the small and the large display. The experiment was designed so that clustered sequences, presented in the large set, had a longer path-length than non-clustered sequences presented in the small set. If the short path that typically accompanies clustered sequences is the sole explanation of the clustering effect, as proposed by Parmentier et al. (2006), then non-clustered sequences with a shorter path should be recalled more accurately than clustered sequences with a longer path.

Methods

Participants

Twenty five volunteers (10 male and 15 females), with a mean age of 26 years (SD = 7.76), were recruited from a participant panel at the University of Leicester and paid a small fee to take part in the experiment. They all reported normal or corrected-to-normal vision.

Materials, Design, and Procedure

The experiment was presented using a PC equipped with a 17″ Elo, IntelliTouch sensitive monitor (1024 × 768 pixels). A visual display consisting of a black background and nine identical gray squares arranged in three spatial clusters of three icons each was presented on each trial. Two displays were used: a large display, composed of squares 120 pixels wide (Figures 1A,B); and a small display, composed of squares 40 pixels wide (Figures 1C,D). In the small display, there was a 6 pixel-wide invisible active border area surrounding each square to ensure that the touch of a square was accurately registered even by participants with larger finger tips.

FIGURE 1

Figure 1. The layout of the display used for experiment 1 and examples of sequence paths for conditions: (A) Long structured; (B) Long unstructured; (C) Short structured; and (D) Short unstructured. Filled circle indicates the start of the sequence and arrow point indicates the ending position. The lines indicate the order in which the icons “blinked” during the presentation phase and were not actually displayed. See text for explanation.

In the large display, squares within a cluster were separated by a distance between 76 and 103 pixels, whereas the distance between clusters was of 152–189 pixels. In the small display, squares within a cluster had a distance between 19 and 27 pixels and the distance between clusters was of 38–53 pixels.

In each trial, participants were first presented with the full display of 9 squares for 700 ms. One square then turned to black for 500 ms, before the full display was represented for another 700 ms. This produced the impression that the icon would “blink.” Another square would then turn black for 500 ms, and so on until a sequence of 9 items was presented in this way. After the ninth square had blinked, the screen turned black for 1 s. The full display was then presented again and participants were required to reproduce the sequence that they had previously observed. To confirm that the touch had been registered, each square turned to black for 50 ms when touched.

The design featured the manipulation of sequence type that could be either “Structured” or “Unstructured” and path-length that could be “Long” or “Short.”

“Structured” sequences were segregated by spatial clusters, so that all the items of each cluster were presented before the sequence moved to a different cluster. By contrast “Unstructured” sequences were not segregated by clusters, so that consecutive items were always presented in different clusters. “Long” sequences were displayed in the large display and “Short” sequences were presented in the small display. The average path-length of long sequences was of 3093.36 pixels (SE = 170.37) and that of short sequences was 883.69 pixels (SE = 50.14).

Four experimental conditions were obtained combining these two factors in a 2 (structure) × 2 (path-length) repeated measures design: Long Structured (L-S), Long Unstructured (L-U), Short Structured (S-S), and Short Unstructured (S-U). Importantly, the path-length of L-S sequences (mean = 2381.32; SE = 40.09) was significantly longer than that of S-U sequences (mean = 1096.32; SE = 17.42). Examples of each sequence type are also provided in Figure 1. Ten sequences of each type were used and presented in random order within a testing session of 40 trials.

Participants were tested in a quiet laboratory with dim lighting. The height of their chair was adjusted so their eyes were at the same level as the center of the screen and they could comfortably touch any point of the display with the index finger of their dominant hand. Participants were informed that they had to use that finger when selecting the squares during the experiment, which took about 15 min to complete.

This experiment and all the other experiments reported in this article were carried out in accordance with the Code of Ethics and Conduct of the British Psychological Society and approved by the University of Leicester Ethics Committee for research involving human participants (Psychology sub-committee). All subjects gave written informed consent in accordance with the Declaration of Helsinki.

Results

Accuracy

An item recalled correctly was defined as a square touched in the correct serial position. Accuracy scores were the frequency of items correctly recalled by each participant in each condition. The mean accuracy score of each of the four conditions is presented in Figure 2A.

FIGURE 2

Figure 2. (A) Frequency of items correct recalled for the different conditions (Long Structured; Short Structured; Long Unstructured and Short Unstructured) of Experiment 1; (B) Proportion of correct items recalled at each serial position for the four different conditions of Experiment 1: LS, Long Structured; LU, Long Unstructured; SS, Short Structured; SU, Short Unstructured.

A 2 (structure: structured/unstructured) × 2 (path-length: long/short) repeated measures ANOVA was carried out on the frequency of correct items reported in the different conditions. It revealed a significant main effect for structure [F_{(1, 24)} = 8.747, p < 0.01, $η_{p}^{2}$ = 0.267] with a higher level of recall for structured sequences and path-length [F_{(1, 24)} = 198.965, p < 0.001, $η_{p}^{2}$ = 0.892], with a higher level of recall for long sequences. No interaction between path-length and structure was found.

Paired sample t-tests with Bonferroni correction (alpha of 0.05 corrected to 0.01 and alpha of 0.01 corrected to 0.002) were carried out to further clarify these results. Long structured sequences produced a significantly higher level of accuracy on recall than short unstructured sequences [t₍₂₄₎ = 12.404, p < 0.01], demonstrating that structured sequences were recalled at a higher level of accuracy that unstructured sequences even when they had a longer path-length. Moreover, the effect of structure was very robust as it was maintained in both short [t₍₂₄₎ = 11.681, p < 0.01] and long sequences [t₍₂₄₎ = 11.208, p < 0.01]. The effect of path-length proved less robust as it did not emerge when sequences with long and short path-length were compared within the structured and unstructured conditions separately.

Serial Position Analysis

Serial position effects were observed in each condition, as can be observed for the serial position curves presented in Figure 2B. A 2 (path-length: long/short) × 2 (structure: structured/unstructured) × 9 (serial position: 1/2/3/4/5/6/7/8/9) ANOVA for repeated measures was carried out. A significant main effect emerged for path-length [F_{(1, 24)} = 8.747, p < 0.01, $η_{p}^{2}$ = 0.267], structure [F_{(1, 24)} = 198.965, p < 0.001, $η_{p}^{2}$ = 0.892], and serial position [F_{(8, 192)} = 26.326, p < 0.001, $η_{p}^{2}$ = 0.523], confirming the main results reported above. A particularly strong interaction emerged between structure and serial position [F_{(8, 192)} = 13.082, p < 0.001, η²p = 0.345]. As can be observed from Figure 2B this can be easily accounted for by the different shape of the curves of the structured sequences on the one hand, and the unstructured sequences on the other. The latter curves resemble typical serial position curves for unstructured material with some indications of possible primacy and recency effects (see Crowder, 1969). Such effects are absent in the structured sequences. A significant interaction between path-length and serial position was also found, [F_{(8, 192)} = 5.510, p < 0.01, $η_{p}^{2} =$ 0.114]. This was not as conspicuous as the interaction between serial position and cluster type. It is likely to be explained by small differences occurring at different serial positions, which are more difficult to pinpoint. The third order interaction between path-length, structure and serial position was not significant.

Discussion

In this experiment we observed beneficial effects of clustering similar to those observed in other studies (De Lillo, 2004; De Lillo and Lesk, 2010). We found that clustering had a beneficial effect in both sequences with a long and with a short path. This suggests that path-length alone is unlikely to explain the effects of clustering in SSR. It has previously been suggested that clustered sequences afford a hierarchical coding of the sequence with spatial clusters forming the superordinate level and the items within each cluster forming the subordinate level (e.g., De Lillo, 2004; De Lillo and Lesk, 2010). However, Experiment 1 was not designed to assess this possibility. An attempt at gaining a better insight on the type of memory coding supported by clustering in this study was made in Experiment 2 by manipulating the temporal pattern of the presentation of clustered sequences.

Experiment 2

Experiment 2 aimed to provide additional support for the independence of the effects of structure and path-length in SSR and some indication concerning the nature of the representation underlying sequences segregated by spatial clusters. In order to do so, we used an approach previously used in the study of chunking and hierarchical representation in recall in the spatial (Bor et al., 2003) and other domains (Farrell and Lelievre, 2012). The temporal structure of the presentation of the sequences was manipulated by inserting temporal pauses during the presentation of clustered sequences. Only clustered sequences were used in this experiment. For some sequences the pause was inserted at transitions between items within a cluster. For other sequences, the pause was inserted at transitions between clusters. As such, sequences could be either consistent or inconsistent with a hierarchical representation based on segmentation by spatial clusters. Other path characteristics of the sequences remained the same so that effects of the temporal structure of the sequence could not be confounded with other effects of path characteristics.