Event Abstract

A Computational Model for Cued Infant Learning

  • 1 CNRS and Aix- Marseille University, Laboratoire de Psychologie Cognitive, France
  • 2 Birkbeck, University of London, Department of Psychological Sciences, United Kingdom
  • 3 Japan Advanced Institute of Science and Technology, School of Knowledge Science, Japan
  • 4 Indiana University, Department of Psychological and Brain Sciences, United States

I. INTRODUCTION
In a busy multimodal world, infants must parse useful information from a swirl of perceptual events. One way to accomplish this is relying on attention cues to guide them to relevant learning events. Many attention cues can guide infants’ attention, but which ones help infants learn what to learn? Wu and Kirkham [1] – hereafter W&K – showed that social cues (e.g., a turning face that used infant-directed speech) produce better spatial learning of audio-visual events than non-social cues (i.e., flashing squares that shift attention to the target location), by 8 months of age. With non-social cues (flashing squares), 8-month-olds learned only cued locations regardless of multimodal information. With no cues, infants had difficulty remembering the locations of the appropriate objects. This study measured infants’ gaze behavior when they were presented with dynamic audio-visual events (i.e., cats moving to a bloop sound and dogs moving to a boing sound) in white frames in the corners of a black background (Figure 1). An object’s appearance in a spatial location consistently predicted a location-specific sound. On every familiarization trial, infants were shown identical audio-visual events in two diagonally opposite corners of the screen (i.e., two valid binding locations). To test the effects of attentional cueing on audio-visual learning, either a social (i.e., a real face) or non-social (i.e., colorful flashes) cue shifted infants’ attention to one of the two identical events on every trial. For the social cue, a face appeared, spoke to the infant, and turned to one of the lower corners containing an object. For the non-social cue, a red flashing square wrapped around the target frame appeared and disappeared at a regular interval (i.e., flashed continuously) throughout the familiarization trial. During the test trials, only the four blank frames were displayed on the screen while one of the sounds played.

II. THE MODEL
The purpose of this study is to characterise the neural mechanisms at work in infants when they are performing this task. In other words, the model’s outputs (where it is going to “look”) should determine its next inputs (what it will “see” next). The model (illustrated in Figure 2) is essentially an adaptation of Sirois and Mareschal's architecture for infant habituation (HAB: Habituation, Autoassociation, and Brain) [2], combined with Mozer and Sitton's model of visual attention [3]. However the model departs from the former in that it is capable of multimodal learning among distractors, and from the latter in that the WTA network is thought to model overt rather than covert attentional shifts. One novel and critical feature of the model is that it is wired in a feedback loop, whereby its last output determines its current inputs. In this way, we can attempt to simulate the processes taking place in the infant's brain as the sequence of visual and audio events unfolds, during training and test trials. Two auto-associator networks are trained to store (left network, Hopfield Network [HN]) or suppress (right network, Novelty Detector [ND]) the activation pattern elicited by some attended part of a multimodal input event (filtered input level). The states to which these networks converge are fed into a winner-take-all network of location units (WTA network, upper network). The winning unit determines the next saccade of the model: which part of the multimodal event will be attended to and which parts will be filtered.

III. RESULTS
Over the four blocks in W&K’s No Cue condition, infants were equally likely to look at all locations when presented with the auditory cue. This finding is mirrored in our simulations. When multimodal training events were cued in W&K’s study, infants looked significantly more at cued locations (in the Square condition) or cued correct locations (in the Social condition, last two blocks) during test trials. The middle right and bottom right graphs in Figure 3 show the same advantage in the model for cued locations over non-cued locations. The main finding from W&K was that different cues produced different types of learning. What we might call “shallow learning” was observed in the Square condition, where infants looked preferentially at locations that had been cued during training (in Figure 3 middle, black bars were superior to white bars) but without associating a location to a sound (black bars are of equal height). On the contrary, “deep learning” was observed in the Social condition, but only in the last two blocks, where infants looked significantly more at the correct cued location than at any other peripheral location (bottom, the correct black bar is largely superior both to the incorrect black bar and to the white bars).

IV. CONCLUSIONS AND FURTHER RESEARCH
This study presents a neuro-computational model that builds on two successful predecessors coming from different fields of cognitive science. The model can account for new infant data involving cued multimodal learning in the presence of distractors. In particular, we have found a candidate mechanism that might underlie largely observed differences between social and non- social cues in infancy. This mechanism holds that infants make use of more stringent attentional filters when they are exposed to social cues than to non-social cues.
We are now conducting two streams of research to better understand how this network behaves. First, we are tracing through the network how the final saccades are generated. Second, we are investigating the trajectory of proportional looking times as training unfolds and comparing these data to the infant data [4]. These two streams of research will further ground our understanding of how infants learn from attention cues, a critical component for rapid learning.
In the long term, the model could also be improved by strengthening its links to the brain. For instance Sirois and Mareschal related HN and ND to the cortex and the hippocampus, respectively, and the model might be improved by reinstating the interaction that was originally present between these two systems in HAB. More generally the cortex, hippocampus and superior colliculus obviously all perform more than one function that might well be relevant in this model, for instance coding for auditory maps in the case of the colliculus [5], or input recoding [6] and interleaved learning [7] in the case of the hippocampus. A model that could recode input patterns for better storage and present them repeatedly to the infant during less active periods could offer new perspectives into how infants succeed universally to learn what to learn.

Figure 1
Figure 2
Figure 3

Acknowledgements

We thank Jochen Triesch, Denis Mareschal, Sylvain Sirois, Dan Yurovsky, Bruno Laeng, Nadja Althaus for their help and useful remarks on this work.

References

[1] R. Wu and N. Z. Kirkham. (2010). No two cues are alike: Depth of learning during infancy is dependent on what orients attention. Journal of Experimental Child Psychology, 107, 118-136.
[2] S. Sirois and D. Mareschal, (2004). An interacting systems model of infant habituation. Journal of Cognitive Neuroscience, 16, 1352-1362.
[3] M. C. Mozer and M. Sitton. (1998). Computational modeling of spatial attention. In H. Pashler (Ed.), Attention (pp. 341-393). East Sussex: Psychology
Press Ltd.
[4] C. Yu, and L. B. Smith. (in press). What you learn is what you see: using eye movements to study infant cross-situational word learning. Developmental
Science.
[5] A. J. King, J. W. H. Schnupp, S. Carlile, A. L. Smith, and L. D. Thompson. (1996). The development of topographically- aligned maps of visual and
auditory space in the superior colliculus. In B. E. Stein, M. Narita and T. Bando eds): Extrageniculostriate Mechanisms of Visually Guided Orientation
Behavior, Progress in Brain Research, 112, pp. 335-50.
[6] W. B. Levy, A. B. Hocking, and X. Wu. (2005). Interpreting hippocampal function as recoding and forecasting. Neural Networks, 18, 1242-1264.
[7] J. L. McClelland, B. L. McNaughton, and R. C. O'Reilly. (1995). Why there are complementary learning systems in the hippocampus and neocortex:
insights from the successes and failures of connectionist models of learning and memory. Psychological Review, 102, 419–457.

Keywords: attentional cueing, Connectionist Modeling, Eye-tracking, Infancy, Learning

Conference: IEEE ICDL-EPIROB 2011, Frankfurt, Germany, 24 Aug - 27 Aug, 2011.

Presentation Type: Poster Presentation

Topic: Social development

Citation: Hannagan T, Wu R, Hidaka S and Yu C (2011). A Computational Model for Cued Infant Learning. Front. Comput. Neurosci. Conference Abstract: IEEE ICDL-EPIROB 2011. doi: 10.3389/conf.fncom.2011.52.00018

Copyright: The abstracts in this collection have not been subject to any Frontiers peer review or checks, and are not endorsed by Frontiers. They are made available through the Frontiers publishing platform as a service to conference organizers and presenters.

The copyright in the individual abstracts is owned by the author of each abstract or his/her employer unless otherwise stated.

Each abstract, as well as the collection of abstracts, are published under a Creative Commons CC-BY 4.0 (attribution) licence (https://creativecommons.org/licenses/by/4.0/) and may thus be reproduced, translated, adapted and be the subject of derivative works provided the authors and Frontiers are attributed.

For Frontiers’ terms and conditions please see https://www.frontiersin.org/legal/terms-and-conditions.

Received: 10 Apr 2011; Published Online: 12 Jul 2011.

* Correspondence: Dr. Rachel Wu, Birkbeck, University of London, Department of Psychological Sciences, London, WC1E 7HX, United Kingdom, rachelwu2006@gmail.com