Event Abstract

Does the visual system use natural experience to construct size invariant object representations?

  • 1 MIT, McGovern Inst, Dept of Brain & Cog Sci, United States

Object recognition is challenging because each object produces myriad retinal images. Responses of neurons at the top of the ventral visual stream (inferior temporal cortex, IT) exhibit object selectivity that is unaffected by the image changes. How do IT neurons attain this tolerance ("invariance")? One powerful idea is that temporal contiguity of natural visual experience can instruct tolerance (e.g. Foldiak, Neural Computation, 1991): because objects remain present for many seconds, whereas object or viewer motion cause changes in each object’s retinal image over shorter time intervals, the ventral stream could construct tolerance by learning to associate neuronal representations that occur closely in time. We recently found a neuronal signature of such learning in IT: temporally contiguous experience with different object images at different retinal positions can robustly reshape ("break") IT position tolerance, producing a tendency for IT neurons to confuse the identities of those temporally coupled objects across their manipulated positions (Li & DiCarlo, Science, 2008). A similar manipulation can induce the same pattern of confusion in the position tolerance of human object perception (Cox, Meier, Oertelt, DiCarlo. Nat Neurosci, 2005). Does this IT neuronal learning reflect a canonical unsupervised learning algorithm the ventral stream relies on to achieve tolerance to all types of image variation (e.g. object size and pose changes)? To begin to answer this question, we here extend our position tolerance paradigm to object size changes. Unsupervised non-human primates were exposed to an altered visual world in which we temporally coupled the experience of two object images of different sizes at each animal’s center of gaze: (e.g.) a small image of one object (P, neuronally preferred object) was consistently followed by a large image of a second object (N), rendering the small image of P temporally contiguous with the large image of N. We made IT neuronal selectivity measurements before and after the animals received ~2 hours of experience in the unsupervised, altered visual world. Consistent with our results on position tolerance, we found that this size experience manipulation robustly reshapes IT size tolerance over a period of hours. Specifically, unlike experienced controls, we found a change in neuronal selectivity (P-N) across the manipulated objects and their manipulated sizes, producing a tendency to confuse those object identities across those sizes. This change in size tolerance is specific to the manipulated objects, grew gradually stronger with increasing experience, and the rate of learning was similar to position tolerance learning (~5 spikes/s per hour of exposure). Finally, in a separate experiment, we examine how temporal direction of the experience affects the learning: do temporally-early images teach temporally-later ones, or vice-versa? We found greater learning for the temporally-later images, suggesting a Hebbian-like learning mechanism (e.g. Sprekeler & Gerstner, COSYNE, 2009; Wallis & Rolls, Prog Neurobiol, 1997). We speculate that these converging results on IT position and size tolerance plasticity reflect an underlying unsupervised cortical learning mechanism by which the ventral visual stream acquires and maintains its tolerant object representations.

Conference: Computational and Systems Neuroscience 2010, Salt Lake City, UT, United States, 25 Feb - 2 Mar, 2010.

Presentation Type: Poster Presentation

Topic: Poster session II

Citation: Li N and DiCarlo J (2010). Does the visual system use natural experience to construct size invariant object representations?. Front. Neurosci. Conference Abstract: Computational and Systems Neuroscience 2010. doi: 10.3389/conf.fnins.2010.03.00326

Copyright: The abstracts in this collection have not been subject to any Frontiers peer review or checks, and are not endorsed by Frontiers. They are made available through the Frontiers publishing platform as a service to conference organizers and presenters.

The copyright in the individual abstracts is owned by the author of each abstract or his/her employer unless otherwise stated.

Each abstract, as well as the collection of abstracts, are published under a Creative Commons CC-BY 4.0 (attribution) licence (https://creativecommons.org/licenses/by/4.0/) and may thus be reproduced, translated, adapted and be the subject of derivative works provided the authors and Frontiers are attributed.

For Frontiers’ terms and conditions please see https://www.frontiersin.org/legal/terms-and-conditions.

Received: 08 Mar 2010; Published Online: 08 Mar 2010.

* Correspondence: Nuo Li, MIT, McGovern Inst, Dept of Brain & Cog Sci, Paris, United States, linuo@mit.edu