Event Abstract

Prefrontal neurons solve the temporal credit assignment problem during reinforcement learning

  • 1 Harvard Medical School, Massachusetts General Hospital, United States

Animals can learn new behaviors by forming associations between stimuli and actions under the guidance of appropriate reinforcement. The association of stimuli, actions and reinforcement is relatively straightforward when they overlap temporally. However, a dilemma is posed by reinforcers that are delayed with respect to their antecedent causes. This problem arises because the assignment of credit for a particular reward to a preceding event is less certain; many stimuli and several actions may have led up to that reward, so which is responsible? How does a learning system attribute these rewards to the causal event? This is the temporal credit assignment problem. To solve this problem, mechanisms must be in-place to represent information about the relevant preceding events at the time of reinforcement. Therefore, we designed a task that created a temporal credit assignment problem, and sought to determine if, during reinforcement, neurons in the lateral prefrontal cortex (PFC) could selectively represent an earlier, reward-predicting stimulus. Monkeys learned, by trial and error, which of four simultaneously-presented cues were associated with later reward. They indicated a choice by executing a saccade, after a blank delay, to the former location of one of those cues. If the correct cue had indeed appeared there, a generic reinforcer (a green circle) signaled a correct selection, followed by reward. Critically, the green circle did not reveal which cue was correct. Thus, for learning to take place, the occurrence of this feedback must be linked in some fashion to the one particular cue, out of the four shown earlier, that predicted it. Because cue arrangement varied randomly on every trial, the location of the response could not predict reward; rather, this was an "object learning" task. As a control, we interleaved a "spatial learning" task in which reward was determined by spatial location, irrespective of which cue had appeared there. This task was identical in all sensory and motor respects to the object task, but differed only in the rule. Here, there was no temporal credit assignment problem because the reinforcement overlapped in time with the predictive feature, the selected location. We found that 1) individual PFC neurons selectively represented the correct object at the time of reinforcement, and over the entire, unscreened population of PFC neurons there was more feedback-period object selectivity in the object-learning than in the spatial-learning task; 2) some neurons had feedback-period object-selectivity even if they had no visually-evoked response at the time of actual cue presentation, earlier in the trial; 3) in neurons that had both cue-period object selectivity and feedback-period object-selectivity, the rank-ordering of this selectivity was rarely (only 11%of the time) identical. Therefore, PFC neurons did indeed actively generate a representation of the relevant stimulus during reinforcement, specifically when there was a temporal credit assignment problem. However, the neuronal re-representation of this information during feedback differed from that observed during the actual presentation of the rewarded stimulus, arguing that population activity states - not individual neurons as "feature detectors" - are the objects of reinforcement.

Conference: Computational and Systems Neuroscience 2010, Salt Lake City, UT, United States, 25 Feb - 2 Mar, 2010.

Presentation Type: Poster Presentation

Topic: Poster session I

Citation: Asaad WF and Eskandar EN (2010). Prefrontal neurons solve the temporal credit assignment problem during reinforcement learning. Front. Neurosci. Conference Abstract: Computational and Systems Neuroscience 2010. doi: 10.3389/conf.fnins.2010.03.00089

Copyright: The abstracts in this collection have not been subject to any Frontiers peer review or checks, and are not endorsed by Frontiers. They are made available through the Frontiers publishing platform as a service to conference organizers and presenters.

The copyright in the individual abstracts is owned by the author of each abstract or his/her employer unless otherwise stated.

Each abstract, as well as the collection of abstracts, are published under a Creative Commons CC-BY 4.0 (attribution) licence (https://creativecommons.org/licenses/by/4.0/) and may thus be reproduced, translated, adapted and be the subject of derivative works provided the authors and Frontiers are attributed.

For Frontiers’ terms and conditions please see https://www.frontiersin.org/legal/terms-and-conditions.

Received: 20 Feb 2010; Published Online: 20 Feb 2010.

* Correspondence: Wael F Asaad, Harvard Medical School, Massachusetts General Hospital, Boston, United States, wfasaad@mit.edu