AUTHOR=Balcarras Matthew , Womelsdorf Thilo 

TITLE=A Flexible Mechanism of Rule Selection Enables Rapid Feature-Based Reinforcement Learning

JOURNAL=Frontiers in Neuroscience

VOLUME=Volume 10 - 2016

YEAR=2016

URL=https://www.frontiersin.org/journals/neuroscience/articles/10.3389/fnins.2016.00125

DOI=10.3389/fnins.2016.00125

ISSN=1662-453X

ABSTRACT=Learning in a new environment is influenced by prior learning and experience. Correctly applying a rule that maps a context to stimuli, actions, and outcomes enables faster learning and better outcomes compared to relying on strategies for learning that are ignorant of task structure. However, it is often difficult to know when and how to apply learned rules in new contexts. In our study we explored how subjects employ different strategies for learning the relationship between stimulus features and positive outcomes in a probabilistic task context. We test the hypothesis that task naive subjects will show enhanced learning of feature specific reward associations by switching to the use of an abstract rule that associates stimuli by feature type and restricts selections to that dimension. To test this hypothesis we designed a decision making task where subjects receive probabilistic feedback following choices between pairs of stimuli.  In the task, trials are grouped in two contexts by blocks, where in one type of block there is no unique relationship between a specific feature dimension (stimulus shape or colour) and positive outcomes, and following an un-cued transition, alternating blocks have outcomes that are linked to either stimulus shape or colour.  Two-thirds of subjects (n=22/32) exhibited behaviour that was best fit by a hierarchical feature-rule model. Supporting the prediction of the model mechanism these subjects showed significantly enhanced performance in feature-reward blocks, and rapidly switched their choice strategy to using abstract feature rules when reward contingencies changed. Choice behaviour of other subjects (n=10/32) was fit by a range of alternative reinforcement learning models representing strategies that do not benefit from applying previously learned rules. In summary, these results  show that untrained subjects are capable of flexibly shifting between behavioural rules by leveraging simple model-free reinforcement learning and context-specific selections to drive responses.