Probabilistic modeling of novelty-based exploration and operand conditioning.
-
1
TU Berlin, Germany
-
2
Charité University Medicine, Germany
One of the basic adaptation mechanisms of living being is the exploration of novel, yet unknown environments. Exploration of a novel option may even be advantageous from an evolutionary point of view, since the new option may reveal to be more rewarding than the options exploited so far. It is thus not surprising that novel stimuli tend to be associated with a stronger explorative behavior in the context of a reward-based learning paradigm in humans (1). Stimulus novelty enhances these exploratory choices through engagment of neural reward systems, such as the ventral striatum (VS) and the ventral tegmental area (VTA), which have been shown to be activated both by novelty and reward (2). VS and VTA are mesolimbic dopaminergic structures, involved in operand conditioning and are closely linked to reward-related learning (3). From a computational perspective it has been postulated that novel, unexpected stimuli are intrinsically rewarding, equivalent to a “novelty bonus” (4).
The present study investigated the link between novelty-based explorative behavior and its influence on operand conditiong in a reward-motivated decision-making task. The paradigm consisted of a first phase, where subjects were familiarised with different categories, followed by the actual testing phase, where subjects had to choose between two categories. One of the categories was more rewarding than the other and subjects had to learn which of both categories was “best”. Depending on the condition, novel stimuli pertaining to the corresponding category were either presented in the best rewarding or in the worst rewarding category, having the effect of accelerating or decelerating learning, as compared to the control condition.
Additional fRMI analysis further revealed a prediction error activation in the VS and VTA, which were stronger in novel rewarded trials as compared to standard rewarded trials.
In order to investigate subject's behavior quantitatively, we developed a probabilistic hidden markov model and introduced a parameter to allow for explorative behavior towards novel stimuli. This exploration novelty bias was accounted for by biasing the probability of an action towards the category presented with the novel stimulus. Individual variation in novelty response were characterized by finding the best-fitting values of metaparameters for each subject.
Altogether, we have not only shown that novelty enhances neural responses underlying reward anticipation in decision-making, but also that novel stimuli have a direct influence on operand conditioning. These results as well as individual differences could be reproduced by a probabilistic computational model.
References
1.Wittman et. al. (2008) Neuron 58,967-973
2.Reed et. al. (1996) Animal learn. & behav., 24, 38-45
3.Schultz et. al. (1997) Science, 275, 1593-1599
4.Kakade & Dayan (2002) Neural Networks 15, 549-559
Keywords:
decision-making,
Hidden markov model,
novelty,
reward-motivated learning
Conference:
BC11 : Computational Neuroscience & Neurotechnology Bernstein Conference & Neurex Annual Meeting 2011, Freiburg, Germany, 4 Oct - 6 Oct, 2011.
Presentation Type:
Poster
Topic:
learning and plasticity (please use "learning and plasticity" as keyword)
Citation:
Houillon
A,
Lorenz
R,
Wüstenberg
T,
Gallinat
J,
Heinz
A and
Obermayer
KH
(2011). Probabilistic modeling of novelty-based exploration and operand conditioning..
Front. Comput. Neurosci.
Conference Abstract:
BC11 : Computational Neuroscience & Neurotechnology Bernstein Conference & Neurex Annual Meeting 2011.
doi: 10.3389/conf.fncom.2011.53.00091
Copyright:
The abstracts in this collection have not been subject to any Frontiers peer review or checks, and are not endorsed by Frontiers.
They are made available through the Frontiers publishing platform as a service to conference organizers and presenters.
The copyright in the individual abstracts is owned by the author of each abstract or his/her employer unless otherwise stated.
Each abstract, as well as the collection of abstracts, are published under a Creative Commons CC-BY 4.0 (attribution) licence (https://creativecommons.org/licenses/by/4.0/) and may thus be reproduced, translated, adapted and be the subject of derivative works provided the authors and Frontiers are attributed.
For Frontiers’ terms and conditions please see https://www.frontiersin.org/legal/terms-and-conditions.
Received:
23 Aug 2011;
Published Online:
04 Oct 2011.
*
Correspondence:
Dr. Audrey Houillon, TU Berlin, Berlin, Germany, audrey.houillon@tu-berlin.de