How do we approach intrinsic motivation computationally

Weber, Cornelius

doi:10.3389/neuro.12.001.2008

GENERAL COMMENTARY article

Front. Neurorobot., 22 May 2008

Volume 2 - 2008 | https://doi.org/10.3389/neuro.12.001.2008

How do we approach intrinsic motivation computationally

Cornelius Weber*

Johann Wolfgang Goethe University, Germany

A commentary on
What is intrinsic motivation. A typology of computational approaches

by Pierre-Yves Oudeyer and Frederic Kaplan

What is the energy function guiding behavior and learningµ Representationbased approaches like maximum entropy, generative models, sparse coding, or slowness principles can account for unsupervised learning of biologically observed structure in sensory systems from raw sensory data. However, they do not relate to behavior. Behavior-based approaches like reinforcement learning explain animal behavior in well-described situations. However, they rely on high-level representations which they cannot extract from raw sensory data. Combinations of multiple goal functions seems the methodology of choice to understand the complexity of the brain. But what is the set of possible goalsµ

Focusing on the reinforcement learning framework, this question is addressed in the article “What is intrinsic motivationµ A typology of computational approaches” by Pierre-Yves Oudeyer and Frederic Kaplan. It lists and classifies equations which extend the traditional concept of a “reward function”. Our behavior is not only driven by external rewards such as food, but there is a variety of intrinsic motivations. Some are aimed at exploration and so ensure delivery of rich sensory data, aiding unsupervised learning by active data acquisition, where the learning progress of the sensory system becomes the goal.

A novice reader may first want to familiarize himself with an example of a motivation function implemented in a model and applied in some scenario. A fun example is Schmidhuber (2006) , which would be classified as “Learning Progress Motivation” (LPM) in the article of Oudeyer and Kaplan. The model consists of a predictor and a controller, aka critic and actor, respectively. The critic is a sensory system that gives rewards to the actor whenever its learning progresses. The actor hence learns to act in such a way that the critic is presented data which leads to the critic"s learning progress. This can explain the learning of the actor"s parameters by a reinforcement learning algorithm. The structure, parameters and the learning paradigm of the critic are not specified, but unsupervised learning as to learning to predict would be suitable.

The broad overview of intrinsic motivation functions offered by Oudeyer and Kaplan leads to novel ways of conceptualizing and gaining new insights into the variety of computational mechanisms driving behavior and learning. A possible extension of the typology could include goal functions of unsupervised learning. Then an assessment of the relations between all relevant goal functions may provide a well-founded systems view of the brain

References

Schmidhuber, J. (2006). Developmental Robotics, Optimal Artificial Curiosity, Creativity, Music, and the Fine Arts. Connection Science, 18(2):173–187.

CrossRef Full Text

Citation: Weber C (2008) How do we approach intrinsic motivation computationally. Front. Neurorobot. 2:1. doi:10.3389/neuro.12.001.2008

Published online: 22 May 2008.

Copyright: © 2008 Weber. This is an open-access article subject to an exclusive license agreement between the authors and the Frontiers Research Foundation, which permits unrestricted use, distribution, and reproduction in any medium, provided the original authors and source are credited.

*Correspondence: Cornelius Weber, Frankfurt Institute for Advanced Studies, Johann Wolfgang Goethe University, Ruth-Moufang-Straße 1, 60438 Frankfurt am Main, Germany. e-mail: c.weber@fias.uni-frankfurt.de

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.