Event Abstract

Deep Belief Nets as Function Approximators for Reinforcement Learning

  • 1 The University of Arizona, Department of Computer Science, United States

Real-world tasks often require learning methods to deal with continuous state and action spaces. In these applications, function approximation is often useful in building a compact representation of the value function. One popular framework for implementing such function approximation is Neural Fitted Q-Iteration. However, this algorithm is based solely on the value returns, without making use of explicit structural information from the state space. In this paper, we propose a new reinforcement learning approach which uses Deep Belief Networks to learn a generative model of the state space and find an accurate approximation of the value function based on both the value returns and the learned model. Deep Networks have been very successful for pattern recognition problems, but have not yet been used for learning agent controllers. The unsupervised pre-training phase in Deep Belief Networks initializes the parameters of the network in a region of the parameter space that is more likely to contain good solutions, given the available data. On the other hand, reinforcement learning problems are apt to data imbalance. This implies that in order to take advantage of pre-training in reinforcement learning, the network should be provided with balanced initial data that covers interesting regions of the state space. Our experiments confirm this and show that our approach significantly increases the learning efficiency, specially when the initial data is wisely collected.

Acknowledgements

This research was supported by ONR "Science of Autonomy" contract N00014-09-1-065 and DARPA contract N10AP20008. The authors would also like to thank Tom Walsh for his contributions to this work.

References

[1] M. Riedmiller, Neural fitted Q-iteration - first experiences with a data effcient neural reinforcement learning method, In Proceedings of ECML 2005, 317–328. Porto, Portugal, 2005
[2] G. E. Hinton, S. Osindero, and Y. Teh, A fast learning algorithm for deep belief nets, Neural Computation, 18: 1527–1554, 2006.
[3] D. Erhan, Y. Bengio, A. Courville, P. Manzagol, P. Vincent, and S. Bengio, Why does unsupervised pre-training help deep learning?, In Proceedings of AISTATS 2010, 201–208. Chia Laguna, Sardinia, Italy, 2010.

Keywords: Data Imbalance, Deep Belief Networks, Function Approximation, reinforcement learning

Conference: IEEE ICDL-EPIROB 2011, Frankfurt, Germany, 24 Aug - 27 Aug, 2011.

Presentation Type: Poster Presentation

Topic: Architectures

Citation: Abtahi F and Fasel I (2011). Deep Belief Nets as Function Approximators for Reinforcement Learning. Front. Comput. Neurosci. Conference Abstract: IEEE ICDL-EPIROB 2011. doi: 10.3389/conf.fncom.2011.52.00029

Copyright: The abstracts in this collection have not been subject to any Frontiers peer review or checks, and are not endorsed by Frontiers. They are made available through the Frontiers publishing platform as a service to conference organizers and presenters.

The copyright in the individual abstracts is owned by the author of each abstract or his/her employer unless otherwise stated.

Each abstract, as well as the collection of abstracts, are published under a Creative Commons CC-BY 4.0 (attribution) licence (https://creativecommons.org/licenses/by/4.0/) and may thus be reproduced, translated, adapted and be the subject of derivative works provided the authors and Frontiers are attributed.

For Frontiers’ terms and conditions please see https://www.frontiersin.org/legal/terms-and-conditions.

Received: 12 Apr 2011; Published Online: 12 Jul 2011.

* Correspondence: Ms. Farnaz Abtahi, The University of Arizona, Department of Computer Science, Tucson, Arizona, 85721-0077, United States, farnaza@cs.arizona.edu