PILCO: A Model-Based and Data-Efficient Approach to Policy Search

Deisenroth, MP; Rasmussen, CE

Repository landing page

research

oai:spiral.imperial.ac.uk:10044/1/11585

PILCO: A Model-Based and Data-Efficient Approach to Policy Search

Authors: MP Deisenroth
CE Rasmussen
Publication date: 7 October 2011
Publisher: IMLS

Abstract

In this paper, we introduce PILCO, a practical, data-efficient model-based policy search method. PILCO reduces model bias, one of the key problems of model-based reinforcement learning, in a principled way. By learning a probabilistic dynamics model and explicitly incorporating model uncertainty into long-term planning, PILCO can cope with very little data and facilitates learning from scratch in only a few trials. Policy evaluation is performed in closed form using state-of-the-art approximate inference. Furthermore, policy gradients are computed analytically for policy improvement. We report unprecedented learning efficiency on challenging and high-dimensional control tasks. Copyright 2011 by the author(s)/owner(s)

Conference Paper

Similar works

Full text

Open in the Core reader

Download PDF

Spiral - Imperial College Digital Repository

oai:spiral.imperial.ac.uk:1004...

Last time updated on 21/10/2013Provided by our Supporting member

This paper was published in Spiral - Imperial College Digital Repository.

Having an issue?

Is data on this page outdated, violates copyrights or anything else? Report the problem now and we will take corresponding actions after reviewing your request.