Sciweavers

ICML
2006
IEEE

Using inaccurate models in reinforcement learning

15 years 8 days ago
Using inaccurate models in reinforcement learning
In the model-based policy search approach to reinforcement learning (RL), policies are found using a model (or "simulator") of the Markov decision process. However, for highdimensional continuous-state tasks, it can be extremely difficult to build an accurate model, and thus often the algorithm returns a policy that works in simulation but not in real-life. The other extreme, model-free RL, tends to require infeasibly large numbers of real-life trials. In this paper, we present a hybrid algorithm that requires only an approximate model, and only a small number of real-life trials. The key idea is to successively "ground" the policy evaluations using real-life trials, but to rely on the approximate model to suggest local changes. Our theoretical results show that this algorithm achieves near-optimal performance in the real system, even when the model is only approximate. Empirical results also demonstrate that--when given only a crude model and a small number of rea...
Pieter Abbeel, Morgan Quigley, Andrew Y. Ng
Added 17 Nov 2009
Updated 17 Nov 2009
Type Conference
Year 2006
Where ICML
Authors Pieter Abbeel, Morgan Quigley, Andrew Y. Ng
Comments (0)