Probabilistic Policy Reuse for inter-task transfer learning

14 years 5 months ago

Download scalab.uc3m.es

Policy Reuse is a reinforcement learning technique that eﬃciently learns a new policy by using past similar learned policies. The Policy Reuse learner improves its exploration by probabilistically including the exploitation of those past policies. Policy Reuse was introduced and previously demonstrated its eﬀectiveness in problems with diﬀerent reward functions in the same state and action spaces. In this article, we contribute Policy Reuse as transfer learning among diﬀerent domains. We introduce extended MDPs to include domains and tasks, where domains have diﬀerent state and action spaces, and task are problems with diﬀerent rewards within a domain. We show how Policy Reuse can be applied among domains by deﬁning and using a mapping between their state and action spaces. We use several domains, as versions of a simulated RoboCup Keepaway problem, where we show that Policy Reuse can be used as a mechanism of transfer learning signiﬁcantly outperforming a basic policy...

Fernando Fernández, Javier García, M

Real-time Traffic

Action Spaces | Policy Reuse | Policy Reuse Learner | RAS 2010 |

claim paper

Post Info
More Details (n/a)

Added	30 Jan 2011
Updated	30 Jan 2011
Type	Journal
Year	2010
Where	RAS
Authors	Fernando Fernández, Javier García, Manuela M. Veloso

Comments (0)

Sciweavers

Probabilistic Policy Reuse for inter-task transfer learning

Action Spaces | Policy Reuse | Policy Reuse Learner | RAS 2010 |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers