Policy Reuse is a method to improve reinforcement learning with the ability to solve multiple tasks by building upon past problem solving experience, as accumulated in a Policy Library. Given a new task, a Policy Reuse learner uses the past policies in the library as a probabilistic bias in its new learning process. We present how the effectiveness of each reuse episode is indicative of the novelty of the new task with respect to the previously solved ones in the policy library. In the paper we review Policy Reuse, and we introduce theoretical results that demonstrate that: (i) a Policy Library can be selectively and incrementally built while learning different problems; (ii) the Policy Library can be understood as a basis of the domain that represents its structure through a set of core policies; and (iii) given the basis of a domain, we can define a lower bound for its reuse gain.
Fernando Fernández, Manuela M. Veloso