This paper describes Icarus, an agent architecture that embeds a hierarchical reinforcement learning algorithm within a language for specifying agent behavior. An Icarus program expresses an approximately correct theory about how to behave with options at varying levels of detail, while the Icarus agent determines the best options by learning from experience. We describe Icarus and its learning algorithm, then report on two experiments in a vehicle control domain. The first examines the benefit of new distinctions about state, whereas the second explores the impact of added plan structure. We show that background knowledge increases learning rate and asymptotic performance, and decreases plan size by three orders of magnitude, relative to the typical formulation of the learning problem in our test domain. Categories and Subject Descriptors
Daniel G. Shapiro, Pat Langley, Ross D. Shachter