Sciweavers

AIPS
2007

Learning to Plan Using Harmonic Analysis of Diffusion Models

14 years 2 months ago
Learning to Plan Using Harmonic Analysis of Diffusion Models
This paper summarizes research on a new emerging framework for learning to plan using the Markov decision process model (MDP). In this paradigm, two approaches to learning to plan have traditionally been studied: the indirect model-based approach infers the state transition matrix and reward function from samples, and then solves the Bellman equation to find the optimal (action) value function; the direct model-free approach, most notably Q-learning, estimates the action value function directly. This paper describes a new harmonic analysis framework for planning based on estimating a diffusion model that captures information flow on a graph (discrete state space) or a manifold (continuous state space) using the Laplace heat equation. Diffusion models are significantly easier to learn than transition models, and yet provide similar speedups in performance over model-free methods. Two methods for constructing novel plan representations from diffusion models are described: Fourier met...
Sridhar Mahadevan, Sarah Osentoski, Jeffrey Johns,
Added 02 Oct 2010
Updated 02 Oct 2010
Type Conference
Year 2007
Where AIPS
Authors Sridhar Mahadevan, Sarah Osentoski, Jeffrey Johns, Kimberly Ferguson, Chang Wang
Comments (0)