Sciweavers

AAAI
2006

Targeting Specific Distributions of Trajectories in MDPs

14 years 1 months ago
Targeting Specific Distributions of Trajectories in MDPs
We define TTD-MDPs, a novel class of Markov decision processes where the traditional goal of an agent is changed from finding an optimal trajectory through a state space to realizing a specified distribution of trajectories through the space. After motivating this formulation, we show how to convert a traditional MDP into a TTD-MDP. We derive an algorithm for finding non-deterministic policies by constructing a trajectory tree that allows us to compute locally-consistent policies. We specify the necessary conditions for solving the problem exactly and present a heuristic algorithm for constructing policies when an exact answer is impossible or impractical. We present empirical results for our algorithm in two domains: a synthetic grid world and stories in an interactive drama or game.
David L. Roberts, Mark J. Nelson, Charles Lee Isbe
Added 30 Oct 2010
Updated 30 Oct 2010
Type Conference
Year 2006
Where AAAI
Authors David L. Roberts, Mark J. Nelson, Charles Lee Isbell Jr., Michael Mateas, Michael L. Littman
Comments (0)