Sciweavers

AAAI
2011

Optimal Rewards versus Leaf-Evaluation Heuristics in Planning Agents

13 years 13 days ago
Optimal Rewards versus Leaf-Evaluation Heuristics in Planning Agents
Planning agents often lack the computational resources needed to build full planning trees for their environments. Agent designers commonly overcome this finite-horizon approximation by applying an evaluation function at the leaf-states of the planning tree. Recent work has proposed an alternative approach for overcoming computational constraints on agent design: modify the reward function. In this work, we compare this reward design approach to the common leaf-evaluation heuristic approach for improving planning agents. We show that in many agents, the reward design approach strictly subsumes the leaf-evaluation approach, i.e., there exists a reward function for every leaf-evaluation heuristic that leads to equivalent behavior, but the converse is not true. We demonstrate that this generality leads to improved performance when an agent makes approximations in addition to the finite-horizon approximation. As part of our contribution, we extend PGRD, an online reward design algorithm...
Jonathan Sorg, Satinder P. Singh, Richard L. Lewis
Added 12 Dec 2011
Updated 12 Dec 2011
Type Journal
Year 2011
Where AAAI
Authors Jonathan Sorg, Satinder P. Singh, Richard L. Lewis
Comments (0)