

Heuristic Search for Generalized Stochastic Shortest Path MDPs

13 years 6 months ago
Heuristic Search for Generalized Stochastic Shortest Path MDPs
Research in efficient methods for solving infinite-horizon MDPs has so far concentrated primarily on discounted MDPs and the more general stochastic shortest path problems (SSPs). These are MDPs with 1) an optimal value function V ∗ that is the unique solution of Bellman equation and 2) optimal policies that are the greedy policies w.r.t. V ∗ . This paper’s main contribution is the description of a new class of MDPs, that have well-defined optimal solutions that do not comply with either 1 or 2 above. We call our new class Generalized Stochastic Shortest Path (GSSP) problems. GSSP allows more general reward structure than SSP and subsumes several established MDP types including SSP, positive-bounded, negative, and discounted-reward models. While existing efficient heuristic search algorithms like LAO∗ and LRTDP are not guaranteed to converge to the optimal value function for GSSPs, we present a new heuristic-search-based family of algorithms, FRET (Find, Revise, Eliminate ...
Andrey Kolobov, Mausam, Daniel S. Weld, Hector Gef
Added 24 Aug 2011
Updated 24 Aug 2011
Type Journal
Year 2011
Where AIPS
Authors Andrey Kolobov, Mausam, Daniel S. Weld, Hector Geffner
Comments (0)