Heuristic functions make MDP solvers practical by reducing their time and memory requirements. Some of the most effective heuristics (e.g., the FF heuristic function) first determinize the MDP and then solve a relaxation of the resulting classical planning problem (e.g., by ignoring delete effects). While these heuristic functions are fast to compute, they frequently yield overly optimistic value estimates. It is natural to wonder, then, whether the improved estimates of using a full classical planner on the (non-relaxed) determinized domain will provide enough gains to compensate for the vastly increased cost of computation. This paper shows that the answer is "No and Yes". If one uses a full classical planner in the obvious way, the cost of the heuristic function's computation outweighs the benefits. However, we show that one can make the idea practical by generalizing the results of classical planning successes and failures. Specifically, we introduce a novel heurist...
Andrey Kolobov, Mausam, Daniel S. Weld