This paper addresses the search control problemof selecting whichplan to refine next for decision-theoretic planners, a choice point commonto the decision theoretic planners created to date. Suchplanners can makeuse of a utility function to calculate boundson the expectedutility of an abstract plan. Threestrategies for using these boundsto select the next plan to refine have been proposedin the literature. Weexaminethe rationale for each strategy and provethat the optimistic strategy of alwaysselecting a plan with the highest upper-boundon expected utility expands the fewestnumberof plans, whenlookingfor all plans with the highest expectedutility. Whenlookingfor a single plan withthe highest expectedutility, weprove that the optimistic strategy has the best possible worst case performanceand that other strategies can fail to terminate. Todemonstratethe effect of plan selection strategies on performance,wegive results using the DRWSplanner that showthat the optimistic strategy can prod...
Richard Goodwin, Reid G. Simmons