Sciweavers

AAAI
2015

Better Be Lucky than Good: Exceeding Expectations in MDP Evaluation

8 years 9 months ago
Better Be Lucky than Good: Exceeding Expectations in MDP Evaluation
We introduce the MDP-Evaluation Stopping Problem, the optimization problem faced by participants of the International Probabilistic Planning Competition 2014 that focus on their own performance. It can be constructed as a meta-MDP where actions correspond to the application of a policy on a base-MDP, which is intractable in practice. Our theoretical analysis reveals that there are tractable special cases where the problem can be reduced to an optimal stopping problem. We derive approximate strategies of high quality by relaxing the general problem to an optimal stopping problem, and show both theoretically and experimentally that it not only pays off to pursue luck in the execution of the optimal policy, but that there are even cases where it is better to be lucky than good as the execution of a suboptimal base policy is part of an optimal strategy in the meta-MDP.
Thomas Keller 0001, Florian Geißer
Added 27 Mar 2016
Updated 27 Mar 2016
Type Journal
Year 2015
Where AAAI
Authors Thomas Keller 0001, Florian Geißer
Comments (0)