Better Be Lucky than Good: Exceeding Expectations in MDP Evaluation

10 years 3 months ago

Download gki.informatik.uni-freiburg.de

We introduce the MDP-Evaluation Stopping Problem, the optimization problem faced by participants of the International Probabilistic Planning Competition 2014 that focus on their own performance. It can be constructed as a meta-MDP where actions correspond to the application of a policy on a base-MDP, which is intractable in practice. Our theoretical analysis reveals that there are tractable special cases where the problem can be reduced to an optimal stopping problem. We derive approximate strategies of high quality by relaxing the general problem to an optimal stopping problem, and show both theoretically and experimentally that it not only pays off to pursue luck in the execution of the optimal policy, but that there are even cases where it is better to be lucky than good as the execution of a suboptimal base policy is part of an optimal strategy in the meta-MDP.

Thomas Keller 0001, Florian Geißer

Real-time Traffic

AAAI 2015 | Intelligent Agents |

claim paper

Post Info
More Details (n/a)

Added	27 Mar 2016
Updated	27 Mar 2016
Type	Journal
Year	2015
Where	AAAI
Authors	Thomas Keller 0001, Florian Geißer

Comments (0)

Sciweavers

Better Be Lucky than Good: Exceeding Expectations in MDP Evaluation

AAAI 2015 | Intelligent Agents |

Explore & Download

Productivity Tools

Sciweavers