Parametric regret in uncertain Markov decision processes

14 years 5 months ago

Download www.cim.mcgill.ca

— We consider decision making in a Markovian setup where the reward parameters are not known in advance. Our performance criterion is the gap between the performance of the best strategy that is chosen after the true parameter realization is revealed and the performance of the strategy that is chosen before the parameter realization is revealed. We call this gap the parametric regret. We consider two related problems: minimax regret and mean-variance tradeoff of the regret. The minimax regret strategy minimizes the worst-case regret under the most adversarial possible realization. We show that the problem of computing the minimax regret strategy is NP-hard and propose algorithms to efﬁciently solve it under favorable conditions. The mean-variance tradeoff formulation requires a probabilistic model of the uncertain parameters and looks for a strategy that minimizes a convex combination of the mean and the variance of the regret. We prove that computing such a strategy can be done nu...

Huan Xu, Shie Mannor

Real-time Traffic

CDC 2009 | Control Systems | Minimax Regret | Minimax Regret Strategy | Parameter Realization |

claim paper

Post Info
More Details (n/a)

Added	21 Jul 2010
Updated	21 Jul 2010
Type	Conference
Year	2009
Where	CDC
Authors	Huan Xu, Shie Mannor

Comments (0)

Sciweavers

Parametric regret in uncertain Markov decision processes

CDC 2009 | Control Systems | Minimax Regret | Minimax Regret Strategy | Parameter Realization |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers