Sciweavers

STACS
2007
Springer

Pure Stationary Optimal Strategies in Markov Decision Processes

14 years 5 months ago
Pure Stationary Optimal Strategies in Markov Decision Processes
Markov decision processes (MDPs) are controllable discrete event systems with stochastic transitions. Performances of an MDP are evaluated by a payoff function. The controller of the MDP seeks to optimize those performances, using optimal strategies. There exists various ways of measuring performances, i.e. various classes of payoff functions. For example, average performances can be evaluated by a mean-payoff function, peak performances by a limsup payoff function, and the parity payoff function can be used to encode logical specifications. Surprisingly, all the MDPs equipped with mean, limsup or parity payoff functions share a common non-trivial property: they admit pure stationary optimal strategies. In this paper, we introduce the class of prefix-independent and submixing payoff functions, and we prove that any MDP equipped with such a payoff function admits pure stationary optimal strategies. This result unifies and simplifies several existing proofs. Moreover, it is a...
Hugo Gimbert
Added 09 Jun 2010
Updated 09 Jun 2010
Type Conference
Year 2007
Where STACS
Authors Hugo Gimbert
Comments (0)