Sciweavers

CORR
2011
Springer

Mean-Variance Optimization in Markov Decision Processes

13 years 6 months ago
Mean-Variance Optimization in Markov Decision Processes
We consider finite horizon Markov decision processes under performance measures that involve both the mean and the variance of the cumulative reward. We show that either randomized or history-based policies can improve performance. We prove that the complexity of computing a policy that maximizes the mean reward under a variance constraint is NP-hard for some cases, and strongly NP-hard for others. We finally offer pseudopolynomial exact and approximation algorithms.
Shie Mannor, John N. Tsitsiklis
Added 13 May 2011
Updated 13 May 2011
Type Journal
Year 2011
Where CORR
Authors Shie Mannor, John N. Tsitsiklis
Comments (0)