Sciweavers

ECML
2006
Springer

Reinforcement Learning for MDPs with Constraints

14 years 2 months ago
Reinforcement Learning for MDPs with Constraints
In this article, I will consider Markov Decision Processes with two criteria, each defined as the expected value of an infinite horizon cumulative return. The second criterion is either itself subject to an inequality constraint, or there is maximum allowable probability that the single returns violate the constraint. I describe and discuss three new reinforcement learning approaches for solving such control problems.
Peter Geibel
Added 13 Oct 2010
Updated 13 Oct 2010
Type Conference
Year 2006
Where ECML
Authors Peter Geibel
Comments (0)