Symmetry in Markov Decision Processes and its Implications for Single Agent and Multiagent Learning

15 years 1 months ago

Download www-2.cs.cmu.edu

This paper examines the notion of symmetry in Markov decision processes (MDPs). We define symmetry for an MDP and show how it can be exploited for more effective learning in single agent systems as well as multiagent systems and multirobot systems. We prove that if an MDP possesses a symmetry, then the optimal value function and Q function are similarly symmetric and there exists a symmetric optimal policy. If an MDP is known to possess a symmetry, this knowledge can be applied to decrease the number of training examples needed for algorithms like Q learning and value iteration. It can also be used to directly restrict the hypothesis space.

Martin Zinkevich, Tucker R. Balch

Real-time Traffic

ICML 2001 | Machine Learning | Optimal Value Function | Single Agent Systems | Symmetric Optimal Policy |

claim paper

Post Info
More Details (n/a)

Added	17 Nov 2009
Updated	17 Nov 2009
Type	Conference
Year	2001
Where	ICML
Authors	Martin Zinkevich, Tucker R. Balch

Comments (0)

Sciweavers

Symmetry in Markov Decision Processes and its Implications for Single Agent and Multiagent Learning

ICML 2001 | Machine Learning | Optimal Value Function | Single Agent Systems | Symmetric Optimal Policy |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers