Sciweavers

102 search results - page 6 / 21
» MDPs with Non-Deterministic Policies
Sort
View
ATAL
2007
Springer
14 years 1 months ago
Commitment-driven distributed joint policy search
Decentralized MDPs provide powerful models of interactions in multi-agent environments, but are often very difficult or even computationally infeasible to solve optimally. Here we...
Stefan J. Witwicki, Edmund H. Durfee
EXACT
2008
13 years 9 months ago
Explaining recommendations generated by MDPs
There has been little work in explaining recommendations generated by Markov Decision Processes (MDPs). We analyze the difculty of explaining policies computed automatically and id...
Omar Zia Khan, Pascal Poupart, James P. Black
ICMLA
2009
13 years 5 months ago
Automatic Feature Selection for Model-Based Reinforcement Learning in Factored MDPs
Abstract--Feature selection is an important challenge in machine learning. Unfortunately, most methods for automating feature selection are designed for supervised learning tasks a...
Mark Kroon, Shimon Whiteson
ICML
2008
IEEE
14 years 8 months ago
Apprenticeship learning using linear programming
In apprenticeship learning, the goal is to learn a policy in a Markov decision process that is at least as good as a policy demonstrated by an expert. The difficulty arises in tha...
Umar Syed, Michael H. Bowling, Robert E. Schapire
NIPS
2007
13 years 9 months ago
Optimistic Linear Programming gives Logarithmic Regret for Irreducible MDPs
We present an algorithm called Optimistic Linear Programming (OLP) for learning to optimize average reward in an irreducible but otherwise unknown Markov decision process (MDP). O...
Ambuj Tewari, Peter L. Bartlett