Decentralized MDPs provide powerful models of interactions in multi-agent environments, but are often very difficult or even computationally infeasible to solve optimally. Here we...
There has been little work in explaining recommendations generated by Markov Decision Processes (MDPs). We analyze the difculty of explaining policies computed automatically and id...
Abstract--Feature selection is an important challenge in machine learning. Unfortunately, most methods for automating feature selection are designed for supervised learning tasks a...
In apprenticeship learning, the goal is to learn a policy in a Markov decision process that is at least as good as a policy demonstrated by an expert. The difficulty arises in tha...
We present an algorithm called Optimistic Linear Programming (OLP) for learning to optimize average reward in an irreducible but otherwise unknown Markov decision process (MDP). O...