Sciweavers

60 search results - page 8 / 12
» A Simulation-based Approach for Solving Generalized Semi-Mar...
Sort
View
ICML
2008
IEEE
14 years 8 months ago
Apprenticeship learning using linear programming
In apprenticeship learning, the goal is to learn a policy in a Markov decision process that is at least as good as a policy demonstrated by an expert. The difficulty arises in tha...
Umar Syed, Michael H. Bowling, Robert E. Schapire
JAIR
2006
101views more  JAIR 2006»
13 years 7 months ago
Resource Allocation Among Agents with MDP-Induced Preferences
Allocating scarce resources among agents to maximize global utility is, in general, computationally challenging. We focus on problems where resources enable agents to execute acti...
Dmitri A. Dolgov, Edmund H. Durfee
ECML
2007
Springer
14 years 1 months ago
Policy Gradient Critics
We present Policy Gradient Actor-Critic (PGAC), a new model-free Reinforcement Learning (RL) method for creating limited-memory stochastic policies for Partially Observable Markov ...
Daan Wierstra, Jürgen Schmidhuber
3DIM
2007
IEEE
14 years 1 months ago
A Bayesian Framework for Simultaneous Matting and 3D Reconstruction
Conventional approaches to 3D scene reconstruction often treat matting and reconstruction as two separate problems, with matting a prerequisite to reconstruction. The problem with...
Jean-Yves Guillemaut, Adrian Hilton, Jonathan Star...
ATAL
2007
Springer
14 years 1 months ago
A globally optimal algorithm for TTD-MDPs
In this paper, we discuss the use of Targeted Trajectory Distribution Markov Decision Processes (TTD-MDPs)—a variant of MDPs in which the goal is to realize a specified distrib...
Sooraj Bhat, David L. Roberts, Mark J. Nelson, Cha...