Search Sciweavers | Sciweavers

60 search results - page 8 / 12

» A Simulation-based Approach for Solving Generalized Semi-Mar...

162

click to vote

ICML
2008
IEEE

147views Machine Learning» more ICML 2008»

Apprenticeship learning using linear programming

16 years 7 months ago

Download www.cs.ualberta.ca

In apprenticeship learning, the goal is to learn a policy in a Markov decision process that is at least as good as a policy demonstrated by an expert. The difficulty arises in tha...

Umar Syed, Michael H. Bowling, Robert E. Schapire

claim paper

Read More »

187

click to vote

JAIR
2006

101views more JAIR 2006»

Resource Allocation Among Agents with MDP-Induced Preferences

15 years 7 months ago

Download www.jair.org

Allocating scarce resources among agents to maximize global utility is, in general, computationally challenging. We focus on problems where resources enable agents to execute acti...

Dmitri A. Dolgov, Edmund H. Durfee

claim paper

Read More »

197

Voted

ECML
2007
Springer

192views Machine Learning» more ECML 2007»

Policy Gradient Critics

16 years 1 months ago

Download www.idsia.ch

We present Policy Gradient Actor-Critic (PGAC), a new model-free Reinforcement Learning (RL) method for creating limited-memory stochastic policies for Partially Observable Markov ...

Daan Wierstra, Jürgen Schmidhuber

claim paper

Read More »

170

click to vote

3DIM
2007
IEEE

135views Image Processing» more 3DIM 2007»

A Bayesian Framework for Simultaneous Matting and 3D Reconstruction

16 years 1 months ago

Download www.ee.surrey.ac.uk

Conventional approaches to 3D scene reconstruction often treat matting and reconstruction as two separate problems, with matting a prerequisite to reconstruction. The problem with...

Jean-Yves Guillemaut, Adrian Hilton, Jonathan Star...

claim paper

Read More »

160

click to vote

ATAL
2007
Springer

112views Intelligent Agents» more ATAL 2007»

A globally optimal algorithm for TTD-MDPs

16 years 1 months ago

Download www.cc.gatech.edu

In this paper, we discuss the use of Targeted Trajectory Distribution Markov Decision Processes (TTD-MDPs)—a variant of MDPs in which the goal is to realize a speciﬁed distrib...

Sooraj Bhat, David L. Roberts, Mark J. Nelson, Cha...

claim paper

Read More »

« Prev « First page 8 / 12 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers