Search Sciweavers | Sciweavers

47

ILP
2007
Springer

283views Automated Reasoning» more ILP 2007»

Building Relational World Models for Reinforcement Learning

14 years 1 months ago

Abstract. Many reinforcement learning domains are highly relational. While traditional temporal-difference methods can be applied to these domains, they are limited in their capaci...

Trevor Walker, Lisa Torrey, Jude W. Shavlik, Richa...

claim paper

Read More »

28

click to vote

ICML
1994
IEEE

151views Machine Learning» more ICML 1994»

Learning Without State-Estimation in Partially Observable Markovian Decision Processes

13 years 11 months ago

Download www.eecs.umich.edu

Reinforcement learning (RL) algorithms provide a sound theoretical basis for building learning control architectures for embedded agents. Unfortunately all of the theory and much ...

Satinder P. Singh, Tommi Jaakkola, Michael I. Jord...

claim paper

Read More »

23

click to vote

ATAL
2008
Springer

138views Intelligent Agents» more ATAL 2008»

Reinforcement learning for DEC-MDPs with changing action sets and partially ordered dependencies

13 years 9 months ago

Download ml.informatik.uni-freiburg.de

Decentralized Markov decision processes are frequently used to model cooperative multi-agent systems. In this paper, we identify a subclass of general DEC-MDPs that features regul...

Thomas Gabel, Martin A. Riedmiller

claim paper

Read More »

25

click to vote

ATAL
2005
Springer

101views Intelligent Agents» more ATAL 2005»

Automated resource-driven mission phasing techniques for constrained agents

13 years 9 months ago

Download www.cs.huji.ac.il

A constrained agent is limited in the actions that it can take at any given time, and a challenging problem is to design policies for such agents to do the best they can despite t...

Jianhui Wu, Edmund H. Durfee

claim paper

Read More »

26

click to vote

ATAL
2007
Springer

142views Intelligent Agents» more ATAL 2007»

Q-value functions for decentralized POMDPs

14 years 1 months ago

Download www.science.uva.nl

Planning in single-agent models like MDPs and POMDPs can be carried out by resorting to Q-value functions: a (near-) optimal Q-value function is computed in a recursive manner by ...

Frans A. Oliehoek, Nikos A. Vlassis

claim paper

Read More »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers