Search Sciweavers | Sciweavers

102 search results - page 7 / 21

» MDPs with Non-Deterministic Policies

131

click to vote

NIPS
2003

158views Information Technology» more NIPS 2003»

Envelope-based Planning in Relational MDPs

15 years 7 months ago

Download books.nips.cc

A mobile robot acting in the world is faced with a large amount of sensory data and uncertainty in its action outcomes. Indeed, almost all interesting sequential decision-making d...

Natalia Hernandez-Gardiol, Leslie Pack Kaelbling

claim paper

Read More »

173

click to vote

JMLR
2006

190views more JMLR 2006»

Causal Graph Based Decomposition of Factored MDPs

15 years 5 months ago

Download www-anw.cs.umass.edu

We present Variable Influence Structure Analysis, or VISA, an algorithm that performs hierarchical decomposition of factored Markov decision processes. VISA uses a dynamic Bayesia...

Anders Jonsson, Andrew G. Barto

claim paper

Read More »

145

click to vote

ORL
2007

70views more ORL 2007»

Linear dependence of stationary distributions in ergodic Markov decision processes

15 years 5 months ago

Download personal.unileoben.ac.at

In ergodic MDPs we consider stationary distributions of policies that coincide in all but n states, in which one of two possible actions is chosen. We give conditions and formulas...

Ronald Ortner

claim paper

Read More »

162

Voted

ICML
2003
IEEE

104views Machine Learning» more ICML 2003»

The Influence of Reward on the Speed of Reinforcement Learning: An Analysis of Shaping

15 years 11 months ago

Download www.hpl.hp.com

Shaping can be an effective method for improving the learning rate in reinforcement systems. Previously, shaping has been heuristically motivated and implemented. We provide a for...

Adam Laud, Gerald DeJong

claim paper

Read More »

165

click to vote

ATAL
2009
Springer

146views Intelligent Agents» more ATAL 2009»

Online exploration in least-squares policy iteration

16 years 14 days ago

Download www.aamas-conference.org

One of the key problems in reinforcement learning is balancing exploration and exploitation. Another is learning and acting in large or even continuous Markov decision processes (...

Lihong Li, Michael L. Littman, Christopher R. Mans...

claim paper

Read More »

« Prev « First page 7 / 21 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers