Sciweavers

102 search results - page 7 / 21
» MDPs with Non-Deterministic Policies
Sort
View
NIPS
2003
13 years 8 months ago
Envelope-based Planning in Relational MDPs
A mobile robot acting in the world is faced with a large amount of sensory data and uncertainty in its action outcomes. Indeed, almost all interesting sequential decision-making d...
Natalia Hernandez-Gardiol, Leslie Pack Kaelbling
JMLR
2006
190views more  JMLR 2006»
13 years 7 months ago
Causal Graph Based Decomposition of Factored MDPs
We present Variable Influence Structure Analysis, or VISA, an algorithm that performs hierarchical decomposition of factored Markov decision processes. VISA uses a dynamic Bayesia...
Anders Jonsson, Andrew G. Barto
ORL
2007
70views more  ORL 2007»
13 years 7 months ago
Linear dependence of stationary distributions in ergodic Markov decision processes
In ergodic MDPs we consider stationary distributions of policies that coincide in all but n states, in which one of two possible actions is chosen. We give conditions and formulas...
Ronald Ortner
ICML
2003
IEEE
14 years 22 days ago
The Influence of Reward on the Speed of Reinforcement Learning: An Analysis of Shaping
Shaping can be an effective method for improving the learning rate in reinforcement systems. Previously, shaping has been heuristically motivated and implemented. We provide a for...
Adam Laud, Gerald DeJong
ATAL
2009
Springer
14 years 2 months ago
Online exploration in least-squares policy iteration
One of the key problems in reinforcement learning is balancing exploration and exploitation. Another is learning and acting in large or even continuous Markov decision processes (...
Lihong Li, Michael L. Littman, Christopher R. Mans...