Sciweavers

102 search results - page 14 / 21
» MDPs with Non-Deterministic Policies
Sort
View
ICML
2005
IEEE
14 years 8 months ago
Bounded real-time dynamic programming: RTDP with monotone upper bounds and performance guarantees
MDPs are an attractive formalization for planning, but realistic problems often have intractably large state spaces. When we only need a partial policy to get from a fixed start s...
H. Brendan McMahan, Maxim Likhachev, Geoffrey J. G...
ICML
2001
IEEE
14 years 8 months ago
Symmetry in Markov Decision Processes and its Implications for Single Agent and Multiagent Learning
This paper examines the notion of symmetry in Markov decision processes (MDPs). We define symmetry for an MDP and show how it can be exploited for more effective learning in singl...
Martin Zinkevich, Tucker R. Balch
AAAI
2010
13 years 9 months ago
Trial-Based Dynamic Programming for Multi-Agent Planning
Trial-based approaches offer an efficient way to solve singleagent MDPs and POMDPs. These approaches allow agents to focus their computations on regions of the environment they en...
Feng Wu, Shlomo Zilberstein, Xiaoping Chen
ICML
2005
IEEE
14 years 8 months ago
Exploration and apprenticeship learning in reinforcement learning
We consider reinforcement learning in systems with unknown dynamics. Algorithms such as E3 (Kearns and Singh, 2002) learn near-optimal policies by using "exploration policies...
Pieter Abbeel, Andrew Y. Ng
ICML
2007
IEEE
14 years 8 months ago
Multi-task reinforcement learning: a hierarchical Bayesian approach
We consider the problem of multi-task reinforcement learning, where the agent needs to solve a sequence of Markov Decision Processes (MDPs) chosen randomly from a fixed but unknow...
Aaron Wilson, Alan Fern, Soumya Ray, Prasad Tadepa...