Sciweavers

2 search results - page 1 / 1
» Provably Near-Optimal Sampling-Based Policies for Stochastic...
Sort
View
IJCAI
2001
13 years 8 months ago
R-MAX - A General Polynomial Time Algorithm for Near-Optimal Reinforcement Learning
R-max is a very simple model-based reinforcement learning algorithm which can attain near-optimal average reward in polynomial time. In R-max, the agent always maintains a complet...
Ronen I. Brafman, Moshe Tennenholtz