Sciweavers

ICML
2005
IEEE
15 years 1 months ago
A theoretical analysis of Model-Based Interval Estimation
Several algorithms for learning near-optimal policies in Markov Decision Processes have been analyzed and proven efficient. Empirical results have suggested that Model-based Inter...
Alexander L. Strehl, Michael L. Littman
ICML
2005
IEEE
15 years 1 months ago
Exploration and apprenticeship learning in reinforcement learning
We consider reinforcement learning in systems with unknown dynamics. Algorithms such as E3 (Kearns and Singh, 2002) learn near-optimal policies by using "exploration policies...
Pieter Abbeel, Andrew Y. Ng