Sciweavers

165 search results - page 27 / 33
» Exploration and apprenticeship learning in reinforcement lea...
Sort
View
ICML
2010
IEEE
13 years 8 months ago
Toward Off-Policy Learning Control with Function Approximation
We present the first temporal-difference learning algorithm for off-policy control with unrestricted linear function approximation whose per-time-step complexity is linear in the ...
Hamid Reza Maei, Csaba Szepesvári, Shalabh ...
JMLR
2010
141views more  JMLR 2010»
13 years 2 months ago
Pinview: Implicit Feedback in Content-Based Image Retrieval
This paper describes Pinview, a content-based image retrieval system that exploits implicit relevance feedback during a search session. Pinview contains several novel methods that...
Peter Auer, Zakria Hussain, Samuel Kaski, Arto Kla...
IJCAI
2007
13 years 9 months ago
Using Linear Programming for Bayesian Exploration in Markov Decision Processes
A key problem in reinforcement learning is finding a good balance between the need to explore the environment and the need to gain rewards by exploiting existing knowledge. Much ...
Pablo Samuel Castro, Doina Precup
EWCBR
2008
Springer
13 years 9 months ago
Forgetting Reinforced Cases
To meet time constraints, a CBR system must control the time spent searching in the case base for a solution. In this paper, we presents the results of a case study comparing the p...
Houcine Romdhane, Luc Lamontagne
JAIR
2008
148views more  JAIR 2008»
13 years 7 months ago
Learning Partially Observable Deterministic Action Models
We present exact algorithms for identifying deterministic-actions' effects and preconditions in dynamic partially observable domains. They apply when one does not know the ac...
Eyal Amir, Allen Chang