Search Sciweavers | Sciweavers

165 search results - page 27 / 33

» Exploration and apprenticeship learning in reinforcement lea...

click to vote

ICML
2010
IEEE

231views Machine Learning» more ICML 2010»

Toward Off-Policy Learning Control with Function Approximation

13 years 8 months ago

Download www.sztaki.hu

We present the first temporal-difference learning algorithm for off-policy control with unrestricted linear function approximation whose per-time-step complexity is linear in the ...

Hamid Reza Maei, Csaba Szepesvári, Shalabh ...

claim paper

Read More »

click to vote

JMLR
2010

141views more JMLR 2010»

Pinview: Implicit Feedback in Content-Based Image Retrieval

13 years 2 months ago

Download jmlr.csail.mit.edu

This paper describes Pinview, a content-based image retrieval system that exploits implicit relevance feedback during a search session. Pinview contains several novel methods that...

Peter Auer, Zakria Hussain, Samuel Kaski, Arto Kla...

claim paper

Read More »

click to vote

IJCAI
2007

201views Artificial Intelligence» more IJCAI 2007»

Using Linear Programming for Bayesian Exploration in Markov Decision Processes

13 years 9 months ago

Download www.cs.mcgill.ca

A key problem in reinforcement learning is ﬁnding a good balance between the need to explore the environment and the need to gain rewards by exploiting existing knowledge. Much ...

Pablo Samuel Castro, Doina Precup

claim paper

Read More »

click to vote

EWCBR
2008
Springer

206views Automated Reasoning» more EWCBR 2008»

Forgetting Reinforced Cases

13 years 9 months ago

Download agora.ulaval.ca

To meet time constraints, a CBR system must control the time spent searching in the case base for a solution. In this paper, we presents the results of a case study comparing the p...

Houcine Romdhane, Luc Lamontagne

claim paper

Read More »

click to vote

JAIR
2008

148views more JAIR 2008»

Learning Partially Observable Deterministic Action Models

13 years 7 months ago

Download www.jair.org

We present exact algorithms for identifying deterministic-actions' effects and preconditions in dynamic partially observable domains. They apply when one does not know the ac...

Eyal Amir, Allen Chang

claim paper

Read More »

« Prev « First page 27 / 33 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers