Search Sciweavers | Sciweavers

286 search results - page 28 / 58

» Using inaccurate models in reinforcement learning

click to vote

SARA
2005
Springer

114views Artificial Intelligence» more SARA 2005»

The Cruncher: Automatic Concept Formation Using Minimum Description Length

14 years 1 months ago

Download www.coral-lab.org

Abstract. We present The Cruncher, a simple representation framework and algorithm based on minimum description length for automatically forming an ontology of concepts from attrib...

Marc Pickett, Tim Oates

claim paper

Read More »

click to vote

FLAIRS
2006

109views Artificial Intelligence» more FLAIRS 2006»

Refining Human Behavior Models in a Context-based Architecture

13 years 9 months ago

Download www.aaai.org

This paper describes an investigation into the refinement of context-based human behavior models through the use of experiential learning. Specifically, a tactical agent was endow...

David Aihe, Avelino J. Gonzalez

claim paper

Read More »

click to vote

ICML
2010
IEEE

222views Machine Learning» more ICML 2010»

Temporal Difference Bayesian Model Averaging: A Bayesian Perspective on Adapting Lambda

13 years 5 months ago

Download www.icml2010.org

Temporal difference (TD) algorithms are attractive for reinforcement learning due to their ease-of-implementation and use of "bootstrapped" return estimates to make effi...

Carlton Downey, Scott Sanner

claim paper

Read More »

click to vote

HT
2009
ACM

146views Internet Technology» more HT 2009»

Improving recommender systems with adaptive conversational strategies

14 years 2 months ago

Download www.inf.unibz.it

Conversational recommender systems (CRSs) assist online users in their information-seeking and decision making tasks by supporting an interactive process. Although these processes...

Tariq Mahmood, Francesco Ricci

claim paper

Read More »

click to vote

ATAL
2008
Springer

123views Intelligent Agents» more ATAL 2008»

Sigma point policy iteration

13 years 9 months ago

Download web.mit.edu

In reinforcement learning, least-squares temporal difference methods (e.g., LSTD and LSPI) are effective, data-efficient techniques for policy evaluation and control with linear v...

Michael H. Bowling, Alborz Geramifard, David Winga...

claim paper

Read More »

« Prev « First page 28 / 58 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers