Sciweavers

286 search results - page 28 / 58
» Using inaccurate models in reinforcement learning
Sort
View
SARA
2005
Springer
14 years 1 months ago
The Cruncher: Automatic Concept Formation Using Minimum Description Length
Abstract. We present The Cruncher, a simple representation framework and algorithm based on minimum description length for automatically forming an ontology of concepts from attrib...
Marc Pickett, Tim Oates
FLAIRS
2006
13 years 9 months ago
Refining Human Behavior Models in a Context-based Architecture
This paper describes an investigation into the refinement of context-based human behavior models through the use of experiential learning. Specifically, a tactical agent was endow...
David Aihe, Avelino J. Gonzalez
ICML
2010
IEEE
13 years 5 months ago
Temporal Difference Bayesian Model Averaging: A Bayesian Perspective on Adapting Lambda
Temporal difference (TD) algorithms are attractive for reinforcement learning due to their ease-of-implementation and use of "bootstrapped" return estimates to make effi...
Carlton Downey, Scott Sanner
HT
2009
ACM
14 years 2 months ago
Improving recommender systems with adaptive conversational strategies
Conversational recommender systems (CRSs) assist online users in their information-seeking and decision making tasks by supporting an interactive process. Although these processes...
Tariq Mahmood, Francesco Ricci
ATAL
2008
Springer
13 years 9 months ago
Sigma point policy iteration
In reinforcement learning, least-squares temporal difference methods (e.g., LSTD and LSPI) are effective, data-efficient techniques for policy evaluation and control with linear v...
Michael H. Bowling, Alborz Geramifard, David Winga...