Sciweavers

286 search results - page 28 / 58
» Using inaccurate models in reinforcement learning
Sort
View
SARA
2005
Springer
15 years 9 months ago
The Cruncher: Automatic Concept Formation Using Minimum Description Length
Abstract. We present The Cruncher, a simple representation framework and algorithm based on minimum description length for automatically forming an ontology of concepts from attrib...
Marc Pickett, Tim Oates
FLAIRS
2006
15 years 5 months ago
Refining Human Behavior Models in a Context-based Architecture
This paper describes an investigation into the refinement of context-based human behavior models through the use of experiential learning. Specifically, a tactical agent was endow...
David Aihe, Avelino J. Gonzalez
152
Voted
ICML
2010
IEEE
15 years 2 months ago
Temporal Difference Bayesian Model Averaging: A Bayesian Perspective on Adapting Lambda
Temporal difference (TD) algorithms are attractive for reinforcement learning due to their ease-of-implementation and use of "bootstrapped" return estimates to make effi...
Carlton Downey, Scott Sanner
HT
2009
ACM
15 years 10 months ago
Improving recommender systems with adaptive conversational strategies
Conversational recommender systems (CRSs) assist online users in their information-seeking and decision making tasks by supporting an interactive process. Although these processes...
Tariq Mahmood, Francesco Ricci
ATAL
2008
Springer
15 years 6 months ago
Sigma point policy iteration
In reinforcement learning, least-squares temporal difference methods (e.g., LSTD and LSPI) are effective, data-efficient techniques for policy evaluation and control with linear v...
Michael H. Bowling, Alborz Geramifard, David Winga...