Sciweavers

350 search results - page 15 / 70
» Incremental profile learning based on a reinforcement method
Sort
View
CG
2006
Springer
13 years 9 months ago
Feature Construction for Reinforcement Learning in Hearts
Temporal difference (TD) learning has been used to learn strong evaluation functions in a variety of two-player games. TD-gammon illustrated how the combination of game tree search...
Nathan R. Sturtevant, Adam M. White
NN
2010
Springer
125views Neural Networks» more  NN 2010»
13 years 6 months ago
Parameter-exploring policy gradients
We present a model-free reinforcement learning method for partially observable Markov decision problems. Our method estimates a likelihood gradient by sampling directly in paramet...
Frank Sehnke, Christian Osendorfer, Thomas Rü...
IWLCS
2005
Springer
14 years 1 months ago
Counter Example for Q-Bucket-Brigade Under Prediction Problem
Aiming to clarify the convergence or divergence conditions for Learning Classifier System (LCS), this paper explores: (1) an extreme condition where the reinforcement process of ...
Atsushi Wada, Keiki Takadama, Katsunori Shimohara
NN
2002
Springer
113views Neural Networks» more  NN 2002»
13 years 7 months ago
Control of exploitation-exploration meta-parameter in reinforcement learning
In reinforcement learning (RL), the duality between exploitation and exploration has long been an important issue. This paper presents a new method that controls the balance betwe...
Shin Ishii, Wako Yoshida, Junichiro Yoshimoto
AIMSA
2006
Springer
13 years 11 months ago
Machine Learning for Spoken Dialogue Management: An Experiment with Speech-Based Database Querying
Although speech and language processing techniques achieved a relative maturity during the last decade, designing a spoken dialogue system is still a tailoring task because of the ...
Olivier Pietquin