Sciweavers

779 search results - page 78 / 156
» Reinforcement Using Supervised Learning for Policy Generaliz...
Sort
View
ICML
2003
IEEE
14 years 10 months ago
Exploration in Metric State Spaces
We present metric?? , a provably near-optimal algorithm for reinforcement learning in Markov decision processes in which there is a natural metric on the state space that allows t...
Sham Kakade, Michael J. Kearns, John Langford
ICML
2009
IEEE
14 years 10 months ago
A majorization-minimization algorithm for (multiple) hyperparameter learning
We present a general Bayesian framework for hyperparameter tuning in L2-regularized supervised learning models. Paradoxically, our algorithm works by first analytically integratin...
Chuan-Sheng Foo, Chuong B. Do, Andrew Y. Ng
COLT
2010
Springer
13 years 7 months ago
An Asymptotically Optimal Bandit Algorithm for Bounded Support Models
Multiarmed bandit problem is a typical example of a dilemma between exploration and exploitation in reinforcement learning. This problem is expressed as a model of a gambler playi...
Junya Honda, Akimichi Takemura
IJCAI
2007
13 years 10 months ago
Transfer Learning in Real-Time Strategy Games Using Hybrid CBR/RL
The goal of transfer learning is to use the knowledge acquired in a set of source tasks to improve performance in a related but previously unseen target task. In this paper, we pr...
Manu Sharma, Michael P. Holmes, Juan Carlos Santam...
EPIA
2003
Springer
14 years 2 months ago
Adaptation to Drifting Concepts
Most of supervised learning algorithms assume the stability of the target concept over time. Nevertheless in many real-user modeling systems, where the data is collected over an ex...
Gladys Castillo, João Gama, Pedro Medas