Search Sciweavers | Sciweavers

779 search results - page 78 / 156

» Reinforcement Using Supervised Learning for Policy Generaliz...

114

Voted

ICML
2003
IEEE

124views Machine Learning» more ICML 2003»

Exploration in Metric State Spaces

16 years 3 months ago

Download www.cis.upenn.edu

We present metric?? , a provably near-optimal algorithm for reinforcement learning in Markov decision processes in which there is a natural metric on the state space that allows t...

Sham Kakade, Michael J. Kearns, John Langford

claim paper

Read More »

click to vote

ICML
2009
IEEE

91views Machine Learning» more ICML 2009»

A majorization-minimization algorithm for (multiple) hyperparameter learning

16 years 3 months ago

Download ai.stanford.edu

We present a general Bayesian framework for hyperparameter tuning in L2-regularized supervised learning models. Paradoxically, our algorithm works by first analytically integratin...

Chuan-Sheng Foo, Chuong B. Do, Andrew Y. Ng

claim paper

Read More »

130

click to vote

COLT
2010
Springer

207views Machine Learning» more COLT 2010»

An Asymptotically Optimal Bandit Algorithm for Bounded Support Models

15 years 11 days ago

Download www.colt2010.org

Multiarmed bandit problem is a typical example of a dilemma between exploration and exploitation in reinforcement learning. This problem is expressed as a model of a gambler playi...

Junya Honda, Akimichi Takemura

claim paper

Read More »

130

Voted

IJCAI
2007

248views Artificial Intelligence» more IJCAI 2007»

Transfer Learning in Real-Time Strategy Games Using Hybrid CBR/RL

15 years 3 months ago

Download www.cc.gatech.edu

The goal of transfer learning is to use the knowledge acquired in a set of source tasks to improve performance in a related but previously unseen target task. In this paper, we pr...

Manu Sharma, Michael P. Holmes, Juan Carlos Santam...

claim paper

Read More »

114

click to vote

EPIA
2003
Springer

153views Artificial Intelligence» more EPIA 2003»

Adaptation to Drifting Concepts

15 years 7 months ago

Download www2.mat.ua.pt

Most of supervised learning algorithms assume the stability of the target concept over time. Nevertheless in many real-user modeling systems, where the data is collected over an ex...

Gladys Castillo, João Gama, Pedro Medas

claim paper

Read More »

« Prev « First page 78 / 156 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers