Search Sciweavers | Sciweavers

1262 search results - page 183 / 253

» Reinforcement Learning: An Introduction

108

click to vote

ICML
2009
IEEE

123views Machine Learning» more ICML 2009»

Constraint relaxation in approximate linear programs

16 years 4 months ago

Download anytime.cs.umass.edu

Approximate Linear Programming (ALP) is a reinforcement learning technique with nice theoretical properties, but it often performs poorly in practice. We identify some reasons for...

Marek Petrik, Shlomo Zilberstein

claim paper

Read More »

130

click to vote

ICML
2004
IEEE

163views Machine Learning» more ICML 2004»

Multi-task feature and kernel selection for SVMs

16 years 4 months ago

Download www1.cs.columbia.edu

We compute a common feature selection or kernel selection configuration for multiple support vector machines (SVMs) trained on different yet inter-related datasets. The method is ...

Tony Jebara

claim paper

Read More »

125

Voted

ICML
2003
IEEE

124views Machine Learning» more ICML 2003»

Exploration in Metric State Spaces

16 years 4 months ago

Download www.cis.upenn.edu

We present metric?? , a provably near-optimal algorithm for reinforcement learning in Markov decision processes in which there is a natural metric on the state space that allows t...

Sham Kakade, Michael J. Kearns, John Langford

claim paper

Read More »

110

click to vote

ICML
2003
IEEE

146views Machine Learning» more ICML 2003»

TD(0) Converges Provably Faster than the Residual Gradient Algorithm

16 years 4 months ago

Download www.hpl.hp.com

In Reinforcement Learning (RL) there has been some experimental evidence that the residual gradient algorithm converges slower than the TD(0) algorithm. In this paper, we use the ...

Ralf Schoknecht, Artur Merke

claim paper

Read More »

116

Voted

IJCNN
2006
IEEE

111views Neural Networks» more IJCNN 2006»

Training Coordination Proxy Agents

15 years 9 months ago

Download cs.itd.nrl.navy.mil

— Delegating the coordination role to proxy agents can improve the overall outcome of the task at the expense of cognitive overload due to switching subtasks. Stability and commi...

Myriam Abramson, William Chao, Ranjeev Mittu

claim paper

Read More »

« Prev « First page 183 / 253 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers