Search Sciweavers | Sciweavers

1630 search results - page 201 / 326

» Coordinated Reinforcement Learning

117

click to vote

ICML
2009
IEEE

123views Machine Learning» more ICML 2009»

Constraint relaxation in approximate linear programs

16 years 4 months ago

Download anytime.cs.umass.edu

Approximate Linear Programming (ALP) is a reinforcement learning technique with nice theoretical properties, but it often performs poorly in practice. We identify some reasons for...

Marek Petrik, Shlomo Zilberstein

claim paper

Read More »

142

click to vote

ICML
2004
IEEE

163views Machine Learning» more ICML 2004»

Multi-task feature and kernel selection for SVMs

16 years 4 months ago

Download www1.cs.columbia.edu

We compute a common feature selection or kernel selection configuration for multiple support vector machines (SVMs) trained on different yet inter-related datasets. The method is ...

Tony Jebara

claim paper

Read More »

132

click to vote

ICML
2003
IEEE

124views Machine Learning» more ICML 2003»

Exploration in Metric State Spaces

16 years 4 months ago

Download www.cis.upenn.edu

We present metric?? , a provably near-optimal algorithm for reinforcement learning in Markov decision processes in which there is a natural metric on the state space that allows t...

Sham Kakade, Michael J. Kearns, John Langford

claim paper

Read More »

119

click to vote

ICML
2003
IEEE

146views Machine Learning» more ICML 2003»

TD(0) Converges Provably Faster than the Residual Gradient Algorithm

16 years 4 months ago

Download www.hpl.hp.com

In Reinforcement Learning (RL) there has been some experimental evidence that the residual gradient algorithm converges slower than the TD(0) algorithm. In this paper, we use the ...

Ralf Schoknecht, Artur Merke

claim paper

Read More »

130

click to vote

ISCAS
2006
IEEE

103views Hardware» more ISCAS 2006»

Towards autonomous adaptive behavior in a bio-inspired CNN-controlled robot

15 years 10 months ago

Download web.mit.edu

— This paper describes a general approach for the unsupervised learning of behaviors in a behavior-based robot. The key idea is to formalize a behavior produced by a Motor Map dr...

Paolo Arena, Luigi Fortuna, Mattia Frasca, Luca Pa...

claim paper

Read More »

« Prev « First page 201 / 326 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers