Search Sciweavers | Sciweavers

1236 search results - page 190 / 248

» Opposition-Based Reinforcement Learning

click to vote

JMLR
2008

141views more JMLR 2008»

Accelerated Neural Evolution through Cooperatively Coevolved Synapses

13 years 10 months ago

Download www.idsia.ch

Many complex control problems require sophisticated solutions that are not amenable to traditional controller design. Not only is it difficult to model real world systems, but oft...

Faustino J. Gomez, Jürgen Schmidhuber, Risto ...

claim paper

Read More »

click to vote

ICML
2009
IEEE

123views Machine Learning» more ICML 2009»

Constraint relaxation in approximate linear programs

14 years 10 months ago

Download anytime.cs.umass.edu

Approximate Linear Programming (ALP) is a reinforcement learning technique with nice theoretical properties, but it often performs poorly in practice. We identify some reasons for...

Marek Petrik, Shlomo Zilberstein

claim paper

Read More »

click to vote

ICML
2004
IEEE

163views Machine Learning» more ICML 2004»

Multi-task feature and kernel selection for SVMs

14 years 10 months ago

Download www1.cs.columbia.edu

We compute a common feature selection or kernel selection configuration for multiple support vector machines (SVMs) trained on different yet inter-related datasets. The method is ...

Tony Jebara

claim paper

Read More »

click to vote

ICML
2003
IEEE

124views Machine Learning» more ICML 2003»

Exploration in Metric State Spaces

14 years 10 months ago

Download www.cis.upenn.edu

We present metric?? , a provably near-optimal algorithm for reinforcement learning in Markov decision processes in which there is a natural metric on the state space that allows t...

Sham Kakade, Michael J. Kearns, John Langford

claim paper

Read More »

click to vote

ICML
2003
IEEE

146views Machine Learning» more ICML 2003»

TD(0) Converges Provably Faster than the Residual Gradient Algorithm

14 years 10 months ago

Download www.hpl.hp.com

In Reinforcement Learning (RL) there has been some experimental evidence that the residual gradient algorithm converges slower than the TD(0) algorithm. In this paper, we use the ...

Ralf Schoknecht, Artur Merke

claim paper

Read More »

« Prev « First page 190 / 248 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers