Search Sciweavers | Sciweavers

1235 search results - page 178 / 247

» Reinforcement learning in a nutshell

152

Voted

JMLR
2010

119views more JMLR 2010»

A Convergent Online Single Time Scale Actor Critic Algorithm

14 years 10 months ago

Download jmlr.csail.mit.edu

Actor-Critic based approaches were among the first to address reinforcement learning in a general setting. Recently, these algorithms have gained renewed interest due to their gen...

Dotan Di Castro, Ron Meir

claim paper

Read More »

108

click to vote

ICML
2009
IEEE

123views Machine Learning» more ICML 2009»

Constraint relaxation in approximate linear programs

16 years 4 months ago

Download anytime.cs.umass.edu

Approximate Linear Programming (ALP) is a reinforcement learning technique with nice theoretical properties, but it often performs poorly in practice. We identify some reasons for...

Marek Petrik, Shlomo Zilberstein

claim paper

Read More »

130

click to vote

ICML
2004
IEEE

163views Machine Learning» more ICML 2004»

Multi-task feature and kernel selection for SVMs

16 years 4 months ago

Download www1.cs.columbia.edu

We compute a common feature selection or kernel selection configuration for multiple support vector machines (SVMs) trained on different yet inter-related datasets. The method is ...

Tony Jebara

claim paper

Read More »

125

Voted

ICML
2003
IEEE

124views Machine Learning» more ICML 2003»

Exploration in Metric State Spaces

16 years 4 months ago

Download www.cis.upenn.edu

We present metric?? , a provably near-optimal algorithm for reinforcement learning in Markov decision processes in which there is a natural metric on the state space that allows t...

Sham Kakade, Michael J. Kearns, John Langford

claim paper

Read More »

110

click to vote

ICML
2003
IEEE

146views Machine Learning» more ICML 2003»

TD(0) Converges Provably Faster than the Residual Gradient Algorithm

16 years 4 months ago

Download www.hpl.hp.com

In Reinforcement Learning (RL) there has been some experimental evidence that the residual gradient algorithm converges slower than the TD(0) algorithm. In this paper, we use the ...

Ralf Schoknecht, Artur Merke

claim paper

Read More »

« Prev « First page 178 / 247 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers