Search Sciweavers | Sciweavers

226 search results - page 20 / 46

» A Convergent Reinforcement Learning Algorithm in the Continu...

click to vote

ML
2002
ACM

121views Machine Learning» more ML 2002»

Near-Optimal Reinforcement Learning in Polynomial Time

13 years 6 months ago

Download www.cis.upenn.edu

We present new algorithms for reinforcement learning, and prove that they have polynomial bounds on the resources required to achieve near-optimal return in general Markov decisio...

Michael J. Kearns, Satinder P. Singh

claim paper

Read More »

click to vote

JMLR
2010

119views more JMLR 2010»

A Convergent Online Single Time Scale Actor Critic Algorithm

13 years 1 months ago

Download jmlr.csail.mit.edu

Actor-Critic based approaches were among the first to address reinforcement learning in a general setting. Recently, these algorithms have gained renewed interest due to their gen...

Dotan Di Castro, Ron Meir

claim paper

Read More »

click to vote

ECML
2007
Springer

170views Machine Learning» more ECML 2007»

Sequence Labeling with Reinforcement Learning and Ranking Algorithms

13 years 8 months ago

Download nieme.lip6.fr

Many problems in areas such as Natural Language Processing, Information Retrieval, or Bioinformatic involve the generic task of sequence labeling. In many cases, the aim is to assi...

Francis Maes, Ludovic Denoyer, Patrick Gallinari

claim paper

Read More »

click to vote

ICML
2003
IEEE

146views Machine Learning» more ICML 2003»

TD(0) Converges Provably Faster than the Residual Gradient Algorithm

14 years 7 months ago

Download www.hpl.hp.com

In Reinforcement Learning (RL) there has been some experimental evidence that the residual gradient algorithm converges slower than the TD(0) algorithm. In this paper, we use the ...

Ralf Schoknecht, Artur Merke

claim paper

Read More »

click to vote

GECCO
2008
Springer

182views Optimization» more GECCO 2008»

Scaling ant colony optimization with hierarchical reinforcement learning partitioning

13 years 7 months ago

Download www.cs.bham.ac.uk

This paper merges hierarchical reinforcement learning (HRL) with ant colony optimization (ACO) to produce a HRL ACO algorithm capable of generating solutions for large domains. Th...

Erik J. Dries, Gilbert L. Peterson

claim paper

Read More »

« Prev « First page 20 / 46 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers