Search Sciweavers | Sciweavers

77 search results - page 14 / 16

» Value Function Approximation in Reinforcement Learning Using...

153

click to vote

ICML
1999
IEEE

168views Machine Learning» more ICML 1999»

Least-Squares Temporal Difference Learning

16 years 3 months ago

Download www.research.rutgers.edu

Excerpted from: Boyan, Justin. Learning Evaluation Functions for Global Optimization. Ph.D. thesis, Carnegie Mellon University, August 1998. (Available as Technical Report CMU-CS-...

Justin A. Boyan

claim paper

Read More »

142

click to vote

Publication

222views

Algorithms and Bounds for Rollout Sampling Approximate Policy Iteration

16 years 10 hour ago

Download arxiv.org

Abstract: Several approximate policy iteration schemes without value functions, which focus on policy representation using classifiers and address policy learning as a supervis...

Christos Dimitrakakis, Michail G. Lagoudakis

posted by olethros

Read More »

146

click to vote

JMLR
2010

119views more JMLR 2010»

A Convergent Online Single Time Scale Actor Critic Algorithm

14 years 9 months ago

Download jmlr.csail.mit.edu

Actor-Critic based approaches were among the first to address reinforcement learning in a general setting. Recently, these algorithms have gained renewed interest due to their gen...

Dotan Di Castro, Ron Meir

claim paper

Read More »

105

click to vote

ICML
2003
IEEE

146views Machine Learning» more ICML 2003»

TD(0) Converges Provably Faster than the Residual Gradient Algorithm

16 years 3 months ago

Download www.hpl.hp.com

In Reinforcement Learning (RL) there has been some experimental evidence that the residual gradient algorithm converges slower than the TD(0) algorithm. In this paper, we use the ...

Ralf Schoknecht, Artur Merke

claim paper

Read More »

169

click to vote

STOC
2012
ACM

209views Algorithms» more STOC 2012»

Nearly optimal solutions for the chow parameters problem and low-weight approximation of halfspaces

13 years 5 months ago

Download www.cs.berkeley.edu

The Chow parameters of a Boolean function f : {−1, 1}n → {−1, 1} are its n + 1 degree-0 and degree-1 Fourier coefﬁcients. It has been known since 1961 [Cho61, Tan61] that ...

Anindya De, Ilias Diakonikolas, Vitaly Feldman, Ro...

claim paper

Read More »

« Prev « First page 14 / 16 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers