Search Sciweavers | Sciweavers

61 search results - page 4 / 13

» Convergence of synchronous reinforcement learning with linea...

223

click to vote

ICML
2006
IEEE

256views Machine Learning» more ICML 2006»

Automatic basis function construction for approximate dynamic programming and reinforcement learning

16 years 1 months ago

Download www.ece.mcgill.ca

We address the problem of automatically constructing basis functions for linear approximation of the value function of a Markov Decision Process (MDP). Our work builds on results ...

Philipp W. Keller, Shie Mannor, Doina Precup

claim paper

Read More »

230

click to vote

IWLCS
2005
Springer

161views Machine Learning» more IWLCS 2005»

Counter Example for Q-Bucket-Brigade Under Prediction Problem

16 years 16 days ago

Download www.cs.bham.ac.uk

Aiming to clarify the convergence or divergence conditions for Learning Classiﬁer System (LCS), this paper explores: (1) an extreme condition where the reinforcement process of ...

Atsushi Wada, Keiki Takadama, Katsunori Shimohara

claim paper

Read More »

264

click to vote

AI
1998
Springer

177views Artificial Intelligence» more AI 1998»

Model-Based Average Reward Reinforcement Learning

15 years 6 months ago

Download web.engr.oregonstate.edu

Reinforcement Learning (RL) is the study of programs that improve their performance by receiving rewards and punishments from the environment. Most RL methods optimize the discoun...

Prasad Tadepalli, DoKyeong Ok

claim paper

Read More »

171

click to vote

AIPS
2008

95views Artificial Intelligence» more AIPS 2008»

Learning Heuristic Functions through Approximate Linear Programming

15 years 9 months ago

Download anytime.cs.umass.edu

Planning problems are often formulated as heuristic search. The choice of the heuristic function plays a significant role in the performance of planning systems, but a good heuris...

Marek Petrik, Shlomo Zilberstein

claim paper

Read More »

181

click to vote

ICRA
2009
IEEE

143views Robotics» more ICRA 2009»

Least absolute policy iteration for robust value function approximation

16 years 1 months ago

Download sugiyama-www.cs.titech.ac.jp

Abstract— Least-squares policy iteration is a useful reinforcement learning method in robotics due to its computational efﬁciency. However, it tends to be sensitive to outliers...

Masashi Sugiyama, Hirotaka Hachiya, Hisashi Kashim...

claim paper

Read More »

« Prev « First page 4 / 13 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers