Search Sciweavers | Sciweavers

56 search results - page 4 / 12

» Reinforcement Learning for Average Reward Zero-Sum Games

220

click to vote

ESANN
2008

278views Neural Networks» more ESANN 2008»

Learning to play Tetris applying reinforcement learning methods

15 years 7 months ago

Download www.dice.ucl.ac.be

In this paper the application of reinforcement learning to Tetris is investigated, particulary the idea of temporal difference learning is applied to estimate the state value funct...

Alexander Groß, Jan Friedland, Friedhelm Sch...

claim paper

Read More »

187

click to vote

GECCO
2006
Springer

198views Optimization» more GECCO 2006»

Reward allotment in an event-driven hybrid learning classifier system for online soccer games

15 years 9 months ago

Download www.cs.bham.ac.uk

This paper describes our study into the concept of using rewards in a classifier system applied to the acquisition of decision-making algorithms for agents in a soccer game. Our a...

Yuji Sato, Yosuke Akatsuka, Takenori Nishizono

claim paper

Read More »

148

click to vote

ICML
2000
IEEE

126views Machine Learning» more ICML 2000»

Reinforcement Learning in POMDP's via Direct Gradient Ascent

16 years 6 months ago

Download reference.kfupm.edu.sa

This paper discusses theoretical and experimental aspects of gradient-based approaches to the direct optimization of policy performance in controlled ??? ?s. We introduce ??? ?, a...

Jonathan Baxter, Peter L. Bartlett

claim paper

Read More »

162

click to vote

NECO
2010

97views more NECO 2010»

Derivatives of Logarithmic Stationary Distributions for Policy Gradient Reinforcement Learning

15 years 4 months ago

Download www.kyb.tuebingen.mpg.de

Most conventional Policy Gradient Reinforcement Learning (PGRL) algorithms neglect (or do not explicitly make use of) a term in the average reward gradient with respect to the pol...

Tetsuro Morimura, Eiji Uchibe, Junichiro Yoshimoto...

claim paper

Read More »

176

click to vote

COLT
2003
Springer

141views Machine Learning» more COLT 2003»

On-Line Learning with Imperfect Monitoring

15 years 11 months ago

Download www.ece.mcgill.ca

We study on-line play of repeated matrix games in which the observations of past actions of the other player and the obtained reward are partial and stochastic. We deﬁne the Part...

Shie Mannor, Nahum Shimkin

claim paper

Read More »

« Prev « First page 4 / 12 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers