Search Sciweavers | Sciweavers

171 search results - page 14 / 35

» Principled Methods for Advising Reinforcement Learning Agent...

160

click to vote

IROS
2007
IEEE

157views Robotics» more IROS 2007»

Autonomous blimp control using model-free reinforcement learning in a continuous state and action space

16 years 9 days ago

Download www.informatik.uni-freiburg.de

— In this paper, we present an approach that applies the reinforcement learning principle to the problem of learning height control policies for aerial blimps. In contrast to pre...

Axel Rottmann, Christian Plagemann, Peter Hilgers,...

claim paper

Read More »

135

click to vote

PRIMA
2009
Springer

102views Intelligent Agents» more PRIMA 2009»

Recursive Adaptation of Stepsize Parameter for Non-stationary Environments

16 years 16 days ago

Download teamcore.usc.edu

In this article, we propose a method to adapt stepsize parameters used in reinforcement learning for dynamic environments. In general reinforcement learning situations, a stepsize...

Itsuki Noda

claim paper

Read More »

171

click to vote

IAT
2008
IEEE

161views Intelligent Agents» more IAT 2008»

Scaling Up Multi-agent Reinforcement Learning in Complex Domains

15 years 6 months ago

Download www3.ntu.edu.sg

TD-FALCON (Temporal Difference - Fusion Architecture for Learning, COgnition, and Navigation) is a class of self-organizing neural networks that incorporates Temporal Difference (...

Dan Xiao, Ah-Hwee Tan

claim paper

Read More »

204

click to vote

NN
2007
Springer

105views Neural Networks» more NN 2007»

Guiding exploration by pre-existing knowledge without modifying reward

15 years 5 months ago

Download www.cs.hut.fi

Reinforcement learning is based on exploration of the environment and receiving reward that indicates which actions taken by the agent are good and which ones are bad. In many app...

Kary Främling

claim paper

Read More »

167

click to vote

ISDA
2009
IEEE

144views Operating System» more ISDA 2009»

Postponed Updates for Temporal-Difference Reinforcement Learning

16 years 19 days ago

Download www.science.uva.nl

This paper presents postponed updates, a new strategy for TD methods that can improve sample efﬁciency without incurring the computational and space requirements of model-based ...

Harm van Seijen, Shimon Whiteson

claim paper

Read More »

« Prev « First page 14 / 35 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers