Sciweavers

451 search results - page 4 / 91
» Performance evaluation with temporal rewards
Sort
View
IJCAI
2007
13 years 9 months ago
Direct Code Access in Self-Organizing Neural Networks for Reinforcement Learning
TD-FALCON is a self-organizing neural network that incorporates Temporal Difference (TD) methods for reinforcement learning. Despite the advantages of fast and stable learning, TD...
Ah-Hwee Tan
APN
1999
Springer
13 years 12 months ago
Parallel Approaches to the Numerical Transient Analysis of Stochastic Reward Nets
Abstract. This paper presents parallel approaches to the complete transient numerical analysis of stochastic reward nets (SRNs) for both shared and distributed-memory machines. Par...
Susann C. Allmaier, David Kreische

Publication
233views
12 years 6 months ago
Sparse reward processes
We introduce a class of learning problems where the agent is presented with a series of tasks. Intuitively, if there is relation among those tasks, then the information gained duri...
Christos Dimitrakakis
ATAL
2008
Springer
13 years 9 months ago
Sigma point policy iteration
In reinforcement learning, least-squares temporal difference methods (e.g., LSTD and LSPI) are effective, data-efficient techniques for policy evaluation and control with linear v...
Michael H. Bowling, Alborz Geramifard, David Winga...
ATAL
2009
Springer
14 years 2 months ago
Reward shaping for valuing communications during multi-agent coordination
Decentralised coordination in multi-agent systems is typically achieved using communication. However, in many cases, communication is expensive to utilise because there is limited...
Simon A. Williamson, Enrico H. Gerding, Nicholas R...