TD-FALCON is a self-organizing neural network that incorporates Temporal Difference (TD) methods for reinforcement learning. Despite the advantages of fast and stable learning, TD...
Abstract. This paper presents parallel approaches to the complete transient numerical analysis of stochastic reward nets (SRNs) for both shared and distributed-memory machines. Par...
We introduce a class of learning problems where the agent is presented with a series of tasks. Intuitively, if there is relation among those tasks, then the information gained duri...
In reinforcement learning, least-squares temporal difference methods (e.g., LSTD and LSPI) are effective, data-efficient techniques for policy evaluation and control with linear v...
Michael H. Bowling, Alborz Geramifard, David Winga...
Decentralised coordination in multi-agent systems is typically achieved using communication. However, in many cases, communication is expensive to utilise because there is limited...
Simon A. Williamson, Enrico H. Gerding, Nicholas R...