Search Sciweavers | Sciweavers

451 search results - page 4 / 91

» Performance evaluation with temporal rewards

185

Voted

IJCAI
2007

143views Artificial Intelligence» more IJCAI 2007»

Direct Code Access in Self-Organizing Neural Networks for Reinforcement Learning

15 years 8 months ago

Download www.aaai.org

TD-FALCON is a self-organizing neural network that incorporates Temporal Difference (TD) methods for reinforcement learning. Despite the advantages of fast and stable learning, TD...

Ah-Hwee Tan

claim paper

Read More »

202

Voted

APN
1999
Springer

107views Artificial Intelligence» more APN 1999»

Parallel Approaches to the Numerical Transient Analysis of Stochastic Reward Nets

15 years 11 months ago

Download www3.informatik.uni-erlangen.de

Abstract. This paper presents parallel approaches to the complete transient numerical analysis of stochastic reward nets (SRNs) for both shared and distributed-memory machines. Par...

Susann C. Allmaier, David Kreische

claim paper

Read More »

327

click to vote

Publication

233views

Sparse reward processes

14 years 6 months ago

Download arxiv.org

We introduce a class of learning problems where the agent is presented with a series of tasks. Intuitively, if there is relation among those tasks, then the information gained duri...

Christos Dimitrakakis

posted by olethros

Read More »

210

Voted

ATAL
2008
Springer

123views Intelligent Agents» more ATAL 2008»

Sigma point policy iteration

15 years 9 months ago

Download web.mit.edu

In reinforcement learning, least-squares temporal difference methods (e.g., LSTD and LSPI) are effective, data-efficient techniques for policy evaluation and control with linear v...

Michael H. Bowling, Alborz Geramifard, David Winga...

claim paper

Read More »

230

click to vote

ATAL
2009
Springer

109views Intelligent Agents» more ATAL 2009»

Reward shaping for valuing communications during multi-agent coordination

16 years 2 months ago

Download eprints.ecs.soton.ac.uk

Decentralised coordination in multi-agent systems is typically achieved using communication. However, in many cases, communication is expensive to utilise because there is limited...

Simon A. Williamson, Enrico H. Gerding, Nicholas R...

claim paper

Read More »

« Prev « First page 4 / 91 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers