Sciweavers

81 search results - page 16 / 17
» An extended policy gradient algorithm for robot task learnin...
Sort
View
AI
1998
Springer
13 years 8 months ago
Utility-Based On-Line Exploration for Repeated Navigation in an Embedded Graph
In this paper, we address the tradeo between exploration and exploitation for agents which need to learn more about the structure of their environment in order to perform more e e...
Shlomo Argamon-Engelson, Sarit Kraus, Sigalit Sina
ATAL
2010
Springer
13 years 9 months ago
Learning multi-agent state space representations
This paper describes an algorithm, called CQ-learning, which learns to adapt the state representation for multi-agent systems in order to coordinate with other agents. We propose ...
Yann-Michaël De Hauwere, Peter Vrancx, Ann No...
AGENTS
1999
Springer
14 years 27 days ago
Team-Partitioned, Opaque-Transition Reinforcement Learning
In this paper, we present a novel multi-agent learning paradigm called team-partitioned, opaque-transition reinforcement learning (TPOT-RL). TPOT-RL introduces the concept of usin...
Peter Stone, Manuela M. Veloso
ATAL
2010
Springer
13 years 9 months ago
Linear options
Learning, planning, and representing knowledge in large state t multiple levels of temporal abstraction are key, long-standing challenges for building flexible autonomous agents. ...
Jonathan Sorg, Satinder P. Singh
CORR
2010
Springer
152views Education» more  CORR 2010»
13 years 8 months ago
Neuroevolutionary optimization
Temporal difference methods are theoretically grounded and empirically effective methods for addressing reinforcement learning problems. In most real-world reinforcement learning ...
Eva Volná