Sciweavers

4544 search results - page 180 / 909
» Reinforcement Learning with Time
Sort
View
ICONIP
2007
13 years 11 months ago
Practical Recurrent Learning (PRL) in the Discrete Time Domain
One of the authors has proposed a simple learning algorithm for recurrent neural networks, which requires computational cost and memory capacity in practical order O(n2 )[1]. The a...
Mohamad Faizal Bin Samsudin, Takeshi Hirose, Katsu...
AGENTS
2000
Springer
14 years 2 months ago
Adaptivity in agent-based routing for data networks
Adaptivity, both of the individual agents and of the interaction structure among the agents, seems indispensable for scaling up multi-agent systems MAS's in noisy environme...
David Wolpert, Sergey Kirshner, Christopher J. Mer...
ROBOCUP
2000
Springer
104views Robotics» more  ROBOCUP 2000»
14 years 1 months ago
Essex Wizards 2000 Team Description
: This article gives an overview of the Essex Wizards 2000 team participated in the RoboCup 2000 simulator league. A brief description of the agent architecture for the team is int...
Huosheng Hu, Kostas Kostiadis, Matthew Hunter, Kos...
ESANN
2008
13 years 11 months ago
Similarities and differences between policy gradient methods and evolution strategies
Natural policy gradient methods and the covariance matrix adaptation evolution strategy, two variable metric methods proposed for solving reinforcement learning tasks, are contrast...
Verena Heidrich-Meisner, Christian Igel
NIPS
2007
13 years 11 months ago
Stable Dual Dynamic Programming
Recently, we have introduced a novel approach to dynamic programming and reinforcement learning that is based on maintaining explicit representations of stationary distributions i...
Tao Wang, Daniel J. Lizotte, Michael H. Bowling, D...