Sciweavers

1235 search results - page 162 / 247
» ABC Reinforcement Learning
Sort
View
NIPS
1996
14 years 14 days ago
Why did TD-Gammon Work?
Although TD-Gammon is one of the major successes in machine learning, it has not led to similar impressive breakthroughs in temporal difference learning for other applications or ...
Jordan B. Pollack, Alan D. Blair
GECCO
2008
Springer
144views Optimization» more  GECCO 2008»
14 years 7 days ago
Self-adaptive constructivism in Neural XCS and XCSF
For artificial entities to achieve high degrees of autonomy they will need to display appropriate adaptability. In this sense adaptability includes representational flexibility gu...
Gerard David Howard, Larry Bull, Pier Luca Lanzi
ICML
2010
IEEE
14 years 7 days ago
Finite-Sample Analysis of LSTD
In this paper we consider the problem of policy evaluation in reinforcement learning, i.e., learning the value function of a fixed policy, using the least-squares temporal-differe...
Alessandro Lazaric, Mohammad Ghavamzadeh, Ré...
ROBOCUP
2000
Springer
104views Robotics» more  ROBOCUP 2000»
14 years 2 months ago
Essex Wizards 2000 Team Description
: This article gives an overview of the Essex Wizards 2000 team participated in the RoboCup 2000 simulator league. A brief description of the agent architecture for the team is int...
Huosheng Hu, Kostas Kostiadis, Matthew Hunter, Kos...
ESANN
2008
14 years 18 days ago
Similarities and differences between policy gradient methods and evolution strategies
Natural policy gradient methods and the covariance matrix adaptation evolution strategy, two variable metric methods proposed for solving reinforcement learning tasks, are contrast...
Verena Heidrich-Meisner, Christian Igel