Sciweavers

536 search results - page 48 / 108
» Residual Algorithms: Reinforcement Learning with Function Ap...
Sort
View
ATAL
2008
Springer
13 years 9 months ago
Sequential decision making with untrustworthy service providers
In this paper, we deal with the sequential decision making problem of agents operating in computational economies, where there is uncertainty regarding the trustworthiness of serv...
W. T. Luke Teacy, Georgios Chalkiadakis, Alex Roge...
ICMLA
2010
13 years 5 months ago
Multimodal Parameter-exploring Policy Gradients
Abstract-- Policy Gradients with Parameter-based Exploration (PGPE) is a novel model-free reinforcement learning method that alleviates the problem of high-variance gradient estima...
Frank Sehnke, Alex Graves, Christian Osendorfer, J...
IJCAI
2003
13 years 9 months ago
Simultaneous Adversarial Multi-Robot Learning
Multi-robot learning faces all of the challenges of robot learning with all of the challenges of multiagent learning. There has been a great deal of recent research on multiagent ...
Michael H. Bowling, Manuela M. Veloso
ICDM
2007
IEEE
106views Data Mining» more  ICDM 2007»
14 years 2 months ago
High-Speed Function Approximation
We address a new learning problem where the goal is to build a predictive model that minimizes prediction time (the time taken to make a prediction) subject to a constraint on mod...
Biswanath Panda, Mirek Riedewald, Johannes Gehrke,...
ICML
2010
IEEE
13 years 8 months ago
Finite-Sample Analysis of LSTD
In this paper we consider the problem of policy evaluation in reinforcement learning, i.e., learning the value function of a fixed policy, using the least-squares temporal-differe...
Alessandro Lazaric, Mohammad Ghavamzadeh, Ré...