Sciweavers

536 search results - page 48 / 108
» Residual Algorithms: Reinforcement Learning with Function Ap...
Sort
View
132
Voted
ATAL
2008
Springer
15 years 4 months ago
Sequential decision making with untrustworthy service providers
In this paper, we deal with the sequential decision making problem of agents operating in computational economies, where there is uncertainty regarding the trustworthiness of serv...
W. T. Luke Teacy, Georgios Chalkiadakis, Alex Roge...
132
Voted
ICMLA
2010
15 years 22 days ago
Multimodal Parameter-exploring Policy Gradients
Abstract-- Policy Gradients with Parameter-based Exploration (PGPE) is a novel model-free reinforcement learning method that alleviates the problem of high-variance gradient estima...
Frank Sehnke, Alex Graves, Christian Osendorfer, J...
110
Voted
IJCAI
2003
15 years 4 months ago
Simultaneous Adversarial Multi-Robot Learning
Multi-robot learning faces all of the challenges of robot learning with all of the challenges of multiagent learning. There has been a great deal of recent research on multiagent ...
Michael H. Bowling, Manuela M. Veloso
108
Voted
ICDM
2007
IEEE
106views Data Mining» more  ICDM 2007»
15 years 9 months ago
High-Speed Function Approximation
We address a new learning problem where the goal is to build a predictive model that minimizes prediction time (the time taken to make a prediction) subject to a constraint on mod...
Biswanath Panda, Mirek Riedewald, Johannes Gehrke,...
111
Voted
ICML
2010
IEEE
15 years 3 months ago
Finite-Sample Analysis of LSTD
In this paper we consider the problem of policy evaluation in reinforcement learning, i.e., learning the value function of a fixed policy, using the least-squares temporal-differe...
Alessandro Lazaric, Mohammad Ghavamzadeh, Ré...