Search Sciweavers | Sciweavers

536 search results - page 48 / 108

» Residual Algorithms: Reinforcement Learning with Function Ap...

186

click to vote

ATAL
2008
Springer

184views Intelligent Agents» more ATAL 2008»

Sequential decision making with untrustworthy service providers

15 years 9 months ago

Download www.aamas-conference.org

In this paper, we deal with the sequential decision making problem of agents operating in computational economies, where there is uncertainty regarding the trustworthiness of serv...

W. T. Luke Teacy, Georgios Chalkiadakis, Alex Roge...

claim paper

Read More »

204

click to vote

ICMLA
2010

203views Machine Learning» more ICMLA 2010»

Multimodal Parameter-exploring Policy Gradients

15 years 5 months ago

Download www6.in.tum.de

Abstract-- Policy Gradients with Parameter-based Exploration (PGPE) is a novel model-free reinforcement learning method that alleviates the problem of high-variance gradient estima...

Frank Sehnke, Alex Graves, Christian Osendorfer, J...

claim paper

Read More »

163

click to vote

IJCAI
2003

118views Artificial Intelligence» more IJCAI 2003»

Simultaneous Adversarial Multi-Robot Learning

15 years 8 months ago

Download www.cs.cmu.edu

Multi-robot learning faces all of the challenges of robot learning with all of the challenges of multiagent learning. There has been a great deal of recent research on multiagent ...

Michael H. Bowling, Manuela M. Veloso

claim paper

Read More »

162

click to vote

ICDM
2007
IEEE

106views Data Mining» more ICDM 2007»

High-Speed Function Approximation

16 years 1 months ago

Download www.ccs.neu.edu

We address a new learning problem where the goal is to build a predictive model that minimizes prediction time (the time taken to make a prediction) subject to a constraint on mod...

Biswanath Panda, Mirek Riedewald, Johannes Gehrke,...

claim paper

Read More »

177

click to vote

ICML
2010
IEEE

167views Machine Learning» more ICML 2010»

Finite-Sample Analysis of LSTD

15 years 8 months ago

Download hal.inria.fr

In this paper we consider the problem of policy evaluation in reinforcement learning, i.e., learning the value function of a fixed policy, using the least-squares temporal-differe...

Alessandro Lazaric, Mohammad Ghavamzadeh, Ré...

claim paper

Read More »

« Prev « First page 48 / 108 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers