Search Sciweavers | Sciweavers

226 search results - page 39 / 46

» A Convergent Reinforcement Learning Algorithm in the Continu...

click to vote

UAI
2008

242views Artificial Intelligence» more UAI 2008»

Dyna-Style Planning with Linear Function Approximation and Prioritized Sweeping

13 years 8 months ago

Download uai2008.cs.helsinki.fi

We consider the problem of efficiently learning optimal control policies and value functions over large state spaces in an online setting in which estimates must be available afte...

Richard S. Sutton, Csaba Szepesvári, Alborz...

claim paper

Read More »

click to vote

ATAL
2008
Springer

145views Intelligent Agents» more ATAL 2008»

Artificial agents learning human fairness

13 years 9 months ago

Download www.sce.carleton.ca

Recent advances in technology allow multi-agent systems to be deployed in cooperation with or as a service for humans. Typically, those systems are designed assuming individually ...

Steven de Jong, Karl Tuyls, Katja Verbeeck

claim paper

Read More »

click to vote

ICML
2005
IEEE

121views Machine Learning» more ICML 2005»

Combining model-based and instance-based learning for first order regression

14 years 8 months ago

Download www.cs.kuleuven.ac.be

T ORDER REGRESSION (EXTENDED ABSTRACT) Kurt Driessensa Saso Dzeroskib a Department of Computer Science, University of Waikato, Hamilton, New Zealand (kurtd@waikato.ac.nz) b Departm...

Kurt Driessens, Saso Dzeroski

claim paper

Read More »

click to vote

ECML
2007
Springer

192views Machine Learning» more ECML 2007»

Policy Gradient Critics

14 years 1 months ago

Download www.idsia.ch

We present Policy Gradient Actor-Critic (PGAC), a new model-free Reinforcement Learning (RL) method for creating limited-memory stochastic policies for Partially Observable Markov ...

Daan Wierstra, Jürgen Schmidhuber

claim paper

Read More »

click to vote

NIPS
2007

146views Information Technology» more NIPS 2007»

Receding Horizon Differential Dynamic Programming

13 years 8 months ago

Download books.nips.cc

The control of high-dimensional, continuous, non-linear dynamical systems is a key problem in reinforcement learning and control. Local, trajectory-based methods, using techniques...

Yuval Tassa, Tom Erez, William D. Smart

claim paper

Read More »

« Prev « First page 39 / 46 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers