Search Sciweavers | Sciweavers

172

ATAL
2009
Springer

146views Intelligent Agents» more ATAL 2009»

Online exploration in least-squares policy iteration

16 years 1 months ago

One of the key problems in reinforcement learning is balancing exploration and exploitation. Another is learning and acting in large or even continuous Markov decision processes (...

Lihong Li, Michael L. Littman, Christopher R. Mans...

claim paper

Read More »

190

click to vote

VALUETOOLS
2006
ACM

176views Hardware» more VALUETOOLS 2006»

How to solve large scale deterministic games with mean payoff by policy iteration

16 years 18 days ago

Download minimal.inria.fr

Min-max functions are dynamic programming operators of zero-sum deterministic games with ﬁnite state and action spaces. The problem of computing the linear growth rate of the or...

Vishesh Dhingra, Stephane Gaubert

claim paper

Read More »

218

click to vote

Publication

222views

Algorithms and Bounds for Rollout Sampling Approximate Policy Iteration

16 years 3 months ago

Download arxiv.org

Abstract: Several approximate policy iteration schemes without value functions, which focus on policy representation using classifiers and address policy learning as a supervis...

Christos Dimitrakakis, Michail G. Lagoudakis

posted by olethros

Read More »

189

click to vote

ISCC
2000
IEEE

104views Communications» more ISCC 2000»

Dynamic Routing and Wavelength Assignment Using First Policy Iteration

15 years 11 months ago

Download www.netlab.tkk.fi

With standard assumptions the routing and wavelength assignment problem (RWA) can be viewed as a Markov Decision Process (MDP). The problem, however, deﬁes an exact solution bec...

Esa Hyytiä, Jorma T. Virtamo

claim paper

Read More »

165

click to vote

ECML
2006
Springer

141views Machine Learning» more ECML 2006»

Approximate Policy Iteration for Closed-Loop Learning of Visual Tasks

15 years 10 months ago

Download www.montefiore.ulg.ac.be

Abstract. Approximate Policy Iteration (API) is a reinforcement learning paradigm that is able to solve high-dimensional, continuous control problems. We propose to exploit API for...

Sébastien Jodogne, Cyril Briquet, Justus H....

claim paper

Read More »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers