Search Sciweavers | Sciweavers

31 search results - page 1 / 7

» Algorithms and Bounds for Rollout Sampling Approximate Polic...

202

click to vote

Publication

222views

Algorithms and Bounds for Rollout Sampling Approximate Policy Iteration

16 years 3 months ago

Download arxiv.org

Abstract: Several approximate policy iteration schemes without value functions, which focus on policy representation using classifiers and address policy learning as a supervis...

Christos Dimitrakakis, Michail G. Lagoudakis

posted by olethros

Read More »

257

click to vote

Publication

334views

Rollout Sampling Approximate Policy Iteration

16 years 3 months ago

Download www.springerlink.com

Several researchers have recently investigated the connection between reinforcement learning and classification. We are motivated by proposals of approximate policy iteration schem...

Christos Dimitrakakis, Michail G. Lagoudakis

posted by olethros

Read More »

161

click to vote

IJCAI
2003

147views Artificial Intelligence» more IJCAI 2003»

Approximate Policy Iteration using Large-Margin Classifiers

15 years 7 months ago

Download ijcai.org

We present an approximate policy iteration algorithm that uses rollouts to estimate the value of each action under a given policy in a subset of states and a classifier to general...

Michail G. Lagoudakis, Ronald Parr

claim paper

Read More »

194

click to vote

ML
2008
ACM

152views Machine Learning» more ML 2008»

Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path

15 years 6 months ago

Download hal.inria.fr

Abstract. We consider batch reinforcement learning problems in continuous space, expected total discounted-reward Markovian Decision Problems. As opposed to previous theoretical wo...

András Antos, Csaba Szepesvári, R&ea...

claim paper

Read More »

177

click to vote

NIPS
2003

180views Information Technology» more NIPS 2003»

Bounded Finite State Controllers

15 years 7 months ago

Download books.nips.cc

We describe a new approximation algorithm for solving partially observable MDPs. Our bounded policy iteration approach searches through the space of bounded-size, stochastic ﬁni...

Pascal Poupart, Craig Boutilier

claim paper

Read More »

« Prev « First page 1 / 7 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers