Search Sciweavers | Sciweavers

567 search results - page 7 / 114

» Regularized Policy Iteration

184

Voted

AAAI
2006

146views Intelligent Agents» more AAAI 2006»

Incremental Least Squares Policy Iteration for POMDPs

15 years 8 months ago

Download www.aaai.org

We present a new algorithm, called incremental least squares policy iteration (ILSPI), for finding the infinite-horizon stationary policy for partially observable Markov decision ...

Hui Li, Xuejun Liao, Lawrence Carin

claim paper

Read More »

165

click to vote

ESOP
2007
Springer

152views Programming Languages» more ESOP 2007»

Static Analysis by Policy Iteration on Relational Domains

16 years 25 days ago

Download minimal.inria.fr

We give a new practical algorithm to compute, in ﬁnite time, a ﬁxpoint (and often the least ﬁxpoint) of a system of equations in the abstract numerical domains of zones and t...

Stephane Gaubert, Eric Goubault, Ankur Taly, Sarah...

claim paper

Read More »

173

click to vote

JMLR
2002

100views more JMLR 2002»

On the Convergence of Optimistic Policy Iteration

15 years 6 months ago

Download www.mit.edu

We consider a finite-state Markov decision problem and establish the convergence of a special case of optimistic policy iteration that involves Monte Carlo estimation of Q-values,...

John N. Tsitsiklis

claim paper

Read More »

169

click to vote

ICRA
2009
IEEE

143views Robotics» more ICRA 2009»

Least absolute policy iteration for robust value function approximation

16 years 1 months ago

Download sugiyama-www.cs.titech.ac.jp

Abstract— Least-squares policy iteration is a useful reinforcement learning method in robotics due to its computational efﬁciency. However, it tends to be sensitive to outliers...

Masashi Sugiyama, Hirotaka Hachiya, Hisashi Kashim...

claim paper

Read More »

198

click to vote

NIPS
2001

206views Information Technology» more NIPS 2001»

Model-Free Least-Squares Policy Iteration

15 years 8 months ago

Download www.cs.duke.edu

We propose a new approach to reinforcement learning which combines least squares function approximation with policy iteration. Our method is model-free and completely off policy. ...

Michail G. Lagoudakis, Ronald Parr

claim paper

Read More »

« Prev « First page 7 / 114 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers