Search Sciweavers | Sciweavers

85 search results - page 11 / 17

» Approximate Policy Iteration with a Policy Language Bias

click to vote

CORR
2012
Springer

235views Education» more CORR 2012»

An Incremental Sampling-based Algorithm for Stochastic Optimal Control

12 years 3 months ago

Download www.mit.edu

Abstract— In this paper, we consider a class of continuoustime, continuous-space stochastic optimal control problems. Building upon recent advances in Markov chain approximation ...

Vu Anh Huynh, Sertac Karaman, Emilio Frazzoli

claim paper

Read More »

click to vote

CORR
2010
Springer

170views Education» more CORR 2010»

Global Optimization for Value Function Approximation

13 years 7 months ago

Download www.cs.umass.edu

Existing value function approximation methods have been successfully used in many applications, but they often lack useful a priori error bounds. We propose a new approximate bili...

Marek Petrik, Shlomo Zilberstein

claim paper

Read More »

click to vote

IAT
2008
IEEE

151views Intelligent Agents» more IAT 2008»

Introducing Communication in Dis-POMDPs with Locality of Interaction

14 years 1 months ago

Download teamcore.usc.edu

The Networked Distributed POMDPs (ND-POMDPs) can model multiagent systems in uncertain domains and has begun to scale-up the number of agents. However, prior work in ND-POMDPs has ...

Makoto Tasaki, Yuichi Yabu, Yuki Iwanari, Makoto Y...

claim paper

Read More »

click to vote

ICML
1999
IEEE

168views Machine Learning» more ICML 1999»

Least-Squares Temporal Difference Learning

14 years 8 months ago

Download www.research.rutgers.edu

Excerpted from: Boyan, Justin. Learning Evaluation Functions for Global Optimization. Ph.D. thesis, Carnegie Mellon University, August 1998. (Available as Technical Report CMU-CS-...

Justin A. Boyan

claim paper

Read More »

click to vote

NIPS
2007

164views Information Technology» more NIPS 2007»

Incremental Natural Actor-Critic Algorithms

13 years 8 months ago

Download books.nips.cc

We present four new reinforcement learning algorithms based on actor-critic and natural-gradient ideas, and provide their convergence proofs. Actor-critic reinforcement learning m...

Shalabh Bhatnagar, Richard S. Sutton, Mohammad Gha...

claim paper

Read More »

« Prev « First page 11 / 17 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers