Search Sciweavers | Sciweavers

63 search results - page 11 / 13

» Mean field for Markov Decision Processes: from Discrete to C...

click to vote

CORR
2006
Springer

113views Education» more CORR 2006»

A Unified View of TD Algorithms; Introducing Full-Gradient TD and Equi-Gradient Descent TD

13 years 7 months ago

Download hal.inria.fr

This paper addresses the issue of policy evaluation in Markov Decision Processes, using linear function approximation. It provides a unified view of algorithms such as TD(), LSTD()...

Manuel Loth, Philippe Preux

claim paper

Read More »

click to vote

IJRR
2008

101views more IJRR 2008»

Motion Planning Under Uncertainty for Image-guided Medical Needle Steering

13 years 7 months ago

Download dora.cwru.edu

We develop a new motion planning algorithm for a variant of a Dubins car with binary left/right steering and apply it to steerable needles, a new class of flexible beveltip medica...

Ron Alterovitz, Michael S. Branicky, Kenneth Y. Go...

claim paper

Read More »

click to vote

HT
2009
ACM

146views Internet Technology» more HT 2009»

Improving recommender systems with adaptive conversational strategies

14 years 2 months ago

Download www.inf.unibz.it

Conversational recommender systems (CRSs) assist online users in their information-seeking and decision making tasks by supporting an interactive process. Although these processes...

Tariq Mahmood, Francesco Ricci

claim paper

Read More »

click to vote

NIPS
1996

192views Information Technology» more NIPS 1996»

Multidimensional Triangulation and Interpolation for Reinforcement Learning

13 years 9 months ago

Download www.cs.cmu.edu

Dynamic Programming, Q-learning and other discrete Markov Decision Process solvers can be applied to continuous d-dimensional state-spaces by quantizing the state space into an arr...

Scott Davies

claim paper

Read More »

click to vote

ECML
2007
Springer

192views Machine Learning» more ECML 2007»

Policy Gradient Critics

14 years 1 months ago

Download www.idsia.ch

We present Policy Gradient Actor-Critic (PGAC), a new model-free Reinforcement Learning (RL) method for creating limited-memory stochastic policies for Partially Observable Markov ...

Daan Wierstra, Jürgen Schmidhuber

claim paper

Read More »

« Prev « First page 11 / 13 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers