Sciweavers

63 search results - page 11 / 13
» Mean field for Markov Decision Processes: from Discrete to C...
Sort
View
CORR
2006
Springer
113views Education» more  CORR 2006»
13 years 7 months ago
A Unified View of TD Algorithms; Introducing Full-Gradient TD and Equi-Gradient Descent TD
This paper addresses the issue of policy evaluation in Markov Decision Processes, using linear function approximation. It provides a unified view of algorithms such as TD(), LSTD()...
Manuel Loth, Philippe Preux
IJRR
2008
101views more  IJRR 2008»
13 years 7 months ago
Motion Planning Under Uncertainty for Image-guided Medical Needle Steering
We develop a new motion planning algorithm for a variant of a Dubins car with binary left/right steering and apply it to steerable needles, a new class of flexible beveltip medica...
Ron Alterovitz, Michael S. Branicky, Kenneth Y. Go...
HT
2009
ACM
14 years 2 months ago
Improving recommender systems with adaptive conversational strategies
Conversational recommender systems (CRSs) assist online users in their information-seeking and decision making tasks by supporting an interactive process. Although these processes...
Tariq Mahmood, Francesco Ricci
NIPS
1996
13 years 9 months ago
Multidimensional Triangulation and Interpolation for Reinforcement Learning
Dynamic Programming, Q-learning and other discrete Markov Decision Process solvers can be applied to continuous d-dimensional state-spaces by quantizing the state space into an arr...
Scott Davies
ECML
2007
Springer
14 years 1 months ago
Policy Gradient Critics
We present Policy Gradient Actor-Critic (PGAC), a new model-free Reinforcement Learning (RL) method for creating limited-memory stochastic policies for Partially Observable Markov ...
Daan Wierstra, Jürgen Schmidhuber