Search Sciweavers | Sciweavers

147 search results - page 1 / 30

» Policy Gradient in Continuous Time

204

click to vote

JMLR
2006

124views more JMLR 2006»

Policy Gradient in Continuous Time

15 years 6 months ago

Download hal.inria.fr

Policy search is a method for approximately solving an optimal control problem by performing a parametric optimization search in a given class of parameterized policies. In order ...

Rémi Munos

claim paper

Read More »

168

click to vote

ICANNGA
2007
Springer

105views Algorithms» more ICANNGA 2007»

Reinforcement Learning in Fine Time Discretization

16 years 23 days ago

Download staff.elka.pw.edu.pl

Reinforcement Learning (RL) is analyzed here as a tool for control system optimization. State and action spaces are assumed to be continuous. Time is assumed to be discrete, yet th...

Pawel Wawrzynski

claim paper

Read More »

156

Voted

ICML
2008
IEEE

110views Machine Learning» more ICML 2008»

Non-parametric policy gradients: a unified treatment of propositional and relational domains

16 years 7 months ago

Download www-kd.iai.uni-bonn.de

Policy gradient approaches are a powerful instrument for learning how to interact with the environment. Existing approaches have focused on propositional and continuous domains on...

Kristian Kersting, Kurt Driessens

claim paper

Read More »

191

click to vote

CORR
2006
Springer

113views Education» more CORR 2006»

A Unified View of TD Algorithms; Introducing Full-Gradient TD and Equi-Gradient Descent TD

15 years 6 months ago

Download hal.inria.fr

This paper addresses the issue of policy evaluation in Markov Decision Processes, using linear function approximation. It provides a unified view of algorithms such as TD(), LSTD()...

Manuel Loth, Philippe Preux

claim paper

Read More »

178

click to vote

SIAMCO
2008

112views more SIAMCO 2008»

A Knowledge-Gradient Policy for Sequential Information Collection

15 years 6 months ago

Download www.castlelab.princeton.edu

In a sequential Bayesian ranking and selection problem with independent normal populations and common known variance, we study a previously introduced measurement policy which we ...

Peter Frazier, Warren B. Powell, Savas Dayanik

claim paper

Read More »

« Prev « First page 1 / 30 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers