Search Sciweavers | Sciweavers

147 search results - page 3 / 30

» Policy Gradient in Continuous Time

192

click to vote

AAAI
2011

145views Intelligent Agents» more AAAI 2011»

Policy Gradient Planning for Environmental Decision Making with Existing Simulators

14 years 6 months ago

Download www.cs.ubc.ca

In environmental and natural resource planning domains actions are taken at a large number of locations over multiple time periods. These problems have enormous state and action s...

Mark Crowley, David Poole

claim paper

Read More »

147

click to vote

ICANN
2007
Springer

95views Neural Networks» more ICANN 2007»

Solving Deep Memory POMDPs with Recurrent Policy Gradients

16 years 25 days ago

Download www.idsia.ch

Abstract. This paper presents Recurrent Policy Gradients, a modelfree reinforcement learning (RL) method creating limited-memory stochastic policies for partially observable Markov...

Daan Wierstra, Alexander Förster, Jan Peters,...

claim paper

Read More »

197

click to vote

EWRL
2008

148views Machine Learning» more EWRL 2008»

Policy Learning - A Unified Perspective with Applications in Robotics

15 years 8 months ago

Download www.kyb.tuebingen.mpg.de

Policy Learning approaches are among the best suited methods for high-dimensional, continuous control systems such as anthropomorphic robot arms and humanoid robots. In this paper,...

Jan Peters, Jens Kober, Duy Nguyen-Tuong

claim paper

Read More »

138

click to vote

EOR
2011

112views more EOR 2011»

Continuous time mean variance asset allocation: A time-consistent strategy

15 years 1 months ago

Download www.cs.uwaterloo.ca

We develop a numerical scheme for determining the optimal asset allocation strategy for time-consistent, continuous time, mean variance optimization. Any type of constraint can be...

J. Wang, P. A. Forsyth

claim paper

Read More »

161

click to vote

DEDS
2010

97views more DEDS 2010»

On Regression-Based Stopping Times

15 years 6 months ago

Download www.stanford.edu

We study approaches that fit a linear combination of basis functions to the continuation value function of an optimal stopping problem and then employ a greedy policy based on the...

Benjamin Van Roy

claim paper

Read More »

« Prev « First page 3 / 30 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers