Sciweavers

147 search results - page 3 / 30
» Policy Gradient in Continuous Time
Sort
View
AAAI
2011
12 years 9 months ago
Policy Gradient Planning for Environmental Decision Making with Existing Simulators
In environmental and natural resource planning domains actions are taken at a large number of locations over multiple time periods. These problems have enormous state and action s...
Mark Crowley, David Poole
ICANN
2007
Springer
14 years 3 months ago
Solving Deep Memory POMDPs with Recurrent Policy Gradients
Abstract. This paper presents Recurrent Policy Gradients, a modelfree reinforcement learning (RL) method creating limited-memory stochastic policies for partially observable Markov...
Daan Wierstra, Alexander Förster, Jan Peters,...
EWRL
2008
13 years 11 months ago
Policy Learning - A Unified Perspective with Applications in Robotics
Policy Learning approaches are among the best suited methods for high-dimensional, continuous control systems such as anthropomorphic robot arms and humanoid robots. In this paper,...
Jan Peters, Jens Kober, Duy Nguyen-Tuong
EOR
2011
112views more  EOR 2011»
13 years 4 months ago
Continuous time mean variance asset allocation: A time-consistent strategy
We develop a numerical scheme for determining the optimal asset allocation strategy for time-consistent, continuous time, mean variance optimization. Any type of constraint can be...
J. Wang, P. A. Forsyth
DEDS
2010
97views more  DEDS 2010»
13 years 9 months ago
On Regression-Based Stopping Times
We study approaches that fit a linear combination of basis functions to the continuation value function of an optimal stopping problem and then employ a greedy policy based on the...
Benjamin Van Roy