Sciweavers

200 search results - page 24 / 40
» Point-Based Policy Iteration
Sort
View
IJCAI
2001
13 years 9 months ago
Symbolic Dynamic Programming for First-Order MDPs
We present a dynamic programming approach for the solution of first-order Markov decisions processes. This technique uses an MDP whose dynamics is represented in a variant of the ...
Craig Boutilier, Raymond Reiter, Bob Price
AUTOMATICA
2005
108views more  AUTOMATICA 2005»
13 years 7 months ago
Robust optimal control of regular languages
This paper presents an algorithm for robust optimal control of regular languages under specified uncertainty bounds on the event cost parameters of the language measure that has b...
Constantino M. Lagoa, Jinbo Fu, Asok Ray
ICONIP
2009
13 years 5 months ago
Tracking in Reinforcement Learning
Reinforcement learning induces non-stationarity at several levels. Adaptation to non-stationary environments is of course a desired feature of a fair RL algorithm. Yet, even if the...
Matthieu Geist, Olivier Pietquin, Gabriel Fricout
ICRA
2010
IEEE
149views Robotics» more  ICRA 2010»
13 years 6 months ago
A simple learning strategy for high-speed quadrocopter multi-flips
— We describe a simple and intuitive policy gradient method for improving parametrized quadrocopter multi-flips by combining iterative experiments with information from a first...
Sergei Lupashin, Angela Schöllig, Michael She...
NIPS
2004
13 years 9 months ago
Solitaire: Man Versus Machine
In this paper, we use the rollout method for policy improvement to analyze a version of Klondike solitaire. This version, sometimes called thoughtful solitaire, has all cards reve...
Xiang Yan, Persi Diaconis, Paat Rusmevichientong, ...