Sciweavers

332 search results - page 46 / 67
» Ranking policies in discrete Markov decision processes
Sort
View
NIPS
2001
13 years 9 months ago
Variance Reduction Techniques for Gradient Estimates in Reinforcement Learning
Policy gradient methods for reinforcement learning avoid some of the undesirable properties of the value function approaches, such as policy degradation (Baxter and Bartlett, 2001...
Evan Greensmith, Peter L. Bartlett, Jonathan Baxte...
IJCAI
2001
13 years 9 months ago
Complexity of Probabilistic Planning under Average Rewards
A general and expressive model of sequential decision making under uncertainty is provided by the Markov decision processes (MDPs) framework. Complex applications with very large ...
Jussi Rintanen
IROS
2006
IEEE
121views Robotics» more  IROS 2006»
14 years 2 months ago
Planning and Acting in Uncertain Environments using Probabilistic Inference
— An important problem in robotics is planning and selecting actions for goal-directed behavior in noisy uncertain environments. The problem is typically addressed within the fra...
Deepak Verma, Rajesh P. N. Rao
AIPS
2008
13 years 10 months ago
Multiagent Planning Under Uncertainty with Stochastic Communication Delays
We consider the problem of cooperative multiagent planning under uncertainty, formalized as a decentralized partially observable Markov decision process (Dec-POMDP). Unfortunately...
Matthijs T. J. Spaan, Frans A. Oliehoek, Nikos A. ...
EOR
2006
106views more  EOR 2006»
13 years 8 months ago
Optimal dynamic assignment of a flexible worker on an open production line with specialists
This paper models and analyzes serial production lines with specialists at each station and a single, cross-trained floating worker who can work at any station. We formulate Marko...
Linn I. Sennott, Mark P. Van Oyen, Seyed M. R. Ira...