Sciweavers

332 search results - page 18 / 67
» Ranking policies in discrete Markov decision processes
Sort
View
ICML
2008
IEEE
14 years 8 months ago
Apprenticeship learning using linear programming
In apprenticeship learning, the goal is to learn a policy in a Markov decision process that is at least as good as a policy demonstrated by an expert. The difficulty arises in tha...
Umar Syed, Michael H. Bowling, Robert E. Schapire
NN
2010
Springer
187views Neural Networks» more  NN 2010»
13 years 2 months ago
Efficient exploration through active learning for value function approximation in reinforcement learning
Appropriately designing sampling policies is highly important for obtaining better control policies in reinforcement learning. In this paper, we first show that the least-squares ...
Takayuki Akiyama, Hirotaka Hachiya, Masashi Sugiya...
ICML
2010
IEEE
13 years 9 months ago
Convergence of Least Squares Temporal Difference Methods Under General Conditions
We consider approximate policy evaluation for finite state and action Markov decision processes (MDP) in the off-policy learning context and with the simulation-based least square...
Huizhen Yu
AAAI
2006
13 years 9 months ago
Factored MDP Elicitation and Plan Display
The software suite we will demonstrate at AAAI '06 was designed around planning with factored Markov decision processes (MDPs). It is a user-friendly suite that facilitates d...
Krol Kevin Mathias, Casey Lengacher, Derek William...
IPCO
2008
114views Optimization» more  IPCO 2008»
13 years 9 months ago
The Stochastic Machine Replenishment Problem
We study the stochastic machine replenishment problem, which is a canonical special case of closed multiclass queuing systems in Markov decision theory. The problem models the sche...
Kamesh Munagala, Peng Shi