Sciweavers

231 search results - page 7 / 47
» Active Learning in Partially Observable Markov Decision Proc...
Sort
View
COLT
2000
Springer
14 years 28 days ago
Estimation and Approximation Bounds for Gradient-Based Reinforcement Learning
We model reinforcement learning as the problem of learning to control a Partially Observable Markov Decision Process (  ¢¡¤£¦¥§  ), and focus on gradient ascent approache...
Peter L. Bartlett, Jonathan Baxter
ANOR
2010
85views more  ANOR 2010»
13 years 8 months ago
Inventory management with partially observed nonstationary demand
Abstract. We consider a continuous-time model for inventory management with Markov modulated non-stationary demands. We introduce active learning by assuming that the state of the ...
Erhan Bayraktar, Michael Ludkovski
DATE
2007
IEEE
133views Hardware» more  DATE 2007»
14 years 2 months ago
Stochastic modeling and optimization for robust power management in a partially observable system
As the hardware and software complexity grows, it is unlikely for the power management hardware/software to have a full observation of the entire system status. In this paper, we ...
Qinru Qiu, Ying Tan, Qing Wu
SARA
2007
Springer
14 years 2 months ago
Active Learning of Dynamic Bayesian Networks in Markov Decision Processes
Several recent techniques for solving Markov decision processes use dynamic Bayesian networks to compactly represent tasks. The dynamic Bayesian network representation may not be g...
Anders Jonsson, Andrew G. Barto
ATAL
2007
Springer
14 years 18 days ago
Modeling plan coordination in multiagent decision processes
In multiagent planning, it is often convenient to view a problem as two subproblems: agent local planning and coordination. Thus, we can classify agent activities into two categor...
Ping Xuan