Sciweavers

238 search results - page 9 / 48
» Value-Function Approximations for Partially Observable Marko...
Sort
View
COLT
2000
Springer
15 years 7 months ago
Estimation and Approximation Bounds for Gradient-Based Reinforcement Learning
We model reinforcement learning as the problem of learning to control a Partially Observable Markov Decision Process (  ¢¡¤£¦¥§  ), and focus on gradient ascent approache...
Peter L. Bartlett, Jonathan Baxter
NIPS
2007
15 years 4 months ago
Bayes-Adaptive POMDPs
Bayesian Reinforcement Learning has generated substantial interest recently, as it provides an elegant solution to the exploration-exploitation trade-off in reinforcement learning...
Stéphane Ross, Brahim Chaib-draa, Joelle Pi...
154
Voted
AMAI
2004
Springer
15 years 8 months ago
A Framework for Sequential Planning in Multi-Agent Settings
This paper extends the framework of partially observable Markov decision processes (POMDPs) to multi-agent settings by incorporating the notion of agent models into the state spac...
Piotr J. Gmytrasiewicz, Prashant Doshi
140
Voted
ICML
1999
IEEE
16 years 4 months ago
Monte Carlo Hidden Markov Models: Learning Non-Parametric Models of Partially Observable Stochastic Processes
We present a learning algorithm for non-parametric hidden Markov models with continuous state and observation spaces. All necessary probability densities are approximated using sa...
Sebastian Thrun, John Langford, Dieter Fox
132
Voted
NIPS
2004
15 years 4 months ago
VDCBPI: an Approximate Scalable Algorithm for Large POMDPs
Existing algorithms for discrete partially observable Markov decision processes can at best solve problems of a few thousand states due to two important sources of intractability:...
Pascal Poupart, Craig Boutilier