Sciweavers

97 search results - page 5 / 20
» An epsilon-Optimal Grid-Based Algorithm for Partially Observ...
Sort
View
COLT
2000
Springer
13 years 11 months ago
Estimation and Approximation Bounds for Gradient-Based Reinforcement Learning
We model reinforcement learning as the problem of learning to control a Partially Observable Markov Decision Process (  ¢¡¤£¦¥§  ), and focus on gradient ascent approache...
Peter L. Bartlett, Jonathan Baxter
ATAL
2010
Springer
13 years 8 months ago
Quasi deterministic POMDPs and DecPOMDPs
In this paper, we study a particular subclass of partially observable models, called quasi-deterministic partially observable Markov decision processes (QDET-POMDPs), characterize...
Camille Besse, Brahim Chaib-draa
IJCAI
2003
13 years 8 months ago
A Planning Algorithm for Predictive State Representations
We address the problem of optimally controlling stochastic environments that are partially observable. The standard method for tackling such problems is to define and solve a Part...
Masoumeh T. Izadi, Doina Precup
PERCOM
2007
ACM
14 years 7 months ago
Sensor Scheduling for Optimal Observability Using Estimation Entropy
We consider sensor scheduling as the optimal observability problem for partially observable Markov decision processes (POMDP). This model fits to the cases where a Markov process ...
Mohammad Rezaeian
FOCS
2007
IEEE
14 years 1 months ago
Approximation Algorithms for Partial-Information Based Stochastic Control with Markovian Rewards
We consider a variant of the classic multi-armed bandit problem (MAB), which we call FEEDBACK MAB, where the reward obtained by playing each of n independent arms varies according...
Sudipto Guha, Kamesh Munagala