Sciweavers

135 search results - page 12 / 27
» Bounded Parameter Markov Decision Processes
Sort
View
ICML
2006
IEEE
14 years 8 months ago
PAC model-free reinforcement learning
For a Markov Decision Process with finite state (size S) and action spaces (size A per state), we propose a new algorithm--Delayed Q-Learning. We prove it is PAC, achieving near o...
Alexander L. Strehl, Lihong Li, Eric Wiewiora, Joh...
JMLR
2006
143views more  JMLR 2006»
13 years 7 months ago
Geometric Variance Reduction in Markov Chains: Application to Value Function and Gradient Estimation
We study a sequential variance reduction technique for Monte Carlo estimation of functionals in Markov Chains. The method is based on designing sequential control variates using s...
Rémi Munos
ICASSP
2008
IEEE
14 years 2 months ago
Link throughput of multi-channel opportunistic access with limited sensing
—We aim to characterize the maximum link throughput of a multi-channel opportunistic communication system. The states of these channels evolve as independent and identically dist...
Keqin Liu, Qing Zhao
PKDD
2010
Springer
129views Data Mining» more  PKDD 2010»
13 years 6 months ago
Smarter Sampling in Model-Based Bayesian Reinforcement Learning
Abstract. Bayesian reinforcement learning (RL) is aimed at making more efficient use of data samples, but typically uses significantly more computation. For discrete Markov Decis...
Pablo Samuel Castro, Doina Precup
CORR
2008
Springer
107views Education» more  CORR 2008»
13 years 7 months ago
A Spectral Algorithm for Learning Hidden Markov Models
Hidden Markov Models (HMMs) are one of the most fundamental and widely used statistical tools for modeling discrete time series. In general, learning HMMs from data is computation...
Daniel Hsu, Sham M. Kakade, Tong Zhang