For a Markov Decision Process with finite state (size S) and action spaces (size A per state), we propose a new algorithm--Delayed Q-Learning. We prove it is PAC, achieving near o...
Alexander L. Strehl, Lihong Li, Eric Wiewiora, Joh...
We study a sequential variance reduction technique for Monte Carlo estimation of functionals in Markov Chains. The method is based on designing sequential control variates using s...
—We aim to characterize the maximum link throughput of a multi-channel opportunistic communication system. The states of these channels evolve as independent and identically dist...
Abstract. Bayesian reinforcement learning (RL) is aimed at making more efficient use of data samples, but typically uses significantly more computation. For discrete Markov Decis...
Hidden Markov Models (HMMs) are one of the most fundamental and widely used statistical tools for modeling discrete time series. In general, learning HMMs from data is computation...