For a Markov Decision Process with finite state (size S) and action spaces (size A per state), we propose a new algorithm--Delayed Q-Learning. We prove it is PAC, achieving near o...
Alexander L. Strehl, Lihong Li, Eric Wiewiora, Joh...
How should a reinforcement learning agent act if its sole purpose is to efficiently learn an optimal policy for later use? In other words, how should it explore, to be able to exp...
Systems and Networks on Chips (NoCs) are a prime design focus of many hardware manufacturers. In addition to functional verification, which is a difficult necessity, the chip desi...
Nicolas Coste, Holger Hermanns, Etienne Lantreibec...
In this paper, we propose a Bayesian model and a Monte Carlo Markov chain (MCMC) algorithm for reconstructing images that consist of only few non-zero pixels. An appropriate distr...
Nicolas Dobigeon, Alfred O. Hero, Jean-Yves Tourne...
We propose a new optimization algorithm called Generalized Baum Welch (GBW) algorithm for discriminative training on hidden Markov model (HMM). GBW is based on Lagrange relaxation...