Appropriately designing sampling policies is highly important for obtaining better control policies in reinforcement learning. In this paper, we first show that the least-squares ...
This paper considers consensus problems with delayed noisy measurements, and stochastic approximation is used to achieve mean square consensus. For stochastic approximation based c...
Off-policy reinforcement learning is aimed at efficiently reusing data samples gathered in the past, which is an essential problem for physically grounded AI as experiments are us...
As shown in [7], optimal control problems with either ODE or PDE dynamics can be solved efficiently using a setting of consistent approximations obtained by numerical discretizati...
We study a sequential variance reduction technique for Monte Carlo estimation of functionals in Markov Chains. The method is based on designing sequential control variates using s...