In this paper, we study an online learning algorithm in Reproducing Kernel Hilbert Spaces (RKHS) and general Hilbert spaces. We present a general form of the stochastic gradient m...
We study a sequential variance reduction technique for Monte Carlo estimation of functionals in Markov Chains. The method is based on designing sequential control variates using s...
— Many deterministic algorithms in the context of constrained optimization require the first-order derivatives, or the gradient vectors, of the objective and constraint function...
Stephanus Daniel Handoko, Chee Keong Kwoh, Yew-Soo...
As Field Programmable Gate Arrays (FPGAs) have reached capacities beyond millions of equivalent gates, it becomes possible to accelerate floating-point scientific computing applica...
We consider the problem of estimating the policy gradient in Partially Observable Markov Decision Processes (POMDPs) with a special class of policies that are based on Predictive ...