In this paper, we propose a policy gradient reinforcement learning algorithm to address transition-independent Dec-POMDPs. This approach aims at implicitly exploiting the locality...
This paper considers the least-square online gradient descent algorithm in a reproducing kernel Hilbert space (RKHS) without explicit regularization. We present a novel capacity i...
We study the problem of learning high dimensional regression models regularized by a structured-sparsity-inducing penalty that encodes prior structural information on either input...
Xi Chen, Qihang Lin, Seyoung Kim, Jaime G. Carbone...
We consider regularized stochastic learning and online optimization problems, where the objective function is the sum of two convex terms: one is the loss function of the learning...
We consider the problem of estimating the policy gradient in Partially Observable Markov Decision Processes (POMDPs) with a special class of policies that are based on Predictive ...