We present Policy Gradient Actor-Critic (PGAC), a new model-free Reinforcement Learning (RL) method for creating limited-memory stochastic policies for Partially Observable Markov ...
Pseudo-independent (PI) models are a special class of probabilistic domain model (PDM) where a set of marginally independent domain variables shows collective dependency, a specia...
Sparse coding--that is, modelling data vectors as sparse linear combinations of basis elements--is widely used in machine learning, neuroscience, signal processing, and statistics...
Julien Mairal, Francis Bach, Jean Ponce, Guillermo...
Exponential Family PSR (EFPSR) models capture stochastic dynamical systems by representing state as the parameters of an exponential family distribution over a shortterm window of...
Gaussian Process Temporal Difference (GPTD) learning offers a Bayesian solution to the policy evaluation problem of reinforcement learning. In this paper we extend the GPTD framew...