We consider the problem of energy-efficient point-to-point transmission of delay-sensitive data (e.g. multimedia data) over a fading channel. We propose a rigorous and unified fra...
We present Policy Gradient Actor-Critic (PGAC), a new model-free Reinforcement Learning (RL) method for creating limited-memory stochastic policies for Partially Observable Markov ...
This paper extends the framework of dynamic influence diagrams (DIDs) to the multi-agent setting. DIDs are computational representations of the Partially Observable Markov Decisio...
Open-ended spoken interactions are typically characterised by both structural complexity and high levels of uncertainty, making dialogue management in such settings a particularly...
Predictive state representation (PSR) models for controlled dynamical systems have recently been proposed as an alternative to traditional models such as partially observable Mark...
Michael R. James, Satinder P. Singh, Michael L. Li...