This paper proposes a stochastic dynamic thermal management (DTM) technique in high-performance VLSI system with especial attention to the uncertainty in temperature observation. ...
We present a general machine learning framework for modelling the phenomenon of missing information in data. We propose a masking process model to capture the stochastic nature of...
The problem of deriving joint policies for a group of agents that maximize some joint reward function can be modeled as a decentralized partially observable Markov decision proces...
Ranjit Nair, Milind Tambe, Makoto Yokoo, David V. ...
The Partially Observable Markov Decision Process has long been recognized as a rich framework for real-world planning and control problems, especially in robotics. However exact s...
Joelle Pineau, Geoffrey J. Gordon, Sebastian Thrun
Reinforcement techniques have been successfully used to maximise the expected cumulative reward of statistical dialogue systems. Typically, reinforcement learning is used to estim...