We introduce the first algorithm for off-policy temporal-difference learning that is stable with linear function approximation. Off-policy learning is of interest because it forms...
This paper investigates a relatively new direction in Multiagent Reinforcement Learning. Most multiagent learning techniques focus on Nash equilibria as elements of both the learn...
Reinforcement learning (RL) algorithms provide a sound theoretical basis for building learning control architectures for embedded agents. Unfortunately all of the theory and much ...
Satinder P. Singh, Tommi Jaakkola, Michael I. Jord...
This paper is about Reinforcement Learning (RL) applied to online parameter tuning in Stochastic Local Search (SLS) methods. In particular a novel application of RL is considered i...
Conversational recommender systems (CRSs) assist online users in their information-seeking and decision making tasks by supporting an interactive process. Although these processes...