We present a model-free reinforcement learning method for partially observable Markov decision problems. Our method estimates a likelihood gradient by sampling directly in paramet...
This paper presents a spoken dialogue framework that helps users in making decisions. Users often do not have a definite goal or criteria for selecting from a list of alternatives...
To avoid the curse of dimensionality, function approximators are used in reinforcement learning to learn value functions for individual states. In order to make better use of comp...
We consider the problem of efficiently learning optimal control policies and value functions over large state spaces in an online setting in which estimates must be available afte...
In this paper we use a reinforcement learning algorithm with the aim to increase the autonomous lifetime of a Wireless Sensor Network (WSN) and decrease latency in a decentralized...