In this paper, we propose a policy gradient reinforcement learning algorithm to address transition-independent Dec-POMDPs. This approach aims at implicitly exploiting the locality...
We present a taxonomy for local distance functions where most existing algorithms can be regarded as approximations of the geodesic distance defined by a metric tensor. We categor...
We consider the classical finite-state discounted Markovian decision problem, and we introduce a new policy iteration-like algorithm for finding the optimal state costs or Q-facto...
— We are interested in transferring control policies for arbitrary tasks from a human to a robot. Using interactive demonstration via teloperation as our transfer scenario, we ca...
The problem of production and delivery lot-sizing and scheduling of set of items in a two-echelon supply chain over a finite planning horizon is addressed in this paper. A single ...