Consider a robot whose task is to pick up some colored balls from a grid, taking the red balls to a red spot, the blue balls to a blue spot and so on, one by one, without knowing e...
We present Policy Gradient Actor-Critic (PGAC), a new model-free Reinforcement Learning (RL) method for creating limited-memory stochastic policies for Partially Observable Markov ...
We design a representation based on the situation calculus to facilitate development, maintenance and elaboration of very large taxonomies of actions. This representation leads to...
Multiagent environments are often highly dynamic and only partially observable which makes deliberative action planning computationally hard. In many such environments, however, a...
This paper describes an approach to automatically learn planning operators by observing expert solution traces and to further refine the operators through practice in a learning-b...