—TD learning and its refinements are powerful tools for approximating the solution to dynamic programming problems. However, the techniques provide the approximate solution only...
Wei Chen, Dayu Huang, Ankur A. Kulkarni, Jayakrish...
This work describes an approach for the interpretation and explanation of human behavior in image sequences, within the context of a Cognitive Vision System. The information source...
Previous algorithms for learning lexicographic preference models (LPMs) produce a "best guess" LPM that is consistent with the observations. Our approach is more democra...
Fusun Yaman, Thomas J. Walsh, Michael L. Littman, ...
Abstract. Previous work of the authors has studied a notion of implication between sets of sequences based on the conceptual structure of a Galois lattice, and also a way of repres...
We address two open theoretical questions in Policy Gradient Reinforcement Learning. The first concerns the efficacy of using function approximation to represent the state action ...