A channel machine consists of a finite controller together with several fifo channels; the controller can read messages from the head of a channel and write messages to the tail of...
Previous algorithms for learning lexicographic preference models (LPMs) produce a "best guess" LPM that is consistent with the observations. Our approach is more democra...
Fusun Yaman, Thomas J. Walsh, Michael L. Littman, ...
For large-scale classification problems, the training samples can be clustered beforehand as a downsampling pre-process, and then only the obtained clusters are used for training....
This paper presents a new approach to hierarchical reinforcement learning based on the MAXQ decomposition of the value function. The MAXQ decomposition has both a procedural seman...
This paper introduces the RL-TOPs architecture for robot learning, a hybrid system combining teleo-reactive planning and reinforcement learning techniques. The aim of this system ...