In reinforcement learning, least-squares temporal difference methods (e.g., LSTD and LSPI) are effective, data-efficient techniques for policy evaluation and control with linear v...
Michael H. Bowling, Alborz Geramifard, David Winga...
In this paper, we propose a serving system consisting intelligent agents processing society information in a multi-user domain. The agents use the similarity information on the us...
In order to claim fully general intelligence in an autonomous agent, the ability to learn is one of the most central capabilities. Classical machine learning techniques have had ma...
Autonomous agents that learn about their environment can be divided into two broad classes. One class of existing learners, reinforcement learners, typically employ weak learning ...
Recommender systems are widely used to cope with the problem of information overload and, consequently, many recommendation methods have been developed. However, no one technique i...