In this paper we present TDLEAF( ), a variation on the TD( ) algorithm that enables it to be used in conjunction with game-tree search. We present some experiments in which our che...
While a user’s preference is directly reflected in the interactive choice process between her and the recommender, this wealth of information was not fully exploited for learni...
Shuang-Hong Yang, Bo Long, Alexander J. Smola, Hon...
We propose a fast batch learning method for linearchain Conditional Random Fields (CRFs) based on Newton-CG methods. Newton-CG methods are a variant of Newton method for high-dime...
Yuta Tsuboi, Yuya Unno, Hisashi Kashima, Naoaki Ok...
— In this paper, we present an approach that allows a robot to observe, generalize, and reproduce tasks observed from multiple demonstrations. Motion capture data is recorded in ...
— We consider the problem of optimal control in continuous and partially observable environments when the parameters of the model are not known exactly. Partially Observable Mark...