Sciweavers

718 search results - page 107 / 144
» Online Experiments: Lessons Learned
Sort
View
ML
2000
ACM
126views Machine Learning» more  ML 2000»
13 years 7 months ago
Learning to Play Chess Using Temporal Differences
In this paper we present TDLEAF( ), a variation on the TD( ) algorithm that enables it to be used in conjunction with game-tree search. We present some experiments in which our che...
Jonathan Baxter, Andrew Tridgell, Lex Weaver
SIGIR
2011
ACM
12 years 10 months ago
Collaborative competitive filtering: learning recommender using context of user choice
While a user’s preference is directly reflected in the interactive choice process between her and the recommender, this wealth of information was not fully exploited for learni...
Shuang-Hong Yang, Bo Long, Alexander J. Smola, Hon...
AAAI
2011
12 years 7 months ago
Fast Newton-CG Method for Batch Learning of Conditional Random Fields
We propose a fast batch learning method for linearchain Conditional Random Fields (CRFs) based on Newton-CG methods. Newton-CG methods are a variant of Newton method for high-dime...
Yuta Tsuboi, Yuya Unno, Hisashi Kashima, Naoaki Ok...
ICRA
2009
IEEE
170views Robotics» more  ICRA 2009»
14 years 2 months ago
Imitation learning with generalized task descriptions
— In this paper, we present an approach that allows a robot to observe, generalize, and reproduce tasks observed from multiple demonstrations. Motion capture data is recorded in ...
Clemens Eppner, Jürgen Sturm, Maren Bennewitz...
ICRA
2008
IEEE
173views Robotics» more  ICRA 2008»
14 years 2 months ago
Bayesian reinforcement learning in continuous POMDPs with application to robot navigation
— We consider the problem of optimal control in continuous and partially observable environments when the parameters of the model are not known exactly. Partially Observable Mark...
Stéphane Ross, Brahim Chaib-draa, Joelle Pi...