Sciweavers

1236 search results - page 186 / 248
» Opposition-Based Reinforcement Learning
Sort
View
KDD
2010
ACM
289views Data Mining» more  KDD 2010»
13 years 7 months ago
Exploitation and exploration in a performance based contextual advertising system
The dynamic marketplace in online advertising calls for ranking systems that are optimized to consistently promote and capitalize better performing ads. The streaming nature of on...
Wei Li 0010, Xuerui Wang, Ruofei Zhang, Ying Cui, ...
EMNLP
2011
12 years 9 months ago
Watermarking the Outputs of Structured Prediction with an application in Statistical Machine Translation
We propose a general method to watermark and probabilistically identify the structured outputs of machine learning algorithms. Our method is robust to local editing operations and...
Ashish Venugopal, Jakob Uszkoreit, David Talbot, F...
ICML
2008
IEEE
14 years 10 months ago
Automatic discovery and transfer of MAXQ hierarchies
We present an algorithm, HI-MAT (Hierarchy Induction via Models And Trajectories), that discovers MAXQ task hierarchies by applying dynamic Bayesian network models to a successful...
Neville Mehta, Soumya Ray, Prasad Tadepalli, Thoma...
ICML
2001
IEEE
14 years 10 months ago
Expectation Maximization for Weakly Labeled Data
We call data weakly labeled if it has no exact label but rather a numerical indication of correctness of the label "guessed" by the learning algorithm - a situation comm...
Yuri A. Ivanov, Bruce Blumberg, Alex Pentland
ATAL
2005
Springer
14 years 3 months ago
An integrated framework for adaptive reasoning about conversation patterns
We present an integrated approach for reasoning about and learning conversation patterns in multiagent communication. The approach is based on the assumption that information abou...
Michael Rovatsos, Felix A. Fischer, Gerhard Wei&sz...