Sciweavers

141 search results - page 14 / 29
» icml 2006
Sort
View
ICML
2006
IEEE
14 years 10 months ago
Quadratic programming relaxations for metric labeling and Markov random field MAP estimation
Quadratic program relaxations are proposed as an alternative to linear program relaxations and tree reweighted belief propagation for the metric labeling or MAP estimation problem...
Pradeep D. Ravikumar, John D. Lafferty
ICML
2006
IEEE
14 years 10 months ago
PAC model-free reinforcement learning
For a Markov Decision Process with finite state (size S) and action spaces (size A per state), we propose a new algorithm--Delayed Q-Learning. We prove it is PAC, achieving near o...
Alexander L. Strehl, Lihong Li, Eric Wiewiora, Joh...
ICML
2006
IEEE
14 years 10 months ago
Experience-efficient learning in associative bandit problems
We formalize the associative bandit problem framework introduced by Kaelbling as a learning-theory problem. The learning environment is modeled as a k-armed bandit where arm payof...
Alexander L. Strehl, Chris Mesterharm, Michael L. ...
ICML
2006
IEEE
14 years 10 months ago
Statistical debugging: simultaneous identification of multiple bugs
We describe a statistical approach to software debugging in the presence of multiple bugs. Due to sparse sampling issues and complex interaction between program predicates, many g...
Alice X. Zheng, Michael I. Jordan, Ben Liblit, May...
ICML
2006
IEEE
14 years 10 months ago
Discriminative unsupervised learning of structured predictors
We present a new unsupervised algorithm for training structured predictors that is discriminative, convex, and avoids the use of EM. The idea is to formulate an unsupervised versi...
Linli Xu, Dana F. Wilkinson, Finnegan Southey, Dal...