Quadratic program relaxations are proposed as an alternative to linear program relaxations and tree reweighted belief propagation for the metric labeling or MAP estimation problem...
For a Markov Decision Process with finite state (size S) and action spaces (size A per state), we propose a new algorithm--Delayed Q-Learning. We prove it is PAC, achieving near o...
Alexander L. Strehl, Lihong Li, Eric Wiewiora, Joh...
We formalize the associative bandit problem framework introduced by Kaelbling as a learning-theory problem. The learning environment is modeled as a k-armed bandit where arm payof...
Alexander L. Strehl, Chris Mesterharm, Michael L. ...
We describe a statistical approach to software debugging in the presence of multiple bugs. Due to sparse sampling issues and complex interaction between program predicates, many g...
Alice X. Zheng, Michael I. Jordan, Ben Liblit, May...
We present a new unsupervised algorithm for training structured predictors that is discriminative, convex, and avoids the use of EM. The idea is to formulate an unsupervised versi...
Linli Xu, Dana F. Wilkinson, Finnegan Southey, Dal...