Sciweavers

ICML
2006
IEEE
15 years 16 days ago
Using inaccurate models in reinforcement learning
In the model-based policy search approach to reinforcement learning (RL), policies are found using a model (or "simulator") of the Markov decision process. However, for ...
Pieter Abbeel, Morgan Quigley, Andrew Y. Ng
ICML
2006
IEEE
15 years 16 days ago
On a theory of learning with similarity functions
Kernel functions have become an extremely popular tool in machine learning, with an attractive theory as well. This theory views a kernel as implicitly mapping data points into a ...
Maria-Florina Balcan, Avrim Blum
ICML
2008
IEEE
15 years 16 days ago
Sample-based learning and search with permanent and transient memories
We present a reinforcement learning architecture, Dyna-2, that encompasses both samplebased learning and sample-based search, and that generalises across states during both learni...
David Silver, Martin Müller 0003, Richard S. ...
ICML
2008
IEEE
15 years 16 days ago
Graph transduction via alternating minimization
Graph transduction methods label input data by learning a classification function that is regularized to exhibit smoothness along a graph over labeled and unlabeled samples. In pr...
Jun Wang, Tony Jebara, Shih-Fu Chang
ICML
2008
IEEE
15 years 16 days ago
On-line discovery of temporal-difference networks
We present an algorithm for on-line, incremental discovery of temporal-difference (TD) networks. The key contribution is the establishment of three criteria to expand a node in TD...
Takaki Makino, Toshihisa Takagi
ICML
2008
IEEE
15 years 16 days ago
Learning to learn implicit queries from gaze patterns
In the absence of explicit queries, an alternative is to try to infer users' interests from implicit feedback signals, such as clickstreams or eye tracking. The interests, fo...
Antti Ajanki, Kai Puolamäki, Samuel Kaski
ICML
2008
IEEE
15 years 16 days ago
On partial optimality in multi-label MRFs
We consider the problem of optimizing multilabel MRFs, which is in general NP-hard and ubiquitous in low-level computer vision. One approach for its solution is to formulate it as...
Pushmeet Kohli, Alexander Shekhovtsov, Carsten Rot...
ICML
2008
IEEE
15 years 16 days ago
Confidence-weighted linear classification
We introduce confidence-weighted linear classifiers, which add parameter confidence information to linear classifiers. Online learners in this setting update both classifier param...
Mark Dredze, Koby Crammer, Fernando Pereira