Sciweavers

ICML
2009
IEEE
15 years 10 days ago
Predictive representations for policy gradient in POMDPs
We consider the problem of estimating the policy gradient in Partially Observable Markov Decision Processes (POMDPs) with a special class of policies that are based on Predictive ...
Abdeslam Boularias, Brahim Chaib-draa
ICML
2009
IEEE
15 years 10 days ago
BoltzRank: learning to maximize expected ranking gain
Ranking a set of retrieved documents according to their relevance to a query is a popular problem in information retrieval. Methods that learn ranking functions are difficult to o...
Maksims Volkovs, Richard S. Zemel
ICML
2009
IEEE
15 years 10 days ago
SimpleNPKL: simple non-parametric kernel learning
Previous studies of Non-Parametric Kernel (NPK) learning usually reduce to solving some Semi-Definite Programming (SDP) problem by a standard SDP solver. However, time complexity ...
Jinfeng Zhuang, Ivor W. Tsang, Steven C. H. Hoi
ICML
2009
IEEE
15 years 10 days ago
Monte-Carlo simulation balancing
In this paper we introduce the first algorithms for efficiently learning a simulation policy for Monte-Carlo search. Our main idea is to optimise the balance of a simulation polic...
David Silver, Gerald Tesauro
ICML
2009
IEEE
15 years 10 days ago
Constraint relaxation in approximate linear programs
Approximate Linear Programming (ALP) is a reinforcement learning technique with nice theoretical properties, but it often performs poorly in practice. We identify some reasons for...
Marek Petrik, Shlomo Zilberstein
ICML
2009
IEEE
15 years 10 days ago
Multi-class image segmentation using conditional random fields and global classification
A key aspect of semantic image segmentation is to integrate local and global features for the prediction of local segment labels. We present an approach to multi-class segmentatio...
Nils Plath, Marc Toussaint, Shinichi Nakajima
ICML
2009
IEEE
15 years 10 days ago
A stochastic memoizer for sequence data
We propose an unbounded-depth, hierarchical, Bayesian nonparametric model for discrete sequence data. This model can be estimated from a single training sequence, yet shares stati...
Frank Wood, Cédric Archambeau, Jan Gasthaus...
ICML
2009
IEEE
15 years 10 days ago
Unsupervised hierarchical modeling of locomotion styles
This paper describes an unsupervised learning technique for modeling human locomotion styles, such as distinct related activities (e.g. running and striding) or variations of the ...
Wei Pan, Lorenzo Torresani
ICML
2009
IEEE
15 years 10 days ago
Matrix updates for perceptron training of continuous density hidden Markov models
In this paper, we investigate a simple, mistakedriven learning algorithm for discriminative training of continuous density hidden Markov models (CD-HMMs). Most CD-HMMs for automat...
Chih-Chieh Cheng, Fei Sha, Lawrence K. Saul
ICML
2009
IEEE
15 years 10 days ago
Learning structurally consistent undirected probabilistic graphical models
In many real-world domains, undirected graphical models such as Markov random fields provide a more natural representation of the dependency structure than directed graphical mode...
Sushmita Roy, Terran Lane, Margaret Werner-Washbur...