Sciweavers

ICML
2002
IEEE
15 years 15 days ago
Discovering Hierarchy in Reinforcement Learning with HEXQ
An open problem in reinforcement learning is discovering hierarchical structure. HEXQ, an algorithm which automatically attempts to decompose and solve a model-free factored MDP h...
Bernhard Hengst
ICML
2002
IEEE
15 years 15 days ago
Algorithm-Directed Exploration for Model-Based Reinforcement Learning in Factored MDPs
One of the central challenges in reinforcement learning is to balance the exploration/exploitation tradeoff while scaling up to large problems. Although model-based reinforcement ...
Carlos Guestrin, Relu Patrascu, Dale Schuurmans
ICML
2002
IEEE
15 years 15 days ago
Coordinated Reinforcement Learning
We present several new algorithms for multiagent reinforcement learning. A common feature of these algorithms is a parameterized, structured representation of a policy or value fu...
Carlos Guestrin, Michail G. Lagoudakis, Ronald Par...
ICML
2002
IEEE
15 years 15 days ago
Hierarchically Optimal Average Reward Reinforcement Learning
Two notions of optimality have been explored in previous work on hierarchical reinforcement learning (HRL): hierarchical optimality, or the optimal policy in the space defined by ...
Mohammad Ghavamzadeh, Sridhar Mahadevan
ICML
2002
IEEE
15 years 15 days ago
Combining Labeled and Unlabeled Data for MultiClass Text Categorization
Supervised learning techniques for text classi cation often require a large number of labeled examples to learn accurately. One way to reduce the amountoflabeled datarequired is t...
Rayid Ghani
ICML
2002
IEEE
15 years 15 days ago
Multi-Instance Kernels
Learning from structured data is becoming increasingly important. However, most prior work on kernel methods has focused on learning from attribute-value data. Only recently, rese...
Adam Kowalczyk, Alex J. Smola, Peter A. Flach, Tho...
ICML
2002
IEEE
15 years 15 days ago
On generalization bounds, projection profile, and margin distribution
We study generalization properties of linear learning algorithms and develop a data dependent approach that is used to derive generalization bounds that depend on the margin distr...
Ashutosh Garg, Sariel Har-Peled, Dan Roth
ICML
2002
IEEE
15 years 15 days ago
Univariate Polynomial Inference by Monte Carlo Message Length Approximation
We apply the Message from Monte Carlo (MMC) algorithm to inference of univariate polynomials. MMC is an algorithm for point estimation from a Bayesian posterior sample. It partiti...
Leigh J. Fitzgibbon, David L. Dowe, Lloyd Allison
ICML
2002
IEEE
15 years 15 days ago
Learning Decision Trees Using the Area Under the ROC Curve
ROC analysis is increasingly being recognised as an important tool for evaluation and comparison of classifiers when the operating characteristics (i.e. class distribution and cos...
César Ferri, José Hernández-O...