Sciweavers

ICML
2000
IEEE
15 years 15 days ago
Reinforcement Learning in POMDP's via Direct Gradient Ascent
This paper discusses theoretical and experimental aspects of gradient-based approaches to the direct optimization of policy performance in controlled ??? ?s. We introduce ??? ?, a...
Jonathan Baxter, Peter L. Bartlett
ICML
2001
IEEE
15 years 15 days ago
Symmetry in Markov Decision Processes and its Implications for Single Agent and Multiagent Learning
This paper examines the notion of symmetry in Markov decision processes (MDPs). We define symmetry for an MDP and show how it can be exploited for more effective learning in singl...
Martin Zinkevich, Tucker R. Balch
ICML
2001
IEEE
15 years 15 days ago
Obtaining calibrated probability estimates from decision trees and naive Bayesian classifiers
Accurate, well-calibrated estimates of class membership probabilities are needed in many supervised learning applications, in particular when a cost-sensitive decision must be mad...
Bianca Zadrozny, Charles Elkan
ICML
2001
IEEE
15 years 15 days ago
Feature selection for high-dimensional genomic microarray data
We report on the successful application of feature selection methods to a classification problem in molecular biology involving only 72 data points in a 7130 dimensional space. Ou...
Eric P. Xing, Michael I. Jordan, Richard M. Karp
ICML
2001
IEEE
15 years 15 days ago
Constrained K-means Clustering with Background Knowledge
Clustering is traditionally viewed as an unsupervised method for data analysis. However, in some cases information about the problem domain is available in addition to the data in...
Kiri Wagstaff, Claire Cardie, Seth Rogers, Stefan ...
ICML
2001
IEEE
15 years 15 days ago
Learning to Generate Fast Signal Processing Implementations
A single signal processing algorithm can be represented by many mathematically equivalent formulas. However, when these formulas are implemented in code and run on real machines, ...
Bryan Singer, Manuela M. Veloso
ICML
2001
IEEE
15 years 15 days ago
Direct Policy Search using Paired Statistical Tests
Direct policy search is a practical way to solve reinforcement learning problems involving continuous state and action spaces. The goal becomes finding policy parameters that maxi...
Malcolm J. A. Strens, Andrew W. Moore
ICML
2001
IEEE
15 years 15 days ago
Smoothed Bootstrap and Statistical Data Cloning for Classifier Evaluation
This work is concerned with the estimation of a classifier's accuracy. We first review some existing methods for error estimation, focusing on cross-validation and bootstrap,...
Gregory Shakhnarovich, Ran El-Yaniv, Yoram Baram
ICML
2001
IEEE
15 years 15 days ago
Scaling Reinforcement Learning toward RoboCup Soccer
Peter Stone, Richard S. Sutton