ICML 2008 | Sciweavers

156

ICML
2008
IEEE

126views Machine Learning» more ICML 2008»

Strategy evaluation in extensive games with importance sampling

16 years 7 months ago

Typically agent evaluation is done through Monte Carlo estimation. However, stochastic agent decisions and stochastic outcomes can make this approach inefficient, requiring many s...

Michael H. Bowling, Michael Johanson, Neil Burch, ...

claim paper

Read More »

164

click to vote

ICML
2008
IEEE

162views Machine Learning» more ICML 2008»

Automatic discovery and transfer of MAXQ hierarchies

16 years 7 months ago

Download pages.cs.wisc.edu

We present an algorithm, HI-MAT (Hierarchy Induction via Models And Trajectories), that discovers MAXQ task hierarchies by applying dynamic Bayesian network models to a successful...

Neville Mehta, Soumya Ray, Prasad Tadepalli, Thoma...

claim paper

Read More »

165

click to vote

ICML
2008
IEEE

113views Machine Learning» more ICML 2008»

Tailoring density estimation via reproducing kernel moment matching

16 years 7 months ago

Download eprints.pascal-network.org

Moment matching is a popular means of parametric density estimation. We extend this technique to nonparametric estimation of mixture models. Our approach works by embedding distri...

Alex J. Smola, Arthur Gretton, Bernhard Schöl...

claim paper

Read More »

186

click to vote

ICML
2008
IEEE

154views Machine Learning» more ICML 2008»

Beam sampling for the infinite hidden Markov model

16 years 7 months ago

Download mlg.eng.cam.ac.uk

The infinite hidden Markov model is a nonparametric extension of the widely used hidden Markov model. Our paper introduces a new inference algorithm for the infinite Hidden Markov...

Jurgen Van Gael, Yunus Saatci, Yee Whye Teh, Zoubi...

claim paper

Read More »

148

Voted

ICML
2008
IEEE

105views Machine Learning» more ICML 2008»

Learning all optimal policies with multiple criteria

16 years 7 months ago

Download leon.barrettnexus.com

We describe an algorithm for learning in the presence of multiple criteria. Our technique generalizes previous approaches in that it can learn optimal policies for all linear pref...

Leon Barrett, Srini Narayanan

claim paper

Read More »

167

click to vote

ICML
2008
IEEE

165views Machine Learning» more ICML 2008»

A worst-case comparison between temporal difference and residual gradient with linear function approximation

16 years 7 months ago

Download www.research.rutgers.edu

Residual gradient (RG) was proposed as an alternative to TD(0) for policy evaluation when function approximation is used, but there exists little formal analysis comparing them ex...

Lihong Li

claim paper

Read More »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers