Search Sciweavers | Sciweavers

50 search results - page 8 / 10

» Nonparametric Return Distribution Approximation for Reinforc...

184

Voted

PKDD
2010
Springer

164views Data Mining» more PKDD 2010»

Efficient Planning in Large POMDPs through Policy Graph Based Factorized Approximations

15 years 11 days ago

Download users.ics.tkk.fi

Partially observable Markov decision processes (POMDPs) are widely used for planning under uncertainty. In many applications, the huge size of the POMDP state space makes straightf...

Joni Pajarinen, Jaakko Peltonen, Ari Hottinen, Mik...

claim paper

Read More »

114

Voted

ICML
2008
IEEE

144views Machine Learning» more ICML 2008»

An HDP-HMM for systems with state persistence

16 years 3 months ago

Download www.cs.berkeley.edu

The hierarchical Dirichlet process hidden Markov model (HDP-HMM) is a flexible, nonparametric model which allows state spaces of unknown size to be learned from data. We demonstra...

Emily B. Fox, Erik B. Sudderth, Michael I. Jordan,...

claim paper

Read More »

118

click to vote

JMLR
2006

135views more JMLR 2006»

Quantile Regression Forests

15 years 2 months ago

Download jmlr.csail.mit.edu

Random forests were introduced as a machine learning tool in Breiman (2001) and have since proven to be very popular and powerful for high-dimensional regression and classificatio...

Nicolai Meinshausen

claim paper

Read More »

139

Voted

ICML
1996
IEEE

162views Machine Learning» more ICML 1996»

Learning Evaluation Functions for Large Acyclic Domains

16 years 3 months ago

Download www.ri.cmu.edu

Some of the most successful recent applications of reinforcement learning have used neural networks and the TD algorithm to learn evaluation functions. In this paper, we examine t...

Justin A. Boyan, Andrew W. Moore

claim paper

Read More »

139

click to vote

NIPS
2001

206views Information Technology» more NIPS 2001»

Model-Free Least-Squares Policy Iteration

15 years 3 months ago

Download www.cs.duke.edu

We propose a new approach to reinforcement learning which combines least squares function approximation with policy iteration. Our method is model-free and completely off policy. ...

Michail G. Lagoudakis, Ronald Parr

claim paper

Read More »

« Prev « First page 8 / 10 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers