Search Sciweavers | Sciweavers

437 search results - page 11 / 88

» Policy Gradient Critics

161

click to vote

ICML
2009
IEEE

131views Machine Learning» more ICML 2009»

Monte-Carlo simulation balancing

16 years 7 months ago

Download www.cs.ualberta.ca

In this paper we introduce the first algorithms for efficiently learning a simulation policy for Monte-Carlo search. Our main idea is to optimise the balance of a simulation polic...

David Silver, Gerald Tesauro

claim paper

Read More »

171

click to vote

JMLR
2006

143views more JMLR 2006»

Geometric Variance Reduction in Markov Chains: Application to Value Function and Gradient Estimation

15 years 6 months ago

Download www.aaai.org

We study a sequential variance reduction technique for Monte Carlo estimation of functionals in Markov Chains. The method is based on designing sequential control variates using s...

Rémi Munos

claim paper

Read More »

197

click to vote

EWRL
2008

148views Machine Learning» more EWRL 2008»

Policy Learning - A Unified Perspective with Applications in Robotics

15 years 8 months ago

Download www.kyb.tuebingen.mpg.de

Policy Learning approaches are among the best suited methods for high-dimensional, continuous control systems such as anthropomorphic robot arms and humanoid robots. In this paper,...

Jan Peters, Jens Kober, Duy Nguyen-Tuong

claim paper

Read More »

172

click to vote

MICRO
2005
IEEE

123views Hardware» more MICRO 2005»

A Criticality Analysis of Clustering in Superscalar Processors

16 years 7 days ago

Download www-faculty.cs.uiuc.edu

Clustered machines partition hardware resources to circumvent the cycle time penalties incurred by large, monolithic structures. This partitioning introduces a long inter-cluster ...

Pierre Salverda, Craig B. Zilles

claim paper

Read More »

182

click to vote

NIPS
2003

180views Information Technology» more NIPS 2003»

Bounded Finite State Controllers

15 years 8 months ago

Download books.nips.cc

We describe a new approximation algorithm for solving partially observable MDPs. Our bounded policy iteration approach searches through the space of bounded-size, stochastic ﬁni...

Pascal Poupart, Craig Boutilier

claim paper

Read More »

« Prev « First page 11 / 88 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers