Search Sciweavers | Sciweavers

87 search results - page 5 / 18

» Hybrid Least-Squares Algorithms for Approximate Policy Evalu...

click to vote

IPPS
2005
IEEE

122views Distributed And Parallel Com...» more IPPS 2005»

Scheduling Algorithms for Effective Thread Pairing on Hybrid Multiprocessors

14 years 1 months ago

Download people.cs.vt.edu

With the latest high-end computing nodes combining shared-memory multiprocessing with hardware multithreading, new scheduling policies are necessary for workloads consisting of mu...

Robert L. McGregor, Christos D. Antonopoulos, Dimi...

claim paper

Read More »

click to vote

Publication

334views

Rollout Sampling Approximate Policy Iteration

14 years 4 months ago

Download www.springerlink.com

Several researchers have recently investigated the connection between reinforcement learning and classification. We are motivated by proposals of approximate policy iteration schem...

Christos Dimitrakakis, Michail G. Lagoudakis

posted by olethros

Read More »

click to vote

ICML
2006
IEEE

143views Machine Learning» more ICML 2006»

Fast direct policy evaluation using multiscale analysis of Markov diffusion processes

14 years 8 months ago

Download www.cs.umass.edu

Policy evaluation is a critical step in the approximate solution of large Markov decision processes (MDPs), typically requiring O(|S|3 ) to directly solve the Bellman system of |S...

Mauro Maggioni, Sridhar Mahadevan

claim paper

Read More »

click to vote

NIPS
2008

165views Information Technology» more NIPS 2008»

Regularized Policy Iteration

13 years 9 months ago

Download webdocs.cs.ualberta.ca

In this paper we consider approximate policy-iteration-based reinforcement learning algorithms. In order to implement a flexible function approximation scheme we propose the use o...

Amir Massoud Farahmand, Mohammad Ghavamzadeh, Csab...

claim paper

Read More »

click to vote

ICML
2001
IEEE

185views Machine Learning» more ICML 2001»

Off-Policy Temporal Difference Learning with Function Approximation

14 years 8 months ago

Download www.cs.ualberta.ca

We introduce the first algorithm for off-policy temporal-difference learning that is stable with linear function approximation. Off-policy learning is of interest because it forms...

Doina Precup, Richard S. Sutton, Sanjoy Dasgupta

claim paper

Read More »

« Prev « First page 5 / 18 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers