Search Sciweavers | Sciweavers

50 search results - page 4 / 10

» Convergence and Divergence in Standard and Averaging Reinfor...

click to vote

CORR
2007
Springer

73views Education» more CORR 2007»

Universal Reinforcement Learning

13 years 7 months ago

Download www.stanford.edu

—We consider an agent interacting with an unmodeled environment. At each time, the agent makes an observation, takes an action, and incurs a cost. Its actions can inﬂuence futu...

Vivek F. Farias, Ciamac Cyrus Moallemi, Tsachy Wei...

claim paper

Read More »

click to vote

ICML
2000
IEEE

126views Machine Learning» more ICML 2000»

Reinforcement Learning in POMDP's via Direct Gradient Ascent

14 years 8 months ago

Download reference.kfupm.edu.sa

This paper discusses theoretical and experimental aspects of gradient-based approaches to the direct optimization of policy performance in controlled ??? ?s. We introduce ??? ?, a...

Jonathan Baxter, Peter L. Bartlett

claim paper

Read More »

click to vote

ICML
2001
IEEE

185views Machine Learning» more ICML 2001»

Off-Policy Temporal Difference Learning with Function Approximation

14 years 8 months ago

Download www.cs.ualberta.ca

We introduce the first algorithm for off-policy temporal-difference learning that is stable with linear function approximation. Off-policy learning is of interest because it forms...

Doina Precup, Richard S. Sutton, Sanjoy Dasgupta

claim paper

Read More »

click to vote

JMLR
2010

119views more JMLR 2010»

A Convergent Online Single Time Scale Actor Critic Algorithm

13 years 2 months ago

Download jmlr.csail.mit.edu

Actor-Critic based approaches were among the first to address reinforcement learning in a general setting. Recently, these algorithms have gained renewed interest due to their gen...

Dotan Di Castro, Ron Meir

claim paper

Read More »

click to vote

WSC
2008

154views Modeling And Simulation» more WSC 2008»

On step sizes, stochastic shortest paths, and survival probabilities in Reinforcement Learning

13 years 10 months ago

Download www.informs-sim.org

Reinforcement Learning (RL) is a simulation-based technique useful in solving Markov decision processes if their transition probabilities are not easily obtainable or if the probl...

Abhijit Gosavi

claim paper

Read More »

« Prev « First page 4 / 10 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers