Sciweavers

30 search results - page 3 / 6
» Model-Based Average Reward Reinforcement Learning
Sort
View
ICML
2000
IEEE
14 years 10 months ago
Reinforcement Learning in POMDP's via Direct Gradient Ascent
This paper discusses theoretical and experimental aspects of gradient-based approaches to the direct optimization of policy performance in controlled ??? ?s. We introduce ??? ?, a...
Jonathan Baxter, Peter L. Bartlett
NIPS
2001
13 years 11 months ago
The Steering Approach for Multi-Criteria Reinforcement Learning
We consider the problem of learning to attain multiple goals in a dynamic environment, which is initially unknown. In addition, the environment may contain arbitrarily varying ele...
Shie Mannor, Nahum Shimkin
NECO
2010
97views more  NECO 2010»
13 years 8 months ago
Derivatives of Logarithmic Stationary Distributions for Policy Gradient Reinforcement Learning
Most conventional Policy Gradient Reinforcement Learning (PGRL) algorithms neglect (or do not explicitly make use of) a term in the average reward gradient with respect to the pol...
Tetsuro Morimura, Eiji Uchibe, Junichiro Yoshimoto...
EUROCAST
2007
Springer
182views Hardware» more  EUROCAST 2007»
14 years 4 months ago
A k-NN Based Perception Scheme for Reinforcement Learning
Abstract a paradigm of modern Machine Learning (ML) which uses rewards and punishments to guide the learning process. One of the central ideas of RL is learning by “direct-online...
José Antonio Martin H., Javier de Lope Asia...
WSC
2008
14 years 3 days ago
On step sizes, stochastic shortest paths, and survival probabilities in Reinforcement Learning
Reinforcement Learning (RL) is a simulation-based technique useful in solving Markov decision processes if their transition probabilities are not easily obtainable or if the probl...
Abhijit Gosavi