Sciweavers

121 search results - page 14 / 25
» Toward Off-Policy Learning Control with Function Approximati...
Sort
View
ML
1998
ACM
131views Machine Learning» more  ML 1998»
13 years 8 months ago
Learning from Examples and Membership Queries with Structured Determinations
It is well known that prior knowledge or bias can speed up learning, at least in theory. It has proved di cult to make constructive use of prior knowledge, so that approximately c...
Prasad Tadepalli, Stuart J. Russell
NIPS
2001
13 years 10 months ago
Variance Reduction Techniques for Gradient Estimates in Reinforcement Learning
Policy gradient methods for reinforcement learning avoid some of the undesirable properties of the value function approaches, such as policy degradation (Baxter and Bartlett, 2001...
Evan Greensmith, Peter L. Bartlett, Jonathan Baxte...
ECAI
2010
Springer
13 years 10 months ago
Bayesian Monte Carlo for the Global Optimization of Expensive Functions
In the last decades enormous advances have been made possible for modelling complex (physical) systems by mathematical equations and computer algorithms. To deal with very long run...
Perry Groot, Adriana Birlutiu, Tom Heskes
ICANN
2003
Springer
14 years 2 months ago
Unsupervised Learning of a Kinematic Arm Model
Abstract. An abstract recurrent neural network trained by an unsupervised method is applied to the kinematic control of a robot arm. The network is a novel extension of the Neural ...
Heiko Hoffmann, Ralf Möller
JMLR
2010
119views more  JMLR 2010»
13 years 3 months ago
A Convergent Online Single Time Scale Actor Critic Algorithm
Actor-Critic based approaches were among the first to address reinforcement learning in a general setting. Recently, these algorithms have gained renewed interest due to their gen...
Dotan Di Castro, Ron Meir