Sciweavers

113 search results - page 18 / 23
» Learning Representation and Control in Continuous Markov Dec...
Sort
View
ICML
2007
IEEE
14 years 9 months ago
Learning state-action basis functions for hierarchical MDPs
This paper introduces a new approach to actionvalue function approximation by learning basis functions from a spectral decomposition of the state-action manifold. This paper exten...
Sarah Osentoski, Sridhar Mahadevan
JMLR
2010
202views more  JMLR 2010»
13 years 3 months ago
Learning the Structure of Deep Sparse Graphical Models
Deep belief networks are a powerful way to model complex probability distributions. However, it is difficult to learn the structure of a belief network, particularly one with hidd...
Ryan Prescott Adams, Hanna M. Wallach, Zoubin Ghah...
NIPS
2001
13 years 10 months ago
Variance Reduction Techniques for Gradient Estimates in Reinforcement Learning
Policy gradient methods for reinforcement learning avoid some of the undesirable properties of the value function approaches, such as policy degradation (Baxter and Bartlett, 2001...
Evan Greensmith, Peter L. Bartlett, Jonathan Baxte...
TSMC
2011
258views more  TSMC 2011»
13 years 3 months ago
Cross-Entropy Optimization of Control Policies With Adaptive Basis Functions
—This paper introduces an algorithm for direct search of control policies in continuous-state discrete-action Markov decision processes. The algorithm looks for the best closed-l...
Lucian Busoniu, Damien Ernst, Bart De Schutter, Ro...
ATAL
2009
Springer
14 years 3 months ago
SarsaLandmark: an algorithm for learning in POMDPs with landmarks
Reinforcement learning algorithms that use eligibility traces, such as Sarsa(λ), have been empirically shown to be effective in learning good estimated-state-based policies in pa...
Michael R. James, Satinder P. Singh