Sciweavers

86 search results - page 14 / 18
» Estimation and Approximation Bounds for Gradient-Based Reinf...
Sort
View
ICML
2008
IEEE
14 years 7 months ago
Sample-based learning and search with permanent and transient memories
We present a reinforcement learning architecture, Dyna-2, that encompasses both samplebased learning and sample-based search, and that generalises across states during both learni...
David Silver, Martin Müller 0003, Richard S. ...
ICML
2004
IEEE
14 years 7 months ago
Approximate inference by Markov chains on union spaces
A standard method for approximating averages in probabilistic models is to construct a Markov chain in the product space of the random variables with the desired equilibrium distr...
Max Welling, Michal Rosen-Zvi, Yee Whye Teh
NIPS
1998
13 years 8 months ago
Finite-Sample Convergence Rates for Q-Learning and Indirect Algorithms
In this paper, we address two issues of long-standing interest in the reinforcement learning literature. First, what kinds of performance guarantees can be made for Q-learning aft...
Michael J. Kearns, Satinder P. Singh
SODA
2008
ACM
184views Algorithms» more  SODA 2008»
13 years 8 months ago
Coresets, sparse greedy approximation, and the Frank-Wolfe algorithm
The problem of maximizing a concave function f(x) in a simplex S can be solved approximately by a simple greedy algorithm. For given k, the algorithm can find a point x(k) on a k-...
Kenneth L. Clarkson
TNN
2010
216views Management» more  TNN 2010»
13 years 1 months ago
Simplifying mixture models through function approximation
Finite mixture model is a powerful tool in many statistical learning problems. In this paper, we propose a general, structure-preserving approach to reduce its model complexity, w...
Kai Zhang, James T. Kwok