Sciweavers

31 search results - page 2 / 7
» Algorithms and Bounds for Rollout Sampling Approximate Polic...
Sort
View
NIPS
1998
13 years 8 months ago
Finite-Sample Convergence Rates for Q-Learning and Indirect Algorithms
In this paper, we address two issues of long-standing interest in the reinforcement learning literature. First, what kinds of performance guarantees can be made for Q-learning aft...
Michael J. Kearns, Satinder P. Singh
NIPS
2008
13 years 8 months ago
Regularized Policy Iteration
In this paper we consider approximate policy-iteration-based reinforcement learning algorithms. In order to implement a flexible function approximation scheme we propose the use o...
Amir Massoud Farahmand, Mohammad Ghavamzadeh, Csab...
AAAI
2012
11 years 9 months ago
Generalized Sampling and Variance in Counterfactual Regret Minimization
In large extensive form games with imperfect information, Counterfactual Regret Minimization (CFR) is a popular, iterative algorithm for computing approximate Nash equilibria. Whi...
Richard G. Gibson, Marc Lanctot, Neil Burch, Duane...
CDC
2008
IEEE
206views Control Systems» more  CDC 2008»
14 years 1 months ago
Approximate dynamic programming using support vector regression
— This paper presents a new approximate policy iteration algorithm based on support vector regression (SVR). It provides an overview of commonly used cost approximation architect...
Brett Bethke, Jonathan P. How, Asuman E. Ozdaglar
CORR
2008
Springer
151views Education» more  CORR 2008»
13 years 7 months ago
CoSaMP: Iterative signal recovery from incomplete and inaccurate samples
Compressive sampling offers a new paradigm for acquiring signals that are compressible with respect to an orthonormal basis. The major algorithmic challenge in compressive sampling...
Joel A. Tropp, Deanna Needell