Sciweavers

176 search results - page 9 / 36
» Optimal Sample Selection for Batch-mode Reinforcement Learni...
Sort
View
NIPS
2001
13 years 9 months ago
Variance Reduction Techniques for Gradient Estimates in Reinforcement Learning
Policy gradient methods for reinforcement learning avoid some of the undesirable properties of the value function approaches, such as policy degradation (Baxter and Bartlett, 2001...
Evan Greensmith, Peter L. Bartlett, Jonathan Baxte...
ICML
2009
IEEE
14 years 8 months ago
Robust bounds for classification via selective sampling
We introduce a new algorithm for binary classification in the selective sampling protocol. Our algorithm uses Regularized Least Squares (RLS) as base classifier, and for this reas...
Nicolò Cesa-Bianchi, Claudio Gentile, Franc...
ECIR
2009
Springer
14 years 4 months ago
Active Sampling for Rank Learning via Optimizing the Area under the ROC Curve
Abstract. Learning ranking functions is crucial for solving many problems, ranging from document retrieval to building recommendation systems based on an individual user’s prefer...
Pinar Donmez, Jaime G. Carbonell
PKDD
2009
Springer
184views Data Mining» more  PKDD 2009»
14 years 7 days ago
Boosting Active Learning to Optimality: A Tractable Monte-Carlo, Billiard-Based Algorithm
Abstract. This paper focuses on Active Learning with a limited number of queries; in application domains such as Numerical Engineering, the size of the training set might be limite...
Philippe Rolet, Michèle Sebag, Olivier Teyt...
ICML
2003
IEEE
14 years 8 months ago
Q-Decomposition for Reinforcement Learning Agents
The paper explores a very simple agent design method called Q-decomposition, wherein a complex agent is built from simpler subagents. Each subagent has its own reward function and...
Stuart J. Russell, Andrew Zimdars