Search Sciweavers | Sciweavers

10 search results - page 2 / 2

» Near-optimal Regret Bounds for Reinforcement Learning

click to vote

ICML
2006
IEEE

131views Machine Learning» more ICML 2006»

PAC model-free reinforcement learning

14 years 9 months ago

Download cseweb.ucsd.edu

For a Markov Decision Process with finite state (size S) and action spaces (size A per state), we propose a new algorithm--Delayed Q-Learning. We prove it is PAC, achieving near o...

Alexander L. Strehl, Lihong Li, Eric Wiewiora, Joh...

claim paper

Read More »

click to vote

ML
2002
ACM

133views Machine Learning» more ML 2002»

Finite-time Analysis of the Multiarmed Bandit Problem

13 years 8 months ago

Download homes.dsi.unimi.it

Reinforcement learning policies face the exploration versus exploitation dilemma, i.e. the search for a balance between exploring the environment to find profitable actions while t...

Peter Auer, Nicolò Cesa-Bianchi, Paul Fisch...

claim paper

Read More »

click to vote

ATAL
2006
Springer

192views Intelligent Agents» more ATAL 2006»

A hierarchical approach to efficient reinforcement learning in deterministic domains

14 years 8 days ago

Download paul.rutgers.edu

Factored representations, model-based learning, and hierarchies are well-studied techniques for improving the learning efficiency of reinforcement-learning algorithms in large-sca...

Carlos Diuk, Alexander L. Strehl, Michael L. Littm...

claim paper

Read More »

click to vote

JMLR
2012

200views Programming Languages» more JMLR 2012»

Contextual Bandit Learning with Predictable Rewards

11 years 11 months ago

Download www.cs.princeton.edu

Contextual bandit learning is a reinforcement learning problem where the learner repeatedly receives a set of features (context), takes an action and receives a reward based on th...

Alekh Agarwal, Miroslav Dudík, Satyen Kale,...

claim paper

Read More »

click to vote

CIMCA
2008
IEEE

125views Intelligent Agents» more CIMCA 2008»

Tree Exploration for Bayesian RL Exploration

14 years 3 months ago

Download arxiv.org

Research in reinforcement learning has produced algorithms for optimal decision making under uncertainty that fall within two main types. The ﬁrst employs a Bayesian framework, ...

Christos Dimitrakakis

posted by olethros

Read More »

« Prev « First page 2 / 2 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers