Search Sciweavers | Sciweavers

181 search results - page 33 / 37

» On Policy Learning in Restricted Policy Spaces

click to vote

CORR
2011
Springer

210views Education» more CORR 2011»

Online Learning of Rested and Restless Bandits

13 years 2 months ago

Download www.eecs.umich.edu

In this paper we study the online learning problem involving rested and restless multiarmed bandits with multiple plays. The system consists of a single player/user and a set of K...

Cem Tekin, Mingyan Liu

claim paper

Read More »

click to vote

FGR
2006
IEEE

121views Biometrics» more FGR 2006»

Learning to Identify Facial Expression During Detection Using Markov Decision Process

14 years 1 months ago

Download www.cs.rutgers.edu

While there has been a great deal of research in face detection and recognition, there has been very limited work on identifying the expression on a face. Many current face detect...

Ramana Isukapalli, Ahmed M. Elgammal, Russell Grei...

claim paper

Read More »

click to vote

NIPS
1996

192views Information Technology» more NIPS 1996»

Multidimensional Triangulation and Interpolation for Reinforcement Learning

13 years 9 months ago

Download www.cs.cmu.edu

Dynamic Programming, Q-learning and other discrete Markov Decision Process solvers can be applied to continuous d-dimensional state-spaces by quantizing the state space into an arr...

Scott Davies

claim paper

Read More »

click to vote

SDM
2007
SIAM

167views Data Mining» more SDM 2007»

Bandits for Taxonomies: A Model-based Approach

13 years 9 months ago

Download www.cs.cmu.edu

We consider a novel problem of learning an optimal matching, in an online fashion, between two feature spaces that are organized as taxonomies. We formulate this as a multi-armed ...

Sandeep Pandey, Deepak Agarwal, Deepayan Chakrabar...

claim paper

Read More »

click to vote

UAI
2008

242views Artificial Intelligence» more UAI 2008»

Dyna-Style Planning with Linear Function Approximation and Prioritized Sweeping

13 years 9 months ago

Download uai2008.cs.helsinki.fi

We consider the problem of efficiently learning optimal control policies and value functions over large state spaces in an online setting in which estimates must be available afte...

Richard S. Sutton, Csaba Szepesvári, Alborz...

claim paper

Read More »

« Prev « First page 33 / 37 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers