Search Sciweavers | Sciweavers

92 search results - page 14 / 19

» A General Convergence Method for Reinforcement Learning in t...

click to vote

ICML
2001
IEEE

159views Machine Learning» more ICML 2001»

Direct Policy Search using Paired Statistical Tests

14 years 7 months ago

Download www.autonlab.org

Direct policy search is a practical way to solve reinforcement learning problems involving continuous state and action spaces. The goal becomes finding policy parameters that maxi...

Malcolm J. A. Strens, Andrew W. Moore

claim paper

Read More »

click to vote

COLT
2005
Springer

113views Machine Learning» more COLT 2005»

Ranking and Scoring Using Empirical Risk Minimization

14 years 5 days ago

Download www-connex.lip6.fr

A general model is proposed for studying ranking problems. We investigate learning methods based on empirical minimization of the natural estimates of the ranking risk. The empiric...

Stéphan Clémençon, Gáb...

claim paper

Read More »

click to vote

CEC
2005
IEEE

99views Artificial Intelligence» more CEC 2005»

XCS with computed prediction for the learning of Boolean functions

14 years 10 days ago

Download www.eskimo.com

Computed prediction represents a major shift in learning classiﬁer system research. XCS with computed prediction, based on linear approximators, has been applied so far to functi...

Pier Luca Lanzi, Daniele Loiacono, Stewart W. Wils...

claim paper

Read More »

click to vote

ECML
2007
Springer

192views Machine Learning» more ECML 2007»

Policy Gradient Critics

14 years 26 days ago

Download www.idsia.ch

We present Policy Gradient Actor-Critic (PGAC), a new model-free Reinforcement Learning (RL) method for creating limited-memory stochastic policies for Partially Observable Markov ...

Daan Wierstra, Jürgen Schmidhuber

claim paper

Read More »

click to vote

ISCC
2003
IEEE

110views Communications» more ISCC 2003»

Intelligent Agents Serving Based On The Society Information

13 years 12 months ago

Download www3.itu.edu.tr

In this paper, we propose a serving system consisting intelligent agents processing society information in a multi-user domain. The agents use the similarity information on the us...

Sanem Sariel, B. Tevfik Akgün

claim paper

Read More »

« Prev « First page 14 / 19 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers