Search Sciweavers | Sciweavers

473 search results - page 52 / 95

» Optimal policy switching algorithms for reinforcement learni...

169

Voted

ECAI
2006
Springer

245views Artificial Intelligence» more ECAI 2006»

Least Squares SVM for Least Squares TD Learning

15 years 9 months ago

Download homepages.feis.herts.ac.uk

Abstract. We formulate the problem of least squares temporal difference learning (LSTD) in the framework of least squares SVM (LS-SVM). To cope with the large amount (and possible ...

Tobias Jung, Daniel Polani

claim paper

Read More »

163

Voted

ESANN
2007

148views Neural Networks» more ESANN 2007»

Applying the Episodic Natural Actor-Critic Architecture to Motor Primitive Learning

15 years 7 months ago

Download www.dice.ucl.ac.be

In this paper, we investigate motor primitive learning with the Natural Actor-Critic approach. The Natural Actor-Critic consists out of actor updates which are achieved using natur...

Jan Peters, Stefan Schaal

claim paper

Read More »

169

Voted

ECML
2006
Springer

116views Machine Learning» more ECML 2006»

Scaling Model-Based Average-Reward Reinforcement Learning for Product Delivery

15 years 9 months ago

Download web.engr.oregonstate.edu

Reinforcement learning in real-world domains suffers from three curses of dimensionality: explosions in state and action spaces, and high stochasticity. We present approaches that ...

Scott Proper, Prasad Tadepalli

claim paper

Read More »

143

click to vote

ML
2002
ACM

133views Machine Learning» more ML 2002»

Finite-time Analysis of the Multiarmed Bandit Problem

15 years 5 months ago

Download homes.dsi.unimi.it

Reinforcement learning policies face the exploration versus exploitation dilemma, i.e. the search for a balance between exploring the environment to find profitable actions while t...

Peter Auer, Nicolò Cesa-Bianchi, Paul Fisch...

claim paper

Read More »

150

click to vote

CORR
2007
Springer

73views Education» more CORR 2007»

Universal Reinforcement Learning

15 years 6 months ago

Download www.stanford.edu

—We consider an agent interacting with an unmodeled environment. At each time, the agent makes an observation, takes an action, and incurs a cost. Its actions can inﬂuence futu...

Vivek F. Farias, Ciamac Cyrus Moallemi, Tsachy Wei...

claim paper

Read More »

« Prev « First page 52 / 95 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers