Search Sciweavers | Sciweavers

92 search results - page 9 / 19

» A General Convergence Method for Reinforcement Learning in t...

click to vote

UAI
2003

172views Artificial Intelligence» more UAI 2003»

On the Convergence of Bound Optimization Algorithms

13 years 8 months ago

Download cs.nyu.edu

Many practitioners who use EM and related algorithms complain that they are sometimes slow. When does this happen, and what can be done about it? In this paper, we study the gener...

Ruslan Salakhutdinov, Sam T. Roweis, Zoubin Ghahra...

claim paper

Read More »

click to vote

PKDD
2010
Springer

179views Data Mining» more PKDD 2010»

Gaussian Processes for Sample Efficient Reinforcement Learning with RMAX-Like Exploration

13 years 4 months ago

Download www.cs.utexas.edu

Abstract. We present an implementation of model-based online reinforcement learning (RL) for continuous domains with deterministic transitions that is specifically designed to achi...

Tobias Jung, Peter Stone

claim paper

Read More »

click to vote

JMLR
2006

124views more JMLR 2006»

Policy Gradient in Continuous Time

13 years 6 months ago

Download hal.inria.fr

Policy search is a method for approximately solving an optimal control problem by performing a parametric optimization search in a given class of parameterized policies. In order ...

Rémi Munos

claim paper

Read More »

click to vote

NIPS
2008

159views Information Technology» more NIPS 2008»

Policy Search for Motor Primitives in Robotics

13 years 8 months ago

Download www.kyb.tuebingen.mpg.de

Many motor skills in humanoid robotics can be learned using parametrized motor primitives as done in imitation learning. However, most interesting motor learning problems are high...

Jens Kober, Jan Peters

claim paper

Read More »

click to vote

CORR
2006
Springer

101views Education» more CORR 2006»

Metric State Space Reinforcement Learning for a Vision-Capable Mobile Robot

13 years 6 months ago

Download www.idsia.ch

We address the problem of autonomously learning controllers for visioncapable mobile robots. We extend McCallum's (1995) Nearest-Sequence Memory algorithm to allow for genera...

Viktor Zhumatiy, Faustino J. Gomez, Marcus Hutter,...

claim paper

Read More »

« Prev « First page 9 / 19 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers