Search Sciweavers | Sciweavers

58 search results - page 8 / 12

» A Dynamic Allocation Method of Basis Functions in Reinforcem...

128

click to vote

EUROCAST
2007
Springer

182views Hardware» more EUROCAST 2007»

A k-NN Based Perception Scheme for Reinforcement Learning

15 years 9 months ago

Download www.dia.fi.upm.es

Abstract a paradigm of modern Machine Learning (ML) which uses rewards and punishments to guide the learning process. One of the central ideas of RL is learning by “direct-online...

José Antonio Martin H., Javier de Lope Asia...

claim paper

Read More »

127

click to vote

CDC
2010
IEEE

160views Control Systems» more CDC 2010»

Adaptive bases for Q-learning

14 years 10 months ago

Download webee.technion.ac.il

Abstract-- We consider reinforcement learning, and in particular, the Q-learning algorithm in large state and action spaces. In order to cope with the size of the spaces, a functio...

Dotan Di Castro, Shie Mannor

claim paper

Read More »

148

click to vote

ECML
2005
Springer

193views Machine Learning» more ECML 2005»

Natural Actor-Critic

15 years 8 months ago

Download www-clmc.usc.edu

This paper investigates a novel model-free reinforcement learning architecture, the Natural Actor-Critic. The actor updates are based on stochastic policy gradients employing Amari...

Jan Peters, Sethu Vijayakumar, Stefan Schaal

claim paper

Read More »

129

click to vote

TMI
2008

138views more TMI 2008»

Dynamic Positron Emission Tomography Data-Driven Analysis Using Sparse Bayesian Learning

15 years 2 months ago

Download ntur.lib.ntu.edu.tw

A method is presented for the analysis of dynamic positron emission tomography (PET) data using sparse Bayesian learning. Parameters are estimated in a compartmental framework usin...

Jyh-Ying Peng, John A. D. Aston, R. N. Gunn, Cheng...

claim paper

Read More »

142

click to vote

Publication

222views

Algorithms and Bounds for Rollout Sampling Approximate Policy Iteration

16 years 1 days ago

Download arxiv.org

Abstract: Several approximate policy iteration schemes without value functions, which focus on policy representation using classifiers and address policy learning as a supervis...

Christos Dimitrakakis, Michail G. Lagoudakis

posted by olethros

Read More »

« Prev « First page 8 / 12 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers