Search Sciweavers | Sciweavers

34 search results - page 5 / 7

» Towards Finite-Sample Convergence of Direct Reinforcement Le...

193

click to vote

EUROCAST
2007
Springer

182views Hardware» more EUROCAST 2007»

A k-NN Based Perception Scheme for Reinforcement Learning

16 years 1 months ago

Download www.dia.fi.upm.es

Abstract a paradigm of modern Machine Learning (ML) which uses rewards and punishments to guide the learning process. One of the central ideas of RL is learning by “direct-online...

José Antonio Martin H., Javier de Lope Asia...

claim paper

Read More »

197

click to vote

AGENTS
1999
Springer

105views Security Privacy» more AGENTS 1999»

Team-Partitioned, Opaque-Transition Reinforcement Learning

15 years 12 months ago

Download www.cs.ucf.edu

In this paper, we present a novel multi-agent learning paradigm called team-partitioned, opaque-transition reinforcement learning (TPOT-RL). TPOT-RL introduces the concept of usin...

Peter Stone, Manuela M. Veloso

claim paper

Read More »

235

click to vote

PKDD
2009
Springer

184views Data Mining» more PKDD 2009»

Boosting Active Learning to Optimality: A Tractable Monte-Carlo, Billiard-Based Algorithm

16 years 4 days ago

Download www.lri.fr

Abstract. This paper focuses on Active Learning with a limited number of queries; in application domains such as Numerical Engineering, the size of the training set might be limite...

Philippe Rolet, Michèle Sebag, Olivier Teyt...

claim paper

Read More »

212

click to vote

FBIT
2007
IEEE

142views Information Technology» more FBIT 2007»

Learning to Drive a Real Car in 20 Minutes

16 years 1 months ago

Download www.ni.uos.de

The paper describes our ﬁrst experiments on Reinforcement Learning to steer a real robot car. The applied method, Neural Fitted Q Iteration (NFQ) is purely data-driven based on ...

Martin Riedmiller, Michael Montemerlo, Hendrik Dah...

claim paper

Read More »

217

click to vote

COGSR
2011

109views more COGSR 2011»

How groups develop a specialized domain vocabulary: A cognitive multi-agent model

15 years 2 months ago

Download www.david-reitter.com

We simulate the evolution of a domain vocabulary in small communities. Empirical data show that human communicators can evolve graphical languages quickly in a constrained task (P...

David Reitter, Christian Lebiere

claim paper

Read More »

« Prev « First page 5 / 7 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers