Search Sciweavers | Sciweavers

1234 search results - page 207 / 247

» Multi-criteria Reinforcement Learning

149

click to vote

NIPS
2008

110views Information Technology» more NIPS 2008»

Signal-to-Noise Ratio Analysis of Policy Gradient Algorithms

15 years 5 months ago

Download groups.csail.mit.edu

Policy gradient (PG) reinforcement learning algorithms have strong (local) convergence guarantees, but their learning performance is typically limited by a large variance in the e...

John W. Roberts, Russ Tedrake

claim paper

Read More »

132

click to vote

ICML
2005
IEEE

121views Machine Learning» more ICML 2005»

Combining model-based and instance-based learning for first order regression

16 years 4 months ago

Download www.cs.kuleuven.ac.be

T ORDER REGRESSION (EXTENDED ABSTRACT) Kurt Driessensa Saso Dzeroskib a Department of Computer Science, University of Waikato, Hamilton, New Zealand (kurtd@waikato.ac.nz) b Departm...

Kurt Driessens, Saso Dzeroski

claim paper

Read More »

130

Voted

HICSS
2003
IEEE

116views Biometrics» more HICSS 2003»

Modeling Instrumental Conditioning - The Behavioral Regulation Approach

15 years 9 months ago

Download www.hicss.hawaii.edu

Basically, instrumental conditioning is learning through consequences: Behavior that produces positive results (high “instrumental response”) is reinforced, and that which pro...

Jose J. Gonzalez, Agata Sawicka

claim paper

Read More »

136

click to vote

ICML
2008
IEEE

117views Machine Learning» more ICML 2008»

Sample-based learning and search with permanent and transient memories

16 years 4 months ago

Download www.cs.ualberta.ca

We present a reinforcement learning architecture, Dyna-2, that encompasses both samplebased learning and sample-based search, and that generalises across states during both learni...

David Silver, Martin Müller 0003, Richard S. ...

claim paper

Read More »

119

click to vote

ATAL
2008
Springer

146views Intelligent Agents» more ATAL 2008»

Adaptive Kanerva-based function approximation for multi-agent systems

15 years 5 months ago

Download www.aamas-conference.org

In this paper, we show how adaptive prototype optimization can be used to improve the performance of function approximation based on Kanerva Coding when solving largescale instanc...

Cheng Wu, Waleed Meleis

claim paper

Read More »

« Prev « First page 207 / 247 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers