Search Sciweavers | Sciweavers

91 search results - page 16 / 19

» Parameter-exploring policy gradients

145

Voted

DAC
2008
ACM

120views Computer Architecture» more DAC 2008»

Temperature management in multiprocessor SoCs using online learning

16 years 7 months ago

Download cseweb.ucsd.edu

In deep submicron circuits, thermal hot spots and high temperature gradients increase the cooling costs, and degrade reliability and performance. In this paper, we propose a low-co...

Ayse Kivilcim Coskun, Tajana Simunic Rosing, Kenny...

claim paper

Read More »

169

click to vote

CDC
2010
IEEE

136views Control Systems» more CDC 2010»

Pathologies of temporal difference methods in approximate dynamic programming

15 years 1 months ago

Download web.mit.edu

Approximate policy iteration methods based on temporal differences are popular in practice, and have been tested extensively, dating to the early nineties, but the associated conve...

Dimitri P. Bertsekas

claim paper

Read More »

162

click to vote

PKDD
2009
Springer

181views Data Mining» more PKDD 2009»

Active Learning for Reward Estimation in Inverse Reinforcement Learning

16 years 15 days ago

Download users.isr.ist.utl.pt

Abstract. Inverse reinforcement learning addresses the general problem of recovering a reward function from samples of a policy provided by an expert/demonstrator. In this paper, w...

Manuel Lopes, Francisco S. Melo, Luis Montesano

claim paper

Read More »

148

click to vote

ICRA
2008
IEEE

129views Robotics» more ICRA 2008»

Compliant manipulation for peg-in-hole: Is passive compliance a key to learn contact motion?

16 years 12 days ago

Download groups.csail.mit.edu

— We examine the usefulness of passive compliance in a manipulator that learns contact motion. Based on the notice that humans outperforms robots with the contact motion, we foll...

Seung-kook Yun

claim paper

Read More »

184

click to vote

NIPS
2003

207views Information Technology» more NIPS 2003»

Extending Q-Learning to General Adaptive Multi-Agent Systems

15 years 7 months ago

Download books.nips.cc

Recent multi-agent extensions of Q-Learning require knowledge of other agents’ payoffs and Q-functions, and assume game-theoretic play at all times by all other agents. This pap...

Gerald Tesauro

claim paper

Read More »

« Prev « First page 16 / 19 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers