Sciweavers

437 search results - page 24 / 88
» Policy Gradient Critics
Sort
View
AAAI
2007
13 years 11 months ago
Temporal Difference and Policy Search Methods for Reinforcement Learning: An Empirical Comparison
Reinforcement learning (RL) methods have become popular in recent years because of their ability to solve complex tasks with minimal feedback. Both genetic algorithms (GAs) and te...
Matthew E. Taylor, Shimon Whiteson, Peter Stone
TVLSI
2008
107views more  TVLSI 2008»
13 years 8 months ago
Static and Dynamic Temperature-Aware Scheduling for Multiprocessor SoCs
Thermal hot spots and high temperature gradients degrade reliability and performance, and increase cooling costs and leakage power. In this paper, we explore the benefits of temper...
Ayse Kivilcim Coskun, T. T. Rosing, Keith Whisnant...
CHIMIT
2008
ACM
13 years 10 months ago
Policy-based IT automation: the role of human judgment
Policy-based automation is emerging as a viable approach to IT systems management, codifying high-level business goals into executable specifications for governing IT operations. ...
Eser Kandogan, John H. Bailey, Paul P. Maglio, Ebe...
TON
2010
151views more  TON 2010»
13 years 3 months ago
Throughput Optimal Distributed Power Control of Stochastic Wireless Networks
The Maximum Differential Backlog (MDB) control policy of Tassiulas and Ephremides has been shown to adaptively maximize the stable throughput of multihop wireless networks with ran...
Yufang Xi, Edmund M. Yeh
ICRA
2010
IEEE
149views Robotics» more  ICRA 2010»
13 years 7 months ago
A simple learning strategy for high-speed quadrocopter multi-flips
— We describe a simple and intuitive policy gradient method for improving parametrized quadrocopter multi-flips by combining iterative experiments with information from a first...
Sergei Lupashin, Angela Schöllig, Michael She...