Sciweavers

95 search results - page 14 / 19
» Policy Gradients for Cryptanalysis
Sort
View
TIT
2010
115views Education» more  TIT 2010»
13 years 2 months ago
On resource allocation in fading multiple-access channels-an efficient approximate projection approach
We consider the problem of rate and power allocation in a multiple-access channel. Our objective is to obtain rate and power allocation policies that maximize a general concave ut...
Ali ParandehGheibi, Atilla Eryilmaz, Asuman E. Ozd...
ICML
2001
IEEE
14 years 8 months ago
Off-Policy Temporal Difference Learning with Function Approximation
We introduce the first algorithm for off-policy temporal-difference learning that is stable with linear function approximation. Off-policy learning is of interest because it forms...
Doina Precup, Richard S. Sutton, Sanjoy Dasgupta
AAAI
2000
13 years 9 months ago
Localizing Search in Reinforcement Learning
Reinforcement learning (RL) can be impractical for many high dimensional problems because of the computational cost of doing stochastic search in large state spaces. We propose a ...
Gregory Z. Grudic, Lyle H. Ungar
ICRA
2010
IEEE
145views Robotics» more  ICRA 2010»
13 years 6 months ago
Reinforcement learning of motor skills in high dimensions: A path integral approach
— Reinforcement learning (RL) is one of the most general approaches to learning control. Its applicability to complex motor systems, however, has been largely impossible so far d...
Evangelos Theodorou, Jonas Buchli, Stefan Schaal
TVLSI
2008
107views more  TVLSI 2008»
13 years 7 months ago
Static and Dynamic Temperature-Aware Scheduling for Multiprocessor SoCs
Thermal hot spots and high temperature gradients degrade reliability and performance, and increase cooling costs and leakage power. In this paper, we explore the benefits of temper...
Ayse Kivilcim Coskun, T. T. Rosing, Keith Whisnant...