Sciweavers

983 search results - page 67 / 197
» A Better Update Policy
Sort
View
ICML
2002
IEEE
14 years 9 months ago
Hierarchically Optimal Average Reward Reinforcement Learning
Two notions of optimality have been explored in previous work on hierarchical reinforcement learning (HRL): hierarchical optimality, or the optimal policy in the space defined by ...
Mohammad Ghavamzadeh, Sridhar Mahadevan
DATE
2007
IEEE
148views Hardware» more  DATE 2007»
14 years 3 months ago
Temperature aware task scheduling in MPSoCs
In deep submicron circuits, elevation in temperatures has brought new challenges in reliability, timing, performance, cooling costs and leakage power. Conventional thermal managem...
Ayse Kivilcim Coskun, Tajana Simunic Rosing, Keith...
IASTEDSE
2004
13 years 10 months ago
An authorization and access control scheme for pervasive computing
The existence of a central security authority is too restrictive for pervasive computing environments. Existing distributed security schemes fail in a pervasive computing environm...
Linda Staffans, Titos Saridakis
GECCO
2009
Springer
162views Optimization» more  GECCO 2009»
13 years 6 months ago
Uncertainty handling CMA-ES for reinforcement learning
The covariance matrix adaptation evolution strategy (CMAES) has proven to be a powerful method for reinforcement learning (RL). Recently, the CMA-ES has been augmented with an ada...
Verena Heidrich-Meisner, Christian Igel
CDC
2010
IEEE
136views Control Systems» more  CDC 2010»
13 years 3 months ago
Pathologies of temporal difference methods in approximate dynamic programming
Approximate policy iteration methods based on temporal differences are popular in practice, and have been tested extensively, dating to the early nineties, but the associated conve...
Dimitri P. Bertsekas