Sciweavers

437 search results - page 28 / 88
» Policy Gradient Critics
Sort
View
DASFAA
2010
IEEE
176views Database» more  DASFAA 2010»
13 years 9 months ago
Efficient Database-Driven Evaluation of Security Clearance for Federated Access Control of Dynamic XML Documents
Achieving data security over cooperating web services is becoming a reality, but existing XML access control architectures do not consider this federated service computing. In this...
Erwin Leonardi, Sourav S. Bhowmick, Mizuho Iwaihar...
DAC
2008
ACM
14 years 9 months ago
Temperature management in multiprocessor SoCs using online learning
In deep submicron circuits, thermal hot spots and high temperature gradients increase the cooling costs, and degrade reliability and performance. In this paper, we propose a low-co...
Ayse Kivilcim Coskun, Tajana Simunic Rosing, Kenny...
GECCO
2009
Springer
162views Optimization» more  GECCO 2009»
13 years 6 months ago
Uncertainty handling CMA-ES for reinforcement learning
The covariance matrix adaptation evolution strategy (CMAES) has proven to be a powerful method for reinforcement learning (RL). Recently, the CMA-ES has been augmented with an ada...
Verena Heidrich-Meisner, Christian Igel
CDC
2010
IEEE
136views Control Systems» more  CDC 2010»
13 years 3 months ago
Pathologies of temporal difference methods in approximate dynamic programming
Approximate policy iteration methods based on temporal differences are popular in practice, and have been tested extensively, dating to the early nineties, but the associated conve...
Dimitri P. Bertsekas
PKDD
2009
Springer
181views Data Mining» more  PKDD 2009»
14 years 3 months ago
Active Learning for Reward Estimation in Inverse Reinforcement Learning
Abstract. Inverse reinforcement learning addresses the general problem of recovering a reward function from samples of a policy provided by an expert/demonstrator. In this paper, w...
Manuel Lopes, Francisco S. Melo, Luis Montesano