Sciweavers

95 search results - page 12 / 19
» Policy Gradients for Cryptanalysis
Sort
View
ICML
2010
IEEE
13 years 8 months ago
Toward Off-Policy Learning Control with Function Approximation
We present the first temporal-difference learning algorithm for off-policy control with unrestricted linear function approximation whose per-time-step complexity is linear in the ...
Hamid Reza Maei, Csaba Szepesvári, Shalabh ...
CEC
2011
IEEE
12 years 7 months ago
Stochastic Natural Gradient Descent by estimation of empirical covariances
—Stochastic relaxation aims at finding the minimum of a fitness function by identifying a proper sequence of distributions, in a given model, that minimize the expected value o...
Luigi Malagò, Matteo Matteucci, Giovanni Pi...
ESANN
2007
13 years 9 months ago
Applying the Episodic Natural Actor-Critic Architecture to Motor Primitive Learning
In this paper, we investigate motor primitive learning with the Natural Actor-Critic approach. The Natural Actor-Critic consists out of actor updates which are achieved using natur...
Jan Peters, Stefan Schaal
PCI
2005
Springer
14 years 1 months ago
TSIC: Thermal Scheduling Simulator for Chip Multiprocessors
Abstract. Increased power density, hot-spots, and temperature gradients are severe limiting factors for today’s state-of-the-art microprocessors. However, the flexibility offer...
Kyriakos Stavrou, Pedro Trancoso
AAAI
2010
13 years 8 months ago
Bayesian Policy Search for Multi-Agent Role Discovery
Bayesian inference is an appealing approach for leveraging prior knowledge in reinforcement learning (RL). In this paper we describe an algorithm for discovering different classes...
Aaron Wilson, Alan Fern, Prasad Tadepalli