Sciweavers

147 search results - page 14 / 30
» Policy Gradient in Continuous Time
Sort
View
ISLPED
2005
ACM
122views Hardware» more  ISLPED 2005»
14 years 3 months ago
A simple mechanism to adapt leakage-control policies to temperature
Leakage power reduction in cache memories continues to be a critical area of research because of the promise of a significant pay-off. Various techniques have been developed so fa...
Stefanos Kaxiras, Polychronis Xekalakis, Georgios ...
NIPS
2003
13 years 11 months ago
Extending Q-Learning to General Adaptive Multi-Agent Systems
Recent multi-agent extensions of Q-Learning require knowledge of other agents’ payoffs and Q-functions, and assume game-theoretic play at all times by all other agents. This pap...
Gerald Tesauro
WSC
2008
13 years 12 months ago
Supply chain risks analysis by using jump-diffusion model
This paper investigates the effects of demand risk on the performance of supply chain in continuous time setting. The inventory level has been modeled as a jump-diffusion process ...
Xianzhe Chen, Jun Zhang
ESANN
2007
13 years 11 months ago
The Recurrent Control Neural Network
This paper presents our Recurrent Control Neural Network (RCNN), which is a model-based approach for a data-efficient modelling and control of reinforcement learning problems in di...
Anton Maximilian Schäfer, Steffen Udluft, Han...
ICASSP
2011
IEEE
13 years 1 months ago
SRF: Matrix completion based on smoothed rank function
In this paper, we address the matrix completion problem and propose a novel algorithm based on a smoothed rank function (SRF) approximation. Among available algorithms like FPCA a...
Hooshang Ghasemi, Mohmmadreza Malek-Mohammadi, Mas...