Sciweavers

64 search results - page 9 / 13
» Multi-Agent Learning with Policy Prediction
Sort
View
ICML
1999
IEEE
14 years 11 months ago
Least-Squares Temporal Difference Learning
Excerpted from: Boyan, Justin. Learning Evaluation Functions for Global Optimization. Ph.D. thesis, Carnegie Mellon University, August 1998. (Available as Technical Report CMU-CS-...
Justin A. Boyan
ICML
2010
IEEE
13 years 12 months ago
Asymptotic Analysis of Generative Semi-Supervised Learning
Semi-supervised learning has emerged as a popular framework for improving modeling accuracy while controlling labeling cost. Based on an extension of stochastic composite likeliho...
Joshua Dillon, Krishnakumar Balasubramanian, Guy L...
MICRO
2009
IEEE
191views Hardware» more  MICRO 2009»
14 years 5 months ago
Pseudo-LIFO: the foundation of a new family of replacement policies for last-level caches
Cache blocks often exhibit a small number of uses during their life time in the last-level cache. Past research has exploited this property in two different ways. First, replacem...
Mainak Chaudhuri
ICML
2008
IEEE
14 years 11 months ago
A worst-case comparison between temporal difference and residual gradient with linear function approximation
Residual gradient (RG) was proposed as an alternative to TD(0) for policy evaluation when function approximation is used, but there exists little formal analysis comparing them ex...
Lihong Li
HPCA
2008
IEEE
14 years 5 months ago
Prediction of CPU idle-busy activity pattern
Real-world workloads rarely saturate multi-core processor. CPU C-states can be used to reduce power consumption during processor idle time. The key unsolved problem is: when and h...
Qian Diao, Justin J. Song