Sciweavers

1166 search results - page 173 / 234
» Negotiating Using Rewards
Sort
View
SIGMETRICS
2002
ACM
171views Hardware» more  SIGMETRICS 2002»
13 years 9 months ago
Characterizing the d-TLB behavior of SPEC CPU2000 benchmarks
Despite the numerous optimization and evaluation studies that have been conducted with TLBs over the years, there is still a deficiency in an indepth understanding of TLB characte...
Gokul B. Kandiraju, Anand Sivasubramaniam
ML
2007
ACM
104views Machine Learning» more  ML 2007»
13 years 9 months ago
A general criterion and an algorithmic framework for learning in multi-agent systems
We offer a new formal criterion for agent-centric learning in multi-agent systems, that is, learning that maximizes one’s rewards in the presence of other agents who might also...
Rob Powers, Yoav Shoham, Thuc Vu
SAB
2010
Springer
226views Optimization» more  SAB 2010»
13 years 8 months ago
Distributed Online Learning of Central Pattern Generators in Modular Robots
Abstract. In this paper we study distributed online learning of locomotion gaits for modular robots. The learning is based on a stochastic approximation method, SPSA, which optimiz...
David Johan Christensen, Alexander Spröwitz, ...
COLT
2010
Springer
13 years 8 months ago
An Asymptotically Optimal Bandit Algorithm for Bounded Support Models
Multiarmed bandit problem is a typical example of a dilemma between exploration and exploitation in reinforcement learning. This problem is expressed as a model of a gambler playi...
Junya Honda, Akimichi Takemura
GLOBECOM
2010
IEEE
13 years 8 months ago
Need-Based Communication for Smart Grid: When to Inquire Power Price?
In smart grid, a home appliance can adjust its power consumption level according to the realtime power price obtained from communication channels. Most studies on smart grid do not...
Husheng Li, Robert C. Qiu