Sciweavers

1166 search results - page 173 / 234
» Negotiating Using Rewards
Sort
View
114
Voted
SIGMETRICS
2002
ACM
171views Hardware» more  SIGMETRICS 2002»
15 years 1 months ago
Characterizing the d-TLB behavior of SPEC CPU2000 benchmarks
Despite the numerous optimization and evaluation studies that have been conducted with TLBs over the years, there is still a deficiency in an indepth understanding of TLB characte...
Gokul B. Kandiraju, Anand Sivasubramaniam
135
Voted
ML
2007
ACM
104views Machine Learning» more  ML 2007»
15 years 1 months ago
A general criterion and an algorithmic framework for learning in multi-agent systems
We offer a new formal criterion for agent-centric learning in multi-agent systems, that is, learning that maximizes one’s rewards in the presence of other agents who might also...
Rob Powers, Yoav Shoham, Thuc Vu
SAB
2010
Springer
226views Optimization» more  SAB 2010»
15 years 18 days ago
Distributed Online Learning of Central Pattern Generators in Modular Robots
Abstract. In this paper we study distributed online learning of locomotion gaits for modular robots. The learning is based on a stochastic approximation method, SPSA, which optimiz...
David Johan Christensen, Alexander Spröwitz, ...
129
Voted
COLT
2010
Springer
15 years 6 days ago
An Asymptotically Optimal Bandit Algorithm for Bounded Support Models
Multiarmed bandit problem is a typical example of a dilemma between exploration and exploitation in reinforcement learning. This problem is expressed as a model of a gambler playi...
Junya Honda, Akimichi Takemura
137
Voted
GLOBECOM
2010
IEEE
15 years 6 days ago
Need-Based Communication for Smart Grid: When to Inquire Power Price?
In smart grid, a home appliance can adjust its power consumption level according to the realtime power price obtained from communication channels. Most studies on smart grid do not...
Husheng Li, Robert C. Qiu