Sciweavers

47 search results - page 5 / 10
» Average-Reward Decentralized Markov Decision Processes
Sort
View
ICC
2007
IEEE
137views Communications» more  ICC 2007»
14 years 2 months ago
Optimality and Complexity of Opportunistic Spectrum Access: A Truncated Markov Decision Process Formulation
— We consider opportunistic spectrum access (OSA) which allows secondary users to identify and exploit instantaneous spectrum opportunities resulting from the bursty traffic of ...
Dejan V. Djonin, Qing Zhao, Vikram Krishnamurthy
CAV
2010
Springer
190views Hardware» more  CAV 2010»
13 years 12 months ago
Measuring and Synthesizing Systems in Probabilistic Environments
Often one has a preference order among the different systems that satisfy a given specification. Under a probabilistic assumption about the possible inputs, such a preference order...
Krishnendu Chatterjee, Thomas A. Henzinger, Barbar...
NECO
2007
150views more  NECO 2007»
13 years 8 months ago
Reinforcement Learning, Spike-Time-Dependent Plasticity, and the BCM Rule
Learning agents, whether natural or artificial, must update their internal parameters in order to improve their behavior over time. In reinforcement learning, this plasticity is ...
Dorit Baras, Ron Meir
NIPS
2007
13 years 10 months ago
Optimistic Linear Programming gives Logarithmic Regret for Irreducible MDPs
We present an algorithm called Optimistic Linear Programming (OLP) for learning to optimize average reward in an irreducible but otherwise unknown Markov decision process (MDP). O...
Ambuj Tewari, Peter L. Bartlett
JAIR
2008
145views more  JAIR 2008»
13 years 8 months ago
Communication-Based Decomposition Mechanisms for Decentralized MDPs
Multi-agent planning in stochastic environments can be framed formally as a decentralized Markov decision problem. Many real-life distributed problems that arise in manufacturing,...
Claudia V. Goldman, Shlomo Zilberstein