Sciweavers

119 search results - page 12 / 24
» A Markov Reward Model Checker
Sort
View
CDC
2009
IEEE
169views Control Systems» more  CDC 2009»
13 years 11 months ago
Parametric regret in uncertain Markov decision processes
— We consider decision making in a Markovian setup where the reward parameters are not known in advance. Our performance criterion is the gap between the performance of the best ...
Huan Xu, Shie Mannor
ICASSP
2008
IEEE
14 years 1 months ago
Link throughput of multi-channel opportunistic access with limited sensing
—We aim to characterize the maximum link throughput of a multi-channel opportunistic communication system. The states of these channels evolve as independent and identically dist...
Keqin Liu, Qing Zhao
GLOBECOM
2008
IEEE
14 years 1 months ago
Foresighted Resource Reciprocation Strategies in P2P Networks
—We consider peer-to-peer (P2P) networks, where multiple peers are interested in sharing content. While sharing resources, autonomous and self-interested peers need to make decis...
Hyunggon Park, Mihaela van der Schaar
AIPS
2000
13 years 8 months ago
Representations of Decision-Theoretic Planning Tasks
Goal-directed Markov Decision Process models (GDMDPs) are good models for many decision-theoretic planning tasks. They have been used in conjunction with two different reward stru...
Sven Koenig, Yaxin Liu
CORR
2010
Springer
143views Education» more  CORR 2010»
13 years 3 months ago
The Non-Bayesian Restless Multi-Armed Bandit: a Case of Near-Logarithmic Regret
In the classic Bayesian restless multi-armed bandit (RMAB) problem, there are N arms, with rewards on all arms evolving at each time as Markov chains with known parameters. A play...
Wenhan Dai, Yi Gai, Bhaskar Krishnamachari, Qing Z...