Sciweavers

119 search results - page 13 / 24
» Average Reward Timed Games
Sort
View
COMSNETS
2012
163views more  COMSNETS 2012»
12 years 3 months ago
Steptacular: An incentive mechanism for promoting wellness
Abstract—This paper describes Steptacular, an online interactive incentive system for encouraging people to walk more. A trial offering Steptacular to the employees of Accenture-...
Naini Gomes, Deepak Merugu, Gearoid O'Brien, Chinm...
JMLR
2010
119views more  JMLR 2010»
13 years 2 months ago
A Convergent Online Single Time Scale Actor Critic Algorithm
Actor-Critic based approaches were among the first to address reinforcement learning in a general setting. Recently, these algorithms have gained renewed interest due to their gen...
Dotan Di Castro, Ron Meir
CORR
2010
Springer
143views Education» more  CORR 2010»
13 years 4 months ago
The Non-Bayesian Restless Multi-Armed Bandit: a Case of Near-Logarithmic Regret
In the classic Bayesian restless multi-armed bandit (RMAB) problem, there are N arms, with rewards on all arms evolving at each time as Markov chains with known parameters. A play...
Wenhan Dai, Yi Gai, Bhaskar Krishnamachari, Qing Z...
ATAL
2007
Springer
14 years 1 months ago
An incentive mechanism for message relaying in unstructured peer-to-peer systems
Distributed message relaying is an important function of a peer-topeer system to discover service providers. Existing search protocols in unstructured peer-to-peer systems either ...
Cuihong Li, Bin Yu, Katia P. Sycara
JMLR
2010
103views more  JMLR 2010»
13 years 2 months ago
Regret Bounds and Minimax Policies under Partial Monitoring
This work deals with four classical prediction settings, namely full information, bandit, label efficient and bandit label efficient as well as four different notions of regret: p...
Jean-Yves Audibert, Sébastien Bubeck