Abstract—This paper describes Steptacular, an online interactive incentive system for encouraging people to walk more. A trial offering Steptacular to the employees of Accenture-...
Actor-Critic based approaches were among the first to address reinforcement learning in a general setting. Recently, these algorithms have gained renewed interest due to their gen...
In the classic Bayesian restless multi-armed bandit (RMAB) problem, there are N arms, with rewards on all arms evolving at each time as Markov chains with known parameters. A play...
Wenhan Dai, Yi Gai, Bhaskar Krishnamachari, Qing Z...
Distributed message relaying is an important function of a peer-topeer system to discover service providers. Existing search protocols in unstructured peer-to-peer systems either ...
This work deals with four classical prediction settings, namely full information, bandit, label efficient and bandit label efficient as well as four different notions of regret: p...