— We consider decision making in a Markovian setup where the reward parameters are not known in advance. Our performance criterion is the gap between the performance of the best ...
—We aim to characterize the maximum link throughput of a multi-channel opportunistic communication system. The states of these channels evolve as independent and identically dist...
—We consider peer-to-peer (P2P) networks, where multiple peers are interested in sharing content. While sharing resources, autonomous and self-interested peers need to make decis...
Goal-directed Markov Decision Process models (GDMDPs) are good models for many decision-theoretic planning tasks. They have been used in conjunction with two different reward stru...
In the classic Bayesian restless multi-armed bandit (RMAB) problem, there are N arms, with rewards on all arms evolving at each time as Markov chains with known parameters. A play...
Wenhan Dai, Yi Gai, Bhaskar Krishnamachari, Qing Z...