Abstract—This paper considers maximizing throughput utility in a multi-user network with partially observable Markov ON/OFF channels. Instantaneous channel states are never known...
Abstract Many elderly and physically impaired people experience difficulties when maneuvering a powered wheelchair. In order to provide improved maneuvering, powered wheelchairs ha...
Hidden Markov models hmms and partially observable Markov decision processes pomdps provide useful tools for modeling dynamical systems. They are particularly useful for represent...
We consider a variant of the classic multi-armed bandit problem (MAB), which we call FEEDBACK MAB, where the reward obtained by playing each of n independent arms varies according...
Reinforcement learning (RL) algorithms provide a sound theoretical basis for building learning control architectures for embedded agents. Unfortunately all of the theory and much ...
Satinder P. Singh, Tommi Jaakkola, Michael I. Jord...