Mediation is the process of decomposing a task into subtasks, finding agents suitable for these subtasks and negotiating with agents to obtain commitments to execute these subtas...
When the transition probabilities and rewards of a Markov Decision Process are specified exactly, the problem can be solved without any interaction with the environment. When no s...
We study the problem of dynamic spectrum sensing and access in cognitive radio systems as a partially observed Markov decision process (POMDP). A group of cognitive users cooperati...
Jayakrishnan Unnikrishnan, Venugopal V. Veeravalli
— We consider the path-determination problem in Internet core routers that distribute flows across alternate paths leading to the same destination. We assume that the remainder ...
How should a reinforcement learning agent act if its sole purpose is to efficiently learn an optimal policy for later use? In other words, how should it explore, to be able to exp...