Abstract. We investigate the problem of using function approximation in reinforcement learning where the agent’s policy is represented as a classifier mapping states to actions....
— While the Partially Observable Markov Decision Process (POMDP) provides a formal framework for the problem of robot control under uncertainty, it typically assumes a known and ...
As computational learning agents move into domains that incur real costs (e.g., autonomous driving or financial investment), it will be necessary to learn good policies without n...
We study the problem of dynamic spectrum sensing and access in cognitive radio systems as a partially observed Markov decision process (POMDP). A group of cognitive users cooperati...
Jayakrishnan Unnikrishnan, Venugopal V. Veeravalli
This paper formalizes Feature Selection as a Reinforcement Learning problem, leading to a provably optimal though intractable selection policy. As a second contribution, this pape...