The fundamental dichotomy in evolutionary algorithms is that between exploration and exploitation. Recently, several algorithms [8, 9, 14, 16, 17, 20] have been introduced that gu...
In bandit problems, a decision-maker must choose between a set of alternatives, each of which has a fixed but unknown rate of reward, to maximize their total number of rewards ov...
Michael D. Lee, Shunan Zhang, Miles Munro, Mark St...
In reinforcement learning problems it has been considered that neither exploitation nor exploration can be pursued exclusively without failing at the task. The optimal balance bet...
While in general trading off exploration and exploitation in reinforcement learning is hard, under some formulations relatively simple solutions exist. Optimal decision thresholds ...
We present an efficient "sparse sampling" technique for approximating Bayes optimal decision making in reinforcement learning, addressing the well known exploration vers...
Tao Wang, Daniel J. Lizotte, Michael H. Bowling, D...