The goal of Reinforcement learning (RL) is to maximize reward (minimize cost) in a Markov decision process (MDP) without knowing the underlying model a priori. RL algorithms tend ...
Abstract. We describe a way to improve the performance of MDP planners by modifying them to use lower and upper bounds to eliminate non-optimal actions during their search. First, ...