An explicit exploration strategy is necessary in reinforcement learning (RL) to balance the need to reduce the uncertainty associated with the expected outcome of an action and the need to converge to a solution. This dependency is more acute in on-policy reinforcement learning where the exploration guides the search for an optimal solution. The need for a self-regulating exploration is manifest in knowledge transfer with the readaptation of past solutions. Tabu search (TS) is an adaptive memorybased exploration method that has been successful in combinatorial optimization problems by systematically exploring the search space and avoiding cycles through action inhibition. Tabu search has also been successfully used in genetic algorithms to ensure diversity and protect against premature convergence. This paper presents an approach to tabu search exploration in reinforcement learning. Experimental results are presented in the discounted, tabular cases of the grid and packet routing probl...