In this paper, we address the tradeo between exploration and exploitation for agents which need to learn more about the structure of their environment in order to perform more e ectively. For example, a robot may need to learn the most e cient routes between important sites in its environment. We compare on-line and o -line exploration for a repeated task, where the agent is given some particular task to perform some number of times. Tasks are modeled as navigation on a graph embedded in the plane. This paper describes a utility-based on-line exploration algorithm for repeated tasks, which takes into account both the costs and potential bene ts (over future task repetitions) of di erent exploratory actions. Exploration is performed in a greedy fashion, with the locally optimal exploratory action performed on each task repetition. We experimentally evaluated our utility-based on-line algorithm against a heuristic search algorithm for o -line exploration as well as a randomized on-line ...