Prefrontal cortex (PFC) has been implicated in the ability to switch behavioral strategies in response to changes in reward contingencies. A recent experimental study has shown that separate subpopulations of neurons in the prefrontal cortex were activated when rats switched between allocentric place strategies and egocentric response strategies in the plus maze. In this paper we propose a simple neuralnetwork model of strategy switching, in which the learning of the two strategies as well as learning to select between those strategies is governed by the same temporal-difference (TD) learning algorithm. We show that the model reproduces the experimental data on both behavioral and neural levels. On the basis of our results we derive testable prediction concerning a spatial dynamics of the phasic dopamine signal in the PFC, which is thought to encode reward-prediction error in the TD-learning theory.