Temporal difference (TD) learning methods [22] have become popular reinforcement learning techniques in recent years. TD methods have had some experimental successes and have been...
This paper presents postponed updates, a new strategy for TD methods that can improve sample efficiency without incurring the computational and space requirements of model-based ...