In this paper the application of reinforcement learning to Tetris is investigated, particulary the idea of temporal difference learning is applied to estimate the state value function V . For two predefined reward functions Tetris agents have been trained by using a -greedy policy. In the numerical experiments it can be observed that the trained agents can outperform fixed policy agents significantly, e.g. by factor 5 for a complex reward function. 1 Machine learning for game playing Playing games, such like Chess, Go, Checkers, Backgammon or Poker, is always a great intellectual challenge to humans, and therefore game playing is a scenario to test and evaluate artificial intelligence methods, in particular machine learning aspects have been taken more and more into account during the last years. Many methods derived from the fields of traditional artificial intelligence and mathematical game theory have been utilized in computer games, for instance, game trees are one of the most popu...