The Cranfield evaluation method has some disadvantages, including its high cost in labor and inadequacy for evaluating interactive retrieval techniques. As a very promising alternative, automatic comparison of retrieval systems based on observed clicking behavior of users has recently been studied. Several methods have been proposed , but there has so far been no systematic way to assess which strategy is better, making it difficult to choose a good method for real applications. In this paper, we propose a general way to evaluate these relative comparison methods with two measures: utility to users(UtU) and effectiveness of differentiation(EoD). We evaluate two state of the art methods by systematically simulating different retrieval scenarios. Inspired by the weakness of these methods revealed through our evaluation, we further propose a novel method by considering the positions of clicked documents. Experiment results show that our new method performs better than the existing me...