Comparing retrieval approaches requires test collections, which consist of documents, queries and relevance assessments. Obtaining consistent and exhaustive relevance assessments is crucial for the appropriate comparison of retrieval approaches. Whereas the evaluation methodology for flat text retrieval approaches is well established, the evaluation of XML retrieval approaches is a research issue. This is because XML documents are composed of nested components that cannot be considered independent in terms of relevance. This paper describes the methodology adopted in INEX (the INitiative for the Evaluation of XML Retrieval) to ensure consistent and exhaustive relevance assessments. Categories and Subject Descriptors H.3.7 [Information Systems]: Information Storage and Retrieval—Digital Libraries; H.5.3 [Information Systems]: Information Interfaces and Presentation—Group and Organisation Interfaces General Terms Measurement, Standardisation Keywords XML, evaluation, relevance asse...