Abstract. We present a dataset for learning to rank in the medical domain, consisting of thousands of full-text queries that are linked to thousands of research articles. The queries are taken from health topics described in layman’s English on the non-commercial NutritionFacts.org website; relevance links are extracted at 3 levels from direct and indirect links of queries to research articles on PubMed. We demonstrate that ranking models trained on this dataset by far outperform standard bag-of-words retrieval models. The dataset can be downloaded from: www.cl.uni-heidelberg.de/statnlpgroup/nfcorpus/.