Researchers have proposed a number of tools for automatic bug localization. Given a program and a description of the failure, such tools pinpoint a set of statements that are most likely to contain the bug. Evaluating bug localization tools is a difficult task because existing benchmarks are limited in size of subjects and number of bugs. In this paper we present iBUGS, an approach that semiautomatically extracts benchmarks for bug localization from the history of a project. For ASPECTJ, we extracted 369 bugs, 223 out of these had associated test cases. We demonstrate the relevance of our dataset with a case study on the bug localization tool AMPLE. Categories and Subject Descriptors: D.2.5 [Software Engineering]: Testing and Debugging—debugging aids, diagnostics, testing tools, tracing; D.2.7 [Software Engineering]: Distribution, Maintenance, and Enhancement—corrections, version control General Terms: Management, Measurement, Reliability