A number of pitfalls of empirical scheduling research are illustrated using real experimental data. These pitfalls, in general, serve to slow the progress of scheduling research by obsfucating results, blurring comparisons among scheduling algorithms and algorithm components, and complicating validation of work in the literature. In particular, we look at difficulties brought about by viewing algorithms in a monolithic fashion, by concentrating on CPU time as the only evaluation criteria, by failing to prepare for gathering of a variety of search statistics at the time of experimental design, by concentrating on benchmarks to the exclusion of other sources of experimental problems, and, more broadly, by a preoccupation with optimization of makespan as the sole goal of scheduling algorithms.
J. Christopher Beck, Andrew J. Davenport, Mark S.