The effective use of processor caches is crucial to the performance of applications. It has been shown that cache misses are not evenly distributed throughout a program. In applications running on RISC-style processors, a small number of delinquent load instructions are responsible for most of the cache misses. Identification of delinquent loads is the key to the success of many cache optimization and prefetching techniques. In this paper, we propose a method for identifying delinquent loads that can be implemented at compile time. Our experiments over eighteen benchmarks from the SPEC suite shows that our proposed scheme is stable across benchmarks, inputs, and cache structures, identifying an average of 10% of the total number of loads in the benchmarks we tested that account for over 90% of all data cache misses. As far as we know, this is the first time a technique for static delinquent load identification with such a level of precision and coverage has been reported. While compar...