—The ever-increasing computational power of contemporary microprocessors reduces the execution time spent on arithmetic computations (i.e., the computations not involving slow memory operations such as cache misses) significantly. Therefore, for memory-intensive workloads, it becomes more important to overlap multiple cache misses than to overlap slow memory operations with other computations. In this paper, we propose a novel technique to parallelize sequential cache misses, thereby increasing memory-level parallelism (MLP). Our idea is based on value prediction, which was proposed originally as an instruction-level parallelism (ILP) optimization to break true data dependencies. In this paper, we advocate value prediction in its capability to enhance MLP instead of ILP. We propose using value prediction and value-speculative execution only for prefetching so that not only the complex prediction validation and misprediction recovery mechanisms are avoided, but better performance can ...
Huiyang Zhou, Thomas M. Conte