This paper describes a scalable, low-complexity alternative to the conventional load/store queue (LSQ) for superscalar processors that execute load and store instructions speculat...
We study the issue of performance prediction on the SGIPower Challenge, a typical SMP. On such a platform, the cost of memory accesses depends on their locality and on contention ...
Nancy M. Amato, Jack Perdue, Mark M. Mathis, Andre...
With the availability of chip multiprocessor (CMP) and simultaneous multithreading (SMT) machines, extracting thread level parallelism from a sequential program has become crucial...
Abstract—Traditionally, performance has been the most important metrics when evaluating a system. However, in the last decades industry and academia have been paying increasing a...
Memory operations remain a significant bottleneck in dynamically scheduled pipelined processors, due in part to the inability to statically determine the existence of memory addr...
Glenn Reinman, Brad Calder, Dean M. Tullsen, Gary ...