We consider a variety of dynamic, hardware-based methods for exploiting load/store parallelism, including mechanisms that use memory dependence speculation. While previous work ha...
As processors continue to exploit more instruction level parallelism, a greater demand is placed on reducing the e ects of memory access latency. In this paper, we introduce a nov...
A configurable memory organisation for the execution of Hiperlan/2 transceiver baseband processing and MPEG2 decoding is presented. The configuration of the memory system is done ...
Juha-Pekka Soininen, Antti Pelkonen, Jussi Roivain...
As the disparity between processor and main memory performance grows, the number of execution cycles spent waiting for memory accesses to complete also increases. As a result, lat...
Teresa L. Johnson, Matthew C. Merten, Wen-mei W. H...
Software-controlled data prefetching is a promising technique for improving the performance of the memory subsystem to match today's high-performance processors. While prefet...