Cache misses form a major bottleneck for memory-intensive applications, due to the significant latency of main memory accesses. Loop tiling, in conjunction with other program tran...
Instruction fetch behavior has been shown to be very regular and predictable, even for diverse application areas. In this work, we propose the Lookahead Instruction Fetch Engine (...
Stephen Roderick Hines, Yuval Peress, Peter Gavin,...
Many hardware optimizations rely on collecting information about program behavior at runtime. This information is stored in lookup tables. To be accurate and effective, these opti...
Ioana Burcea, Stephen Somogyi, Andreas Moshovos, B...
By optimizing data layout at run-time, we can potentially enhance the performance of caches by actively creating spatial locality, facilitating prefetching, and avoiding cache con...
The rapid growth of the World Wide Web has caused serious performance degradation on the Internet. This paper o ers an end-to-end approach to improving Web performance by collecti...
Edith Cohen, Balachander Krishnamurthy, Jennifer R...