Scheduling of processes onto processors of a parallel machine has always been an important and challenging area of research. The issue becomes even more crucial and di cult as we ...
Memory system bottlenecks limit performance for many applications, and computations with strided access patterns are among the hardest hit. The streams used in such applications h...
As the performance gap between the CPU and main memory continues to grow, techniques to hide memory latency are essential to deliver a high performance computer system. Prefetchin...
The Midimew network is an excellent contender for implementing the communication subsystem of a high performance computer. This network is an optimal 2D topology in the sense ther...
Complete system simulation to understand the influence of architecture and operating systems on application execution has been identified to be crucial for systems design. While t...
Tao Li, Lizy Kurian John, Narayanan Vijaykrishnan,...
Loop fusion is important to optimizing compilers because it is an important tool in managing the memory hierarchy. By fusing loops that use the same data elements, we can reduce t...