With the availability of chip multiprocessor (CMP) and simultaneous multithreading (SMT) machines, extracting thread level parallelism from a sequential program has become crucial...
High performance, freedom from deadlocks, and freedom from livelocks are desirable properties of interconnection networks. Unfortunately, these can be conflicting goals because n...
Mithuna Thottethodi, Alvin R. Lebeck, Shubhendu S....
To exploit the potential of multicore architectures, recent dense linear algebra libraries have used tile algorithms, which consist in scheduling a Directed Acyclic Graph (DAG) of...
Bilel Hadri, Hatem Ltaief, Emmanuel Agullo, Jack D...
Understanding why the performance of a multithreaded program does not improve linearly with the number of cores in a sharedmemory node populated with one or more multicore process...
This paper presents three techniques for improving the effectiveness of the recently proposed Adaptive Stream Detection (ASD) prefetching mechanism. The ASD prefetcher is a standa...