To improve the performance of a single application on Chip Multiprocessors (CMPs), the application must be split into threads which execute concurrently on multiple cores. In mult...
M. Aater Suleman, Onur Mutlu, Moinuddin K. Qureshi...
This paper presents the study and results of running several core multimedia applications on a simultaneous multithreading (SMT) architecture, including some detailed analysis ran...
Yen-Kuang Chen, Eric Debes, Rainer Lienhart, Matth...
Providing deterministic execution significantly simplifies the debugging, testing, replication, and deployment of multithreaded programs. Recent work has developed deterministic...
Joseph Devietti, Jacob Nelson, Tom Bergan, Luis Ce...
The advances in multicore technology and modern interconnects is rapidly accelerating the number of cores deployed in today’s commodity clusters. A majority of parallel applicat...
Amith R. Mamidala, Rahul Kumar, Debraj De, Dhabale...
In this paper, we develop a probabilistic model for estimation of the numbers of cache misses during the sparse matrix-vector multiplication (for both general and symmetric matrice...