In current Chip-multiprocessors (CMPs), a significant portion of the die is consumed by the last-level cache. Until recently, the balance of cache and core space has been primari...
Xiaowei Jiang, Asit K. Mishra, Li Zhao, Ravishanka...
Abstract--The increasing availability of multi-core and multiprocessor architectures provides new opportunities for improving the performance of many computer simulations. Markov C...
Jonathan M. R. Byrd, Stephen A. Jarvis, Abhir H. B...
Performance and power consumption of an on-chip interconnect that forms the backbone of Chip Multiprocessors (CMPs), are directly influenced by the underlying network topology. Bo...
Reetuparna Das, Soumya Eachempati, Asit K. Mishra,...
Several message passing-based parallel solvers have been developed for general (non-symmetric) sparse LU factorization with partial pivoting. Due to the fine-grain synchronizatio...
This paper proposes a new hardware technique for using one core of a CMP to prefetch data for a thread running on another core. Our approach simply executes a copy of all non-cont...