In order to meet the high throughput requirements of applications exhibiting high ILP, VLIW ASIPs may increasingly include large numbers of functional unitsFUs. Unfortunately, `sw...
Margarida F. Jacome, Gustavo de Veciana, Satish Pi...
Abstract-- The increasing wire delay constraints in deep submicron VLSI designs have led to the emergence of scalable and modular Network-on-Chip (NoC) architectures. As the power ...
Avinash Karanth Kodi, Ashwini Sarathy, Ahmed Louri...
Memory-intensive threads can hoard shared resources without making progress on a multithreading processor (SMT), thereby hindering the overall system performance. A recent promisi...
Abstract. Shared last level cache is crucial to performance. However, multithread program model incurs serious contention in shared cache. In this paper, to reduce average cache ac...
This paper presents a task-centric memory model for 1000-core compute accelerators. Visual computing applications are emerging as an important class of workloads that can exploit ...
John H. Kelm, Daniel R. Johnson, Steven S. Lumetta...