GPGPUs have recently emerged as powerful vehicles for generalpurpose high-performance computing. Although a new Compute Unified Device Architecture (CUDA) programming model from N...
This paper describes our approaches to raise the level of abstraction at which hardware suitable for accelerating computationally-intensive applications can be specified. Field-Pr...
Qiang Liu, George A. Constantinides, Konstantinos ...
This paper presents pTask-- a system that allows users to automatically exploit dynamic task-level parallelism in sequential array-based C programs. The system employs compiler an...
In distributed-memory message-passing architectures reducing communication cost is extremely important. In this paper, we present a technique to optimize communication globally. O...
Mahmut T. Kandemir, Prithviraj Banerjee, Alok N. C...
Different approaches have been proposed over the years for automatically transforming High-Level-Languages (HLL) descriptions of applications into custom hardware implementations. ...