While commodity computing and graphics hardware has increased in capacity and dropped in cost, it is still quite difficult to make effective use of such systems for general-purpos...
E. Wes Bethel, Greg Humphreys, Brian E. Paul, J. D...
This paper presents pTask-- a system that allows users to automatically exploit dynamic task-level parallelism in sequential array-based C programs. The system employs compiler an...
A microprocessor's performance is fundamentally limited by the rate at which it can resolve branch mispredictions. Control independence (CI) architectures look for useful con...
Kshitiz Malik, Mayank Agarwal, Sam S. Stone, Kevin...
—Atomic operations are important building blocks in supporting general-purpose computing on graphics processing units (GPUs). For instance, they can be used to coordinate executi...
While a number of User-Level Protocols have been developed to reduce the gap between the performance capabilities of the physical network and the performance actually available, a...
Pavan Balaji, Piyush Shivam, Pete Wyckoff, Dhabale...