This paper describes Surge, a collection-oriented programming model that enables programmers to compose parallel computations using nested high-level data collections and operator...
Saurav Muralidharan, Michael Garland, Bryan C. Cat...
As modern hardware keeps evolving, an increasingly effective approach to develop energy efficient and high-performance solvers is to design them to work on many small size indepe...
Azzam Haidar, Tingxing Dong, Piotr Luszczek, Stani...
This paper studies the effects of source-code optimizations on the performance, power draw, and energy consumption of a modern compute GPU. We evaluate 128 versions of two n-body ...
In this paper, we present the most extensive comparison of synchronization techniques. We evaluate 5 different synchronization techniques through a series of 31 data structure alg...
Large-scale graph-structured computation usually exhibits iterative and convergence-oriented computing nature, where input data is computed iteratively until a convergence conditi...
Chenning Xie, Rong Chen, Haibing Guan, Binyu Zang,...
Many recent multiprocessor systems are realized with a nonuniform memory architecture (NUMA) and accesses to remote memory locations take more time than local memory accesses. Opt...
In this paper, we consider concurrent programs in which the shared nsists of instances of linearizable ADTs (abstract data types). We present an automated approach to concurrency ...
Guy Golan-Gueta, G. Ramalingam, Mooly Sagiv, Eran ...
Stream processing or data-flow programming is a compute paradigm that has been around for decades in many forms yet has failed garner the same attention as other mainstream langu...
Hybrid MPI+Threads programming has emerged as an alternative model to the “MPI everywhere” model to better handle the increasing core density in cluster nodes. While the MPI s...