Transactional Memory (TM) simplifies parallel programming by supporting atomic and isolated execution of user-identified tasks. To date, TM programming has required the use of l...
Woongki Baek, Chi Cao Minh, Martin Trautmann, Chri...
As improvements in processor speed continue to outpace improvements in cache and memory speed, poor locality increasingly degrades performance. Because copying garbage collectors ...
Xianglong Huang, Stephen M. Blackburn, Kathryn S. ...
Automatic vectorization of programs for partitioned-ALU SIMD (Single Instruction Multiple Data) processors has been difficult because of not only data dependency issues but also n...
The trend in microprocessor design toward multicore and manycore processors means that future performance gains in software will largely come from harnessing parallelism. To reali...
This paper presents a semantic space profiler for parallel functional programs. Building on previous work in sequential profiling, our tools help programmers to relate runtime res...
Daniel Spoonhower, Guy E. Blelloch, Robert Harper,...