With the advent of chip-multiprocessors, we are faced with the challenge of parallelizing performance-critical software. Transactional memory (TM) has emerged as a promising progr...
Marek Olszewski, Jeremy Cutler, J. Gregory Steffan
This paper examines the performance of desktop applications running on the Microsoft Windows NT operating system on Intel x86 processors, and contrasts these applications to the p...
Dennis C. Lee, Patrick Crowley, Jean-Loup Baer, Th...
This paper presents recursion unrolling, a technique for improving the performance of recursive computations. Conceptually, recursion unrolling inlines recursive calls to reduce c...
We present Offload, a programming model for offloading parts of a C++ application to run on accelerator cores in a heterogeneous multicore system. Code to be offloaded is enclosed ...
Pete Cooper, Uwe Dolinsky, Alastair F. Donaldson, ...
Understanding why the performance of a multithreaded program does not improve linearly with the number of cores in a sharedmemory node populated with one or more multicore process...