Sciweavers

611 search results - page 33 / 123
» Highly scalable parallel sorting
Sort
View
WMPI
2004
ACM
14 years 2 months ago
Scalable cache memory design for large-scale SMT architectures
The cache hierarchy design in existing SMT and superscalar processors is optimized for latency, but not for bandwidth. The size of the L1 data cache did not scale over the past dec...
Muhamed F. Mudawar
HPCA
2007
IEEE
14 years 9 months ago
A Scalable, Non-blocking Approach to Transactional Memory
Transactional Memory (TM) provides mechanisms that promise to simplify parallel programming by eliminating the need for locks and their associated problems (deadlock, livelock, pr...
Hassan Chafi, Jared Casper, Brian D. Carlstrom, Au...
PVM
2009
Springer
14 years 3 months ago
Fine-Grained Data Distribution Operations for Particle Codes
Abstract This paper proposes a new fine-grained data distribution operation MPI Alltoall specific that allows an element-wise distribution of data elements to specific target pro...
Michael Hofmann, Gudula Rünger
IPPS
1999
IEEE
14 years 28 days ago
BRISK: A Portable and Flexible Distributed Instrumentation System
Researchers and practitioners in the area of parallel and distributed computing have been lacking a portable, flexible and robust distributed instrumentation system. We present th...
Aleksandar M. Bakic, Matt W. Mutka, Diane T. Rover
SPAA
2012
ACM
11 years 11 months ago
A scalable framework for heterogeneous GPU-based clusters
GPU-based heterogeneous clusters continue to draw attention from vendors and HPC users due to their high energy efficiency and much improved single-node computational performance...
Fengguang Song, Jack Dongarra