Sciweavers

420 search results - page 76 / 84
» Scalable Parallel Programming with CUDA
Sort
View
POPL
2010
ACM
14 years 3 months ago
Lightweight asynchrony using parasitic threads
Message-passing is an attractive thread coordination mechanism because it cleanly delineates points in an execution when threads communicate, and unifies synchronization and comm...
K. C. Sivaramakrishnan, Lukasz Ziarek, Raghavendra...
CC
2006
Springer
124views System Software» more  CC 2006»
14 years 4 days ago
Polyhedral Code Generation in the Real World
The polyhedral model is known to be a powerful framework to reason about high level loop transformations. Recent developments in optimizing compilers broke some generally accepted ...
Nicolas Vasilache, Cédric Bastoul, Albert C...
ASPLOS
1998
ACM
14 years 19 days ago
A Cost-Effective, High-Bandwidth Storage Architecture
This paper describes the Network-Attached Secure Disk (NASD) storage architecture, prototype implementations of NASD drives, array management for our architecture, and three files...
Garth A. Gibson, David Nagle, Khalil Amiri, Jeff B...
IPPS
2003
IEEE
14 years 1 months ago
Extending OpenMP to Support Slipstream Execution Mode
OpenMP has emerged as a widely accepted standard for writing shared memory programs. Hardware-specific extensions such as data placement are usually needed to improve the scalabi...
Khaled Z. Ibrahim, Gregory T. Byrd
OSDI
1994
ACM
13 years 9 months ago
The Design and Evaluation of a Shared Object System for Distributed Memory Machines
This paper describes the design and evaluation of SAM, a shared object system for distributed memory machines. SAM is a portable run-time system that provides a global name space ...
Daniel J. Scales, Monica S. Lam