Sciweavers

CC
2009
Springer
141views System Software» more  CC 2009»
14 years 8 months ago
Compile-Time Analysis and Specialization of Clocks in Concurrent Programs
Abstract. Clocks are a mechanism for providing synchronization barriers in concurrent programming languages. They are usually implemented using primitive communication mechanisms a...
Nalini Vasudevan, Olivier Tardieu, Julian Dolby, S...
CC
2009
Springer
132views System Software» more  CC 2009»
14 years 8 months ago
Implementation and Use of Transactional Memory with Dynamic Separation
Abstract. We introduce the design and implementation of dynamic separation (DS) as a programming discipline for using transactional memory. Our approach is based on the programmer ...
Andrew Birrell, Johnson Hsieh, Martín Abadi...
CC
2009
Springer
153views System Software» more  CC 2009»
14 years 8 months ago
Register Spilling and Live-Range Splitting for SSA-Form Programs
Register allocation decides which parts of a variable's live range are held in registers and which in memory. The compiler inserts spill code to move the values of variables b...
Matthias Braun, Sebastian Hack
CC
2009
Springer
142views System Software» more  CC 2009»
14 years 8 months ago
Extensible Proof-Producing Compilation
This paper presents a compiler which produces machine code from functions defined in the logic of a theorem prover, and at the same time proves that the generated code executes the...
Magnus O. Myreen, Konrad Slind, Michael J. C. Gord...
CC
2009
Springer
190views System Software» more  CC 2009»
14 years 8 months ago
SSA Elimination after Register Allocation
form uses a notational abstractions called -functions. These instructions have no analogous in actual machine instruction sets, and they must be replaced by ordinary instructions ...
Fernando Magno Quintão Pereira, Jens Palsbe...
ASPLOS
2009
ACM
14 years 8 months ago
Architecture-aware optimization targeting multithreaded stream computing
Byunghyun Jang, Synho Do, Homer H. Pien, David R. ...
ASPLOS
2009
ACM
14 years 8 months ago
3D finite difference computation on GPUs using CUDA
In this paper we describe a GPU parallelization of the 3D finite difference computation using CUDA. Data access redundancy is used as the metric to determine the optimal implement...
Paulius Micikevicius
ASPLOS
2009
ACM
14 years 8 months ago
QR decomposition on GPUs
QR decomposition is a computationally intensive linear algebra operation that factors a matrix A into the product of a unitary matrix Q and upper triangular matrix R. Adaptive sys...
Andrew Kerr, Dan Campbell, Mark Richards
ASPLOS
2009
ACM
14 years 8 months ago
Optimization of tele-immersion codes
As computational power increases, tele-immersive applications are an emerging trend. These applications make extensive demands on computational resources through their heavy use o...
Albert Sidelnik, I-Jui Sung, Wanmin Wu, Marí...
ASPLOS
2009
ACM
14 years 8 months ago
Understanding software approaches for GPGPU reliability
Even though graphics processors (GPUs) are becoming increasingly popular for general purpose computing, current (and likely near future) generations of GPUs do not provide hardwar...
Martin Dimitrov, Mike Mantor, Huiyang Zhou