Sciweavers

1075 search results - page 190 / 215
» Parallel Programming with Transactional Memory
Sort
View
MICRO
2010
IEEE
156views Hardware» more  MICRO 2010»
13 years 6 months ago
Explicit Communication and Synchronization in SARC
SARC merges cache controller and network interface functions by relying on a single hardware primitive: each access checks the tag and the state of the addressed line for possible...
Manolis Katevenis, Vassilis Papaefstathiou, Stamat...
PPOPP
2009
ACM
14 years 9 months ago
OpenMP to GPGPU: a compiler framework for automatic translation and optimization
GPGPUs have recently emerged as powerful vehicles for generalpurpose high-performance computing. Although a new Compute Unified Device Architecture (CUDA) programming model from N...
Seyong Lee, Seung-Jai Min, Rudolf Eigenmann
MICRO
1997
IEEE
116views Hardware» more  MICRO 1997»
14 years 19 days ago
Tuning Compiler Optimizations for Simultaneous Multithreading
Compiler optimizations are often driven by specific assumptions about the underlying architecture and implementation of the target machine. For example, when targeting shared-mem...
Jack L. Lo, Susan J. Eggers, Henry M. Levy, Sujay ...
ISHPC
2000
Springer
14 years 1 days ago
Leveraging Transparent Data Distribution in OpenMP via User-Level Dynamic Page Migration
This paper describes transparent mechanisms for emulating some of the data distribution facilities offered by traditional data-parallel programming models, such as High Performance...
Dimitrios S. Nikolopoulos, Theodore S. Papatheodor...
ASPLOS
2008
ACM
13 years 10 months ago
Communication optimizations for global multi-threaded instruction scheduling
The recent shift in the industry towards chip multiprocessor (CMP) designs has brought the need for multi-threaded applications to mainstream computing. As observed in several lim...
Guilherme Ottoni, David I. August