Sciweavers

1075 search results - page 186 / 215
» Parallel Programming with Transactional Memory
Sort
View
ASPLOS
2009
ACM
14 years 3 months ago
Performance analysis of accelerated image registration using GPGPU
This paper presents a performance analysis of an accelerated 2-D rigid image registration implementation that employs the Compute Unified Device Architecture (CUDA) programming e...
Peter Bui, Jay B. Brockman
IPPS
2007
IEEE
14 years 2 months ago
Invited Paper: A Compile-time Cost Model for OpenMP
OpenMP has gained wide popularity as an API for parallel programming on shared memory and distributed shared memory platforms. It is also a promising candidate to exploit the emer...
Chunhua Liao, Barbara M. Chapman
IPPS
2003
IEEE
14 years 1 months ago
Extending OpenMP to Support Slipstream Execution Mode
OpenMP has emerged as a widely accepted standard for writing shared memory programs. Hardware-specific extensions such as data placement are usually needed to improve the scalabi...
Khaled Z. Ibrahim, Gregory T. Byrd
ICCD
2006
IEEE
115views Hardware» more  ICCD 2006»
14 years 5 months ago
Microarchitecture and Performance Analysis of Godson-2 SMT Processor
—This paper introduces the microarchitecture and logical implementation of SMT (Simultaneous Multithreading) improvement of Godson-2 processor which is a 64-bit, four-issue, out-...
Zusong Li, Xianchao Xu, Weiwu Hu, Zhimin Tang
SC
2000
ACM
14 years 25 days ago
Scalable Algorithms for Adaptive Statistical Designs
We present a scalable, high-performance solution to multidimensional recurrences that arise in adaptive statistical designs. Adaptive designs are an important class of learning al...
Robert H. Oehmke, Janis Hardwick, Quentin F. Stout