Sciweavers

1075 search results - page 186 / 215
» Parallel Programming with Transactional Memory
Sort
View
ASPLOS
2009
ACM
15 years 9 months ago
Performance analysis of accelerated image registration using GPGPU
This paper presents a performance analysis of an accelerated 2-D rigid image registration implementation that employs the Compute Unified Device Architecture (CUDA) programming e...
Peter Bui, Jay B. Brockman
IPPS
2007
IEEE
15 years 8 months ago
Invited Paper: A Compile-time Cost Model for OpenMP
OpenMP has gained wide popularity as an API for parallel programming on shared memory and distributed shared memory platforms. It is also a promising candidate to exploit the emer...
Chunhua Liao, Barbara M. Chapman
IPPS
2003
IEEE
15 years 7 months ago
Extending OpenMP to Support Slipstream Execution Mode
OpenMP has emerged as a widely accepted standard for writing shared memory programs. Hardware-specific extensions such as data placement are usually needed to improve the scalabi...
Khaled Z. Ibrahim, Gregory T. Byrd
ICCD
2006
IEEE
115views Hardware» more  ICCD 2006»
15 years 11 months ago
Microarchitecture and Performance Analysis of Godson-2 SMT Processor
—This paper introduces the microarchitecture and logical implementation of SMT (Simultaneous Multithreading) improvement of Godson-2 processor which is a 64-bit, four-issue, out-...
Zusong Li, Xianchao Xu, Weiwu Hu, Zhimin Tang
SC
2000
ACM
15 years 6 months ago
Scalable Algorithms for Adaptive Statistical Designs
We present a scalable, high-performance solution to multidimensional recurrences that arise in adaptive statistical designs. Adaptive designs are an important class of learning al...
Robert H. Oehmke, Janis Hardwick, Quentin F. Stout