Sciweavers

1993 search results - page 202 / 399
» Designing and Building Parallel Program
Sort
View
APPT
2009
Springer
14 years 4 months ago
Performance Improvement of Multimedia Kernels by Alleviating Overhead Instructions on SIMD Devices
SIMD extension is one of the most common and effective technique to exploit data-level parallelism in today’s processor designs. However, the performance of SIMD architectures i...
Asadollah Shahbahrami, Ben H. H. Juurlink
HPCA
1998
IEEE
14 years 1 months ago
Performance Study of a Concurrent Multithreaded Processor
The performance of a concurrent multithreaded architectural model, called superthreading 15 , is studied in this paper. It tries to integrate optimizing compilation techniques and...
Jenn-Yuan Tsai, Zhenzhen Jiang, Eric Ness, Pen-Chu...
ASPLOS
1996
ACM
14 years 1 months ago
An Integrated Compile-Time/Run-Time Software Distributed Shared Memory System
On a distributed memory machine, hand-coded message passing leads to the most efficient execution, but it is difficult to use. Parallelizing compilers can approach the performance...
Sandhya Dwarkadas, Alan L. Cox, Willy Zwaenepoel
HPCA
2007
IEEE
14 years 10 months ago
A Scalable, Non-blocking Approach to Transactional Memory
Transactional Memory (TM) provides mechanisms that promise to simplify parallel programming by eliminating the need for locks and their associated problems (deadlock, livelock, pr...
Hassan Chafi, Jared Casper, Brian D. Carlstrom, Au...
IPPS
2005
IEEE
14 years 3 months ago
COTS Clusters vs. the Earth Simulator: An Application Study Using IMPACT-3D
In 2002, Japan announced the Earth Simulator—a supercomputer based on low-volume vector processors and a custom network—and reported that computational scientists had used it ...
Daniel G. Chavarría-Miranda, Guohua Jin, Jo...