Atomic Vector Operations on Chip Multiprocessors

14 years 9 months ago

Download userweb.cs.utexas.edu

The current trend is for processors to deliver dramatic improvements in parallel performance while only modestly improving serial performance. Parallel performance is harvested through vector/SIMD instructions as well as multithreading (through both multithreaded cores and chip multiprocessors). Vector parallelism can be more efﬁciently supported than multithreading, but is often harder for software to exploit. In particular, code with sparse data access patterns cannot easily utilize the vector/SIMD instructions of mainstream processors. Hardware to scatter and gather sparse data has previously been proposed to enable vector execution for these codes. However, on multithreaded architectures, a number of applications spend signiﬁcant time on atomic operations (e.g., parallel reductions), which cannot be vectorized using previously proposed schemes. This paper proposes architectural support for atomic vector operations (referred to as GLSC) that addresses this limitation. GLSC exte...

Sanjeev Kumar, Daehyun Kim, Mikhail Smelyanskiy, Y

Real-time Traffic

Hardware | ISCA 2008 | Parallel Performance | Sparse Data | Vector/SIMD Instructions |

claim paper

Post Info
More Details (n/a)

Added	31 May 2010
Updated	31 May 2010
Type	Conference
Year	2008
Where	ISCA
Authors	Sanjeev Kumar, Daehyun Kim, Mikhail Smelyanskiy, Yen-Kuang Chen, Jatin Chhugani, Christopher J. Hughes, Changkyu Kim, Victor W. Lee, Anthony D. Nguyen

Comments (0)

Sciweavers

Atomic Vector Operations on Chip Multiprocessors

Hardware | ISCA 2008 | Parallel Performance | Sparse Data | Vector/SIMD Instructions |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers