Sciweavers

311 search results - page 5 / 63
» Software-Controlled Multithreading Using Informing Memory Op...
Sort
View
ISCA
2008
IEEE
148views Hardware» more  ISCA 2008»
14 years 2 months ago
Atomic Vector Operations on Chip Multiprocessors
The current trend is for processors to deliver dramatic improvements in parallel performance while only modestly improving serial performance. Parallel performance is harvested th...
Sanjeev Kumar, Daehyun Kim, Mikhail Smelyanskiy, Y...
ISCA
2010
IEEE
185views Hardware» more  ISCA 2010»
14 years 24 days ago
Dynamic warp subdivision for integrated branch and memory divergence tolerance
SIMD organizations amortize the area and power of fetch, decode, and issue logic across multiple processing units in order to maximize throughput for a given area and power budget...
Jiayuan Meng, David Tarjan, Kevin Skadron
IPPS
2010
IEEE
13 years 5 months ago
Optimization of linked list prefix computations on multithreaded GPUs using CUDA
We present a number of optimization techniques to compute prefix sums on linked lists and implement them on multithreaded GPUs using CUDA. Prefix computations on linked structures ...
Zheng Wei, Joseph JáJá
IWMM
2009
Springer
114views Hardware» more  IWMM 2009»
14 years 2 months ago
Scalable support for multithreaded applications on dynamic binary instrumentation systems
Dynamic binary instrumentation systems are used to inject or modify arbitrary instructions in existing binary applications; several such systems have been developed over the past ...
Kim M. Hazelwood, Greg Lueck, Robert Cohn
IISWC
2008
IEEE
14 years 2 months ago
Accelerating multi-core processor design space evaluation using automatic multi-threaded workload synthesis
The design and evaluation of microprocessor architectures is a difficult and time-consuming task. Although small, handcoded microbenchmarks can be used to accelerate performance e...
Clay Hughes, Tao Li