Sciweavers

643 search results - page 118 / 129
» Using Hardware Counters to Automatically Improve Memory Perf...
Sort
View
ASAP
2007
IEEE
219views Hardware» more  ASAP 2007»
14 years 3 months ago
SIMD Vectorization of Histogram Functions
Existing SIMD extensions cannot efficiently vectorize the histogram function due to memory collisions. We propose two techniques to avoid this problem. In the first, a hierarchi...
Asadollah Shahbahrami, Ben H. H. Juurlink, Stamati...
ICCAD
1993
IEEE
111views Hardware» more  ICCAD 1993»
14 years 1 months ago
Unifying synchronous/asynchronous state machine synthesis
We present a design style and synthesis algorithm that encompasses both asynchronous and synchronous state machines. Our proposed design style not only supports generalized “bur...
Kenneth Y. Yun, David L. Dill
HPCA
2011
IEEE
13 years 18 days ago
HAQu: Hardware-accelerated queueing for fine-grained threading on a chip multiprocessor
Queues are commonly used in multithreaded programs for synchronization and communication. However, because software queues tend to be too expensive to support finegrained paralle...
Sanghoon Lee, Devesh Tiwari, Yan Solihin, James Tu...
MSS
2003
IEEE
151views Hardware» more  MSS 2003»
14 years 2 months ago
Accurate Modeling of Cache Replacement Policies in a Data Grid
Caching techniques have been used to improve the performance gap of storage hierarchies in computing systems. In data intensive applications that access large data files over wid...
Ekow J. Otoo, Arie Shoshani
ISCA
2000
IEEE
156views Hardware» more  ISCA 2000»
14 years 1 months ago
CHIMAERA: a high-performance architecture with a tightly-coupled reconfigurable functional unit
Reconfigurable hardware has the potential for significant performance improvements by providing support for application−specific operations. We report our experience with Chimae...
Zhi Alex Ye, Andreas Moshovos, Scott Hauck, Prithv...