Sciweavers

46 search results - page 6 / 10
» cf 2006
Sort
View
CF
2006
ACM
13 years 9 months ago
Intermediately executed code is the key to find refactorings that improve temporal data locality
The growing speed gap between memory and processor makes an efficient use of the cache ever more important to reach high performance. One of the most important ways to improve cac...
Kristof Beyls, Erik H. D'Hollander
CF
2006
ACM
14 years 1 months ago
Dynamic thread assignment on heterogeneous multiprocessor architectures
In a multi-programmed computing environment, threads of execution exhibit different runtime characteristics and hardware resource requirements. Not only do the behaviors of distin...
Michela Becchi, Patrick Crowley
CF
2006
ACM
14 years 1 months ago
Exploiting locality to ameliorate packet queue contention and serialization
Packet processing systems maintain high throughput despite relatively high memory latencies by exploiting the coarse-grained parallelism available between packets. In particular, ...
Sailesh Kumar, John Maschmeyer, Patrick Crowley
CF
2006
ACM
14 years 1 months ago
Kilo-instruction processors, runahead and prefetching
There is a continuous research effort devoted to overcome the memory wall problem. Prefetching is one of the most frequently used techniques. A prefetch mechanism anticipates the ...
Tanausú Ramírez, Alex Pajuelo, Olive...
CF
2006
ACM
13 years 11 months ago
Landing openMP on cyclops-64: an efficient mapping of openMP to a many-core system-on-a-chip
This paper presents our experience mapping OpenMP parallel programming model to the IBM Cyclops-64 (C64) architecture. The C64 employs a many-core-on-a-chip design that integrates...
Juan del Cuvillo, Weirong Zhu, Guang R. Gao