Sciweavers

442 search results - page 69 / 89
» Parallel programming over ChinaGrid
Sort
View
PPOPP
2009
ACM
14 years 10 months ago
A compiler-directed data prefetching scheme for chip multiprocessors
Data prefetching has been widely used in the past as a technique for hiding memory access latencies. However, data prefetching in multi-threaded applications running on chip multi...
Dhruva Chakrabarti, Mahmut T. Kandemir, Mustafa Ka...
CLUSTER
2007
IEEE
14 years 4 months ago
Balancing productivity and performance on the cell broadband engine
— The Cell Broadband Engine (BE) is a heterogeneous multicore processor, combining a general-purpose POWER architecture core with eight independent single-instructionmultiple-dat...
Sadaf R. Alam, Jeremy S. Meredith, Jeffrey S. Vett...
CODES
2003
IEEE
14 years 3 months ago
Design space exploration of a hardware-software co-designed GF(2m) galois field processor for forward error correction and crypt
This paper describes a hardware-software co-design approach for flexible programmable Galois Field Processing for applications which require operations over GF(2m ), such as RS an...
Wei Ming Lim, Mohammed Benaissa
HPCA
1999
IEEE
14 years 2 months ago
Dynamically Exploiting Narrow Width Operands to Improve Processor Power and Performance
In general-purpose microprocessors, recent trends have pushed towards 64-bit word widths, primarily to accommodate the large addressing needs of some programs. Many integer proble...
David Brooks, Margaret Martonosi
PLDI
1995
ACM
14 years 1 months ago
Unifying Data and Control Transformations for Distributed Shared Memory Machines
We present a unified approach to locality optimization that employs both data and control transformations. Data transformations include changing the array layout in memory. Contr...
Michal Cierniak, Wei Li