Sciweavers

309 search results - page 57 / 62
» Parallel Memory Architecture for Arbitrary Stride Accesses
Sort
View
SC
2009
ACM
14 years 2 months ago
Early performance evaluation of a "Nehalem" cluster using scientific and engineering applications
In this paper, we present an early performance evaluation of a 624-core cluster based on the Intel® Xeon® Processor 5560 (code named “Nehalem-EP”, and referred to as Xeon 55...
Subhash Saini, Andrey Naraikin, Rupak Biswas, Davi...
SPAA
1996
ACM
13 years 11 months ago
From AAPC Algorithms to High Performance Permutation Routing and Sorting
Several recent papers have proposed or analyzed optimal algorithms to route all-to-all personalizedcommunication (AAPC) over communication networks such as meshes, hypercubes and ...
Thomas Stricker, Jonathan C. Hardwick
EMSOFT
2005
Springer
14 years 1 months ago
Optimizing inter-processor data locality on embedded chip multiprocessors
Recent research in embedded computing indicates that packing multiple processor cores on the same die is an effective way of utilizing the ever-increasing number of transistors. T...
Guilin Chen, Mahmut T. Kandemir
ISCA
2002
IEEE
104views Hardware» more  ISCA 2002»
13 years 7 months ago
Speculative Dynamic Vectorization
Traditional vector architectures have shown to be very effective for regular codes where the compiler can detect data-level parallelism. However, this SIMD parallelism is also pre...
Alex Pajuelo, Antonio González, Mateo Valer...
IPPS
2008
IEEE
14 years 1 months ago
High-speed string searching against large dictionaries on the Cell/B.E. Processor
Our digital universe is growing, creating exploding amounts of data which need to be searched, protected and filtered. String searching is at the core of the tools we use to curb...
Daniele Paolo Scarpazza, Oreste Villa, Fabrizio Pe...