Sciweavers

955 search results - page 20 / 191
» Performance optimization of multiple memory architectures fo...
Sort
View
HPCA
2005
IEEE
16 years 2 months ago
A Performance Comparison of DRAM Memory System Optimizations for SMT Processors
Memory system optimizations have been well studied on single-threaded systems; however, the wide use of simultaneous multithreading (SMT) techniques raises questions over their ef...
Zhichun Zhu, Zhao Zhang
116
Voted
CLUSTER
2003
IEEE
15 years 7 months ago
Improving the Performance of MPI Derived Datatypes by Optimizing Memory-Access Cost
The MPI Standard supports derived datatypes, which allow users to describe noncontiguous memory layout and communicate noncontiguous data with a single communication function. Thi...
Surendra Byna, William D. Gropp, Xian-He Sun, Raje...
143
Voted
PC
2007
161views Management» more  PC 2007»
15 years 2 months ago
High performance combinatorial algorithm design on the Cell Broadband Engine processor
The Sony–Toshiba–IBM Cell Broadband Engine (Cell/B.E.) is a heterogeneous multicore architecture that consists of a traditional microprocessor (PPE) with eight SIMD co-process...
David A. Bader, Virat Agarwal, Kamesh Madduri, Seu...
115
Voted
ACMMSP
2004
ACM
92views Hardware» more  ACMMSP 2004»
15 years 8 months ago
Instruction combining for coalescing memory accesses using global code motion
Instruction combining is an optimization to replace a sequence of instructions with a more efficient instruction yielding the same result in a fewer machine cycles. When we use it...
Motohiro Kawahito, Hideaki Komatsu, Toshio Nakatan...
118
Voted
PERCOM
2010
ACM
15 years 4 months ago
Collaborative real-time speaker identification for wearable systems
We present an unsupervised speaker identification system for personal annotations of conversations and meetings. The system dynamically learns new speakers and recognizes already k...
Mirco Rossi, Oliver Amft, Martin Kusserow, Gerhard...