Sciweavers

835 search results - page 127 / 167
» On optimal slicing of parallel programs
Sort
View
ISHPC
2003
Springer
14 years 3 months ago
Code and Data Transformations for Improving Shared Cache Performance on SMT Processors
Simultaneous multithreaded processors use shared on-chip caches, which yield better cost-performance ratios. Sharing a cache between simultaneously executing threads causes excessi...
Dimitrios S. Nikolopoulos
IPPS
2009
IEEE
14 years 5 months ago
Designing multi-leader-based Allgather algorithms for multi-core clusters
The increasing demand for computational cycles is being met by the use of multi-core processors. Having large number of cores per node necessitates multi-core aware designs to ext...
Krishna Chaitanya Kandalla, Hari Subramoni, Gopala...
ICASSP
2008
IEEE
14 years 4 months ago
Address assignment sensitive variable partitioning and scheduling for DSPS with multiple memory banks
Multiple memory banks design is employed in many high performance DSP processors. This architectural feature supports higher memory bandwidth by allowing multiple data memory acce...
Chun Jason Xue, Tiantian Liu, Zili Shao, Jingtong ...
LCPC
1997
Springer
14 years 2 months ago
Automatic Data Decomposition for Message-Passing Machines
The data distribution problem is very complex, because it involves trade-offdecisions between minimizing communication and maximizing parallelism. A common approach towards solving...
Mirela Damian-Iordache, Sriram V. Pemmaraju
EUROPAR
2009
Springer
14 years 2 months ago
Fast and Efficient Synchronization and Communication Collective Primitives for Dual Cell-Based Blades
The Cell Broadband Engine (Cell BE) is a heterogeneous multi-core processor specifically designed to exploit thread-level parallelism. Its memory model comprehends a common shared ...
Epifanio Gaona, Juan Fernández, Manuel E. A...