Sciweavers

140 search results - page 15 / 28
» Profiling and mapping of parallel workloads on network proce...
Sort
View
DATE
2003
IEEE
180views Hardware» more  DATE 2003»
14 years 2 months ago
Communication Centric Architectures for Turbo-Decoding on Embedded Multiprocessors
Software implementations of channel decoding algorithms are attractive for communication systems with their large variety of existing and emerging standards due to their flexibil...
Frank Gilbert, Michael J. Thul, Norbert Wehn
HPCA
2003
IEEE
14 years 9 months ago
Active I/O Switches in System Area Networks
We present an active switch architecture to improve the performance of systems connected via system area networks. Our programmable active switches not only flexibly route packets...
Ming Hao, Mark Heinrich
ICPADS
2006
IEEE
14 years 2 months ago
Scalable Hybrid Designs for Linear Algebra on Reconfigurable Computing Systems
—Recently, high-end reconfigurable computing systems that employ Field-Programmable Gate Arrays (FPGAs) as hardware accelerators for general-purpose processors have been built. T...
Ling Zhuo, Viktor K. Prasanna
EUROPAR
2008
Springer
13 years 10 months ago
Mapping Heterogeneous Distributed Applications on Clusters
Performance of distributed applications largely depends on the mapping of their components on the underlying architecture. On one mponent-based approaches provide an abstraction su...
Sylvain Jubertie, Emmanuel Melin, Jér&eacut...
ICPP
2008
IEEE
14 years 3 months ago
Taming Single-Thread Program Performance on Many Distributed On-Chip L2 Caches
This paper presents a two-part study on managing distributed NUCA (Non-Uniform Cache Architecture) L2 caches in a future manycore processor to obtain high singlethread program per...
Lei Jin, Sangyeun Cho