Sciweavers

901 search results - page 158 / 181
» Hiding Communication Latency in Data Parallel Applications
Sort
View
ICS
2009
Tsinghua U.
14 years 11 days ago
Exploring pattern-aware routing in generalized fat tree networks
New static source routing algorithms for High Performance Computing (HPC) are presented in this work. The target parallel architectures are based on the commonly used fattree netw...
Germán Rodríguez, Ramón Beivi...
ICCAD
2002
IEEE
94views Hardware» more  ICCAD 2002»
14 years 4 months ago
High-level synthesis of distributed logic-memory architectures
Abstract— With the increasing cost of global communication onchip, high-performance designs for data-intensive applications require architectures that distribute hardware resourc...
Chao Huang, Srivaths Ravi, Anand Raghunathan, Nira...
HPCA
2007
IEEE
14 years 8 months ago
Thermal Herding: Microarchitecture Techniques for Controlling Hotspots in High-Performance 3D-Integrated Processors
3D integration technology greatly increases transistor density while providing faster on-chip communication. 3D implementations of processors can simultaneously provide both laten...
Kiran Puttaswamy, Gabriel H. Loh
PPOPP
1993
ACM
13 years 12 months ago
Integrating Message-Passing and Shared-Memory: Early Experience
This paper discusses some of the issues involved in implementing a shared-address space programming model on large-scale, distributed-memory multiprocessors. While such a programm...
David A. Kranz, Kirk L. Johnson, Anant Agarwal, Jo...
CLUSTER
2011
IEEE
12 years 7 months ago
Exploring Fine-Grained Task-Based Execution on Multi-GPU Systems
Using multi-GPU systems, including GPU clusters, is gaining popularity in scientific computing. However, when using multiple GPUs concurrently, the conventional data parallel GPU...
Long Chen, Oreste Villa, Guang R. Gao