Sciweavers

2226 search results - page 168 / 446
» Fault-Tolerant Parallel Applications with Dynamic Parallel S...
Sort
View
CLUSTER
2007
IEEE
14 years 2 months ago
Satisfying your dependencies with SuperMatrix
— SuperMatrix out-of-order scheduling leverages el abstractions and straightforward data dependency analysis to provide a general-purpose mechanism for obtaining parallelism from...
Ernie Chan, Field G. Van Zee, Enrique S. Quintana-...
CLUSTER
2011
IEEE
12 years 7 months ago
Exploring Fine-Grained Task-Based Execution on Multi-GPU Systems
Using multi-GPU systems, including GPU clusters, is gaining popularity in scientific computing. However, when using multiple GPUs concurrently, the conventional data parallel GPU...
Long Chen, Oreste Villa, Guang R. Gao
IPPS
2003
IEEE
14 years 1 months ago
UMR: A Multi-Round Algorithm for Scheduling Divisible Workloads
In this paper we present an algorithm for scheduling parallel applications that consist of a divisible workload. Our algorithm uses multiple rounds to overlap communication and co...
Yang Yang, Henri Casanova
ISCA
2012
IEEE
208views Hardware» more  ISCA 2012»
11 years 10 months ago
Harmony: Collection and analysis of parallel block vectors
Efficient execution of well-parallelized applications is central to performance in the multicore era. Program analysis tools support the hardware and software sides of this effor...
Melanie Kambadur, Kui Tang, Martha A. Kim
INFOCOM
2007
IEEE
14 years 2 months ago
On the Extreme Parallelism Inside Next-Generation Network Processors
Next-generation high-end Network Processors (NP) must address demands from both diversified applications and ever-increasing traffic pressure. One major challenge is to design an e...
Lei Shi, Yue Zhang 0006, Jianming Yu, Bo Xu, Bin L...