Sciweavers

1022 search results - page 192 / 205
» Automatic data and computation decomposition on distributed ...
Sort
View
CCGRID
2011
IEEE
12 years 11 months ago
High Performance Pipelined Process Migration with RDMA
—Coordinated Checkpoint/Restart (C/R) is a widely deployed strategy to achieve fault-tolerance. However, C/R by itself is not capable enough to meet the demands of upcoming exasc...
Xiangyong Ouyang, Raghunath Rajachandrasekar, Xavi...
ICS
2003
Tsinghua U.
14 years 22 days ago
Inferential queueing and speculative push for reducing critical communication latencies
Communication latencies within critical sections constitute a major bottleneck in some classes of emerging parallel workloads. In this paper, we argue for the use of Inferentially...
Ravi Rajwar, Alain Kägi, James R. Goodman
SPAA
2004
ACM
14 years 29 days ago
Cache-oblivious shortest paths in graphs using buffer heap
We present the Buffer Heap (BH), a cache-oblivious priority queue that supports Delete-Min, Delete, and Decrease-Key operations in O( 1 B log2 N B ) amortized block transfers fro...
Rezaul Alam Chowdhury, Vijaya Ramachandran
HPCA
2000
IEEE
13 years 12 months ago
Impact of Chip-Level Integration on Performance of OLTP Workloads
With increasing chip densities, future microprocessor designs have the opportunity to integrate many of the traditional systemlevel modules onto the same chip as the processor. So...
Luiz André Barroso, Kourosh Gharachorloo, A...
SPAA
1997
ACM
13 years 11 months ago
Efficient Detection of Determinacy Races in Cilk Programs
A parallel multithreaded program that is ostensibly deterministic may nevertheless behave nondeterministically due to bugs in the code. These bugs are called determinacy races, an...
Mingdong Feng, Charles E. Leiserson