Sciweavers

SC
2015
ACM

Clock delta compression for scalable order-replay of non-deterministic parallel applications

8 years 6 months ago
Clock delta compression for scalable order-replay of non-deterministic parallel applications
The ability to record and replay program execution helps significantly in debugging non-deterministic MPI applications by reproducing message-receive orders. However, the large amount of data that traditional record-and-reply techniques record precludes its practical applicability to massively parallel applications. In this paper, we propose a new compression algorithm, Clock Delta Compression (CDC), for scalable record and replay of non-deterministic MPI applications. CDC defines a reference order of message receives based on a totally ordered relation using Lamport clocks, and only records the differences between this reference logical-clock order and an observed order. Our evaluation shows that CDC significantly reduces the record data size. For example, when we apply CDC to Monte Carlo particle transport Benchmark (MCB), which represents common non-deterministic communication patterns, CDC reduces the record size by approximately two orders of magnitude compared to traditional...
Added 17 Apr 2016
Updated 17 Apr 2016
Type Journal
Year 2015
Where SC
Comments (0)