Existing approaches to debugging distributed systems involve a cycle of passive observation followed by computation replaying. We propose predicate control as an active approach to debugging such systems. The predicate control approach involves a cycle of observation followed by controlled replaying of computations, based on observation. We formalize the predicate control problem for both offline and on-line scenarios. We prove that off-line predicate control for general boolean predicates is NP-hard. However, we provide an efficient solution for off-line predicate control for the class of disjunctive predicates. We further solve on-line predicate control for disjunctive predicates under certain restrictions on the system. Lastly, we demonstrate how both off-line and on-line predicate control facilitate distributed debugging by allowing the programmer to control computations to maintain global safety properties.
Ashis Tarafdar, Vijay K. Garg