As parallel and distributed computers become more widely available and used, the already important process of understanding and debugging concurrent programs will take on even gre...
— Stabilizability of linear time invariant networked systems of general structure is studied with an observer-based approach. In the assumption of piecewise constant controls an ...
Abstract—The distributed nature and large scale of MapReduce programs and systems poses two challenges in using existing profiling and debugging tools to understand MapReduce pr...
This paper provides a technique, based on partially observable Markov decision processes (POMDPs), for building automatic recovery controllers to guide distributed system recovery...
Kaustubh R. Joshi, William H. Sanders, Matti A. Hi...
Buffered CoScheduled (BCS) MPI is a novel implementation of MPI based on global synchronization of all system activities. BCS-MPI imposes a model where all processes and their com...