This paper deals with a novel, distributed, QoS-aware, peer-topeer checkpointing arrangement component for mobile Grid (MoG) computing systems middleware. Checkpointing is more crucial in MoG systems than in their wired counterparts due to node mobility and less reliable wireless links resulting in frequent and dynamic connections and disconnections. Having determined the globally optimal checkpoint arrangement to be NP-complete, we consider ReD, our Reliability Driven (ReD) protocol, employing QoS-aware heuristics, for constructing superior peerto-peer checkpointing arrangements efficiently. Categories and Subject Descriptors D.4.7 [Operating Systems]: Organization and Design – Distributed systems. General Terms Algorithms, Measurement, Performance, Design, Reliability. Keywords Peer-to-Peer wireless checkpointing arrangement, computational Grids, collaborative job execution, Mobile Grids, Checkpointing
Paul J. Darby III, Nian-Feng Tzeng