This paper presents a new distributed computing framework for Many Task Computing (MTC) applications, based on the Extensible Messaging and Presence Protocol (XMPP). A lightweight, highly available system, named Kestrel, has been developed to explore XMPP-based techniques for improving MTC system tolerance to faults that result from scaling and intermittent computing agent presence. By leveraging technologies used in large instant messaging systems that scale to millions of clients, this MTC system is designed to scale to millions of agents at various levels of granularity: cores, machines, clusters, and even sensors, which makes it a good fit for MTC. Kestrel’s architecture is inspired by the distributed design of pilot job frameworks on the grid as well as botnets, with the addition of a commodity instant messaging protocol for communications. Whereas botnet command-andcontrol systems have frequently used a combination of Internet Relay Chat (IRC), Distributed Hash Table (DHT), a...
Lance Stout, Michael A. Murphy, Sebastien Goasguen