Mobile agent systems have many attractive features including asynchrony, openness, dynamicity and anonymity, which makes them indispensable in designing complex modern applications that involve moving devices, human participants and software. To be comprehensive this list should include fault tolerance, yet as our analysis shows, this property is, unfortunately, often overlooked by middleware designers. A few existing solutions for fault tolerant mobile agents are developed mainly for tolerating hardware faults without providing any general support for application-specific recovery. In this paper we describe a novel exception handling model that allows application-specific recovery in coordination-based systems consisting of mobile agents. The proposed mechanism is general enough to be used in both looselyand tightly-coupled communication models. The general ideas behind the mechanism are applied in the context of the Lime middleware.
Alexei Iliasov, Alexander B. Romanovsky