Abstract. To achieve an efficient utilization of cluster systems, a proper programming and operating environment is required. In this context, mobile agents are of growing interest as base for distributed and parallel applications. As mobile and autonomous software units, mobile agents can execute tasks given to the system and allocate independently all the needed resources. However, with growing cluster sizes, the probability of a failure of one or more system components and therewith the loss of mobile agents rises. While fault tolerance issues for applications based on “traditional” processes have been extensively studied, current agent environments provide only insufficient, if at all, extensions for a capable reaction on such kinds of failures. We examine fault tolerance with regard to properties and requirements of mobile agents, and find that independent checkpointing with receiver based message logging is appropriate in this context. We derive the FANTOMAS (Fault-Toleran...