The combination of Grid technology and web services has produced an attractive platform for deploying distributed applications: Grid services, as represented by the Open Grid Services Infrastructure (OGSI) and its Globus toolkit implementation. As the use of Grid services grows in popularity, tolerating failures becomes increasingly important. This paper addresses the problem of building a reliable and highly-available Grid service by replicating the service on two or more hosts using the primary
Xianan Zhang, Dmitrii Zagorodnov, Matti A. Hiltune