Reliability is a key issue of the Service-Oriented Architecture (SOA) which is widely employed in critical domains such as e-commerce and e-government. Redundancy-based fault tole...
Fault tolerance is a major concern to guarantee availability of critical services as well as application execution. Traditional approaches for fault tolerance include checkpoint/r...
In this paper we propose a distributed meta-index using a peer-to-peer protocol to allow spatio-temporal queries of moving objects on a large set of distributed database servers. W...
—As parallel file systems span larger and larger numbers of nodes in order to provide the performance and scalability necessary for modern cluster applications, the need for fau...
Replication is a key technique for improving fault tolerance. Replication can also improve application performance under some circumstances, but can have the opposite effect under...
Replication is widely used to improve fault tolerance in distributed and multi-agent systems. In this paper, we present a different point of view on replication in multi-agent syst...
The ability to guarantee that a system will continue to operate correctly under degraded conditions is key to the success of adopting multi-agent systems (MAS) as a paradigm for d...
General-purpose middleware, by definition, cannot readily support domain-specific semantics without significant manual efforts in specializing the middleware. This paper prese...
Sumant Tambe, Akshay Dabholkar, Aniruddha S. Gokha...
—Considerable work has been done on providing fault tolerance capabilities for different software components on largescale high-end computing systems. Thus far, however, these fa...
Rinku Gupta, Pete Beckman, Byung-Hoon Park, Ewing ...
Based on the framework of service-oriented architecture (SOA), complex distributed systems can be dynamically and automatically composed by integrating distributed Web services pr...