This paper analyzes the performability of client-server applications that use a separate fault management architecture for monitoring and controlling of the status of the application software and hardware. The analysis considers the impact of the management components and connections, and their reliability, on performability. The approach combines minpath algorithms, Layered Queueing analysis and non-coherent fault tree analysis techniques for efficient computation of expected reward rate of the application. Keywords System Fault-tolerance, Performability, Distributed Systems, Noncoherent fault trees, Layered Queueing Networks.
Olivia Das, C. Murray Woodside