Distributed database applications are a wide use of distributed systems. One of the major advantages of distributed database systems is the potential for achieving high availability in the presence of faults. Faults must be handled so that the system still operates or operates in a degraded mode. This paper focuses on being able to detect component errors which can lead to system failures in the scheduling part of the lock manager portion of the distributed database system by using embedded executable assertions. Changeling provides a systematic approach, based on the mathematical model of program verification, to deriving executable assertions that can be evaluated in the faulty distributed computing environment. A complete case study of the development of an error-detecting distributed scheduler, using Changeling, is presented in this paper.
Hanan Lutfiyya, Bruce M. McMillin, Alan Su 0002