Software-intensive systems often exhibit dimensions in size and complexity that exceed the scope of comprehension of even talented, experienced system designers and analysts. With this complexity comes the potential for undetected errors in the system. While software often causes or exacerbates this problem, the form of the software itself can be used to ameliorate it in what is referred to as a survivability architecture. In a system with a survivability architecture, under adverse conditions such as system damage or software failures, some desirable function will be eliminated but critical services will be retained. Making a system survivable rather than highly reliable or highly available has many advantages including overall system simplification and reduced demands on assurance technology. In this paper, we explore the motivation for survivability, how it might be used, what the concept means in a precise and testable sense, and how it is being implemented in two very different ap...
John C. Knight, Elisabeth A. Strunk