An important practical problem in fault diagnosis is discriminating between permanent faults and transient faults. In many computer systems, the majority of errors are due to transient faults. Many heuristic methods have been used for discriminating between transient and permanent faults; however, we have found no previous work stating this decision problem in clear probabilistic terms. We present an optimal procedure for discriminating between transients and permanent faults, based on applying Bayesian inference to the observed events (correct and erroneous results). We describe how the assessed probability that a module is permanently faulty must vary with observed symptoms. We describe and demonstrate our proposed method on a simple application problem, building the appropriate equations and showing numerical examples. The method can be implemented as a run-time diagnosis algorithm at little computational cost; it can also be used to evaluate any heuristic diagnostic procedure by c...
M. Pizza, Lorenzo Strigini, Andrea Bondavalli, Fel