A generic theoretical framework for managing critical events in ubiquitous computing systems is presented. The main idea is to automatically respond to occurrences of critical events in the system and mitigate them in a timely manner. This is different from traditional fault-tolerance schemes, where fault management is performed only after system failures. To model the critical event management, the concept of criticality, which characterizes the effects of critical events in the system, is defined. Each criticality is associated with a timing requirement, called its window-of-opportunity, that needs to be fulfilled in taking mitigative actions to prevent system failures. This is in addition to any application-level timing requirements. The criticality management framework analyzes the concept of criticality in detail and provides conditions which need to be satisfied for a successful multiple criticality management in a system. We have further simulated a criticality aware system ...
Tridib Mukherjee, Krishna M. Venkatasubramanian, S