Sciweavers

6775 search results - page 74 / 1355
» Diagnosis of Active Systems
Sort
View
CLUSTER
2006
IEEE
14 years 3 months ago
JOSHUA: Symmetric Active/Active Replication for Highly Available HPC Job and Resource Management
Most of today‘s HPC systems employ a single head node for control, which represents a single point of failure as it interrupts an entire HPC system upon failure. Furthermore, it...
Kai Uhlemann, Christian Engelmann, Stephen L. Scot...
ICDCS
2010
IEEE
13 years 10 months ago
Minimizing Probing Cost and Achieving Identifiability in Network Link Monitoring
Continuously monitoring the link performance is important to network diagnosis. Recently, active probes sent between end systems are widely used to monitor the link performance. I...
Qiang Zheng, Guohong Cao
IPPS
2007
IEEE
14 years 3 months ago
Detecting Runtime Environment Interference with Parallel Application Behavior
Many performance problems observed in high end systems are actually caused by the runtime system and not the application code. Detecting these cases will require parallel performa...
Rashawn L. Knapp, Karen L. Karavanic, Douglas M. P...
FOSSACS
2006
Springer
14 years 19 days ago
Distributed Unfolding of Petri Nets
Some recent Petri net-based approaches to fault diagnosis of distributed systems suggest to factor the problem into local diagnoses based on the unfoldings of local views of the sy...
Paolo Baldan, Stefan Haar, Barbara König
CDC
2008
IEEE
115views Control Systems» more  CDC 2008»
14 years 3 months ago
Optimal sensor activation in controlled discrete event systems
— The problem of sensor activation in a controlled discrete event system is considered. Sensors are assumed to be costly and can be turned on/off during the operation of the syst...
Weilin Wang, Stéphane Lafortune, Feng Lin