Sciweavers

483 search results - page 84 / 97
» Fault Management in P2P-MPI
Sort
View
IPPS
2010
IEEE
13 years 6 months ago
Robust control-theoretic thermal balancing for server clusters
Thermal management is critical for clusters because of the increasing power consumption of modern processors, compact server architectures and growing server density in data center...
Yong Fu, Chenyang Lu, Hongan Wang
HIPC
2009
Springer
13 years 6 months ago
Extracting the textual and temporal structure of supercomputing logs
Supercomputers are prone to frequent faults that adversely affect their performance, reliability and functionality. System logs collected on these systems are a valuable resource o...
Sourabh Jain, Inderpreet Singh, Abhishek Chandra, ...
IWCC
1999
IEEE
14 years 26 days ago
Nomad: A Scalable Operating System for Clusters of Uni and Multiprocessors
The recent improvements in workstation and interconnection network performance have popularized the clusters of off-the-shelf workstations. However, the usefulness of these cluste...
Eduardo Pinheiro, Ricardo Bianchini
ATAL
2008
Springer
13 years 10 months ago
WADE: a software platform to develop mission critical applications exploiting agents and workflows
In this paper, we describe two mission critical applications currently deployed by Telecom Italia in the Operations Support System domains. The first one called "Network Neut...
Giovanni Caire, Danilo Gotta, Massimo Banzi
SIGSOFT
2007
ACM
14 years 9 months ago
Which warnings should I fix first?
Automatic bug-finding tools have a high false positive rate: most warnings do not indicate real bugs. Usually bug-finding tools assign important warnings high priority. However, t...
Sunghun Kim, Michael D. Ernst