Sciweavers

198 search results - page 36 / 40
» Automatic Performance Diagnosis of Parallel Computations wit...
Sort
View
HPDC
2007
IEEE
14 years 1 months ago
Feedback-directed thread scheduling with memory considerations
This paper describes a novel approach to generate an optimized schedule to run threads on distributed shared memory (DSM) systems. The approach relies upon a binary instrumentatio...
Fengguang Song, Shirley Moore, Jack Dongarra
IPPS
2002
IEEE
14 years 11 days ago
Failure Behavior Analysis for Reliable Distributed Embedded Systems
Failure behavior analysis is a very important phase in developing large distributed embedded systems with weak safety requirements which do graceful degradation in case of failure...
Mario Trapp, Bernd Schürmann, Torsten Tettero...
EUROPAR
2005
Springer
14 years 1 months ago
Faults in Large Distributed Systems and What We Can Do About Them
Scientists are increasingly using large distributed systems built from commodity off-the-shelf components to perform scientific computation. Grid computing has expanded the scale ...
George Kola, Tevfik Kosar, Miron Livny
PODC
2010
ACM
13 years 11 months ago
Adaptive system anomaly prediction for large-scale hosting infrastructures
Large-scale hosting infrastructures require automatic system anomaly management to achieve continuous system operation. In this paper, we present a novel adaptive runtime anomaly ...
Yongmin Tan, Xiaohui Gu, Haixun Wang
ISORC
2006
IEEE
14 years 1 months ago
Dynamically Deploying Web Services on a Grid using Dynasoar
Dynasoar is an infrastructure for dynamically deploying Web Services over a Grid or the Internet. It enables an approach to Grid computing in which distributed applications are bu...
Paul Watson, Chris Fowler, Charles Kubicek, Arijit...