Sciweavers

181 search results - page 15 / 37
» Faults in Large Distributed Systems and What We Can Do About...
Sort
View
CODES
2011
IEEE
12 years 7 months ago
DistRM: distributed resource management for on-chip many-core systems
The trend towards many-core systems comes with various issues, among them their highly dynamic and non-predictable workloads. Hence, new paradigms for managing resources of many-c...
Sebastian Kobbe, Lars Bauer, Daniel Lohmann, Wolfg...
P2P
2005
IEEE
14 years 1 months ago
Video Management in Peer-to-Peer Systems
Providing scalable video services in a peer-to-peer (P2P) environment is challenging. Since videos are typically large and require high communication bandwidth for delivery, many ...
Ying Cai, Zhan Chen, Wallapak Tavanapong
ISPAN
2005
IEEE
14 years 1 months ago
Supervised Peer-to-Peer Systems
In this paper we present a general methodology for designing supervised peer-to-peer systems. A supervised peer-to-peer system is a system in which the overlay network is formed b...
Kishore Kothapalli, Christian Scheideler
CLUSTER
2004
IEEE
13 years 11 months ago
Management of grid jobs and data within SAMGrid
When designing SAMGrid, a project for distributing high-energy physics computations on a grid, we discovered that it was challenging to decide where to place user's jobs. Job...
A. Baranovski, Gabriele Garzoglio, Igor Terekhov, ...
HIPC
2009
Springer
13 years 5 months ago
Extracting the textual and temporal structure of supercomputing logs
Supercomputers are prone to frequent faults that adversely affect their performance, reliability and functionality. System logs collected on these systems are a valuable resource o...
Sourabh Jain, Inderpreet Singh, Abhishek Chandra, ...