Sciweavers

1073 search results - page 100 / 215
» ParaDisEO-Based Design of Parallel and Distributed Evolution...
Sort
View
SPDP
1991
IEEE
14 years 19 days ago
Fault-tolerant meshes with minimal numbers of spares
This paper presents several techniques for adding fault-tolerance to distributed memory parallel computers. More formally, given a target graph with n nodes, we create a fault-tol...
Jehoshua Bruck, Robert Cypher, Ching-Tien Ho
HPDC
2007
IEEE
14 years 3 months ago
Smartsockets: solving the connectivity problems in grid computing
Tightly coupled parallel applications are increasingly run in Grid environments. Unfortunately, on many Grid sites the ability of machines to create or accept network connections ...
Jason Maassen, Henri E. Bal
ICPP
2008
IEEE
14 years 3 months ago
On the Reliability of Large-Scale Distributed Systems A Topological View
In large-scale, self-organized and distributed systems, such as peer-to-peer (P2P) overlays and wireless sensor networks (WSN), a small proportion of nodes are likely to be more c...
Yuan He, Hao Ren, Yunhao Liu, Baijian Yang
ICDCS
2007
IEEE
14 years 29 days ago
Communication-Efficient Tracking of Distributed Cumulative Triggers
In recent work, we proposed D-Trigger, a framework for tracking a global condition over a large network that allows us to detect anomalies while only collecting a very limited amo...
Ling Huang, Minos N. Garofalakis, Anthony D. Josep...
CLUSTER
2005
IEEE
14 years 2 months ago
Search-based Job Scheduling for Parallel Computer Workloads
To balance performance goals and allow administrators to declaratively specify high-level performance goals, we apply complete search algorithms to design on-line job scheduling p...
Sangsuree Vasupongayya, Su-Hui Chiang, B. Massey