Sciweavers

2226 search results - page 26 / 446
» Fault-Tolerant Parallel Applications with Dynamic Parallel S...
Sort
View
FOCS
1992
IEEE
13 years 11 months ago
On the Fault Tolerance of Some Popular Bounded-Degree Networks
In this paper, we analyze the fault tolerance of several bounded-degree networks that are commonly used for parallel computation. Among other things, we show that an N-node butterf...
Frank Thomson Leighton, Bruce M. Maggs, Ramesh K. ...
IPPS
2007
IEEE
14 years 1 months ago
Implementing and Evaluating Automatic Checkpointing
As the size and popularity of computer clusters go on growing, fault tolerance is becoming a crucial factor to ensure high performance and reliability for applications. To provide...
Antonio S. Martins, Ronaldo Augusto Lara Gon&ccedi...
CCGRID
2001
IEEE
13 years 11 months ago
XtremWeb: A Generic Global Computing System
Global Computing achieves high throughput computing by harvesting a very large number of unused computing resources connected to the Internet. This parallel computing model target...
Gilles Fedak, Cécile Germain, Vincent N&eac...
ICA3PP
2010
Springer
13 years 7 months ago
Checkpointing and Migration of Communication Channels in Heterogeneous Grid Environments
Abstract. A grid checkpointing service providing migration and transparent fault tolerance is important for distributed and parallel applications executed in heterogeneous grids. I...
John Mehnert-Spahn, Michael Schoettner
HPDC
2010
IEEE
13 years 7 months ago
Toward high performance computing in unconventional computing environments
Parallel computing on volatile distributed resources requires schedulers that consider job and resource characteristics. We study unconventional computing environments containing ...
Brent Rood, Nathan Gnanasambandam, Michael J. Lewi...