Search Sciweavers | Sciweavers

212 search results - page 28 / 43

» Model-based fault localization in large-scale computing syst...

190

click to vote

IPPS
2005
IEEE

154views Distributed And Parallel Com...» more IPPS 2005»

Fault-Tolerant Parallel Applications with Dynamic Parallel Schedules

16 years 14 days ago

Download dps.epfl.ch

Commodity computer clusters are often composed of hundreds of computing nodes. These generally off-the-shelf systems are not designed for high reliability. Node failures therefore...

Sebastian Gerlach, Roger D. Hersch

claim paper

Read More »

163

click to vote

GRID
2006
Springer

93views Distributed And Parallel Com...» more GRID 2006»

The Palantir Grid Meta-Information System

15 years 6 months ago

Download personals.ac.upc.edu

Grids allow large scale resource-sharing across different administrative domains. Those diverse resources are likely to join or quit the Grid at any moment or possibly to break dow...

Francesc Guim, Ivan Rodero, M. Tomas, Julita Corba...

claim paper

Read More »

195

click to vote

HPDC
2008
IEEE

155views Distributed And Parallel Com...» more HPDC 2008»

DataLab: transactional data-parallel computing on an active storage cloud

16 years 1 months ago

Download www.cse.nd.edu

Active storage clouds are an attractive platform for executing large data intensive workloads found in many ﬁelds of science. However, active storage presents new system managem...

Brandon Rich, Douglas Thain

claim paper

Read More »

204

click to vote

CCGRID
2006
IEEE

131views Distributed And Parallel Com...» more CCGRID 2006»

Proposal of MPI Operation Level Checkpoint/Rollback and One Implementation

16 years 29 days ago

Download icl.cs.utk.edu

With the increasing number of processors in modern HPC(High Performance Computing) systems, there are two emergent problems to solve. One is scalability, the other is fault tolera...

Yuan Tang, Graham E. Fagg, Jack Dongarra

claim paper

Read More »

165

click to vote

ATAL
2003
Springer

150views Intelligent Agents» more ATAL 2003»

A protocol for multi-agent diagnosis with spatially distributed knowledge

16 years 4 days ago

Download www.st.ewi.tudelft.nl

In a large distributed system it is often infeasible or even impossible to perform diagnosis using a single model of the whole system. Instead, several spatially distributed local...

Nico Roos, Annette ten Teije, Cees Witteveen

claim paper

Read More »

« Prev « First page 28 / 43 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers