Sciweavers

207 search results - page 16 / 42
» High accuracy failure injection in parallel and distributed ...
Sort
View
IPPS
2006
IEEE
14 years 1 months ago
Fault tolerance with real-time Java
After having drawn up a state of the art on the theoretical feasibility of a system of periodic tasks scheduled by a preemptive algorithm at fixed priorities, we show in this art...
Damien Masson, Serge Midonnet
CCGRID
2001
IEEE
13 years 11 months ago
Adaptive Prefetching Technique for Shared Virtual Memory
Though shared virtual memory (SVM) systems promise low cost solutions for high performance computing, they suffer from long memory latencies. These latencies are usually caused by...
Sang-Kwon Lee, Hee-Chul Yun, Joonwon Lee, Seungryo...
CLUSTER
2008
IEEE
14 years 2 months ago
DLM: A distributed Large Memory System using remote memory swapping over cluster nodes
Abstract—Emerging 64bitOS’s supply a huge amount of memory address space that is essential for new applications using very large data. It is expected that the memory in connect...
Hiroko Midorikawa, Motoyoshi Kurokawa, Ryutaro Him...
ICDCS
1997
IEEE
13 years 12 months ago
Distributed Recovery with K-Optimistic Logging
Fault-tolerance techniques based on checkpointing and message logging have been increasingly used in real-world applications to reduce service down-time. Most industrial applicati...
Yi-Min Wang, Om P. Damani, Vijay K. Garg
ICS
2007
Tsinghua U.
14 years 1 months ago
Proactive fault tolerance for HPC with Xen virtualization
Large-scale parallel computing is relying increasingly on clusters with thousands of processors. At such large counts of compute nodes, faults are becoming common place. Current t...
Arun Babu Nagarajan, Frank Mueller, Christian Enge...