Sciweavers

161 search results - page 29 / 33
» Using speculative execution for fault tolerance in a real-ti...
Sort
View
SIGSOFT
2007
ACM
14 years 8 months ago
Efficient checkpointing of java software using context-sensitive capture and replay
Checkpointing and replaying is an attractive technique that has been used widely at the operating/runtime system level to provide fault tolerance. Applying such a technique at the...
Guoqing Xu, Atanas Rountev, Yan Tang, Feng Qin
HPDC
2008
IEEE
14 years 2 months ago
DataLab: transactional data-parallel computing on an active storage cloud
Active storage clouds are an attractive platform for executing large data intensive workloads found in many fields of science. However, active storage presents new system managem...
Brandon Rich, Douglas Thain
ISSRE
2007
IEEE
13 years 9 months ago
Towards Self-Protecting Enterprise Applications
Enterprise systems must guarantee high availability and reliability to provide 24/7 services without interruptions and failures. Mechanisms for handling exceptional cases and impl...
Davide Lorenzoli, Leonardo Mariani, Mauro Pezz&egr...
SC
2009
ACM
14 years 2 months ago
Kepler + Hadoop: a general architecture facilitating data-intensive applications in scientific workflow systems
MapReduce provides a parallel and scalable programming model for data-intensive business and scientific applications. MapReduce and its de facto open source project, called Hadoop...
Jianwu Wang, Daniel Crawl, Ilkay Altintas
NOMS
2010
IEEE
201views Communications» more  NOMS 2010»
13 years 5 months ago
Checkpoint-based fault-tolerant infrastructure for virtualized service providers
Crash and omission failures are common in service providers: a disk can break down or a link can fail anytime. In addition, the probability of a node failure increases with the num...
Iñigo Goiri, Ferran Julià, Jordi Gui...