Sciweavers

2226 search results - page 107 / 446
» Fault-Tolerant Parallel Applications with Dynamic Parallel S...
Sort
View
ICPP
2003
IEEE
14 years 1 months ago
Data Conversion for Process/Thread Migration and Checkpointing
Process/thread migration and checkpointing schemes support load balancing, load sharing and fault tolerance to improve application performance and system resource usage on worksta...
Hai Jiang, Vipin Chaudhary, John Paul Walters
HPDC
2012
IEEE
11 years 10 months ago
Understanding the effects and implications of compute node related failures in hadoop
Hadoop has become a critical component in today’s cloud environment. Ensuring good performance for Hadoop is paramount for the wide-range of applications built on top of it. In ...
Florin Dinu, T. S. Eugene Ng
IPPS
1999
IEEE
14 years 5 days ago
The MuSE System: A Flexible Combination of On-Stack Execution and Work-Stealing
Executing subordinate activities by pushing return addresses on the stack is the most e cient working mode for sequential programs. It is supported by all current processors, yet i...
Markus Leberecht
SC
2003
ACM
14 years 1 months ago
BCS-MPI: A New Approach in the System Software Design for Large-Scale Parallel Computers
Buffered CoScheduled MPI (BCS-MPI) introduces a new approach to design the communication layer for largescale parallel machines. The emphasis of BCS-MPI is on the global coordinat...
Juan Fernández, Eitan Frachtenberg, Fabrizi...
IPPS
2003
IEEE
14 years 1 months ago
A Case Study of Selected SPLASH-2 Applications and the SBT Debugging Tool
SBT is portable library and tool for on-line debugging and performance monitoring of shared-memory parallel programs using the single-program-multiple-data (SPMD) model of paralle...
Ernesto Novillo, Paul Lu