Sciweavers

175 search results - page 10 / 35
» Operating System Extensions for the Teradata Parallel VLDB
Sort
View
CCGRID
2006
IEEE
14 years 1 months ago
Proposal of MPI Operation Level Checkpoint/Rollback and One Implementation
With the increasing number of processors in modern HPC(High Performance Computing) systems, there are two emergent problems to solve. One is scalability, the other is fault tolera...
Yuan Tang, Graham E. Fagg, Jack Dongarra
IPPS
2005
IEEE
14 years 1 months ago
Fault-Tolerant Parallel Applications with Dynamic Parallel Schedules
Commodity computer clusters are often composed of hundreds of computing nodes. These generally off-the-shelf systems are not designed for high reliability. Node failures therefore...
Sebastian Gerlach, Roger D. Hersch
DSRT
2008
IEEE
14 years 1 months ago
An Adaptive Energy-Conserving Strategy for Parallel Disk Systems
In the past decade parallel disk systems have been highly scalable and able to alleviate the problem of disk I/O bottleneck, thereby being widely used to support a wide range of d...
Mais Nijim, Adam Manzanares, Xiao Qin
PPOPP
2010
ACM
14 years 1 months ago
Thread to strand binding of parallel network applications in massive multi-threaded systems
In processors with several levels of hardware resource sharing, like CMPs in which each core is an SMT, the scheduling process becomes more complex than in processors with a singl...
Petar Radojkovic, Vladimir Cakarevic, Javier Verd&...
HCW
1998
IEEE
13 years 11 months ago
CCS Resource Management in Networked HPC Systems
CCS is a resource management system for parallel high-performance computers. At the user level, CCS provides vendor-independent access to parallel systems. At the system administr...
Axel Keller, Alexander Reinefeld