Sciweavers

184 search results - page 3 / 37
» Job Management Requirements for NAS Parallel Systems and Clu...
Sort
View
CCGRID
2009
IEEE
13 years 11 months ago
Failure-Aware Construction and Reconfiguration of Distributed Virtual Machines for High Availability Computing
In large-scale clusters and computational grids, component failures become norms instead of exceptions. Failure occurrence as well as its impact on system performance and operatio...
Song Fu
CLUSTER
2007
IEEE
13 years 7 months ago
The computer as software component: A mechanism for developing and testing resource management software
— In this paper, we present an architecture that encapsulates system hardware inside a software component used for job execution and status monitoring. The development of this in...
Narayan Desai, Theron Voran, Ewing L. Lusk, Andrew...
TPDS
2008
106views more  TPDS 2008»
13 years 7 months ago
Security-Aware Resource Allocation for Real-Time Parallel Jobs on Homogeneous and Heterogeneous Clusters
Security is increasingly becoming an important issue in the design of real-time parallel applications, which are widely used in the industry and academic organizations. However, ex...
Tao Xie 0004, Xiao Qin
CCGRID
2006
IEEE
14 years 1 months ago
A Failure-Aware Scheduling Strategy in Large-Scale Cluster System
As the scale is expanding, node failure becomes a commonplace feature of large-scale cluster systems. As an important part of cluster operating system software, job scheduling tak...
Linping Wu, Dan Meng, Jianfeng Zhan, Wang Lei, Bib...
CLUSTER
2006
IEEE
14 years 1 months ago
Resource Management for Interactive Jobs in a Grid Environment
1 Most recent Grid middleware technologies have been aimed at the execution of sequential batch jobs. However, some users require interactive access when running jobs on Grid sites...
Enol Fernández, Elisa Heymann, Miquel A. Se...