Sciweavers

ICAC
2005
IEEE
14 years 6 months ago
Distributed Troubleshooting Agents
Key issues to address in autonomic job recovery for cluster computing are recognizing job failure; understanding the failure sufficiently to know if and how to restart the job; an...
Charles Earl, Emilio Remolina, Jim Ong, John Brown