Sciweavers

335 search results - page 48 / 67
» A job monitoring system for the LCG computing grid
Sort
View
GRID
2000
Springer
13 years 11 months ago
MeSch - An Approach to Resource Management in a Distributed Environment
Resource management in the typical Grid environment based on multi-MPP systems or clusters today still is one of the challenging problems. We will present MeSch, a solution for the...
Gerd Quecke, Wolfgang Ziegler
GRID
2007
Springer
14 years 1 months ago
Log summarization and anomaly detection for troubleshooting distributed systems
— Today’s system monitoring tools are capable of detecting system failures such as host failures, OS errors, and network partitions in near-real time. Unfortunately, the same c...
Dan Gunter, Brian Tierney, Aaron Brown, D. Martin ...
CCGRID
2008
IEEE
14 years 2 months ago
Grid Differentiated Services: A Reinforcement Learning Approach
—Large scale production grids are a major case for autonomic computing. Following the classical definition of Kephart, an autonomic computing system should optimize its own beha...
Julien Perez, Cécile Germain-Renaud, Bal&aa...
PPPJ
2006
ACM
14 years 1 months ago
Juxta-Cat: a JXTA-based platform for distributed computing
In this paper we present a JXTA-based platform, called Juxta-CAT, which is an effort to use the JXTA architecture to build a job execution-sharing distributed environment. The Ju...
Joan Esteve Riasol, Fatos Xhafa
NSDI
2010
13 years 9 months ago
MapReduce Online
MapReduce is a popular framework for data-intensive distributed computing of batch jobs. To simplify fault tolerance, many implementations of MapReduce materialize the entire outp...
Tyson Condie, Neil Conway, Peter Alvaro, Joseph M....